December 22, 2023 – According to reports from VB, Apple has recently unveiled two groundbreaking research papers showcasing significant advancements in the field of artificial intelligence.
One of these papers introduces a novel technology for efficient language model inference, offering the potential for complex AI systems to run smoothly on devices with limited memory, such as the iPhone and iPad. In this paper, Apple’s research team tackled a key challenge associated with deploying Large Language Models (LLMs) on memory-constrained devices.
As we all know, large models like GPT-4 comprise hundreds of billions of parameters, and running them directly on consumer-grade hardware can be prohibitively resource-intensive. Apple’s engineers have ingeniously minimized data transfers from storage to memory during the inference process, resulting in a remarkable 4-5 times improvement in latency. On GPUs, the acceleration reaches an impressive 20-25 times speedup.
This breakthrough holds immense significance for deploying advanced LLMs in resource-constrained environments, greatly enhancing their applicability and accessibility. For Apple users, these optimizations may soon enable complex AI assistants and chatbots to run seamlessly on iPhones, iPads, and other mobile devices.
Earlier, Ming-Chi Kuo, an analyst from Tianfeng International Securities, reported that the iPhone 16 is set to introduce innovative AI-related features. Apple restructured its Siri team in the third quarter of this year with the aim of integrating AI Graphics Control (AIGC) capabilities and Large Language Models (LLMs).
On smartphones, voice input will serve as the crucial interface for AI, AIGC, and LLMs. Therefore, enhancing Siri’s software capabilities is essential for promoting AI functionality. Kuo’s latest research indicates that all iPhone 16 models will see significant upgrades in microphone specifications, including improved waterproofing and better signal-to-noise ratios, enhancing the overall Siri user experience.