Microsoft's definition of what an AI computer means and constitutes is beginning to take shape. With the latest version of Windows, PCs get a new Copilot key and must have an NPU capable of at least 40 trillion operations per second. Why; Soon you will be able to run Microsoft Copilot locally on your computer.
Redmond's requirements for the AI model in Windows were formalized by Intel – one of the strongest companies in the AI PC category – during The company's AI Summit in Taipei this week.
Running a language model (LLM) locally will have some inherent benefits. End users will experience lower latency and thus improved response times, as queries will not need to be sent to and from a remote data center.
Users will theoretically at least have better privacy.
For Microsoft, meanwhile, shifting the bulk of its AI workload to its customers' devices will free up its own resources for other work, such as helping to train OpenAI's next model.
Microsoft may be hoping to run Copilot LLM entirely on NPUs, or neural processing units, in its upcoming Windows artificial intelligence PCs, judging by comments made by Intel executives at the summit. We can imagine Intel pushing in this direction to convince everyone that its silicon is powerful enough to run Redmond's stuff in the home or office.
Of course the idea of disconnecting Copilot from the Azure umbilical may be attractive to Microsoft, but not everyone seems to be in favor of it. Personally I would like at least some amount of processing to be done only in the cloud for the foreseeable future.
Whatever route Microsoft takes, the combination of local and remote AI models is something we will see soon, without yet knowing under what circumstances the local models will be used.
We can think of a few scenarios in how Microsoft will use local artificial intelligence. The first is to offload work from its servers.
The second would be to have it as a backup plan in the event of a network outage. Think of your AI computer as getting dumber instead of stopping completely when disconnected from the network.
We should mention that currently there are no machines that meet the hardware requirements and we are not talking about the Copilot key on the keyboard.
The point is that NPUs are still relatively new and not powerful enough. AMD was among the first to add an NPU to its mobile processors in early 2023 with the launch of the Ryzen 7040 series chips.
This lineup was bumped up in December during the event Advancing AI by House of Zen. AMD also brought its NPUs to the desktop with the launch of its 8000G APUs at CES in January this year.
Intel released Meteor Lake in late December.
Unfortunately, these chips are only capable of 10 to 16 trillion (typically INT4) operations per second, much lower from Microsoft's 40 TOPS specifications. This means that most of the so-called artificial intelligence computers on the market will not meet the requirements.
Both Intel and AMD have more capable chips with Lunar Lake and Strix Point silicon respectively. However, in the near future, we will probably see moves from Qualcomm as well.
Qualcomm's Snapdragon X Elite processor laptops will be released sometime in mid-2024 and will feature NPUs with 45 TOPS. Combined with a 4,6 teraFLOPS FP32 Adreno GPU, Qualcomm says the component will be able to run AI models with up to 13 billion parameters entirely on-device and generate 30 tokens per second when running smaller LLMs of 7 billion parameters.
As PCs arrive with higher-performance NPUs and larger memory caches, and small models become more capable, we suspect Microsoft will start to push more features into local devices – once the hardware can handle them. And for us to be able to manage the ever-increasing demands for hardware….