Microsoft Takes On AWS, Google And Nvidia With Maia 200 AI Chip Launch
Microsoft claims that the newly launched Maia 200 AI accelerator chip is ‘the most performant, first-party silicon from any hyperscaler.’ It’s the latest in Microsoft’s efforts to lower its reliance on third-party silicon vendors such as Intel, AMD and Nvidia.
Microsoft is taking on Amazon and Google with the launch of its Maia 200 AI accelerator chip, calling the processor “the most performant, first-party silicon from any hyperscaler.”
In announcing the inference-focused Maia 200 launch on Monday, the Redmond, Wash.-based tech giant claimed that the processor outperforms homegrown AI chips from Amazon Web Services and Google across several key measures, including low-precision numerical formats that are important for a growing number of AI inference workloads.
[Related: Intel Hires Qualcomm Executive To Lead GPU Engineering For Data Centers]
The Maia 200 is the latest in Microsoft’s efforts to lower its reliance on third-party silicon vendors such as Intel, AMD and Nvidia. The latter company has dominated the AI infrastructure market with an increasingly vertical set of software and hardware solutions that leaves customers and partners with fewer customization options for its fastest GPUs.
Microsoft said it has already deployed Maia 200 systems in its U.S. Central region near Des Moines, Iowa, with the U.S. West 3 near Phoenix, Ariz., on deck for the next available region. More regions are expected to follow.
The systems are powering Microsoft Copilot and Microsoft Foundry workloads, according to Microsoft. They are also being used to run advanced AI models, including OpenAI’s latest GPT-5.2 models, as well as those in development by Microsoft’s Superintelligence team, which is led by Microsoft AI CEO Mustafa Suleyman.
Scott Guthrie, executive vice president of Microsoft’s cloud and AI group, said in a blog post that with the Maia 200 the company was able to achieve “higher utilization, faster time to production and sustained improvements in performance-per-dollar and per-watt at cloud scale.” This is thanks to efforts by Microsoft’s silicon development programs to “validate as much of the end-to-end system as possible ahead of final silicon availability.”
Microsoft claimed that the Maia 200 can achieve nearly 10,200 teraflops of 8-bit floating-point (FP4) performance. That makes the chip four times more powerful than Amazon Web Services’ Trainium3 chip. The company also said Maia 200 can reach just over 5,000 teraflops of 8-bit floating-point performance (FP8), which gives it a 9 percent advantage over Google’s seventh-generation TPU and over double that of Trainium3.
Using HBM3E high-bandwidth memory, the Maia 200 comes with 216 GB of memory and a memory bandwidth of 7 TBps in contrast to the 144 GB and 4.9 TBps of Trainium3 and to the 192 GB and 7.4 TBps of TPU v7. The chip also supports a scale-up bandwidth of 2.8 TBps versus the 2.56 TBps maximum of Trainium 3 and 1.2 TBps of TPU v7.
What Microsoft didn’t address in the Monday blog post is the total performance and other specs of a server rack housing the Maia 200 chips. These kinds of details can illuminate how much performance a rack full of AI chips can deliver and how much power it requires.
AWS, for instance, said its Trn3 UltraServers can pack up to 144 Trainium3 chips to deliver up to 362 petaflops of FP8 performance. Google, on the other hand, said its TPU v7 pod features 9,216 seventh-generation TPUs to deliver 42.5 exaflops of FP8.
A Microsoft spokesperson did not respond to a request for similar details.
While Microsoft also didn’t provide any competitive comparisons on energy consumption or cost, the company said Maia 200 provides 30 percent more performance-per-dollar than the first-generation Maia 100. And it’s achieving that improvement with Maia 200’s 750-watt thermal design power, which is only 50 watts higher than the maximum power envelope of its predecessor. (Microsoft has said that it provisions Maia 100 for 500 watts.)
What will ultimately determine how Maia 200 competes against AWS’ Trainium3 and Google’s TPU v7 are two-fold: how much it costs for customers to run their workloads, and how effectively they can effectively utilize each cloud service provider’s software stack.