How Arm Is Winning Over AWS, Google, Microsoft And Nvidia In Data Centers
In an in-depth interview with CRN, Mohamed Awad, the head of Arm’s infrastructure business, talks about how the British chip designer has managed to convince Amazon Web Services, Google Cloud, Microsoft Azure and Nvidia to use its chip technologies to design custom CPUs and other critical components in data centers. He also teases what’s next for Arm.
When Amazon Web Services executive Dave Brown announced in December that more than half of the cloud giant’s new CPU capacity from the past two years came from its Arm-based Graviton chips, it showed how much AWS has shifted away from x86 processors designed by Intel and AMD in favor of homegrown silicon.
The disclosure came as a surprise to Mohamed Awad (pictured), the general manager of Arm’s infrastructure business, who is responsible for driving the British chip designer’s business with the likes of AWS, Microsoft Azure, Google Cloud and Nvidia. These companies have increasingly embraced Arm’s instruction set architecture, CPU blueprints and other modular technologies for their own custom chip designs.
[Related: Analysis: Intel Data Center Boss Justin Hotard Leaves After CEO, AI Chip Shake-Ups]
“It was that meaningful where I—to be honest with you—even I was like, ‘Whoa.’ I knew it was big. I’m not sure I quite appreciated how committed these guys are. And we’re seeing, in many ways, all of the hyperscalers with similar sorts of commitment,” Awad, a Broadcom veteran who has run Arm’s infrastructure business for a total of four years, said in a December interview with CRN, a week after Brown’s statement was made.
While Intel has long been the dominant provider of server CPUs for data centers and cloud infrastructure, the semiconductor giant’s influence has been weakened over the past decade by two main forces: AMD, with its processors that are based on the same x86 instruction set architecture, and Arm, whose namesake architecture has enabled the introduction of competing chips from some of Intel’s biggest customers—AWS, Microsoft Azure and Google Cloud—and its fast-growing rival, Nvidia.
AWS was the first among the group to introduce Arm-based server CPUs with its Graviton chips in 2018. Nvidia eventually followed with reveal of its Grace CPU for the Grace and Grace Hopper Superchips in 2022. Two years later, Microsoft Azure and Google Cloud launched cloud instances powered by their Cobalt and Axion CPUs, respectively.
In his interview with CRN, Awad explained how and why Arm—which became a publicly traded company again in late 2023 after a failed acquisition attempt by Nvidia and seven years of ownership by Japanese investment giant SoftBank Group—has been able to win over the world’s largest hyperscalers and AI computing giant Nvidia.
While Awad credited the investments Arm and its partners have made in enabling software to run well on Arm-based CPUs as well as the server-class Neoverse cores it introduced in 2019 for its ascent in data centers, the executive also said that ever-growing compute and networking requirements of AI workloads is another reasons these firms turned to Arm.
With AI requiring a massive amount of data and compute within the constraints of power, space and money, there has been a growing need for these companies to design at the system level so that each component is optimized to deliver the best possible efficiency and performance, according to Awad. And this has resulted in them increasingly building their own components, whether that’s a CPU, an accelerator chip or networking silicon.
“So whether you’re Google or Microsoft or AWS or Nvidia, you're looking at your system and you’re going, ‘Wow, I can’t just take off-the-shelf stuff and keep doing what I'm doing. I have to redesign from the ground up,” he said.
This is seen in Nvidia’s new Grace Blackwell GB200 NVL72 rack-scale server platform, which the company has promoted as the flagship vehicle for its Blackwell GPU and is starting to become available across OEMs and cloud service providers.
The platform includes Nvidia-designed server boards with 36 Grace Blackwell GB200 Superchips, each of which contain an Arm-based Grace CPU and two Blackwell GPUs designed by Nvidia. But it also includes Nvidia’s BlueField-3 data processing units (DPUs) and NVLink Switch chips for networking purposes.
Awad said AWS, Microsoft Azure and Google Cloud are following a similar strategy to Nvidia, which he sees as a major benefit to Arm because of how the company’s chip technologies can address different parts of a system.
“They're all building their own accelerators. They’re all building their own networking. They're all building their own CPUs. They're building them together, and now they're optimizing all of that AI software around this co-architected silicon chipset and system, which, frankly, I think it's a huge tailwind for Arm,” he said.
Now that all four companies have begun commercializing their Arm-based products, Awad said his team is focused on ensuring they are successful while at the same time working alongside these companies to define Arm’s road map for future technologies.
But Awad said he’s also looking at how Arm can enable other companies to design their own chips, whether it’s a startup or a much larger company. A critical way his company plans to make this happen is by continuing to lower the costs associated with designing custom chips, which Awad said Arm has done through efforts like the Neoverse cores, the pre-integrated Neoverse Compute Subsystems and the Arm Total Design initiative.
“There's a lot of other players, a lot of other big players out there, quite candidly, and we're going to continue to work with them and find ways to lower the cost of adoption, lower what it takes for them to get silicon in-house, and that's going to be a big focus, so that we can really make it as ubiquitous as Arm is in some of the other markets,” he said.
What follows is a lightly edited transcript of CRN’s interview with Awad, who also talked about his growth expectations for Arm in the cloud infrastructure market, to what extent Arm is looking at the market for on-premises servers, what it took for Arm to gain acceptance among mainstream cloud customers and what Arm hopes to accomplish next.
I think at the end of the day, it comes down to three things. There [are] three things which are all kind of coming together at a perfect time.
The first one is, it's the culmination of just investment that we've been making. And when I say we, [the investments] both Arm and the broader ecosystem have been making in the software ecosystem for 15 years at least. So this is all about focusing on software, which is cloud-native, which is taking advantage of the new client-server paradigms which have really emerged over the last decade-plus. We've been steadily going after that market. So I think that's the first thing, which is making Arm a viable option. So that's kind of a table stakes thing.
And then I think the second thing is about how we approached just technology more broadly. So we've built out our Neoverse CPU cores first and foremost. And that, of course, leverages our long-standing and well-known heritage around things like power consumption, etc., but then adds on top of it the features and the performance required for infrastructure. And that's really a dynamic which started in 2019 or so, when we launched Neoverse. So there's that.
And then on top of that, what we've done more recently is build out [Neoverse] compute subsystems. And so when you think about what Neoverse and compute subsystems sort of represent, they represent making silicon design more accessible to more players in the market. If you go back five or 10 years, when you look at who is building Arm for infrastructure, for servers, these were all people trying to build up their own CPU cores using their own microarchitecture, and then they would have to put together all of the other pieces. And then on top of that, they would have a big software lift. So the software is now in good shape. And on top of that, the silicon and the accessibility of this stuff [is ready], because we've not only have infrastructure-class cores but also have now integrated them with compute subsystems, making it very easy for them to just go adopt. That has happened.
And then the third point is just this massive, massive inflection point that we're seeing around AI. If you look at what's happening in AI—and I know that I don't have to spell this out for you, because you are living it day in and day out—but power is just so important. Systems-level thinking is so important because of just the sheer amount of data and compute required. And the reality is that data centers are limited. Capex spending is way up.
And so what you've got is this dynamic where these hyperscalers, companies like Nvidia, are looking at their entire solution. So whether you're Google or Microsoft or AWS or Nvidia, you're looking at your system and you're going, “Wow, I can't just take off-the-shelf stuff and keep doing what I'm doing. I have to redesign from the ground up.” And if you look at all those companies, by the way, it's not just about the Arm CPU. What they're actually doing is building complete chipsets, which include networking gear, oftentimes based on Arm. And then it also includes accelerators and other aspects of the system with our IP in them. It's about designing all of that together to optimize the efficiency of the system, to optimize the performance of the system, and to really start to build the silicon around what they want the data center to be, rather than the way they had done it before, which was build the data center around what is available silicon, right? And I think that's what's driving all this today.
With Nvidia, especially, it's so interesting that when you look at what their flagship AI products, it was the Grace Hopper Superchip, and now it's becoming Grace Blackwell with the GB200 NVL72 system. Dell’s selling it, and AWS is using it, and Microsoft's using it. Google's using it.
It's a full rack. They're selling a full rack, and they're selling it with the compute. What's interesting, too, if you look at all of those guys: Nvidia, AWS, Google, Microsoft. I mean, they're all building their own accelerators. They're all building their own networking. They're all building their own CPUs. They're building them together, and now they're optimizing all of that AI software around this co-architected silicon chipset and system, which, frankly, I think it's a huge tailwind for Arm.
I've seen people talk about using CPUs for the AI workloads itself, probably more inference. But what is your view on Arm-based CPUs being used for in the AI realm?
So it's a few things. The most obvious answer is around the management of the accelerator. An accelerator is just that: It's an accelerator, whether it's a GPU [or something else, like an application-specific integrated circuit]. It's just that. It can't exist without a CPU alongside it to help do some of the pre-processing, the checkpointing, all this other stuff associated [with accelerator chips], which is why you've got the Grace Blackwell- or Grace Hopper-type systems and why they exist.
Historically, you'd have one x86 node with four accelerators or whatever tied to it. Now you've got a one-to-one or a one-to-two type ratio, and it's really about managing that. It's also about creating a coherent memory domain, all that kind of stuff. So I think that is one aspect, and I think it's certainly interesting.
What you're asking about, I think, is probably more about like, well, where are CPUs going to be used specifically for inference? And I think in that case, the way that I tend to think about it is that, at the end of the day, inference is going to permeate all aspects of compute, meaning it's not going to be tied to just these large, pod-sized systems in the cloud. It is going to literally permeate everything from the temperature sensor on my wall through to that data center. And within each of those use cases, you are going to have effectively general-purpose applications, which are running in every one of those cases, and then they are going to need to go off and be able to execute inference. And the size of that inference will dictate—and this is not unlike what we've seen in every other compute dynamic—the size of that inference will dictate where that inference lands. So it's really all about the granule of compute.
For very large chunks of inference, you're going to need to go to a pod, which is made up of the latest Grace Blackwell or whatever—choose your accelerator system. For the smallest inference loads, it might just happen on the CPU. And then there's going to be a bunch of stuff in between. There'll be things like SVE, [the Scalable Vector Extension in Arm's latest instruction set architectures]. SVE will handle some right now on the CPU, or little accelerators, which will sit very closely coupled to every CPU core within the SOC, for example, could handle some. And so with the fullness of time, there will be acceleration which [will come in] all sorts different shapes and sizes and really how [far] you go from that core CPU application will determine where that happens. Sometimes it will happen directly on the device. Sometimes it won't.
AWS executive Dave Brown said in December that more than half of AWS’s new CPU capacity came from the company’s Graviton CPUs. That’s a pretty revealing fact, and I was just wondering: Do you guys have any further visibility into the share of Arm-based processors among cloud service providers?
It's continuing to grow. I don't have specific numbers off the top of my head, but it certainly is continuing to grow pretty meaningfully. AWS has been leading the way on this. We're seeing great adoption with all the other hyperscalers now as well. I think the rest of that quote was something like, “more than 50 percent over the last two years, more than any other architecture combined.” It was that meaningful where I—to be honest with you, even I was like, "Whoa." I knew it was big. I'm not sure I quite appreciated how committed these guys are. And we're seeing, in many ways, all of the hyperscalers with similar sorts of commitment.
In terms of, “We're expanding our infrastructure, and maybe half of it is Arm-based?”
We’re definitely seeing all of them very meaningfully commit to increased deployment, like on the level of multiples from where they are now moving forward. And you got to remember, for some of these guys, they've just launched products, and they're just hitting [general availability] and the rest of it, so the growth trajectory is substantial.
Microsoft and Google are starting from a smaller place, but it sounds like what you're seeing is they're still going to be growing their capacity—
Absolutely. I think the way to think about it [is] not all hyperscalers are the same in terms of the way that their infrastructure is laid out or what it's servicing. Some of these hyperscalers have a distribution of compute which services their internal workloads, maybe primarily, and others are servicing external workloads. And the reason why I highlight that point is because for those that are servicing external workloads, for them, in many cases, there is already, thanks to all the great work that's been done with AWS and the ecosystem, there's already a broad swath of cloud-based customers who are ready to very quickly adopt. So that's a good story.
But also, for many of these guys who are doing even more with their internal workloads, it's actually very straightforward for them to move their internal workloads over. So to the extent that the CPU or the silicon is providing a better [total cost of ownership], it creates a natural anchor customer for them with their internal properties to drive those deployments. And so it's an interesting, dual-prong demand dynamic.
Can you talk about why designing Arm-based processors can be attractive from a cost perspective for these companies?
There are two aspects to cost. And I kind of touched on this a little bit at the beginning. The first aspect is just the actual cost of designing the silicon. And at the end of the day, that really comes down to, we've been focused here at Arm, the team and I, we've really been focused on bringing that cost down and that time to market down. That's a lot about what we're doing with things like Arm Total Design. It's a lot of what we're doing with our compute subsystems. And so we're continuing to ratchet that down. Cost of silicon is expensive. We're trying to figure out how to lower that barrier. So that's that.
And then obviously, the other part to that silicon design is people tend to think about the actual physical design of the chip, but CPUs are nothing without software that runs on them. And what's actually happening now is that because you have so many of these players: Oracle, Microsoft, Google, AWS, etc., who have adopted Arm and have invested so much in it, they're actively investing in the software ecosystem, and so that flywheel is actually accelerating, where we've gone from “software works on Arm” to now, in some cases, and in many cases, “software works best on arm.” So that actually is driving a whole new dynamic. When I say “best on Arm,” I mean performance-per-watt. Just think of it that way: perf-per watt-per dollar. So that's driving one dynamic.
I think the other dynamic—which I don't think that we highlight often enough—is just this idea that Arm was born out of this idea of driving efficiency in our CPU designs. I mean, yes, Neoverse is architected and designed for infrastructure. But the reality is that its roots come from places like mobile, where billions of devices have been shipped and every milliwatt matters. And I think that that heritage and just continuing to carry that forward really sets us apart. And in a world where energy is such a scarce resource as people are trying to scale more and more of AI, that perf-per-watt-per-dollar becomes so important because it's as much about spending less on the compute—i.e. your question about the cost —as it is about freeing up energy to use for other use cases, like that power-hungry accelerator. So it's not just the cost in terms of dollars. It's the cost in terms of power. And I think it's the confluence of those things that are driving a lot of it.
How do you see Arm growing in the on-prem server space?
The reality is that we made a conscious decision back right around the time that Neoverse was launched—so end of 2018, beginning of 2019—to really focus our efforts on the largest consumers of compute, and, effectively, where the puck was going, and a lot of that was in the cloud. So that's been our focus, and some of that was because of the software lift associated with driving adoption there versus driving some of the legacy enterprise use cases. I certainly think that there is room for Arm-based offerings and what I would call the next tier of players in the market who are also consuming a tremendous amount of compute, have a sort of a cloud-native-ish software stack, maybe are running a hybrid cloud and are looking for that on-prem offering.
It's part of the reason why we have worked so closely to establish the Arm Total Design ecosystem, because what that actually is doing is it's starting to create what are effectively interchangeable chiplets. So we've got a bunch of these parties who are building out compute chiplets based on our compute, which look very similar to what the hyperscalers are building but now they're off-the-shelf. And so if you think about it, those that were building architectural implementations, they're spending hundreds of millions of dollars to go get their silicon, if not a billion dollars to get their silicon out. Those that are taking IP, some spend something less than that. Those that take [Neoverse Compute Subsystems] are taking something less than that to get to market even faster. Folks that are doing chiplet-type offerings can get to market much more quickly for much lower cost. And it's really about us lowering that burden so that we can address the next chunk of the market, and so that's an area that we're continuing to look to innovate around.
There was a time—maybe a few years ago, maybe more—where people were wondering, can I run this stuff on Arm? But like you said, it's gone from being just as good to better than x86. When did Arm cross those thresholds, starting with “as good as” to then “being better?” And then how in your mind has acceptance been among the customers of AWS, Microsoft and Google?
I do think that we really started on this trajectory, and it's really difficult for me to pinpoint a moment in time where you cross that. My son is going to turn 18 on Monday, and I'm like, at what point did he become a man? Like, it's sort of unclear to me exactly when that moment happened, like all of a sudden we're there. And in some ways, this kind of feels like the same thing. It's been in the making for almost as long. It's something that we've been investing in for a long time. When I think about it, I think of a couple of key milestones, which are important.
I think, quite candidly, when AWS launched that first Graviton instance, that was a key moment in time because it made Arm-based compute [that was] widely available, so people could [at] low risk, low cost, very easily try it in the cloud. And the result of that is clear, right? The top 50,000 customers [are] using it, 90 percent of their top 1,000, all of their top 100 [are] all using arm. And you heard the thing from [Dave] in terms of the amount of compute, so that was one.
I think the launch of Neoverse, when we moved away from this idea that, hey, you can either take a mobile core and try to cobble together a server [CPU], or you had to do a complete grounds-up design using an architectural license, when we moved away from that and said, "No, actually, we're going take on more of the lift and make it easier for you to go off and adopt,” so that was around 2019. That was a big milestone.
And then I think there's a little bit of this trajectory, which has happened generation over generation with this stuff, where every generation seems to be getting a little bit better than the one before. And it's like anything, when you see all of these companies which are trying it and getting 40 percent, 60 percent better [total cost of ownership], you try it as well. And that's been something which I would suggest has really taken off in the last three or four years. So I don't think there's a particular moment in time. I think we've been building this, but there are a few specific milestones which have driven it.
So Nvidia, Microsoft and Google all making their commitments to Arm-based CPU roadmaps have been obviously a huge deal for Arm. Where does Arm go from here? You guys are now getting adopted by the biggest hyperscalers and Nvidia, but where do you guys go from here in terms of, do you see Arm being used by other larger companies in the future? Or will it happen more on the startup side? What are your expectations?
So first of all, in terms of startup types of companies, we have a very robust ecosystem of younger companies who leverage Arm's technology, and we work very closely with some. Even if you look at Annapurna [Labs], which ultimately got rolled into AWS [to start the Graviton line], we were one of the early investors in them. That is a long-term heritage that we've had, and we're going to continue to do that, continue to work with those guys. So I do think that that's definitely a vector.
I think to your point earlier, a lot of these players are now just deploying Arm in a meaningful way, and a lot of what we're focused on is helping them be successful in that, which is going to mean there's a lot of software work still to be done. There's a lot of partnering with them to be done.
And frankly, a lot of what we do today is really thinking about what next-generation compute looks like. And so we work actually very closely with a lot of these hyperscalers very early in the process, where we're working side-by-side with them to actually modify and optimize microarchitecture for CPUs that will be coming out two, three, four, five years from now, so that they're optimized for where they see the infrastructure going. So there's a lot of work there that I think we're going to continue to lean into just more broadly.
And then I think the last thing, which you also highlighted a little bit earlier, is that next tier of compute, right? We've got these—you talked about those four guys up at the top. There's a lot of other players, a lot of other big players out there, quite candidly, and we're going to continue to work with them and find ways to lower the cost of adoption, lower what it takes for them to get silicon in-house, and that's going to be a big focus, so that we can really make it as ubiquitous as Arm is in some of the other markets.