Broadcom CEO: Hyperscalers’ Generative AI Spending Spiking ‘Very, Very Dramatically’
‘We’re seeing that increase very, very dramatically. We are seeing urgency in our hyperscale customers come to us to secure products, to secure the ability to put in place those very, very low latency networks that can scale. And Ethernet is what makes those networks scale,’ Broadcom CEO Hock Tan says during earnings.
Broadcom CEO Hock Tan said over the last few months the San Jose., Calif.-based company has found itself sitting at the heart of the generative AI boom.
“That happened over the last 90 days. We are seeing a lot of that urgency. You might call it excitement,” Tan told investors Thursday. “We’re seeing some of these hyperscalers bringing on a sense of urgency and focus, and of course, spending to be up to speed and to not be left behind. As we see the excitement, hype, perhaps in pushing applications and workloads in generative AI. That’s what we see driving a lot of this excitement.”
Tan said Broadcom’s generative AI business is poised to quadruple, from $200 million last year to $800 million this year as that “urgency” among hyperscale customers is driving a demand for the company’s networking gear to make the technology scale.
“We’re seeing that increase very, very dramatically. And we are seeing urgency in our hyperscale customers come to us to secure products. To secure the ability to put in place those very, very low latency networks that can scale. And Ethernet is what makes those networks scale,” said Broadcom CEO Hock Tan while delivering the company’s earnings Thursday.
Generative AI as a business unit of Broadcom is still in “early innings,” Tan said. However, he predicts the data center today is undersized to run thousands of AI engines in parallel, which enable the large and synchronized bursts of data at speeds of 400 GB to 800 GB.
“With generative AI many more billions of parameters come into the models, you are talking about the scale out of data centers driving AI engines networked together in a manner that we probably have not seen before,” he said. ”
Broadcom’s more advanced switches, routers and networking fabric are already part of the low-latency networks inside hyperscalers that make it possible. With latency as the enemy of the massive compute deployments needed in generative AI, Tan said the network becomes the bottleneck to scaling products.
“In 2022, we estimated our Ethernet switch shipments deployed in AI was over $200 million,” Tan said. “With the expected exponential demand from our hyperscale customers, we forecast that this could grow to well over $800 million in 2023.”
That is also driving Broadcom’s compute offload business with its hyperscale customers, which is on track to exceed $3 billion this year, Tan said, an increase of 36 percent.
“The networks to support this massive processor density is critical and as important as the AI engines,” he said. “Such networks have to be lossless, low-latency and be able to scale. So as you know, such AI networks have already been deployed at hyperscalers through our Jericho-2 Switches and Ramon Fabric.”
Here is everything Tan told investors and analysts Thurday about Broadcom’s future in generative AI:
On Broadcom’s place in the generative AI process
In 2022 generative is just barely starting to kick off, but there exist AI networks, within the hypserscalers particularly in fairly significant volume.
What we are trying to say is very similar to traditional CPUs in traditional workloads in those same data centers, we’re constrained on performance of those silicon CPUs. And we are starting to see scale out by positioning rows and rows servers, CPUs, and networking them together to work closely in parallel.
As we step up to launch language models in generative AI. GPUs are starting to be strung together in hundreds, soon to be 1,000s of racks and working in parallel and you know how that goes. Those GPUs work in parallel in a fairly synchronos manner to run and do what you call parametric exchange.
Basically, you run all AI engines together, whether they are GPUs, AI GPUs, or other AI engines. You run them together. It becomes a network. The network becomes now potentially, the critical part of this whole AI phenomenon in hardware.
To make it work you have to put together many many racks of AI engines in parallel, very similar to what hyperscalers have been doing on CPUs to make them run faster with high performance as Moore’s law comes to an end. It doesn’t make any difference in terms of AI engines. They come from silicon, they face similar constraints.
So network becomes the constraint. Network becomes a very key part of fulfilling generative AI dreams. What I’m saying in my comments is last year in 2022, the AI workloads running at hyperscale and the advent of generative AI was still fairly fresh and new we were doing 200 million as far as we could estimate of silicon, Ethernet switches, and fabric that goes into those AI networks, as far as we could identity in hyperscalers.
With Generative AI and the urgency and excitement of it coming in that we’re seeing today. We’re seeing that increase very, very dramatically. And we are seeing urgency in our hyperscale customers come to us to secure products. To secure products. To secure the ability to put in place those very, very low latency networks that can scale. And Ethernet is what makes those networks scale.
On customers asking for in-development work such as Tomahawk 5 and Jericho 3 next-generation switching and routing products to meet demand for AI work
Yes, we are seeing all of the foregoing. That happened over the last 90 days. We are seeing a lot of that urgency. You might call it excitement, but you hit it right on. Which is accounting for the color in my commentary, both about generative AI networks pushing us to develop a new generation all together of internet switching that can support these kinds of very compute and data intensive workloads
So that’s one side of it. The other side of it, you are right. We are typically not one to talk much about compute offload, which is another way of saying, ‘yeah, these are very related to some of the engines that are fairly customized and dedicated to certain hyperscalers.
On how much share generative AI may take of future data center workloads
I think it’s early innings on generative AI, but we also see a very strong sense of urgency among our customers, especially in the hyperscale environment not to miss out, not to be late to this trend.
With generative AI, many more billions of parameters come into the models, you are talking about the scale out of data centers driving AI engines networked together in a manner that we probably have not seen before.
It’s not a problem that is not solvable. It is very solvable. As evidenced by the fact that we have and deploy technology to support AI networks today to certain hyperscalers, where we are talking about hundreds if not thousands of AI engines. AI servers networked together and working in a synchronous manner.
So this is about the ability to scale out in a fairly substantial manner. Its really about trying to make sure that happens and not be the bottleneck to our ability to get the best system performance of an AI data center.
On Broadcom when it comes to deploying generative AI networks
Where it’s at right now is frankly, how to network them and how to do those massive parametric exchanges, so to speak, when you run large numbers of engines or machines in parallel as you grind through this whole database.
We are in early innings which is why we think we have time to work on a new generation of switches in Ethernet that are specifically designed and dedicated to this kind of workloads which are very different from the workloads we see today.
They have to be literally, or virtually lossless, very low latency, and be able to scale into thousands of engines. That’s the main three criteria. We are driving silicon solutions that enable that. We have it. But we think we need to improve the performance of what we have in anticipation of a trend we see over the next several years. So we are putting our investments in that direction.