Search
Homepage Rankings and Research Companies Channelcast Marketing Matters CRNtv Events WOTC Cisco Partner Summit Digital 2020 Lenovo Tech World Newsroom HPE Zone Masergy Zenith Partner Program Newsroom Dell Technologies Newsroom Fortinet Secure Network Hub Hitachi Vantara Digital Newsroom IBM Newsroom Juniper Newsroom The IoT Integrator Lenovo Channel-First NetApp Data Fabric Intel Tech Provider Zone

Nvidia's Ian Buck: A100 GPU Will 'Future-Proof' Data Centers For AI

'By having one infrastructure that can be both used for training at scale as well as inference for scale out at the same time, it not only protects the investment, but it makes it future-proof as things move around,' says Buck, Nvidia's head of accelerated computing, in an interview with CRN.

Back 1   2   3   ... 9 Next
photo

The ability to do inference and training: is that just going to appeal to customers who want to do both, or is it going to also appeal to customers who want one or the other?

You have to think about building these data centers. These are multi-million dollar, potentially hundreds of millions of dollar investments. You don't do that on a whim, on the fly. And when you build up an infrastructure, you'd like to make sure that its utility is maximized and well into the future. By having one infrastructure that can be both used for training at scale as well as inference for scale out at the same time, it not only protects the investment, but makes it future-proof as things move around as networks change, you can configure your data center in any way possible well after you've purchased and physically built it. So Ampere's flexibility to be both an amazing training as well as inference GPU makes it really game changing for the data center, because, as you know, it changes every six months. There's always a new network; it's going to change the workload demands, but they can rely on Ampere to carry them into that future for both the training and inference capabilities.

If an organization needs to make an investment for training first, with the A100, they already have that inference capability when they get to the point of deployment.

Exactly. And they can move it around. It can be used for real-time production inference use cases where latency matters. Or as those needs and that infrastructure shift, they can shift that infrastructure to training capabilities, which is ideal and allows them to really lean in on investing and building up very large-scale infrastructure.

How long did Nvidia know that it needed to get to this point of having these inference and training capabilities within one GPU?

It's a good question. I have to think back now. Utilization is always top of Mind. People always want to make sure that their infrastructure, as they deploy it, is well utilized and well capitalized. One of the great features of Ampere, one of the reasons why we created the multi-instance GPU capability is because they can automatically configure one GPU for the exact amount of inference performance they need for a particular use case, so they may need only a single MIG instance, which is about the performance of a V100. But for some of these larger networks where they want to go to the next level of conversational AI, where the models, as you saw, are getting huge for natural language understanding, they can scale up and provide bigger GPUs, all the way up to the capabilities of a single A100, which is probably more than anyone needs right now for real-time inference. But I say that — given the fact that we just launched Ampere, I'm sure someone will make a model that is capable because now that capability exists. So utility is critical for data centers, and making sure that they can be flexible and protect their investment that they're making, and they know they can trust it and do that with Ampere. That was really important message that we felt was compelling and also is important for data centers as we were seeing the rise of T4 and the rise of Volta at the same time, and managing those two capacities, as the different workloads created demand.

 
 
Back 1   2   3   ... 9 Next

sponsored resources