VMware, Nvidia Reveal Private AI Foundation To Help Enterprises Run Gen AI Apps

Justin Boitano, vice president of enterprise computing at Nvidia, says the new full-stack software platform will help enterprises take advantage of ‘state-of-the-art’ Large Language Models like Llama 2 and build custom generative AI applications using proprietary information on VMware’s cloud infrastructure to significantly improve productivity.

ARTICLE TITLE HERE

VMware CEO Raghu Raghuram

VMware is hoping to entice enterprises to build and run generative AI applications on its cloud infrastructure with a new full-stack software platform it has built with AI chip powerhouse Nvidia.

Unveiled Tuesday at VMware Explore in Las Vegas, the upcoming platform is called VMware Private AI Foundation with Nvidia, and it’s designed to help businesses use their proprietary data to build custom Large Language Models and run generative AI applications on VMware Cloud Foundation.

unit-1659132512259

type

Platform Will Let Businesses Create Apps With ‘State-Of-The-Art’ Models

Justin Boitano, vice president of enterprise computing at Nvidia, said VMware Private AI Foundation will help enterprises take advantage of generative AI—which he called the “most transformational technology of our lifetime”—and build conversational interfaces that connect with business systems.

As examples, Boitano said companies could train Large Language Models “against call records with your customers, your IT tickets and your security configurations” to improve business processes.

“We see AI being infused into every business over the next decade to make people 10 times more productive, to help them answer these complex questions about their business faster and more efficiently,” he said in a briefing with journalists and analysts.

A critical aspect of the new platform is that it will make it “easy for any enterprise” to build AI applications with “state-of-the-art” Large Language Models like Llama 2, Falcon LLM, MPT and Nvidia Nemo, according to Boitano.

The platform will accomplish this by letting businesses “pull in those models with the best-in-class tools and easily combine them with [their] proprietary information,” Boitano said. This will result in a new model that “has a nuanced understanding” of a business’ proprietary and private information.

Boitano said this is a welcome alternative for enterprises that want to avoid feeding proprietary data into public Large Language Models due to privacy concerns.

“You don’t want to provide those to a model that’s basically taking your proprietary data and encoding it into this publicly available thing, and that’s why the concept of private AI is so important,” he said.

What’s Included With VMware Private AI Foundation

VMware Private AI Foundation will consist of software components from VMware and Nvidia that will help businesses train AI models and then run inference of them for live applications.

“It’s a full workflow product for your AI developers, and it really takes away the complexity of how do I get started on this? Where do I go? What do I need? All of it is comes prepackaged from both of us,” said Paul Turner, vice president of cloud platform at VMware.

At the top of the software stack is Nvidia AI Workbench, a new toolkit from the GPU designer that lets developers create, test and customize pretrained Large Language Models on a PC and workstation and then scale them to any data center or cloud infrastructure.

The top of the stack also includes the Nvidia NeMo framework, which includes customization frameworks, guardrail toolkits, data curation tools and pretrained models to help enterprises build, customize and deploy generative AI models on any infrastructure powered by Nvidia GPUs.

A critical component of NeMo Is Nvidia’s TensorRT framework for Large Language Models that ensures such models run on the company’s GPUs with the best possible inference performance.

On VMware’s end, the platform includes the vSphere Deep Learning VM images and image repository, which will give users a “stable turnkey solution image” with preinstalled frameworks and performance-optimized libraries for running AI workloads in virtual machines.

The platform also makes use of the VMware vSAN Express Storage Architecture for enabling performance-optimized NVMe storage and supporting GPUDirect storage over RDMA, the latter of which allows direct data transfer between hard drives and GPUs without involving the CPU.

VMware has also tuned the platform to support running AI workloads across up to 16 GPUs or virtual GPUs in a single virtual machine and across multiple nodes. Part of what makes this possible is a “deep integration” between vSphere and Nvidia’s NVSwitch technology that connects multiple GPU systems.

One of the platform’s two foundational elements is Nvidia AI Enterprise, a software suite that contains important building blocks for developing and running AI applications such as NeMo and TensorRT among other frameworks, pretrained models and tools. Nvidia launched it exclusively with VMware in 2021 before opening up the platform to other virtualization vendors and cloud service providers.

The platform’s other foundational element is VMware Cloud Foundation, the virtualization giant’s hyperconverged infrastructure platform for running applications in private or public environments.

VMware Private AI Foundation Will Have GPU-Based Pricing

When VMware Private AI Foundation releases next year, it will be made available as a single product sold by channel partners, either through VMware itself or bundled with OEM systems, according to Turner.

While Turner declined to provide pricing details, he said VMware plans to charge customers based on the number of GPUs they use to create and run their generative AI applications. Nvidia has used a per-GPU pricing model for commercial software it has developed in the past, including Nvidia AI Enterprise.

“As they scale their environment based on the number of GPUs, the Private AI Foundation pricing will be relative to that value that they’re getting,” he said.