AWS Unleashes New Nvidia Blackwell-Based Servers For Advanced AI Tasks

The availability of the new P6e-GB200 UltraServers, according to the cloud platform giant, will help customers push the boundaries of compute-intensive AI workloads such as large model training.

Amazon Web Services is expanding its lineup of GPU-based servers, this week announcing the general availability of a new system based on Nvidia Grace Blackwell Superchips targeting the training and deployment of “the largest, most sophisticated AI models.”

The new P6e-GB200 UltraServers “represent our most powerful GPU offering to date,” said David Brown, vice president of AWS Compute and ML, in a blog post. The new compute services “build on everything we’ve learned about delivering secure, reliable GPU infrastructure at a massive scale, so that customers can confidently push the boundaries of AI,” Brown wrote.

The availability of the P6e-GB200 UltraServers closely follows AWS’ launch in May of Amazon EC2 P6-B200 instances powered by Nvidia B200 GPUs. Those instances are primarily targeted for large-scale, distributed AI training and inferencing for foundation models.

AWS also said the Blackwell-based servers are the first liquid-cooled hardware platform the company has deployed at scale. The cooling system utilizes the company’s In-Row Heat Exchanger (IRHX) technology to support the compute density of GB200 NVL72 racks, according to AWS.

Each P6e-GB200 UltraServer includes up to 72 Blackwell GPUs interconnected using fifth-generation Nvidia NVLink, all functioning as a single compute unit. Each UltraServer provides 360 petaflops of FP8 computer power, 13.4TB of high bandwidth memory, and up to 28.8 Tbps of Elastic Fiber Adapter (EFAv4) networking.

The Blackwell instances are built on several recent AWS infrastructure innovations including the AWS Niro System and EC2 UltraClusters, according to the company, and work with AWS managed services such as Hyperpod and Elastic Kubernetes Service (EKS).

The P6e-GB200 UltraServer is designed to accelerate innovation across emerging generative AI development initiatives such as reasoning models and agentic AI systems, Brown said in his blog post.

“The scale of the AI systems that our customers are building today—across drug discovery, enterprise search, software development, and more—is truly remarkable,” Brown wrote. The goal with the new Blackwell systems is to provide “secure, reliable GPU infrastructure at a massive scale, so that customers can confidently push the boundaries of AI.”

The P6e-GB200 UltraServer “is ideal” for the most compute- and memory-intensive AI workloads, such as “training frontier models at the trillion-parameter scale,” Brown wrote. The P6-B200 instances, meanwhile, support a broad range of AI workloads and are “an ideal option for medium to large-scale training and inference workloads,” according to the blog.