Nvidia’s Craig Weinstein: Groq AI Racks Will Become A Channel Play ‘Over Time’

In an interview with CRN, Nvidia Americas Channel Chief Craig Weinstein explains why the company’s upcoming Groq LPX and Vera CPU offerings for AI data centers will become a channel play ‘over time.’ Two partners say enterprise interest could be limited for now.

When Nvidia revealed its Groq 3 LPX server rack last Monday, the company indicated that it’s focused on direct engagements with top AI model providers that will take advantage of the product’s ultra-low latency to deliver super-fast, premium AI services.

But in an interview with CRN the next day, Nvidia Americas Channel Chief Craig Weinstein said he expects the Groq LPX racks to become appealing to sophisticated enterprise customers and, as a result, a product for channel partners to sell “over time.”

[Related: Nvidia To Use Intel Xeon 6 CPUs For DGX Rubin NVL8 Systems]

The executive also said the same for the AI infrastructure giant’s new push to provide its custom, Arm-compatible Vera CPU as a stand-alone offering.

However, executives at Lenovo and a top Nvidia channel partner told CRN that the Groq LPX racks will have limited appeal among enterprises—at least for now. The Lenovo leader also had similar views around Vera CPU servers.

Nvidia plans to offer the Groq 3 LPX rack, which features 256 Groq 3 language processing units (LPUs), alongside its flagship Vera Rubin NVL72 rack, which contains 36 Vera CPUs and 72 Rubin GPUs, starting in the second half of the year. These new products were announced at the company’s GTC 2026 event last week.

“We believe as inference scales into the enterprise—and it already is—the tokenomics side of that equation is going to play out very well for the LPU system,” said Weinstein, who is vice president of Nvidia’s Americas partner organization, at GTC.

His “tokenomics” comment is in reference to the company’s claims that the Groq 3 LPX and Vera Rubin NVL72 racks, when connected and running in tandem, can significantly speed up the rate at which trillion-parameter models can produce tokens. These models can range in use case from chatbots with reasoning capabilities to coding assistants, producing multitudes of text or code as tokens depending on the application.

In one example, Nvidia said, the two systems can boost inference throughput for a 1-trillion-parameter GPT model by 35 times for every megawatt consumed by the two systems in comparison to the previous-generation Grace Blackwell NVL72 platform.

This would result in the model producing 300 tokens per second per megawatt as it serves 500 tokens per second for every user, which is expected to help large and influential AI model developers provide more expensive, premium services.

“The primary message [at GTC] was around token factories or high-end hyperscalers or AI natives whose business is built around tokens,” Weinstein said. “But as enterprises scale and tokenomics become even more important to those enterprises, that LPU system is going to be super important, [providing] low-cost inference that allows that enterprise to do millions if not billions of tokens to scale their token factories.”

The eventual channel play would apply to the “many” partners that are already delivering and implementing Nvidia’s rack-scale platforms such as the Grace Blackwell NVL72, according to Weinstein, who said one unnamed partner is handling 18,000 racks a year. These kinds of server racks often require liquid cooling and hundreds of kilowatts in electricity, limiting the hardware to customers that have the proper infrastructure for such products.

“These partners are in the rack-scale game, either in the enterprise or for neoclouds. Many of them even partner and do business with hyperscalers. The responsibility of the build side of that becomes a core ingredient of the services portfolio that those partners are building,” he said.

Weinstein said there is no timeline for when partners could start handling Groq LPX deployments, noting that the company will approach the channel “opportunistically.”

Lenovo, Mark III Execs See Limited Groq Enterprise Interest—For Now

While no OEM support has been announced for Groq 3 LPX, the sales leader of Lenovo’s data center business, Vlad Rozanovich, said his company would consider offering the rack if it starts seeing demand from customers, even if there isn’t a big enterprise play yet.

“We’ve heard about interest in Groq in places like Saudi Arabia,” he said in an interview with CRN last week, citing interest from the Middle Eastern country’s state-backed AI company, Humain, which had been using Groq chips prior to Nvidia’s announcement.

Elsewhere, Lenovo is seeing interest from “many of the same companies that are looking at large language models, but it’s not something that all enterprises are asking for,” said Rozanovich, who is senior vice president of Lenovo’s Infrastructure Solutions Group.

Andy Lin, CTO and vice president of strategy and innovation at Houston-based Nvidia systems integration partner Mark III Systems, told CRN that he thinks the Groq 3 LPX will find appeal among cloud service providers, neoclouds and AI-native companies “that really differentiate themselves on providing a great user experience for models.”

However, he admitted that those companies represent a “very small subset” of his company’s customer base because taking advantage of new chip architecture like Groq’s will require organizations that have a “certain amount of scale and advanced capability” with respect to development and integration resources.

As for Groq’s potential among enterprise customers, Lin said he sees limited appeal.

“I think it will require a special type of enterprise to want to do the work to integrate it into their current pipelines,” said the solution provider executive, whose company has won Nvidia Partner Network awards for multiple years in a row, including this year. “I would say it’s probably [going to appeal to] the small minority of enterprises because enterprises find it challenging enough to spin up a true AI Center of Excellence or AI factory.”

Vera CPU Opportunities In The Channel: ‘Not An Immediate Opportunity’

With Nvidia’s plan to offer the Vera CPU in its first CPU-only server rack, the company sees a multibillion-dollar opportunity to speed up important functions of agentic AI workloads that are better off running on CPUs rather than GPUs.

These functions fall under the umbrella of what Nvidia is calling “sandbox execution,” and they include things such as tool calling, database queries and code compilation. The faster these processes run, the quicker any resulting data can reach GPUs to output tokens based on tasks given by a user, according to the company.

Like Groq LPX, Weinstein said stand-alone Vera CPU offerings like the Vera CPU rack represent a “specialized opportunity for certain workloads that can benefit from that architecture.” While this means it’s “probably not an immediate opportunity” for the channel, he thinks it could be “over time.”

This specialized opportunity is mainly focused on hyperscalers, including Meta and Oracle Cloud Infrastructure, as well as neocloud providers like Lambda and Nebius.

But Nvidia is working with top OEMs—including Dell Technologies, HPE, Lenovo and Supermicro—to offer the Vera CPU in a variety of configurations, according to Ian Buck, Nvidia’s vice president of hyperscale and high-performance computing.

“They can take this processor, and they are absolutely welcome to build CPU servers for the market,” he said in a meeting with journalists at GTC last week.

Vera CPU Interest Dependent On Enterprise Arm Support

Rozanovich, the Lenovo sales executive, said he sees a limited opportunity for stand-alone Vera CPU servers in the channel, at least when it comes to enterprises in the near term.

“When I think of a Vera compute-only solution through channel partners, maybe [it will be sold to the] early adopters. But it's not going to be the mass part of the market,” he said.

That’s because “traditional enterprises are still in that x86 world,” according to Rozanovich, and enabling software on Arm-based CPUs requires a “heavy lift.”

“There are definitely software applications that Nvidia is driving to that Arm infrastructure, and it's going to happen—it will happen—but it's going to take time,” he said.

The Lenovo executive said he saw this issue with CPU servers the vendor sold based on Vera’s predecessor, Grace, which used an off-the-shelf CPU core design from Arm versus the custom core Nvidia created for Vera.

“Grace was not very channel-friendly because the ecosystem wasn’t ready. Are there some software vendors that are going down that path? Yes. We’re not there yet,” he said. “x86 is still that ubiquitous part of the market, but you can’t forget about the power of Nvidia and the reach that they have right now to software developers.”

Lin, the executive at Mark III Systems, said he thinks “there’s a lot of interest” from customers in Vera CPU servers, particularly around high-performance computing, where workloads like simulation can benefit from CPU optimizations.

The solution provider CTO noted that some customers had hiccups with Nvidia’s first server CPU, Grace, because “their codes weren’t optimized for Arm yet.” However, he expects the situation to improve with the AI infrastructure giant’s second generation.

“I think there’s going to be additional adoption and folks willing to explore even more because I think a lot of the efficiencies are even more pronounced now,” Lin said.