Vast Data And CoreWeave Knit $1.7 Billion AI Pact

The partnership merges CoreWeave’s GPU-accelerated infrastructure with the Vast AI Operating System. Together, according to the announcement, the companies are creating “a new class of intelligent data architecture,” which is made to support continuous training, real-time inference, and large-scale data processing for mission-critical industries.

AI concept. 3D render

New York City-based storage innovator Vast Data struck an agreement with GPU provider CoreWeave that will layer Vast’s AI operating software over the hyperscaler’s data to deliver “instant access to massive data sets.”

Together, according to the announcement, the companies are creating “a new class of intelligent data architecture,” which is made to support continuous training, real-time inference, and large-scale data processing for mission-critical industries.

“The Vast AI Operating System underpins key aspects of how we design and deliver our AI cloud,” Brian Venturo, co-founder and chief strategy officer of CoreWeave, said in a statement. “This partnership enables us to deliver AI infrastructure that is the most performant, scalable, and cost-efficient in the market, while reinforcing the trust and reliability of a data platform that our customers depend on for their most demanding workloads.”

The deal, valued at $1.17 billion, was announced Thursday.

[RELATED: Dell Targets Rivals Pure Storage And Vast Data As AI Race Heats Up]

In a statement, Vast said its software-defined storage is built on infinitely scalable architecture that can be set up in large data center environments that demand reliability at scale.

“At Vast, we are building the data foundation for the most ambitious AI initiatives in the world,” Renen Hallak, founder and CEO of Vast Data, said in a statement. “Our deep integration with CoreWeave is the result of a long-term commitment to working side by side at both the business and technical level. By aligning our roadmaps, we are delivering an AI platform that organizations cannot find anywhere else in the market.”

Vast Data has become one of the favored vendors for Nvidia, which has also struck a partnership with the company around its data fabric. In a sit-down with Hallak last year, Nvidia CEO and founder Jensen Huang heaped praise on the company and said he looked forward to working with Vast Data for the next “80 years.”

Vast Data entered the market in 2018 with a software-defined storage product that was assembled to spec by Arrow or Avnet for each purchase order, Howard Marks – whose title at Vast Data is technologist extraordinary and plenipotentiary – told CRN this week.

Marks said it was the best way to build the fully HA NVMe Over Fabrics JBOF with no single point of failure which wasn’t widely commercially available back then.

Marks said as the company has grown it has expanded its hardware availability list to include standard x86 servers from Dell, HPE, Lenovo, Cisco and SuperMicro, but the minimum size cluster is relatively large to get the degree of resilience Vast considers necessary required, Marks said.

Marks said what made the company’s storage software compelling is the resiliency that ensured a customer’s data was secured across all servers to the point that even if some servers died the data remained available.

Now Vast Data has taken on jobs “up the stack” in the database engine, the event broker, and other storage elements that were typically reserved for outside vendors, which specialized in managing those abstraction layers.

“Two years ago, we would have been a software-defined storage vendor, but we’ve moved up stack from storage,” he told CRN . “ We said, OK, let’s merge the understanding of what blocks on what SSDs make up a file, and what blocks on what SSDs make up a table. Instead of abstracting that a table points to locations within a file, and then there’s a separate index from the file that says what SSDs it is – we’ll just skip that abstraction, and we’ll understand what tables are, just like we understand what files are.”

He said adding database workloads and treating structured data has opened up Vast’s total addressable market to include platforms running the open-source framework Hadoop and analytics applications, as well as adding Snowflake and Databricks to its competitor list.

“Now we’re adding the workflow automation too so that when a file gets added to this folder, it automatically gets sent through this workflow so that it can get added to your RAG solution, so that your chat bot knows about it in real time,” Marks told CRN. “And so the whole task automation, and Kafka-compatible event broker, and managing a Kubernetes cluster, so that the pieces that do that run on a GPU server to calculate the vectorization can be run by us. And then the database now can hold those vectors and do vector search, so that we’re just more and more tied into the AI ecosystem.”