These Are The Stellar Startup Big Data Vendors To Know In 2025
It’s clear that without good data, all these AI applications and agents now being developed and deployed are, well, not all that intelligent. As part of CRN’s Stellar Startups for 2025, here are five big data technology startups, founded in 2019 or later, that solution providers should be aware of.
Big Data, Big Opportunities
Wrangling ever-growing volumes of business data has always been a challenge for IT managers. But successfully doing so has become all the more critical with the proliferation of AI applications and agents across business operations and processes, given that those AI systems require huge amounts of data to function. Complicating those chores is the fact that data is increasingly distributed across broad IT estates—both in the cloud and on premises.
And the sheer volume of data businesses and organizations are working with continues to explode: More than 400 million terabytes of digital data are generated every day, according to market researcher Statista, including data created captured, copied and consumed worldwide.
That’s why there is a steady stream of startup companies developing leading-edge technologies to help businesses access, collect, manage, move, transform, analyze, understand, measure, govern, maintain and secure all this data.
All this means that big data presents a major opportunity for solution providers, MSPs and strategic service providers.
As part of CRN’s Stellar Startups for 2025, here are five big data technology startups, founded in 2019 or later, that solution providers should get to know. The following slides include lightly edited company and product descriptions provided by the startups.
DataPelago
Founded: 2021
Top Executive: Rajan Goyal, CEO and Co-Founder
DataPelago says it is transforming the economics of data processing with Nucleus, the company’s universal data processing engine built for accelerated computing in the GenAI and analytics era. Nucleus is purpose-built to process any type of data, across any hardware, and supports any query engine, delivering new price/performance benefits for data processing.
DataPelago Accelerator for Spark combines native execution, CPU vectorization, and GPU acceleration for Apache Spark workloads. The Accelerator is powered by Nucleus, DataPelago's universal data processing engine, and delivers up to 10x speedup and 80 percent cost reduction for customer workloads without requiring code changes.
Website: https://www.datapelago.ai/
Diliko
Founded: 2022
Top Executive: Glenn Hazard, CEO
Diliko provides intelligent, secure, and compliant data infrastructure for mid-sized enterprises. Its cloud-native platform automates the data lifecycle with Agentic AI, enabling real-time ETL, governance, and analytics readiness without complex infrastructure or expert staff. Designed for regulated industries, Diliko empowers partners to deliver fast, scalable solutions with built-in compliance.
Diliko’s cloud-native data platform is built for mid-sized enterprises. It automates ETL, reverse ETL, data tagging and policy enforcement using Agentic AI, which dynamically adapts data pipelines to schema, compliance, and business logic changes. Zero trust security, encryption, and multi-factor authentication ensure audit-ready performance without the overhead of traditional infrastructure.
Website: https://diliko.ai/
Firebolt
Founded: 2019
Top Executive: Hemanth Vedagarbha, President
Firebolt provides a high-performance cloud data warehouse designed for millisecond-latency analytics at scale. With a decoupled storage-compute architecture, vectorized query engine, and native support for data apps, Firebolt powers fast, cost-effective analytics for data engineering teams building operational and customer-facing applications across SaaS, MarTech, AdTech, and beyond.
The Firebolt Cloud Data Warehouse delivers millisecond analytics at scale with high concurrency and low cost. Key features include a decoupled storage-compute architecture, vectorized query engine, native support for semi-structured data, and Iceberg integration. It's built for data engineers and developers powering real-time analytics and AI-driven applications.
Website: firebolt.io
Onehouse
Founded: 2021
Top Executive: Vinoth Chandar, CEO and Co-Founder
Onehouse was founded by data lakehouse pioneer Vinoth Chandar, who built the first data lakehouse in 2016 at Uber. Onehouse today delivers the data lakehouse in minutes as a fully managed, cloud-based service. The company is backed by Craft Ventures, Addition, and Greylock Partners.
The Onehouse Universal Data Lakehouse provides lightning-fast data ingestion, incremental data transformations, and intelligent optimizations—a data platform that’s instantly accessible from any engine, from BI to AI. Purpose built for data lakehouse workloads, Onehouse delivers a 2x-3x price-performance advantage over alternative Spark- and SQL-based data platforms.
Website: https//www.onehouse.ai
Tessell
Founded: 2021
Top Executive: Bala Kuchibhotla, CEO and Co-Founder
Tessell develops a cloud-native Database-as-a-Service (DBaaS) platform that simplifies and secures database operations across PostgreSQL, MySQL, Oracle, and more. Focused on enterprises modernizing data infrastructure, Tessell runs on AWS and Azure, offering high-performance storage, predictable pricing, and automated management for mission-critical, multi-cloud workloads.
Tessell’s fully managed database-as-a-service platform for Google Cloud supports Oracle, PostgreSQL, SQL Server, MySQL, and more. It automates provisioning, scaling, and disaster recovery while ensuring performance and compliance. It’s targeted at enterprises and mid-sized organizations that are modernizing mission-critical workloads in regulated industries.
Website: https://www.tessell.com/