Emerging Big Data Vendors To Know In 2022

As data becomes an increasingly valuable asset for businesses—and a critical component of many digital transformation and business automation initiatives—demand is growing for next-generation data management and data analytics technology. Here’s a look at 14 startups that are providing it.

Tackling The Big Challenges Of Big Data

In 2025 the total amount of digital data and information created, captured, copied and consumed worldwide will reach 181 zettabytes, up from 79 zettabytes in 2021, according to an estimate from market researcher Statista.

Businesses are struggling to manage all this data—not to mention analyze and otherwise leverage it for competitive advantage. Global spending for big data products and services, not surprisingly, is expected to explode from $162.6 billion last year to $273.4 billion in 2026, according to a MarketsandMarkets report.

That’s why the big data technology space remains one of the most active segments of the IT industry. A new generation of startups is setting the pace in providing leading-edge technologies in data management and data analytics.

Demand for big data technology, services and expertise is running high right now. And that means opportunity for solution providers. As part of CRN’s Emerging Vendors for 2022, here are 14 big data startups, founded in 2016 or later, that solution providers should be aware of.

*Ahana

*Airbyte

*Bigeye

*Cribl

*Databand

*Equalum

*Firebolt

*Molecula

*Monte Carlo

*Nexla

*Promethium

*Prophecy

*Starburst

*Syncari

Ahana

Founded: 2020

Top Executive: Steven Mih, Co-Founder, CEO

Ahana offers a managed service for the Presto SQL query engine on AWS with the vision to simplify open data lake analytics. Ahana Cloud delivers easy-to-use Presto SaaS and enables data platform teams to provide high performance SQL analytics on S3 data lakes and other data sources.

Ahana, based in San Mateo, Calif., recently launched a free community edition of its software and announced a $7.2 million continuation of its Series A funding, bringing its total funding to $32 million.

Website: https://ahana.io/

Airbyte

Founded: 2020

Top Executive: Michel Tricot, Co-Founder, CEO

Airbyte develops an open-source data integration engine that businesses use to unify their ETL (extract, transform and load) pipelines under a single platform for consolidating data in data warehouses and data lake systems.

Airbyte, based in San Francisco, said it is disrupting the data integration/ETL space with its pricing model that’s based on compute time rather than data volume.

In April the company launched Airbyte Cloud, a managed cloud service based on its core platform.

In December Airbyte raised $150 million in Series B funding.

Website: https://airbyte.com/

Bigeye

Founded: 2019

Top Executive: Kyle Kirwan, Co-Founder, CEO

Bigeye develops data observability software tools for measuring, improving and communicating the quality of data used for self-service business analytics, machine learning models and other data-intensive tasks.

Bigeye’s data observability platform, including its Autometrics and Autothresholds products, monitors the quality of data as it flows between systems, helping data management teams identify and fix data quality problems and maintain data integrity.

San Francisco-based Bigeye raised $45 million in Series B funding in September 2021, close on the heels of a $17 million Series A round in April of that year.

Website: https://www.bigeye.com/

Cribl

Founded: 2017

Top Executive: Clint Sharp, Co-Founder, CEO

Cribl develops a line of data observability products, including Cribl Stream and Cribl Edge, that the company said “makes open observability a reality” for tech professionals and provides “radical levels of choice and control” over their machine-generated data.

In August 2021 the San Francisco-based company launched Cribl Cloud, making it possible to monitor and manage data through the cloud while keeping the data itself in local processing and storage systems.

Cribl raised $150 million in Series D funding in May, following a $200 million Series C round in August 2021.

Website: https://cribl.io/

Databand

Founded: 2018

Top Executive: Josh Benamram, Co-Founder, CEO

Databand is another player in the data observability space, developing a proactive data observability platform for detecting, troubleshooting and resolving data quality issues in near real-time.

IBM acquired Israel-based Databand on June 27.

Website: https://databand.ai/

Equalum

Founded: 2016

Top Executive: Guy Eilon, CEO

Equalum‘s continuous data integration platform natively supports all data integration modes under one, unified platform with zero coding. Equalum offers next-generation change data capture, real-time streaming, ETL/ELT and batch ETL capabilities with native cloud support and enterprise-grade scalability.

Equalum, based in Sunnyvale, Calif., released the Equalum Continuous Data Integration Platform 3.0 in April, natively supporting all data integration, ingestion and transformation use cases.

Website: https://www.equalum.io/

Firebolt

Founded: 2019

Top Executive: Eldad Farkash, Co-Founder, CEO

Firebolt has developed a high-performance cloud data warehouse system that is targeted toward data-intensive applications and interactive analytical systems that tap into huge volumes of data. The company targets its cloud system at data engineers and data application developers.

The Tel Aviv, Israel-based startup exited stealth in December 2020 with $37 million in Series A funding and added to that in June of last year with a $127 million Series B round.

Website: https://www.firebolt.io/

Molecula

Founded: 2019

Top Executive: Higinio Maycotte, CEO

Molecula develops the FeatureBase feature-oriented database that data engineers use to support real-time analytical tasks and machine learning applications. The database simultaneously executes low-latency, high-throughput and highly concurrent workloads.

Molecula is based in Austin, Texas.

Website: https://www.molecula.com/

Monte Carlo

Founded: 2019

Top Executive: Barr Moses, Co-Founder, CEO

Monte Carlo’s data observability and reliability platform helps businesses and organizations ensure that their data is accurate—critical for powering data analytics systems, machine learning applications and digital products.

As businesses rely on data to power digital products and drive decision-making, it‘s critical that data is accurate and reliable. Monte Carlo’s Data Observability Platform automatically monitors data as it flows through pipelines and alerts for data issues across data warehouses, data lakes, ETL systems and business intelligence tools.

Monte Carlo, based in San Francisco, raised $135 million in Series D funding in May, boosting its total financing to $236 million.

Website: https://www.montecarlodata.com/

Nexla

Founded: 2016

Top Executive: Saket Saurabh, Co-Founder, CEO

Nexla provides a data engineering automation platform that unifies data operations with a single interface for managing all flavors of data flows including ETL, ELT, API Integration, API Proxy and Data as a Service.

The platform automatically creates data as a product using a data fabric architecture powered by “continuous metadata intelligence,” giving users the ability to track data lineage and prepare data pipelines.

Nexla is based in San Mateo, Calif.

Website: https://www.nexla.com/

Promethium

Founded: 2018

Top Executive: Kaycee Lai, Founder, CEO

Promethium develops a collaborative data and analytics acceleration system that the company said makes it possible for everyone to make data-driven decisions without the complexity of data management.

The Promethium technology can discover, prepare, query and visualize data without the need to move data or switch between tools.

Promethium raised $26 million in Series A funding in February. The Menlo Park, Calif.-based company was recently awarded a patent for natural language processing-based automation.

Website: https://www.pm61data.com/

Prophecy

Founded: 2017

Top Executive: Raj Bains, Founder, CEO

Prophecy markets a low-code data engineering system that the company said “democratizes” the development and deployment of high-quality data pipelines. The platform is built on the Apache Spark data processing engine and Kubernetes container management system.

In June the company unveiled Prophecy for Databricks, which allows data engineers to use Prophecy in conjunction with the Databricks Lakehouse Platform to quickly build data pipelines for business intelligence and machine learning tasks.

In January Prophecy, Palo Alto, Calif., raised $25 million in Series A funding.

Website: https://www.prophecy.io/

Starburst

Founded: 2017

Top Executive: Justin Borgman, Co-Founder, CEO

Starburst has been one of the more visible big data startups in recent years, raising $414 million in venture funding—including a $250 million Series D round in February that put the Boston-based company’s value at $3.35 billion.

Starburst’s systems, including Starburst Enterprise and the cloud-base Starburst Galaxy, can query data across any source and location, without needing to move data, making it instantly actionable for data-driven applications and operations. The company’s products are based on the Trino SQL query engine.

In June Starburst acquired Varada, an Israeli developer of data lake analytics acceleration software.

Website: https://www.starburst.io/

Syncari

Founded: 2019

Top Executive: Nick Bonfiglio, Co-Founder, CEO

Syncari develops a no-code data management and process orchestration platform that addresses a range of data challenges in data quality and governance, data transformation, workflow automation, system alignment and reverse ETL.

The Syncari system is designed to help businesses and organizations overcome the many point-to-point data integrations that have developed over time, especially with the growth of cloud applications and data sources. The software is used by IT teams, sales and marketing operations, revenue operations and customer support to help synchronize splintered data.

Syncari, based in San Francisco, raised $17.3 million in a Series A funding round in May 2021.

Website: https://syncari.com/