The 10 Coolest Big Data Startups Of 2020

Rick Whiting

Businesses and organizations today are turning to next-generation big data software and services startups to help them identify, integrate, prepare and manage the huge volumes of data they need for business analytics, digital transformation and other initiatives. Here’s a look at 10 startups that offer ground-breaking technology that help solution providers and their customers meet their big data challenges.


Data Preparation, Integration Challenges Drive Demand For Next-Gen Big Data Tools

As businesses look to utilize huge volumes of data for a range of tasks, from business analytics and AI projects to digital transformation initiatives, they are running into a number of problems.

Finding, combining, preparing and transforming data for specific tasks has become a huge challenge: Gartner estimates that data and analytics leaders are spending 36 percent of their time on data preparation and data integration, more than any other data management job.

And as IT environments splinter across on-premises, hybrid and multi-cloud systems, so does a company’s data. That makes finding, integrating and managing data scattered across many systems for big data initiatives increasingly difficult.

See the latest entry: The 10 Hottest Big Data Startups of 2022

Here’s a look at 10 big data startups with ground-breaking products—many designed to solve big data preparation, integration and management challenges—that caught our attention in 2020.

Get more of CRN’s 2020 tech year in review .


Top Executive: Founder and CEO Haoyuan Li

Headquarters: San Mateo, Calif.

Alluxio offers a data orchestration platform for managing data for analytics and machine- learning applications when the compute and data storage functions are in separate locations—an increasingly common situation in today’s complex hybrid cloud and multi-cloud IT environments.

The virtual distributed storage system is based on a memory-centric, fault-tolerant architecture that enables the separation of storage and compute functions. The technology is based on U.C. Berkeley’s AMPLab Tachyon open-source project, whose creators founded Alluxio in 2015.

In October the company released Alluxio Data Orchestration Platform 2.4. The release offers new tools and functionality for linking data-driven applications, such as AI tools and business analytics software, with dispersed data sources like Hadoop-based data lakes, Amazon S3 and Google Cloud Storage.


Top Executive: Founder, CEO Adrian Knapp

Headquarters: Santa Monica, Calif.

Aparavi, which originally focused on data management for data backup tasks, launched its Data Intelligence and Automation Platform in March, touting the system as the answer for how companies can deal with data chaos, risk and opportunity in a distributed IT world.

Founded in 2017, Aparavi exited stealth in 2018.

The Aparavi platform is used to find, classify, automate and govern distributed data across on-premises and cloud systems for a range of tasks including data discovery and access, data retention and protection, and data governance, risk and compliance. The system provides analytics, machine learning and collaboration tools with access to distributed data, helping users transform it into a competitive asset.


Top Executive: Founder, CEO Christopher Bergh

Headquarters: Cambridge, Mass.

DataKitchen and its DataKitchen DataOps platform have been attracting attention in the emerging realm of data operations or “DataOps.”

DataOps, which borrows some concepts from Agile development and DevOps, means adopting a methodical, agile approach to how data architectures and data pipelines are designed, operated and used to support business analytics and data management teams.

The DataKitchen DataOps system automates and coordinates people, tools and activities around an organization’s data pipelines, including data system orchestration, testing and monitoring to development and deployment.

In October DataKitchen launched the DataOps Transformation Advisory Service.


Top Executive: Founder, CEO Nir Livneh

Headquarters: Sunnyvale, Calif., and Tel Aviv, Israel

Businesses and organizations are generating a lot more data today. That means a lot more data needs to be moved between systems—often in real time. The next wave of cloud migrations, for example, will require data streaming capabilities, according to industry analyst Kevin Petrie of the Eckerson Group.

Equalum has created a data ingestion platform for developing and managing both batch and streaming data pipelines for such tasks as data warehouse ETL (extract, transform and load), data consolidation into data lakes, and continuous data replication for change data capture. The company touts the “infinite speed and scalability” of its technology and its ability to develop data pipelines with zero coding.

Founded in 2015, Equalum has raised $25 million in equity financing.


Top Executive: CEO David Flynn

Headquarters: Los Altos, Calif.

Hammerspace is a player in the growing Data-as-a-Service space with technology that provides access to data across hybrid and multi-cloud IT systems. The company‘s software-defined Hybrid Cloud Data Control Plane, which relies on metadata-driven machine learning, virtualizes and abstracts data from multiple storage systems—both on-premises and cloud-based—making it available to any application, service, container or developer.

Hammerspace, founded in 2018, launched the Hammerspace Channel Program in December 2019 and expanded the program into the European Union in July of this year. The company’s channel initiatives are led by Mark Glasgow, senior vice president, worldwide sales and field operations.


Top Executive: Co-Founder, CEO Darshan Rawal

Headquarters: Palo Alto, Calif.,

Big data startup Isima, founded in 2016, just emerged from stealth in August backed by $10 million in seed funding.

The company’s BiOS data convergence and analytics platform is designed to help business and organizations manage the whole life cycle of developing and deploying data-driven applications and easily adding new data sources. The platform consolidates many of the capabilities of traditionally separate big data tools including data warehouse, ETL, enterprise service bus and business intelligence software.

The software is already in use at a number of customer sites for supply chain optimization, fraud detection, trade reconciliation and churn reduction applications.


Top Executive: Co-Founder, CEO Amnon Drori

Headquarters: Rosh Ha’ayin, Israel

Octopai, founded in 2015, develops an automated, centralized, metadata management and search engine system that data scientists and business intelligence groups use to discover, govern and track shared metadata.

The software is used to maintain companywide data consistency and help business analysts find and understand available data and the data’s lineage. It can also be used for big data governance and compliance tasks where data lineage is key.


Top Executive: CEO Nick Halsey

Headquarters: San Francisco

Providing user access to increasingly large volumes of data—“democratizing access” is the buzz term—while enforcing data security and data governance policies is becoming a major challenge.

Okera has developed a platform that IT managers use to automatically discover and tag sensitive data, develop and enforce data governance policies, and audit data security and governance operations. This year the company was named a Gartner Cool Vendor in DataOps.

In April Okera, founded in 2016, raised $15 million in Series B funding, bringing its total financing to $29.6 million, money the company is investing in expanded engineering, sales and marketing. At the same time Okera named Nick Halsey, previously president and CEO at ZoomData, as the company’s new CEO.


Top Executive: Co-Founder, CEO Itamar Ben Hemo

Headquarters: New York

Rivery offers an “intuitive” data integration and preparation platform that simplifies the process of aggregating and transforming both internal and external data into a single stream for loading into cloud-based analytics systems such as Amazon Redshift, Google BigQuery and Snowflake.

The company’s platform includes a no-code ETL tool, software for automatically migrating data from on-premises systems to cloud data warehouses, and data orchestration tools to connect and orchestrate all in-house and third-party data sources.

Founded in 2018, Rivery received $5 million in a seed round of funding in November 2019.


Top Executive: Co-Founder, CEO Kendall Clark

Headquarters: Arlington, Va.

Stardog has developed its Enterprise Knowledge Graph Platform that the startup said creates a flexible, reusable data layer for answering complex queries across multiple data silos—using connectors to all major SQL systems and the most popular NoSQL databases. The technology unifies data based on its meaning, creating what the company calls a data fabric and “a connected network of knowledge.”

The company’s BITES pipeline even extracts concepts from unstructured data such as research papers, resumes and regulatory documents.

Founded in 2015, Stardog has raised $23.3 million in venture funding.


Rick Whiting

Rick Whiting has been with CRN since 2006 and is currently a feature/special projects editor. Whiting manages a number of CRN’s signature annual editorial projects including Channel Chiefs, Partner Program Guide, Big Data 100, Emerging Vendors, Tech Innovators and Products of the Year. He also covers the Big Data beat for CRN. He can be reached at

Sponsored Post


Advertisement exit