The 10 Hottest Machine Learning And Data Science Startups In 2021

Businesses looking to bring the rewards of big data analysis to everyday users need ways to prepare and organize data and develop machine learning models for analyzing it. Here’s a look at 10 hot startups developing leading-edge data science and machine learning technologies that help them do that.

A Scientific Approach

Read the latest entry: The 10 Hottest Data Science and Machine Learning Startups of 2022

Businesses today are struggling to leverage exploding volumes of data for competitive advantage. To meet that challenge, data engineers and data scientists are increasingly turning to emerging technologies in data science, artificial intelligence, machine learning and even deep learning to prepare and organize big data and develop the machine learning algorithms and predictive models that support business intelligence applications used by business analysts and information workers.

Here’s a look at 10 top data science and machine learning startup companies with leading-edge products in the data science and machine learning arena that solution providers should be aware of.


Top Executive: Gideon Mendels, Co-Founder, CEO

Comet’s platform allows data scientists and machine learning teams to manage and optimize the entire machine learning life cycle with a single system, including model management, model production monitoring, as well as tracking data sets, code changes and experimentation history.

The latest release of the Comet product included Comet Artifacts for managing data set versions across any step of the ML pipeline and the addition of continuous integration and continuous deployment capabilities for automating the deployment of models into production.

Comet says it has achieved five-fold growth in annual recurring revenue over the last year and boasts a rapidly growing customer base. In November Comet, founded in 2017 and based in New York, raised $50 million in Series B funding.

Top Executive: Josh Benamram, Co-Founder, CEO

Databand’s unified data observability and machine learning development platform helps data engineers and data scientists identify, troubleshoot and fix data quality issues for data pipelines running on cloud-native systems such as Snowflake, Apache Spark and Apache Airflow., founded in 2018 and based in Tel Aviv, Israel, raised $14.5 million in December 2020 in a Series A round of funding.


Top Executive: Gleb Mezhanskiy, Co-Founder, CEO

Datafold offers a data reliability platform that data engineers and data scientists use to manage data workflows and monitor and improve analytical data quality. The technology can reduce the number of data quality incidents that make it into production by a factor of 10, according to the company.

In November Datafold, founded in 2020 and based in San Francisco, raised $20 million in Series A funding.


Top Executive: Ryohei Fujimaki, Founder, CEO

DotData develops what it calls AutoML 2.0 solutions for automating data science workflows. The dotData Enterprise machine learning and data science automation platform handles data ingestion and wrangling, automated feature engineering, AutoML and model operationalization tasks—all with zero coding.

In February dotData, founded in 2018, launched dotData Cloud, an AI/ML automation platform and services that provide business intelligence teams— especially those within smaller organizations that lack their own data science teams—to quickly automate AI/ML development tasks. And in May the San Mateo, Calif.-based company debuted dotData Py Lite, a containerized AI automation system for data scientists using Python.


Top Executive: Maor Shlomo, Co-Founder, CEO

Explorium develops an automated data acquisition platform for integrating a company’s internal data with thousands of external data sources for use by data scientists and business analysts.

The Explorium technology portfolio also includes a powerful Auto ML engine for automated data discovery and feature generation, and the Signal Studio for finding and integrating the most relevant external data signals.

Explorium, founded in 2017 and based in San Mateo, Calif., closed a $75 million Series C round of funding in May, bringing its total financing to $127 million.

In August the startup stepped up its go-to-market efforts, hiring former VMware worldwide sales director Sam Pugmire as chief revenue officer and former Mulesoft executive Tim Marsh as vice president of alliances and channels. They join Ajay Khanna, previously marketing vice president at Reltio, who joined Explorium as chief marketing officer earlier this year.

Top Executive: Dmitry Petrov, Co-Founder, CEO builds open-source tools used to extend traditional development technologies for machine learning projects—especially ML projects involving unstructured data.’s portfolio includes the DVC version control system, Continuous Machine Learning (CML) for continuous integration/continuous delivery and deployment, and the just-released DVC Studio for project collaboration. New versions of DVC and CML launched in March eliminate the need for proprietary AI platforms such as AWS SageMaker and Microsoft Azure ML Engineer, according to the San Francisco-based company.

Founded in 2018, raised $20 million in a Series A funding round in June.


Top Executive: Luis Ceze, Co-Founder, CEO

The OctoML platform, built on the open-source Apache TVM framework, is used to deploy machine learning models on varied hardware configurations and provide automation and performance when bringing trained models to production. The platform supports cloud services and edge computing hardware endpoints and helps businesses and organizations optimize ML models to match edge resources.

OctoML, founded in 2019 and based in Seattle, was spun out of the University of Washington Paul G. Allen School of Computer Science & Engineering where work on deploying machine learning models led to the creation of the open-source Apache TVM deep learning compiler by OctoML’s founders.

In November OctoML raised $85 million in Series C funding, bringing the company’s total funding to $132 million.

Top Executive: Raj Bains, Co-Founder, CEO provides a low-code data engineering platform for developing and deploying data pipelines used to manage streams of data for business analytics and machine learning tasks. The system combines visual drag-and-drop development with Agile software engineering practices.

In February debuted a SaaS-based version of the platform built on Apache Spark, the open-source analytics engine, and the Kubernetes container management system, and running on the Databricks system on Amazon Web Services, Microsoft Azure and Google Cloud Platform., based in Palo Alto, Calif., raised $6.75 million in funding in February.

Top Executive: Serkan Piantino, Co-Founder, CEO

Headquarters: New York develops a machine learning platform for deep learning operations (DLOps) that the company says goes beyond traditional machine learning with its capabilities for preparing, training, deploying and managing the full life cycle of machine learning and deep learning models.

Deep learning is a segment of the machine learning world that incorporates complex learning models that rely on AI-based neural networks and is often used for complex tasks such as image recognition and natural language processing. Deep learning models are compute-intensive and often require high-performance systems running on GPUs and next-generation AI processors., founded in 2017, says its cloud-agnostic platform can help reduce the costs of deep learning model development. In addition to customers such as Square, Healx and Conde Nast, is teaming up with systems integrators with deep learning practices and application developers building deep learning-driven software.

Top Executive: Mike Del Balso, Co-Founder, CEO exited stealth in April 2020 with its data platform for machine learning that’s designed to enable data scientists to turn raw data into the predictive signals that power machine learning models. The company’s goal is to solve the data challenges that constitute the biggest impediment to deploying machine learning in the enterprise.

The company’s founders—CEO Mike Del Balso, CTO Kevin Stumpf and Engineering Vice President Jeremy Hermann—worked together at Uber when the ride sharing giant was building and deploy new machine learning models. They created Uber’s Michelangelo machine learning platform and then went on to found to develop technology to help other companies meet their operational machine learning data challenges., founded in 2019 and based in San Francisco, has raised $60 million in several rounds of funding.