The 10 Hottest Data Science And Machine Learning Startups of 2021 (So Far)

Businesses looking to bring the rewards of big data analysis to everyday users need ways to prepare and organize data and develop machine learning models for analyzing it. Here’s a look at 10 hot startups developing leading-edge data science and machine learning technologies that help them do that.

The Scientific Method

See the latest entry: The 10 Hottest Data Science and Machine Learning Startups of 2022 (So Far)

Businesses today are leveraging ever-increasing volumes of data for competitive advantage. That means employing emerging technologies in data science, artificial intelligence, machine learning and even deep learning to prepare and organize big data and develop the machine learning algorithms and predictive models that support business intelligence applications used by analysts and information workers.

Here’s a look at 10 data science and machine learning startup companies with leading-edge products in the data science and machine learning arena that solution providers should be aware of.

For more of the biggest startups, products and news stories of 2021 so far, click here.


Top Executive: Robin Rohm, Co-Founder and CEO

Headquarters: Berlin, Germany

Apheris, founded in 2019, offers a platform for cross-company data science operations and collaboration. The software makes it possible to securely analyze distributed data from multiple parties while keeping proprietary information private.

In August 2020 Apheris raised 2.5 million Euros (approximately $2.98 million) in seed funding.


Top Executive: Constantinos Venetsanopoulos, Founder and CEO

Headquarters: San Mateo, Calif.

Arrikto’s flagship product, Arrikto Enterprise Kubeflow, is a complete machine learning operations (MLOps) platform that the company says brings together data scientists and DevOps to simplify, accelerate and secure model development through production. The goal, according to the company, is to bring the same principles used in DevOps to machine learning data.

The company, founded in 2014, also offers the cloud-native Rok Data Management Platform to manage the data – wherever it resides – needed for machine learning development and operations.

Top Executive: Gideon Mendels, Co-Founder and CEO

Headquarters: New York

Comet develops a self-hosted, cloud-based MLOps platform for machine learning model development and monitoring. The system helps data scientists track, compare, explain and optimize machine learning experiments and production models and manage related datasets.

Founded in 2017, Comet raised $13 million in Series A funding in April.

Top Executive: Josh Benamram, Co-Founder and CEO

Headquarters: Tel Aviv, Israel

Databand’s unified data observability and machine learning development platform helps data engineers and data scientists identify, troubleshoot and fix data quality issues for data pipelines running on cloud-native systems such as Snowflake, Apache Spark and Apache Airflow.

Founded in 2018, raised $14.5 million in December 2020 in a Series A round of funding led by Accel.


Top Executive: Ryohei Fujimaki, Founder and CEO

Headquarters: San Mateo, Calif.

dotData develops what it calls AutoML 2.0 solutions for automating data science workflows. The dotData Enterprise machine learning and data science automation platform handles data ingestion and wrangling, automated feature engineering, AutoML and model operationalization tasks – all with zero coding.

In February dotData, founded in 2018, launched dotData Cloud, an AI/ML automation platform and services that provide business intelligence teams – especially those within smaller organizations that lack their own data science teams – to quickly automate AI/ML development tasks. And in May the company debuted dotData Py Lite, a containerized AI automation system for data scientists using Python.


Top Executive: Maor Shlomo, Co-Founder and CEO

Headquarters: San Mateo, Calif.

Explorium develops an automated external data platform for advanced analytics and machine learning tasks. The system provides combined access to a wide range of external sources for data scientists and business analysts.

The Explorium technology portfolio also includes a powerful Auto ML engine for automated data discovery and feature generation, and the Signal Studio for finding and integrating the most relevant external data signals.

Explorium, founded in 2017, closed a $75 million Series C round of funding in May, bringing its total financing to $127 million.

Top Executive: Dmitry Petrov, Co-Founder and CEO

Headquarters: San Francisco

Iterative builds open-source tools used to extend traditional development technologies for machine learning projects – especially ML projects involving unstructured data.

Iterative’s portfolio includes the DVC version control system, Continuous Machine Learning (CML) for continuous integration/continuous delivery and deployment, and the just-released Studio for project collaboration. New versions of DVC and CML launched in March eliminate the need for proprietary AI platforms such as AWS SageMaker and Microsoft Azure ML Engineer, according to the company.

Founded in 2018, just raised $20 million in a Series A funding round.


Top Executive: Assaf Egozi, Co-Founder and CEO

Headquarters: Tel Aviv, Israel

Noogata in March debuted its modular no-code AI data analytics platform to help businesses and organizations scale their enterprise data analytics initiatives. The platform collects, enriches and models data insights, predictions and recommendations and provides actionable self-serve analytics throughout a company.

Founded in 2019, Noogata just raised $12 million in seed funding. The startup lists Colgate-Palmolive and PepsiCo among its early customers.

Top Executive: Serkan Piantino, Co-Founder and CEO

Headquarters: New York develops a machine learning platform for deep learning operations (DLOps) that the company says goes beyond traditional machine learning with its capabilities for preparing, training, deploying and managing the full lifecycle of machine learning and deep learning models.

Deep learning is a segment of the machine learning world that incorporates complex learning models that rely on AI-based neural networks and is often used for complex tasks such as image recognition and natural language processing. Deep learning models are compute-intensive and often require high-performance systems running on GPUs and next-generation AI processors., founded in 2017, says its cloud-agnostic platform can help reduce the costs of deep learning model development. In addition to customers such as Square, Healx and Conde Nast, is teaming up with systems integrators with deep learning practices and application developers building deep learning-driven software.

Top Executive: Mike Del Balso, Co-Founder and CEO

Headquarters: San Francisco exited stealth in April 2020 with its data platform for machine learning that’s designed to enable data scientists to turn raw data into the predictive signals that power machine learning models. The company’s goal is to solve the data challenges that constitute the biggest impediment to deploying machine learning in the enterprise.

Founders Mike Del Balso (CEO), Kevin Stumpf (CTO and Jeremy Hermann (engineering vice president) worked together at Uber when the ride sharing giant was building and deploy new machine learning models. They created Uber’s Michelangelo machine learning platform and then went on to found to develop technology to help other companies meet their operational machine learning data challenges., founded in 2019, has raised $60 million in several rounds of funding.