The Coolest Data Science And Machine Learning Tool Companies Of The 2021 Big Data 100

Part 5 of CRN’s Big Data 100 includes a look at the vendors solution providers should know in the data science and machine learning tool space.

Learning Curve

As businesses and organizations strive to manage ever-growing volumes of data and, even more important, derive value from that data, they are increasingly turning to data engineering and machine learning tools to improve and even automate their big data processes and workflows.

As part of the 2021 Big Data 100, CRN has compiled a list of data science and machine learning tool companies that solution providers should be aware of. While most of these are not exactly household names, some, including DataRobot, Dataiku and H2O, have been around for a number of years and have achieved significant market presence. Others, including dotData, are more recent startups.

This week CRN is running the Big Data 100 list in slideshows, organized by technology category, with vendors of business analytics software, database systems, data management and integration software, data science and machine learning tools, and big data systems and platforms.

(Some vendors market big data products that span multiple technology categories. They appear in the slideshow for the technology segment in which they are most prominent.)


Top Executive: CEO Peter Wang

Anaconda, perhaps best known in the development community for its Python and R development tools, develops a data science and machine learning platform that’s available in multiple editions (individual, team, commercial and enterprise).

The company’s flagship enterprise edition provides a comprehensive foundation for data science and machine learning tasks. Use cases include building machine learning and neural network models and assisting with predictive analytics and data visualization processes.

Anaconda is based in Austin, Texas.


Top Executive: Co-Founder and CEO Francisco Martin

Under the motto “machine learning made beautifully simple for everyone,” BigML, headquartered in Corvallis, Ore., offers a comprehensive machine learning platform with robust machine learning algorithms that the company says can solve real world problems using a single, standardized framework across an organization.

Given the importance of providing access to data for machine learning workflows, the winter 2020 release of the BigML platform included expanded support for MySQL, Microsoft SQL Server and Elasticsearch databases, in addition to PostgreSQL.


Top Executive: Co-Founder and CEO Florian Douetteau

Dataiku’s all-in-one data science and machine learning platform can create, share and reuse applications that leverage big data and machine learning to extend and automate decision making. The system is a key component for data ops, data preparation, data visualization, machine learning operations and analytical application initiatives.

Last week, New York-based Dataiku received a strategic investment (amount undisclosed) from Snowflake Ventures, the venture arm of data cloud vendor Snowflake. In August 2020 the company raised $100 million in Series D funding.


Top Executive: CEO Dan Wright

DataRobot’s enterprise AI platform is used to prepare data for machine learning and AI applications; automate the creation of machine learning and time series models; and centrally deploy, monitor, manage and govern production machine learning models.

In December 2020, Boston-based DataRobot announced a $320 million Series F round of financing from investors that included Snowflake, Salesforce Ventures and Hewlett Packard Enterprise.

Dan Wright, previously president and chief operating officer at DataRobot, succeeded co-founder Jeremy Achin as CEO on March 10.

Domino Data Lab

Top Executive: Co-Founder and CEO Nick Elprin

Domino Data Lab markets the Domino Enterprise MLOps Platform, a machine learning operations and data science platform that centralizes data science work and accelerates machine learning model deployment.

Earlier this month, San Francisco-based Domino Data Lab partnered with chip manufacturer Nvidia to develop a series of integrated solutions and product enhancements for deploying Domino Data Lab tools.


Top Executive: Founder and CEO Ryohei Fujimaki

dotData develops what it calls AutoML 2.0 solutions for automating data science workflows. The dotData Enterprise machine learning and data science automation platform handles data ingestion and wrangling, automated feature engineering, AutoML and model operationalization tasks – all with zero coding.

In February San Mateo, Calif.-based dotData launched dotData Cloud, an AI/ML automation platform and services that provides business intelligence teams – especially those within smaller organizations that lack their own data science teams – to quickly automate AI/Ml development tasks.

Top Executive: Founder and CEO Sri Ambati develops artificial intelligence and machine learning tools including the company’s H2O open-source machine learning and predictive analysis platform for building machine learning models.

Other H2O products include Deep Water and Sparkling Water, versions of H20 integrated with other systems including TensorFlow and Spark; Steam, the company’s commercial offering for building and deploying ML models and applications; and Driverless AI for non-technical users. is headquartered in Mountain View, Calif.


Top Executive: Co-Founder and CEO Asaf Somekh

The Iguazio Data Science Platform accelerates and scales the development, deployment and management of AI applications and automates machine learning operations and pipelines.

In December, New York-based Iguazio launched an integrated feature store within its Data Science Platform to accelerate the deployment of AI applications across hybrid- and multi-cloud environments.


Top Executive: Co-Founder and CEO Michael Berthold

KNIME got its start as a developer of data mining technology and now offers its KNIME (Konstanz Information Miner) Analytics Platform and KNIME Server for building data science solutions and putting them into production.

KNIME, headquartered in Zurich, Switzerland, released version 4.3 of its platform in December with expanded ETL (data extract, transform and load), deep learning, collaboration and deployment monitoring capabilities.


Top Executive: CEO Peter Lee

Another company to come from the data mining space, RapidMiner markets its data science and machine learning platform for data preparation, machine learning, deep learning, text mining and predictive analytics activities.

The Boston-based company’s individual products include RapidMiner Studio for accelerating the development of predictive models and the RapidMiner Go autoML tool for making data science capabilities accessible for analysts and business users.