The Coolest Data Science And Machine Learning Tool Companies Of The 2022 Big Data 100

Part 6 of CRN’s Big Data 100 includes a look at the vendors solution providers should know in the data science and machine learning tool space.

Fast Learners

As businesses and organizations struggle to manage ever-growing volumes of data and, more importantly, derive value from that data, they are increasingly turning to data science and machine learning tools to automate and improve big data processes and workflows.

As part of the 2021 Big Data 100, CRN has compiled a list of data science and machine learning tool companies that solution providers should be aware of.

This week CRN is running the Big Data 100 list in slide shows, organized by technology category, with vendors of business analytics software, database systems, data management and integration software, data warehouse systems, data science and machine learning tools, and big data systems and platforms.

Some vendors market big data products that span multiple technology categories. They appear in the slideshow for the technology segment in which they are most prominent.


Top Executive: Co-Founder, CEO Peter Wang

Anaconda offers a data science platform that includes the Anaconda distribution of the Python and R programming languages, providing a foundation for data science and machine learning tasks. The platform, offered in multiple editions, is used by data scientists, developers and IT teams who build machine learning and neural network models and work with predictive analytics and data visualization processes.

In November Anaconda, based in Austin, Texas, launched an Embedded Partner Program through which software companies can embed Anaconda tools, packages and repositories into their own products and services.


Top Executive: Co-Founder, CEO Francisco Martin

BigML expects that in the near future all applications will be predictive applications and take advantage of machine learning and other AI techniques.

BigML offers a comprehensive machine learning platform and toolset that the Corvallis, Ore.-based company says makes machine learning “easy and beautiful” for everyone. The platform provides what BigML calls robust machine learning algorithms for developing predictive applications that solve real-world problems using a single, standardized framework across an organization.


Top Executive: Co-Founder, CEO Florian Douetteau

Dataiku’s data science and machine learning platform is used to build and deploy applications that deliver data and advanced analytics, extending and automate decision-making across an organization.

Dataiku offers the system as a key component for data ops, data preparation, data visualization, machine learning and analytical application operations and initiatives.

In August Dataiku, based in New York, raised $400 million in Series E funding, putting the company’s valuation at $4.6 billion.


Top Executive: CEO Dan Wright

DataRobot has been among the most visible AI and machine learning tech companies in the industry. That’s due, in part, to the significant amounts of funding the company has raised including a $320 million Series F round of financing in December 2020 and a $300 million Series G round in August 2021, the latter putting the Boston-based company’s valuation at $6.3 billion.

DataRobot’s AI Cloud platform (the 8.0 edition was released in March) is used by data scientists, data engineers, business analysts and developers to leverage augmented intelligence. The system prepares data for machine learning and AI applications, automates the creation of machine learning and time series models, and centrally deploys, monitors, manages and governs production machine learning models.

In December the company debuted DataRobot Core, which broadens the AI Cloud platform for “code-first data science experts.”

Domino Data Lab

Top Executive: Co-Founder, CEO Nick Elprin

Domino Data Lab markets the Domino Enterprise MLOps Platform, a machine learning and data science system that allows business and organizations to develop, deploy and manage models on a large scale.

In March Domino Data Lab unveiled extended integrations with Nvidia processors to enhance the deployment of GPU-accelerated machine learning models.

San Francisco-based Domino Data Lab raised $100 million in a funding round in October 2021.


Top Executive: Founder, CEO Ryohei Fujimaki

The dotData Enterprise data science automation platform allows enterprises to automate data science workflows and build and deploy AI models in days instead of months, according to the company. The system handles data ingestion and wrangling, automated feature engineering, AutoML and model operationalization tasks—all with zero coding.

San Mateo, Calif.-based dotData, a 2018 spinoff from NEC, this week raised $31.6 million in Series B Funding, bringing its total funding to $74.6 million.

Top Executive: Founder, CEO Sri Ambati offers a platform and tools for building and operating artificial intelligence applications and infusing AI into business workflows. The company is seeing rapid adoption of its H2O AI Cloud, launched in January 2021, which brings together the company’s AI and automated machine learning products in a unified cloud platform.

In November 2021 raised $100 million in a Series E funding round that boosted the based in Mountain View, Calif.-based company’s valuation to $1.7 billion.


Top Executive: Co-Founder, CEO Asaf Somekh

The Iguazio MLOps Platform automates machine learning pipelines, handling such tasks as data ingestion and transformation, model training and evaluation, operational pipeline deployment, and data and model monitoring.

In October New York-based Iguazio became a Pure Storage technology partner, integrating the two companies’ technologies to allow Iguazio MLOps users to tap into Pure Storage data for developing machine learning models and deploying them across hybrid and multi-cloud environments.


Top Executive: Co-Founder, CEO Michael Berthold

KNIME, which in its early days focused on data mining with its Konstanz Information Miner, leveraged that technology into an end-to-end data science system that includes the KNIME Analytics Platform for data science tasks and the KNIME Server for deploying, automating and managing data science workflows as analytical applications and services.

A new release of the KNIME software in December provided improved performance for Python development, new data wrangling options and new workflows as services.

KNIME is based in Zurich, Switzerland, with its U.S. office in Austin, Texas.


Top Executive: CEO Peter Lee

RapidMiner’s data science and machine learning platform provides integrated data preparation, machine learning, deep learning, text mining and predictive analytics capabilities.

The Boston-based company’s individual products include RapidMiner Studio for accelerating the development of predictive models and the RapidMiner Go autoML tool for making data science capabilities accessible to analysts and business users. The company also offers Automated Data Science tools and RapidMiner AI Cloud.