The Coolest Big Data Systems And Platform Companies Of The 2021 Big Data 100

Part 2 of CRN’s Big Data 100 includes a look at the vendors solution providers should know in the systems and platforms space.

A Systemic Approach

Business analytics and data visualization applications, database software, data science and data engineering tools—all are critical components of a comprehensive initiative to leverage a business’ data assets for competitive gain.

But all those components run on hardware servers, operating software and cloud platforms that pull all those pieces together.

As part of the 2021 Big Data 100, CRN has compiled a list of major system and cloud platform companies that solution providers should be aware of. They include major computer system vendors like Dell Technologies, Hewlett Packard Enterprise and IBM that provide servers and operating software packaged for big data applications; cloud service providers like Amazon Web Services, Google and Snowflake that offer cloud-based big data services; and leading big data software developers including Microsoft, Oracle and SAP.

This week CRN is running the Big Data 100 list in slide shows, organized by technology category, with vendors of business analytics software, database systems, data management and integration software, data science and machine learning tools, and big data systems and platforms.

(Some vendors market big data products that span multiple technology categories. They appear in the slideshow for the technology segment in which they are most prominent.)

Amazon Web Services

Top Executive: CEO Andy Jassy

For many business and solution providers, AWS has become an indispensable cloud platform for storing and managing data and running big data applications.

In addition to serving as a platform for other vendors big data systems, AWS offers its own growing portfolio of big data services including databases (Amazon Aurora, RDS and DynamoDB); analytics (Amazon Athena, Elasticsearch Service, Kinesis, Quicksight and Redshift); and data management (AWS Glue, Lake Formation, Data Pipeline and Data Exchange).

CEO Andy Jassy has led AWS’ astounding growth and he is about to take over as CEO of parent company Amazon once current CEO Jeff Bezos steps aside, as announced in February. Tableau CEO Adam Selipsky, who previously worked at Seattle-based AWS, is taking over as AWS CEO.


Top Executive: CEO Robert Bearden

While Cloudera struggled to find its footing after its early 2019 merger with rival Hortonworks, the company has developed significant momentum over the past two years and emerged as a key player in the big data platform space.

The Cloudera Data Platform is a complete big data platform for on-premises, hybrid cloud and multi-public-cloud deployments: An edition for private clouds debuted in August 2020. On March 31 Cloudera became available on the Google Cloud Platform.

The Cloudera system provides comprehensive data engineering, machine learning, data visualization, operational database, data hub and data warehouse functionality. Other offered products include Cloudera DataFlow for developing and managing streaming data applications and the Data Science Workbench.

For the company’s fiscal 2021 (ended Jan. 31, 2021) Cloudera, based in Palo Alto, Calif., reported revenue of $869.3 million, up 9.5 percent from $794.2 million one year before.

Dell Technologies

Top Executive: Chairman, CEO Michael Dell

Dell, of course, develops servers and storage systems used to power big data workloads and applications.

Beyond its hardware offerings, Dell, based in Round Rock, Texas, provides a number of pre-integrated systems specifically designed to simplify the deployment and operation of big data analytics projects. The Ready Solutions for Data Analytics combines Dell EMC infrastructure with software from other vendors, including Splunk, Cloudera and Confluent, for such tasks as edge analytics and real-time data streaming.

Google Cloud

Top Executive: CEO Thomas Kurian

Like AWS, Google Cloud is a popular public cloud platform for running big data workloads and applications.

Google Cloud also provides a line of big data services including the BigQuery data warehouse, Cloud Dataflow stream and batch processing, Dataproc managed Hadoop and Spark, Cloud Data Fusion data integration, Cloud Data Catalog, Data Studio for data visualization and dashboards, and the Cloud Bigtable managed NoSQL database.

In 2020 Google, Mountain View, Calif., bought business analytics software vendor Looker. Today the software is a key component of the BigQuery service.

Hewlett Packard Enterprise

Top Executive: President, CEO Antonio Neri

Hewlett Packard Enterprise markets high-performance computing systems such as the HPE Superdome and HPE ProLiant for big data, artificial intelligence and machine learning tasks. The HPE Apollo 6000 system, for example, is offered as a deep learning platform.

The Houston-based company also offers HPE GreenLake for Big Data, an enterprise-grade, as-a-service offering for the Hadoop big data platform.

Hitachi Vantara/Pentaho

Top Executive: CEO Gajen Kandiah

Hitachi Vantara offers a range of data management and data analytics software, as well as converged and hyperconverged servers and storage systems to run big data projects.

The Santa Clara, Calif.-based company’s big data software lineup includes the Lumada DataOps Suite that provides data integration, data catalog and data optimization for Hadoop capabilities. The Lumada Suite also includes Pentaho, Hitachi Vantara’s business intelligence software.


Top Executive: CEO Arvind Krishna

IBM’s broad big data software portfolio includes its DB2 and Informix database systems, the Cognos business intelligence toolset, and the IBM InfoSphere Information Server suite of data integration and data governance tools.

On the hardware side the company, based in Armonk, N.Y., markets the Netezza Performance Server data warehouse platform.

The IBM Big Data Platform is the company’s flagship big data solution. The integrated system offers Hadoop-based analytics, data warehousing, stream computing, and data integration and governance capabilities. It incorporates a number of IBM products including InfoSphere BigInsights, InfoSphere Streams, IBM Smart Analytic Systems, and IBM PureData Systems for Analytics, for Hadoop and for Operational Analytics.


Top Executive: CEO Stephen Murdoch

The Micro Focus analytics and big data software lineup includes the Vertica analytics SQL database, the IDOL unstructured data (text, speech and video) analytics system, and a number of analytical tools under the ArcSight name including ArcSight Data Platform, ArcSight Investigate and ArcSight User Behavior Analytics.

MicroFocus is based in Newbury, U.K.


Top Executive: CEO Satya Nadella

Microsoft, of course, is a long-time player in the big data software space with its SQL Server database and Power BI business analysis and data visualization tool.

Microsoft Azure has become a popular cloud platform for running big data applications and workloads. Microsoft, Redmond, Wash., also uses Azure as a platform for a number of its own big data offerings including the Azure Synapse Analytics data warehouse, HDInsight Hadoop, Azure Data Factory, Azure Stream Analytics, Azure Data Lake Analytics and Azure Analysis Services.


Top Executive: CEO Safra Catz

With its flagship Oracle Database (current release 21c) the dominant relational database in the industry, there is no arguing with the company’s position as a leading big data company. (Oracle’s database portfolio also includes the open-source MySQL database.)

The Oracle Database is also the foundation for a range of other big data packages and offerings from the company including the Oracle Autonomous Database and the Oracle Autonomous Data Warehouse.

Oracle’s analytics software includes Oracle Essbase, Oracle Analytics Server, Analytics Cloud and Fusion Analytics and the company offers a broad range of data management and data integration tools.

Oracle, now based in Austin, Texas, even provides hardware for running big data systems including the Oracle Database Appliance and the Oracle Exadata Database machine.


Top Executive: CEO Christian Klein

SAP is best known for its operational ERP, CRM and supply chain management applications. But those systems generate a lot of data and the Walldorf, Germany-based company offers a portfolio of software for managing and analyzing all that data.

SAP’s HANA in-memory database is the foundation for much of the company’s big data offerings, providing the platform for the SAP Data Intelligence, SAP BW/4HANA, SAP Data Warehouse Cloud and SAP Analytics Cloud systems.

SAP also offers its business analytics software, including its BusinessObjects and SAP Crystal reporting software, as part of its SAP Business Technology Platform, which also includes AI and machine learning capabilities.


Top Executive: CEO Frank Slootman

Snowflake launched in 2012 as a cloud data warehouse service provider but has since expanded its vision to become a data cloud company providing a range of data warehouse, data lake, data engineering, data science, data sharing and data application services. The platform’s data sharing capabilities also underly Snowflake’s data marketplace.

Snowflake, based in San Mateo, Calif., went public in September 2020 in an impressive IPO that put the company’s initial market cap at more than $70 billion. For its fiscal 2021 ended Jan. 31, 2021, Snowflake reported revenue of $592.0 million, up 124 percent from $264.7 million one year earlier.


Top Executive: CEO Doug Merritt

Under the moniker “Data-to-Everything,” Splunk, based in San Francisco, develops and markets its Splunk Enterprise and Splunk Cloud Platform for a range of big data tasks, particularly in security, IT and DevOps.

In addition to the platform, Splunk offers its own applications for those tasks, such as Unified Security Operations and Infrastructure Monitoring & Troubleshooting. ISVs and in-house developers also build applications that leverage the platform’s big data capabilities.

Splunk just hired former AWS and Microsoft executive Teresa Carlson as president and chief growth officer, overseeing the company’s sales, marketing and customer-focused operations.


Top Executive: President, CEO Steve McMillan

Teradata, based in San Diego, is generally seen as the original data warehouse company as it developed an integrated hardware/software data warehouse system back in 1979.

Today Teradata’s Vantage multi-cloud data analytics platform is the company’s core product along with the Vantage Analyst self-service data loading, data discovery, machine learning and advanced analytics software.

The company also offers Vantage CX, its customer experience analysis software, as well as analytical software for specific industries including financial services, automotive, retail, manufacturing, energy and health care.

Yellowbrick Data

Top Executive: CEO Neil Carson

Yellowbrick Data, which got its start as a data warehouse appliance supplier, now develops massively parallel processing (MPP) data warehouse and SQL analytics products.

Its core data warehouse offering, incorporating its MPP database, can be distributed across data centers, public and private clouds, and the network edge. Earlier this month the Palo Alto, Calif.-headquartered company debuted Yellowbrick Manager for unified control of data warehouses across distributed clouds.

The company positions itself as an upgrade option for current Teradata and IBM Netezza data warehouse owners and a less-costly alternative to Snowflake.