The Coolest System And Cloud Platform Companies Of The 2022 Big Data 100

Part 5 of CRN’s Big Data 100 includes a look at the vendors solution providers should know in the big data system and cloud platform space.

Foundational Support For Big Data

Business analytics software, databases, data management tools are critical for managing big data and leveraging it for competitive advantage. But all those technologies need to run on foundational systems including hardware servers, operating systems and cloud platforms. And most of those are provided by some of the biggest names in the IT industry.

As part of the CRN 2022 Big Data 100, we’ve put together the following list of big data system and cloud platform companies that solution providers should be familiar with.

Many of these companies are household names like IBM, Dell Technologies and Hewlett Packard Enterprise that develop the underlying hardware/software that power big data analytics and operational applications. In the cloud, where many businesses are deploying big data projects, cloud platform companies like Amazon Web Services and Google Cloud provide the platforms for those initiatives.

Long-established software giants like Microsoft and Oracle provide foundational cloud systems and databases for big data initiatives, in addition to offering their own broad portfolios of data management and data analysis software. Other vendors like Cloudera, Databricks and Snowflake represent a new generation of big data platform providers.

This week CRN is running the Big Data 100 list in a series of slide shows, organized by technology category, spotlighting vendors of business analytics software, database systems, data warehouse systems, data management and integration software, data science and machine learning tools, and big data systems and cloud platforms.

Some vendors market big data products that span multiple technology categories. They appear in the slideshow for the technology segment in which they are most prominent.

Amazon Web Services

Top Executive: CEO Adam Selipsky

For many businesses and solution providers, AWS is their cloud platform provider for managing their data and running their big data applications and data analysis tools.

That’s not surprising given that AWS stores huge volumes of data for many businesses and organizations in S3 and other systems and is the cloud platform that many enterprises use to run their operational and analytical cloud workloads. AWS is also the cloud platform for many IT vendors’ big data offerings—even some competitors.

AWS also offers its own growing portfolio of big data services including databases (Amazon Aurora, Amazon DynamoDB and Amazon RDS); data analytics (Amazon Athena, Amazon Quicksight, Amazon Redshift and Amazon SageMaker); data movement (AWS Glue and Amazon Kinesis) and data lake management (AWS Lake Formation and AWS Data Exchange).

While AWS often competes with many companies in the big data arena, the cloud giant sometimes pursues a “co-opetition” approach with companies that offer competing products. That includes data cloud company Snowflake, whose own services run on AWS, and database developer MongoDB, which recently formed a strategic alliance with AWS.

Cloudera

Top Executive: CEO Robert Bearden

Cloudera offers a range of data management and analytics software and services around the company’s flagship Cloudera Data Platform, available both as a hybrid cloud service and on-premises private cloud.

The Cloudera portfolio includes CDP Data Engineering, CDP Data Warehouse, the CDP Data Hub “edge-to-AI” analytics service, CDP Machine Learning, CDP Operational Database and the Cloudera DataFlow platform for real-time data streaming,

Cloudera, based in Santa Clara, Calif., merged with rival Hortonworks in early 2019. In October 2021 Cloudera, at the time a publicly traded company, was acquired by private equity firms Clayton, Dubilier & Rice and KKR in a deal valued at $5.3 billion.

Databricks

Top Executive: CEO Ali Ghodsi

Databricks has been one of hottest startups in the big data space in recent years. The company, founded in 2013 by the developers of the Apache Spark unified analytics engine for large-scale data processing, has become the leading proponent of the data lake concept for data analysis, an alternative to traditional data warehouse systems.

One reason Databricks has generated headlines is the company’s funding: Databricks has raised some $3.6 billion from investors—including $1 billion and $1.6 billion funding rounds in February and August 2021— putting the company’s valuation at an astonishing $38 billion. Some industry watchers expect an initial public offering from the company sometime this year.

The company develops its flagship Databricks Lakehouse Platform to handle a range of big data tasks including data analytics and data warehousing, data engineering, data streaming, and data science and machine learning. Recently the company has developed solutions for specific vertical industries, including financial services, health care and life sciences, and retail and consumer packaged goods, that run on the company’s platform.

That dovetails with the company’s recent launch of its Brickbuilder Solutions initiative to work with solution provider and systems integrator partners to develop industry-specific data and AI solutions that run on the Databricks Lakehouse Platform.

Dell Technologies

Top Executive: Founder, Chairman, CEO Michael Dell

Dell Technologies manufactures its PowerEdge line of servers, multiple storage system lines including PowerStore, and appliances like PowerProtect—all used to run big data workloads and applications.

Beyond its “big iron” hardware offerings, Dell, provides a number of pre-integrated hardware, software and service solutions specifically designed to simplify the deployment and operation of big data analytics projects. The Dell EMC Ready Solutions for Big Data, for example, supports big data technologies like Hadoop, Spark, NoSQL databases and Apache Kafka and emphasizes self-service analytics, simpler deployment and lower costs.

Dell, based in Round Rock, Texas, also has offerings that combine Dell EMC storage infrastructure with software from other vendors, including Dremio, Splunk, VMware (Tanzu Greenplum) and Yellowbrick Data, for big data tasks.

Dremio

Top Executive: CEO Billy Bosworth

Dremio develops a data lake platform and SQL-based query software that analysts and data engineers use to manage, curate and analyze data and share insight. The platform is based on the Apache Arrow open-source technology for developing analytical applications that can process in-memory columnar data.

The company launched Dremio Cloud data lakehouse in July 2021 and in March debuted Dremio Sonar, a new release of the company’s SQL Query engine that powers Dremio Cloud. The company is also previewing Dremio Arctic, a new metadata and data management service that will work with Dremio Cloud.

In January Dremio, based in Santa Clara, Calif., raised $160 million in Series E funding, boosting its market valuation to $2 billion.

Google Cloud

Top Executive: CEO Thomas Kurian

Google Cloud has become one of the leading public cloud platforms for running big data workloads and applications.

Google Cloud provides its own lineup of services for managing and analyzing big data including the BigQuery data warehouse system. A key component of BigQuery is the Looker business analytics software that Google acquired in 2020.

Other big data management services in the Google Cloud portfolio include Cloud Dataflow stream and batch processing, Dataproc managed Hadoop and Spark, Cloud Data Fusion data integration, Cloud Data Catalog, Data Studio for data visualization and dashboards, and the Cloud Bigtable managed NoSQL database.

Hewlett Packard Enterprise

Top Executive: President, CEO Antonio Neri

Hewlett Packard Enterprise provides server and storage hardware and cloud services that power big data applications for many businesses and organizations. HPE’s server lineup, for example, includes composable infrastructure, hyperconverged infrastructure and high-performance computing systems.

But HPE’s focus recently has been on its HPE GreenLake as-a-service offerings for big data tasks, including the HPE GreenLake for Big Data edge-to-cloud platform for deploying and managing big data workloads running on Apache Hadoop. In September 2021 the Houston-based company launched the HPE Ezmeral Unified Analytics data lakehouse platform.

IBM

Top Executive: Chairman, CEO Arvind Krishna

IBM markets a broad portfolio of big data and AI-driven analytics software, much of it under the Watson AI umbrella. The Armonk, N.Y.-based company’s portfolio includes the IBM Cognos Analytics data analysis software and database systems including IBM Db2, Netezza and Informix.

IBM’s big data lineup also includes data integration tools (IBM DataStage), data governance and data replication software, and data lake and data warehouse offerings—the latter including the IBM Db2 Warehouse, Netezza Performance Server and the IBM Integrated Analytics System cloud data warehouse.

Infoworks

Top Executive: CEO Buno Pati

Infoworks develops a big data operations and orchestration platform with a comprehensive range of functionality for hybrid, cloud and multi-cloud environments. The system helps businesses on-board, prepare and operationalize enterprise data, create analytics workflows and deploy big data projects.

Infoworks, based in Palo Alto, Calif., also offers Infoworks Replicator for migrating on-premises Hadoop data lakes to the cloud. Also available are more than 200 connectors for moving data from any on-premises or cloud system including relational databases, data warehouses, business systems and Software-as-a-Service applications.

Microsoft

Top Executive: Executive Chairman, CEO Satya Nadella

Microsoft is a longtime player in the big data software space with its SQL Server database and Power BI business analysis and data visualization tool.

Microsoft Azure, along with AWS and Google Cloud, has become a popular cloud platform for running big data analytics workloads, including hosting big data software from other vendors, solution providers and their customers.

Azure is also Microsoft’s vehicle for a number of its own big data offerings including the Azure Synapse Analytics data warehouse and its related tools, Azure Data Explorer, Azure Stream Analytics and the Azure Cosmos DB NoSQL database.

Oracle

Top Executive: CEO Safra Catz

The industry dominance of the Oracle Database alone makes the Austin, Texas-based software giant one of the leading players in the big data space. While that relational database software is Oracle’s flagship product, the company’s database product linealso includes the Oracle Autonomous Database cloud-based service and the popular MySQL open-source database.

Beyond database software Oracle boasts a deep portfolio of big data products including the Oracle Analytics platform, Oracle Big Data Service for building Hadoop-based data lakes, the Apache Spark-based Oracle Cloud Infrastructure Data Flow, and Oracle Cloud Infrastructure Data Catalog for data discovery and governance.

Oracle is even a player in the big data hardware space with its Exadata Database Machine, the high-performance server for running enterprise-class database workloads.

SAP

Top Executive: CEO Christian Klein

SAP is best known for its operational ERP, CRM and supply chain management applications. But those systems generate a lot of data and the Waldorff, Germany-based company offers an extensive portfolio of software for managing all that data and using it to competitive advantage.

At the core of SAP’s big data offerings is the Business Technology Platform, which includes the company’s data management, integration and analytics software. That includes the SAP HANA Cloud database, SAP Data Intelligence and SAP Master Data Governance for data management, and SAP Analytics Cloud and SAP Data Warehouse Cloud for analytics and planning.

Snowflake

Top Executive: CEO Frank Slootman

Snowflake initially focused on providing cloud-based data warehousing services following its 2012 launch. But in recent years, especially since its blockbuster IPO in September 2020, the company has broadened its sights to become “the data cloud” with a broad—and growing—lineup of offerings spanning data warehouse, data lake, data engineering, data science, data sharing and data application services.

More recently Snowflake has been expanding its service offerings for specific vertical industries, debuting in March the new Retail Data Cloud and Healthcare & Life Sciences Data Cloud solutions.

Also in March, Snowflake struck a deal to acquire Streamlit, the developer of a framework for accelerating the creation of data applications

For the company’s fiscal 2022 ended Jan. 31, fast-growing Snowflake reported revenue of $1.22 billion, more than double the $592 million in revenue reported for fiscal 2021.

Splunk

Top Executive: CEO Gary Steele

Calling its core product “the data platform for the hybrid world,” Splunk develops its system for collecting, indexing and searching machine data, putting it at the center of a wide range of big data operations and initiatives that span such areas as IT observability, unified security and a broad range of custom applications.

Splunk Enterprise and Splunk Cloud Platform incorporate such capabilities as data streaming, machine learning, search and visualization, and data collaboration and orchestration. On top of those platforms Splunk provides offerings for specific functions including Splunk Enterprise Security, Splunk SOAR, Splunk Infrastructure Monitoring and Splunk Application Performance Monitoring.

In March Splunk, based in San Francisco, hired Proofpoint founder and CEO Gary Steele as its new CEO.

Sumo Logic

Top Executive: President, CEO Ramin Sayer

Sumo Logic’s cloud-based machine data analytics platform provides log data management and analytics services for security, IT operations and business intelligence use cases.

The Redwood City, Calif.-based company, for example, touts the platform’s ability to provide insightsinto cloud infrastructure performance, helping businesses and organizations accelerate cloud migration and optimize cloud infrastructure reliability.

In late 2021 Amazon Web Services named Sumo Logic its ISV Partner of the Year.