2017 Big Data 100: 15 Coolest Big Data Platform Vendors

Elevating Big Data Operations

Business analytics and data management software may be the most visible components of a company's big data ecosystem. But under those software products are complex, integrated systems – either on premise or in the cloud – that serve as platforms for big data applications, processing huge volumes of data and providing the horsepower that keeps big data systems running.

The CRN editorial team has created the fifth annual Big Data 100 list, identifying vendors that have demonstrated an ability to innovate in bringing to market products and services that help businesses work with big data.

Here are 15 big data platform companies with on-premise and cloud systems for building and operating big data systems, including data warehouses and data lakes.

Amazon Web Services

Top Executive: CEO Andy Jassy

Amazon Web Services has become the defacto data management system for many business applications. So it's no surprise that AWS is on a rapid growth trajectory and recently reported a $15-billion annual revenue run rate.

AWS' big data platform service offerings include analytic frameworks such as the Athena interactive query service and Elasticsearch service; real-time analytics with Kinesis Firehose; data management with Amazon DynamoDB NoSQL and Amazon Aurora relational database systems; the Redshift cloud data warehouse system; and the QuickSight business intelligence system.

Earlier this year AWS announced Redshift Spectrum, a new feature of the data warehouse system that allows customers to run queries on exabytes of data (1 exabyte equals 1,000 petabytes) stored in Amazon S3.

BlueData Software

Top Executive: CEO Kumar Sreekanti

BlueData's software, which incorporates Docker's container technology, is used to deploy big data infrastructure and applications in an on-premise model or on Amazon Web Services. BlueData EPIC (Elastic Private Instant Cluster) is a platform that provides Hadoop-as-a-Service and Spark-as-a-Service.

The company's spring release of BlueData EPIC provides the ability to run big data workloads in hybrid on-premise and public cloud systems.

In January BlueData, founded in 2012 and based in Santa Clara, Calif., said sales grew by 426 percent in 2016 with the addition of such customers as State Farm Insurance, Barclays and Panera Bread.

Cazena

Top Executive: CEO Prat Moghe

Cazena's Big Data-as-a-Service moves data processing tasks to the cloud with just a few clicks, automating what is generally a long, complex process. Cazena bundles cloud databases, analytics engines, data movers, security and other tools into a big data Platform-as-a-Service offering that runs on Microsoft Azure and AWS.

The vendor also provides data lake and datamart cloud services, and in February the company debuted its Data Science Sandbox cloud service for building, testing and running data science analytical applications.

Founded in 2014 and based in Waltham, Mass., Cazena has attracted attention – and financing – because CEO Prat Moghe and board members Jit Saxena and Jim Baum were among the founders of Netezza, a pioneering developer of data warehouse appliances that IBM bought in 2010 for $1.7 billion.

Cloudera

Top Executive: CEO Tom Reilly

Cloudera is a leading provider of a big data platform for machine learning and advanced analytics built on the latest open-source technologies.

The company's products include its flagship Cloudera Enterprise Data Hub, the Cloudera Analytic DB and the Cloudera Operational DB. The company just announced the general availability of the Cloudera Data Science Workbench, a self-service tool for data scientists.

Based in Palo Alto, Calif., Cloudera filed earlier this year to become a public company and plans to issue 15 million shares of common stock at $15.00 per share.

Dell Technologies

Top Executive: CEO Michael Dell

With the $58 billion EMC acquisition under its belt, Dell now boasts a broad range of big data products in such as areas as data management (Dell Master Data Management Services), data integration (Dell Boomi), infrastructure (servers, storage systems and networks) and analytics and business intelligence (through its alliance with NTT Data Services).

The company also markets big data technologies through relationships with other vendors, such as Cloudera's Hadoop system.

Statistica, the advanced analytics software Dell acquired when it bought StatSoft in 2014, was part of the Quest spinoff in 2016.

Google

Top Executive: CEO Larry Page

The Google Cloud Platform includes a number of big data management and analytics tools including the BigQuery analytics data warehouse; Cloud Datalab for visually exploring and analyzing big data sets; and Cloud Dataproc, a managed Hadoop, MapReduce, Spark, Pig and Hive service.

Cloud DataPrep, a data cleansing and preparation tool, is currently undergoing private beta testing.

Hewlett Packard Enterprise

Top Executive: President and CEO Meg Whitman

Hewlett Packard Enterprise's big data platform includes software (IDOL data analytics and Vertica advanced analytics) and information management and governance tools; hardware such as the HPE ConvergedSystem for Big Data and HPE Apollo; and a range of big data services.

The company also develops turnkey platform systems such as the HPE ConvergedSystem for SAP HANA and HPE ConvergedSystem 300 for Microsoft Analytics.

Hortonworks

Top Executive: CEO Rob Bearden

Hortonworks offers a range of big data management products built around its Hortonworks Data Platform (HDP), which is itself based on the Apache Hadoop system. It also develops the Hortonworks DataFlow software that collects and analyzes streaming data in real-time.

In April Hortonworks launched HDP version 2.6 with the ability to provide real-time operational analytics using information stored in data lakes.

IBM

Top Executive: President and CEO Virginia Rometty

IBM's big data initiatives have increasingly centered on its Watson supercomputer, including the Watson Data Platform and Watson Analytics.

Individual big data products include SPSS predictive analytics software, the DB2 database, Cognos Analytics on Cloud, BigInsights Hadoop software, and machine learning technology.

In April IBM enhanced Watson's data analysis and discovery capabilities on the IBM Cloud by expanding the functionality of the Watson Discovery Service and introducing the Watson Company Profiler experimental platform.

Infoworks

Top Executive: CEO Amar Arsikere

Infoworks provides a Hadoop-based data warehouse system that can run either on premise or, more recently, in the cloud.

Infoworks, founded in 2014 and based in San Jose, closed on $15 million in Series B funding in March.

MapR Technologies

Top Executive: CEO Matt Mills

MapR Technologies develops a converged data platform that integrates Hadoop, Spark and the Apache Drill SQL engine with real-time database capabilities, event streaming and scalable storage.

With the Internet of things expected to be a major driver of demand for big data technologies, MapR in March announced a small footprint edition of its platform that can capture, process and analyze data closer to the IoT devices that generate the data.

Based in San Jose, MapR was founded in 2009.

Oracle

Top Executives: CEOs Safra Catz and Mark Hurd

Oracle is one of the long-time leaders in the big data space with its flagship Oracle Database and MySQL relational databases, data management and integration tools, data warehousing technology, and data analytics and visualization software.

Oracle's big data platforms include hardware systems such as the Big Data Appliance, the Exadata Database Machine and the Exalytics In-Memory Machine.

In April Oracle signed a deal to acquire Moat, a developer of a cloud platform for marketing data and analytics. Oracle plans to add the technology to the Oracle Data Cloud system.

Ryft Systems

Top Executive: CEO Des Wilson

Ryft develops a line of hardware compute accelerators that use FPGA and x86 processors, a library of data discovery algorithms, and other technology to create a high-performance analytics engine. The company says its systems, running as a hosted system or on premise, outperforms other analytics platforms by a factor of 100 or more.

Ryft, founded in 2000 and based in Rockville, Md., recently partnered with AWS to deliver Ryft Virtual, a heterogeneous cloud-based version of its Ryft One system, for customers of AWS EC2 F1 instances.

Snowflake Computing

Top Executive: CEO Bob Muglia

Startup Snowflake Computing began offering its cloud-based Snowflake Elastic Data Warehouse service nearly two years ago, providing an alternative to traditional on-premise data warehouse systems that tend to be complex, expensive and time-consuming to build.

On April 5 Snowflake closed on $100 million in Series D funding, bringing its total financing to $205 million.

The San Mateo, Calif.-based Snowflake, founded in 2012, said that in its fiscal year ended Jan. 31, the company nearly doubled its customer base and increased total customer data storage by 300 percent.

Teradata

Top Executive: President and CEO Victor Lund

Teradata was in many respects the original data warehouse system vendor, developing hardware/software systems specifically for data warehouse applications, in contrast to competitors that adapted their transaction processing systems to function as data warehouses.

Today Teradata, based in Dayton, Ohio, offers a line of purpose-built data warehouse platforms running the company's Teradata Database, business analytics software and other big data products. The systems run on premise and in private clouds: In March the company launched the Teradata IntelliCloud packaged software-as-a-service offering.

Teradata was in the leader quadrant in the 2017 Gartner Magic Quadrant for Data Management Solutions for Analytics and achieved the highest position for "completeness of vision."