2016 Big Data 100: 20 Coolest Platform And Tools Vendors

Platform And Tools Vendors

Business analytics and data management software may be the most visible components of a company's big data ecosystem. But underlying those systems are critical platforms, tools, cloud services and other infrastructure that keep everything running.

The CRN editorial team has created the fourth annual Big Data 100 list, identifying vendors that have demonstrated an ability to innovate in bringing to market products and services that help businesses work with big data.

Here are 20 big data platform and tools companies offering everything from hardware servers, to software platforms and applications, to cloud-based services. Some, such as IBM, Hewlett Packard Enterprise and Oracle, have broad product lines that also include analytics, data management and infrastructure technologies for tackling big data.


Top Executive: CEO Raymie Stata

Altiscale, a provider of big data as a service, recently launched Altiscale Insight Cloud, a self-service analytics service that allows business analysts to rapidly query a data lake using familiar business intelligence tools such as Tableau and Excel without heavy involvement from the IT department.

In March Altiscale, founded in 2012 and based in Palo Alto, Calif., established a strategic alliance with Tableau through which Altiscale customers can use Tableau's data visualization software in conjunction with Altiscale's services for data discovery applications.

Amazon Web Services

Top Executive: CEO Andy Jassy

Amazon Web Services has become the de facto data management system for many business applications. So it's no surprise that AWS has been on a meteoric growth rate and is expected to reach $10 billion in revenue this fiscal year after hitting $7.9 billion in fiscal 2015.

Last month AWS launched a database migration service that moves customers' on-premise Oracle, SQL Server, MySQL and PostgreSQL databases to the AWS cloud.

AWS also offers a host of business analytics services including the Redshift data warehousing, Quicksight business intelligence, Kinesis real-time streaming data and Elasticsearch services.

BlueData Software

Top Executive: CEO Kumar Sreekanti

BlueData's EPIC software uses Docker container technology to make it easier for businesses to leverage big data by enabling big data as a service in an on-premise model. They can quickly spin up virtual Hadoop or Spark clusters for on-demand access to data, applications and infrastructure at cost savings up to 75 percent of traditional approaches, according to the company.

BlueData, launched in 2012 and based in Santa Clara, Calif., recently debuted a real-time pipeline accelerator system that helps data scientists and analysts build real-time data pipelines using Spark Streaming, Kafka and Cassandra.


Top Executive: CEO Eric Tilenius

BlueTalon provides data access control software for Hadoop, SQL and other big data environments. The BlueTalon Policy Engine specifically addresses Hadoop's security shortcomings by helping manage data access when departments and even individuals require different levels of authorization to view the same data.

BlueTalon, founded in 2013 and based in Redwood City, Calif., recently won the 2016 Cybersecurity Excellence Award for Data-Centric Security.


Top Executive: CEO Prat Moghe

Startup Cazena's big data as a service moves data processing tasks to the cloud with just a few clicks, automating what traditionally is a long, complex process. The vendor's data lake as a service and DataMart as a service can provision and optimize cloud infrastructure and big data technologies such as Hadoop, MPP SQL and Spark.

Founded in 2014 and based in Waltham, Mass., Cazena exited stealth in July 2015. The company has attracted attention – and financing – because CEO Prat Moghe and board members Jit Saxena and Jim Baum were among the founders of Netezza, a pioneering developer of data warehouse appliances that IBM bought in 2010 for $1.7 billion.


Top Executive: CEO Tom Reilly

Cloudera is one of the leading distributors of Hadoop software and developer of related tools and technologies for managing and securing Hadoop clusters.

In April the Palo Alto, Calif.-based company debuted Cloudera Enterprise 5.7, a new edition of the company's flagship data management and analytics platform that's based on Apache Hadoop. Among the enhancements is an average 3X data processing improvement through support for Hive-on-Spark.


Top Executive: CEO Paula Long

DataGravity develops technology that is at the nexus of storage, data management and business analytics.

The company's DataGravity Discovery Series data-aware storage systems are used by IT management and line-of-business to store, protect, search and govern their data. At the core of the systems is the DataGravity Engine that analyzes data as it is ingested, making it easier for administrators and business users to explore and use the data.

DataGravity was founded in 2012 and is based in Nashua, N.H.


Top Executive: CEO Michael Dell

Dell is a player in the big data arena with its Boomi AtomSphere application integration tools, Statistica predictive analytics software and Toad Data Point data access and analysis product. Statistica, which Dell acquired in 2014, is listed as a leader in the 2016 Gartner Magic Quadrant for Advanced Analytics Platforms report.

Dell is in the process of acquiring storage technology giant EMC for $67 billion in a move that will expand its presence in multiple big data sectors once the acquisition wraps up sometime this year.


Top Executive: CEO Joseph Tucci

EMC is in the process of being acquired by Dell in a $67 billion deal that's expected to be completed by the end of this year's third quarter.

EMC positions a number of its product lines for big data applications, including its VCE VBlock converged hardware, XtremeIO and DSSD storage systems, and the Isilon Data Lake systems.

EMC is based in Hopkinton, Mass.


Top Executive: CEO Larry Page

The Google Cloud Platform includes a number of big data management and analytics tools including the BigQuery analytics data warehouse; Cloud Datalab for large-scale data exploration, analysis and visualization; and Cloud Dataproc, a managed Hadoop, MapReduce, Spark, Pig and Hive service.

Google Cloud Dataflow and Google Cloud Pub/Sub make it easier for developers' code to use huge amounts of data. And Google Cloud Machine Intelligence, currently in alpha development, is a cloud-based machine learning system.

Google is based in Mountain View, Calif.

Hewlett Packard Enterprise

Top Executive: President and CEO Meg Whitman

Hewlett Packard Enterprise’s big data software and services include the Vertica advanced analytics platform; Idol analytics for unstructured data such as text, audio and video; the Haven cloud-based big data platform; and the ArcSight platform for collecting, storing and analyzing machine data.

In February the company debuted HPE Investigative Analytics, hosted software that leverages machine learning and archiving technology to help financial services companies identify fraudulent behavior.

HPE is based in Palo Alto, Calif.


Top Executive: CEO Rob Bearden

Hortonworks offers a range of big data management products built around its Hortonworks Data Platform (HDP), which is itself based on the Apache Hadoop system. It also develops the Hortonworks DataFlow software that collects and analyzes streaming data in real time.

For 2015 publicly held Hortonworks reported revenue of $121.9 million, up 165 percent from 2014.

Hortonworks is based in Santa Clara, Calif.


Top Executive: President and CEO Ginni Rometty

IBM's big data initiatives have increasingly centered on its Watson supercomputer and its cloud-based data discovery and predictive analytics services. But the Cognos Analytics software IBM acquired in 2008 remains a big part of the company's business intelligence software lineup.

In February IBM launched Quarks, open-source technology that embeds streaming analytics capabilities within Internet of Things devices.

IBM was listed among the top leaders in 2016 Gartner Magic Quadrant for Advanced Analytics Platforms report.

The company is based in Armonk, N.Y.

MapR Technologies

Top Executive: CEO John Schroeder

MapR Technologies joins Cloudera and Hortonworks as one of the leading Hadoop distributors with the MapR Converged Data Platform that integrates Hadoop and Spark with real-time database capabilities, event streaming, security and enterprise storage.

In March the company shipped a new release of the Converged Data Platform with extended security and data governance features and improved performance.

MapR is based in San Jose, Calif.


Top Executives: Co-CEOs Safra Catz and Mark Hurd

Oracle is one of the longtime leaders in the big data space with its Oracle Database and MySQL relational databases, data management and integration tools, data warehousing technology, and business analytics software.

Oracle's 2010 acquisition of Sun allowed Oracle to expand into the big data hardware arena with such products as the Big Data Appliance, the Exadata Database Machine and the Exalytics In-Memory Machine.

In April Oracle delivered the Exadata X6 Database Machine, based on the latest Intel Xeon processors with improved capacity and performance for online transaction processing and analytical workloads.

Oracle is based in Redwood Shores, Calif.


Top Executive: CEO Sean Suchter

Managing Hadoop environments can be a challenge. Pepperdata develops software tools for managing Hadoop clusters with hundreds and even thousands of nodes. The technology allows IT to monitor and control system usage to meet service-level agreements, increase data throughput and improve system visibility.

Based in Sunnyvale, Calif., Pepperdata was founded in 2012.

Pivotal Software

Top Executive: CEO Rob Mee

EMC spinoff Pivotal, based in Palo Alto, Calif., offers a number of software products for the big data market within the Pivotal Big Data Suite. The major products include the Greenplum open-source massively parallel data warehouse system, the GemFire in-memory data grid system, and the Pivotal HDB Hadoop system – the latter based on Hortonworks' Hadoop software.

Snowflake Computing

Top Executive: CEO Bob Muglia

Startup Snowflake Computing began offering its cloud-based Snowflake Elastic Data Warehouse service nearly one year ago, providing an alternative to traditional on-premise data warehouse systems that tend to be complex, expensive and time-consuming to build.

The company has actively established technology integration alliances with business analytics software vendors including MicroStrategy and Tableau.

Founded in 2012 and based in San Mateo, Calif., Snowflake Computing is run by former Microsoft executive Bob Muglia.


Top Executive: CEO Josh Rogers

Once largely focused on utility software for mainframe computers, Syncsort has re-invented itself as a big data software company with data transformation and integration technology for Hadoop, Microsoft Windows Linux and other systems.

The Woodcliff Lake, N.J.-based company's key products include DMX for data integration, DMX-h for managing Hadoop workloads, and Ironcluster Hadoop ETL for Amazon Elastic MapReduce.


Top Executive: President, CEO Mike Koehler

Teradata was in many respects the original data warehouse system vendor, developing hardware/software systems specifically for data warehouse applications, in contrast to competitors that adapted their transaction processing systems to function as data warehouses.

Today Teradata, based in Dayton, Ohio, offers data warehouse systems, business analytics software and other big data products. The company was positioned as a leader in the Gartner 2016 Magic Quadrant Data Warehouse and Data Management Solutions for Analytics report.

Teradata has also operated a marketing applications business, but in April it struck a deal to sell that business to Marlin Equity Partners, choosing instead to focus on its core data and analytics business.