2016 Big Data 100: 30 Coolest Data Management Vendors

Data Management Vendors

The total amount of digitally stored data is expected to reach 40 Zettabytes by 2020, according to market researcher IDC, representing a 50-fold increase since 2010. That's 40 trillion Gigabytes or 5.2 Terabytes for every man, woman and child on earth.

Managing huge – and ever-growing – volumes of data is a significant challenge for businesses and IT executives. Thankfully, there have also been rapid developments in data management technology to help with that challenge.

The CRN editorial team has created the fourth annual Big Data 100, identifying vendors that have demonstrated an ability to innovate in bringing to market products and services that help business work with big data. Here are 30 data management technology companies offering everything from next-generation database software to advanced data integration technology.


Top Executive: CEO Satyen Sangani

Alation exited stealth last year, debuting its enterprise data-accessibility platform that's designed to help people more easily find, understand, use and govern their data for making faster and better decisions.

Alation said its platform combines elements of machine learning with human insight to capture information about what the data describes, where it comes from, who's using it and how it's being used. The company's key executives and technologists came from Oracle, Google, Apple and other IT companies.

Based in Redwood City, Calif., Alation was founded in 2012.


Top Executive: CEO Dave Mariani

AtScale's software makes it possible to use popular business intelligence tools such as Tableau and Qlik to access data stored in Hadoop clusters. The technology creates a semantic layer between Hadoop and third-party tools, essentially turning Hadoop into an online analytical processing server that can be tapped for multidimensional analysis.

In March the company unveiled the AtScale Intelligence Platform 4.0, which introduced some 100 new features and system enhancements, including the new AtScale Hybrid Query Service that natively supports both SQL and MDX query languages used by business intelligence tools.

Founded in 2013, AtScale is based in San Mateo, Calif.


Top Executive: CEO Shimon Alon

Attunity develops integration software that enables access, management, sharing and distribution of data across heterogeneous enterprise platforms and cloud systems.

In February, Burlington, Mass.-based Attunity launched Attunity Compose, an automated data warehouse system that's designed to speed time-to-analytics through the use of a model-based, agile approach that supports the complete cycle of building, populating and maintaining a data warehouse.

Bedrock Data

Top Executive: CEO John Marcus

Bedrock Data offers a data integration Platform-as-a-Service that constantly reviews and automatically synchronizes data in IT systems to ensure consistent records across CRM, marketing automation, email, customer support and finance systems.

Boston-based Bedrock Data, founded in 2012, targets its iPaaS offering toward small and midsize businesses.


Top Executive: CEO Jay Kreps

Confluent offers a data platform, based on the Apache Kafka open-source messaging system, for collecting, managing and analyzing streaming data in real time – a major challenge in the worlds of big data and the Internet of Things.

Confluent was launched in September 2014 to provide technology and services that help businesses adopt and use Kafka. The company was co-founded by Jay Kreps, Neha Narkhede and Jun Rao, who created Kafka while working at LinkedIn.

In March the company launched the Confluent Partner Program to support the growing ecosystem of technology developers, systems integrators and consultants in the Kafka and Confluent market.


Top Executive: President, CEO Bob Wiederhold

Couchbase and other vendors in the crowded NoSQL database arena position their products as alternatives to the relational databases that dominate most data centers today. Their next-generation technologies can better handle the huge volumes of data and different data types that businesses are increasingly working with.

Couchbase's products include the Couchbase Server and Couchbase Mobile.

In March Couchbase, founded in 2011 and based in Mountain View, Calif., raised $30 million in Series F financing.


Top Executive: CEO Ali Ghodsi

Databricks was founded in 2013 by the creators of Apache Spark, the open-source big data processing engine that turbocharges Hadoop. The San Francisco-based company develops commercial software and services around Spark, including the Databricks Cloud end-to-end hosted data platform that launched in June 2015.

In February Databricks unveiled the beta release of the community edition of its platform, a move designed to help businesses learn about working with Spark for free. General availability is expected by midyear.


Top Executive: CEO Billy Bosworth

DataStax markets a commercial version of Apache Cassandra, the open-source NoSQL database designed to manage huge volumes of data across multiple data centers and the cloud, as well as providing a line of supporting administration, management, development and analysis tools.

In April DataStax expanded its product lineup by introducing DataStax Enterprise Graph, a scalable real-time graph data platform built for cloud applications that need to manage highly connected data.

Based in Santa Clara, Calif., DataStax was founded in 2010.


Top Executive: CEO Phu Hoang

DataTorrent markets a big data platform for unified stream and batch processing on Hadoop that enables users to process, monitor, analyze and act on big data in real time.

In April the Apache Software Foundation said that Apache Apex, an open-source implementation of the DataTorrent RTS core engine, had been designated a top-level project.

Based in San Jose, DataTorrent was founded in 2012.


Top Executive: President, CEO Ed Boyajian

EnterpriseDB markets an Oracle-compatible relational database system based on the open-source PostgreSQL database, along with security and performance enhancements, management tools, and other support and services.

In January Bedford, Mass.-based Enterprise DB launched EDB Postgres Advanced Server 9.5 with improved performance and scalability, enhanced security, the ability to handle more complex workloads, and expanded facilities migration from Oracle. The company also shipped EDB Postgres Enterprise Manager 6.0.


Top Executive: CEO Anil Chakravarthy

Informatica is a veteran developer of big data technologies including tools for master data management, data and cloud integration, and data quality. The vendor's ETL (extract, transform and load) software has long been a key component of many business' data integration practices.

The company was positioned as a leader in Gartner's 2016 Magic Quadrant for Enterprise Integration Platform report and was ranked No. 1 by Gartner in 2015 global market share for Integration Platform-as-a-Service iPaaS.

In March the company introduced Informatica Intelligent Data Lake, a new data management system for turning big data into more valuable, useful information assets.


Top Executive: CEO Amar Arsikere

Startup Infoworks exited stealth mode in 2015 with the launch of its Infoworks Dynamic Data Warehousing platform, which runs on a single Hadoop cluster. The software automatically crawls enterprise databases, ingests data into Hadoop and keeps it synchronized. It also organizes the data into data warehouses, cubes and other data models for multiple use cases.

In September San Jose, Calif.-based Infoworks received $5 million in Series A financing. The company plans to develop additional software that runs on the DDW platform.


Top Executive: CEO Eli Singer

Hadoop is a notoriously difficult system on which to run interactive queries. JethroData developed a SQL-on-Hadoop engine that acts as a business intelligence-on-Hadoop acceleration layer that speeds up big data queries from business intelligence tools such as Tableau, Qlik and MicroStrategy from any data source like Hadoop or Amazon S3.

The New York-based company debuted Jethro 1.6.0 in April with new concurrency and range-index features.


Top Executive: President, CEO Gary Bloom

MarkLogic offers an enterprise NoSQL database built with a flexible data model to store, manage, query and search structured and unstructured data and facilitate heterogeneous data integration.

MarkLogic, founded in 2001 and based in San Carlos, Calif., stunned the industry last May when it received $102 million in Series F financing. The company is using the new funding to accelerate the company's global expansion to Europe, Japan and Asia-Pacific.


Top Executive: CEO Eric Frenkiel

San Francisco-based MemSQL develops a distributed in-memory database that can process transactions and run analytics in real time using SQL.

In March the company debuted MemSQL 5 with a range of new technologies and enhanced capabilities to improve the software's database, data warehouse and streaming workload performances. The new release can merge transactions and analytics into a single database through its hybrid transaction/analytical processing technology that supports OLTP and OLAP queries.

In April MemSQL, founded in 2011, raised $36 million in Series C financing.


Top Executive: President, CEO Dev Ittycheria

MongoDB develops a NoSQL database that, like competing NoSQL databases, positions itself as an alternative to traditional relational database systems that struggle to meet the demands of today's big data environments.

In November MongoDB launched MongoDB 3.2 with new data storage engines and data governance features, capabilities that the company expects will extend the software's potential market for enterprise-class applications.

In March MongoDB, based in New York and Palo Alto, Calif., formed a partner advisory council, made up of executives from 24 of MongoDB's 1,000-plus partner companies

Neo Technology

Top Executive: CEO Emil Eifrem

Neo develops the Neo4j graph database, a type of NoSQL database that uses graph theory to map, store and query data relationships. Graph databases are generally considered to work more quickly with associative data sets and scale more easily to handle large data sets.

Neo was founded in 2007 and is based in San Mateo, Calif. In April the company debuted Neo4j 3.0, a release the company expects will help the database gain traction for mainstream applications with its scalability, new language drivers and other new development functionality.


Top Executive: CEO Prakash Nanduri

Paxata's Adaptive Data Preparation platform, built on Apache Spark and optimized to run in Hadoop environments, provides data integration, data quality, semantic enrichment, collaboration and governance capabilities.

The latest release of the Paxata platform improves the software's ability to provide users with connected information through advanced "filtergrams" for comprehensive data profiling. Also new is granular searching ability across columns of wide data sets and new options for data discovery with statistical selections. The release also includes new IT controls to improve system governance, security and scale.

Paxata, based in Redwood City, Calif., was founded in 2012.


Top Executive: CEO Ashish Thusoo

Qubole develops the Qubole Data Service, a unified interface that helps users analyze data stored in cloud systems such as Amazon Web Services, Google Cloud and Microsoft Azure.

Qubole, founded in 2011 and based in Mountain View, Calif., raised $30 million in Series C funding in January.

Redis Labs

Top Executive: CEO Ofer Bengal

Redis Labs supports the open-source Redis high-performance, in-memory NoSQL "data structure store" that can perform as a database, caching layer or message broker for fast transactions and real-time analytics.

In January Redis Labs, based in Mountain View, Calif., said its revenue grew 300 percent in 2015. Gartner also positioned Redis Labs as a leader in the 2015 Magic Quadrant Operational Database Management Systems report


Top Executive: CEO Manish Sood

Reltio Cloud, launched in March last year, combines aspects of meta data management and NoSQL graph databases to create a platform for developing enterprise data-driven applications.

Based in Redwood Shores, Calif., Reltio was founded by the team that developed Informatica's MDM technology, along with big data folks from IBM, Salesforce and other companies.


Top Executive: CEO Gaurav Dhillon

The SnapLogic Integration Cloud provides "elastic integration" for connecting enterprise applications with on-premise and cloud-based data, putting it squarely in the middle of the heavily competitive Integration Platform-as-a-Service arena.

San Mateo, Calif.-based SnapLogic reported in January that annual bookings more than doubled in 2015 and the company added more than 300 customers through direct and indirect sales.

SnapLogic was co-founded in 2006 by Gaurav Dhillon, former CEO and co-founder of Informatica. In April former Informatica chief technology officer James Markarian joined SnapLogic as CTO.

Splice Machine

Top Executive: CEO Monte Zweben

Splice Machine develops a Hadoop-based relational database that is more scalable than traditional RDBMS systems from Oracle, Microsoft and others, but still provides a familiar SQL interface for application developers – unlike many next-generation NoSQL databases.

In November 2015 the company debuted Splice Machine 2.0 with the ability to simultaneously perform transaction processing and business analytics tasks.

In January San Francisco-based Splice Machine, founded in 2012, received $9 million in Series C financing.


Top Executive: President, CEO Ali Kutay

Striim, pronounced "stream" with the "i’s" standing for integration and intelligence, was founded in 2012 by former executives from Golden Gate Software, Oracle, Informatica, WebLogic and other big-name data management companies.

The Palo Alto, Calif.-based company's software combines streaming data integration and streaming operational intelligence in one system. That makes possible continuous query/processing and streaming analytics.

Striim raised $10 million in additional financing in March, bringing its total Series B funding to $30 million.


Top Executive: CEO Nitin Donde

In March startup Talena debuted its Talena Rx predictive analytics software that incorporates machine-learning algorithms and data visualization to better administer big data management workloads and more accurately predict data availability.

The software also offers "active copy analytics" capabilities that businesses can use to turn idle backed-up data into useful assets.

Founded in 2013, Talena is based in San Jose, Calif.


Top Executive: CEO Mike Tuchen

Talend develops a range of open-source software for data integration, master data management, data quality management and other big data tasks.

Originally founded in Paris, France, Talend today is based in Redwood City, Calif.

Talend has been stepping up its channel efforts, especially in Europe where it launched a VAR program in March 2015. In December the company named Michael Pickett to the new post of vice president of business development and partners ecosystems to accelerate the company's channel efforts.


Top Executive: CEO Andy Palmer

Cambridge, Mass.-based Tamr developed a data unification platform that transforms "dark, dirty and disparate data" from hundreds and even thousands of data sources both inside and outside an organization into clean, connected data. The technology, for example, automatically catalogs all metadata associated with internal and external data sources and stores it in a central location.

In March the company announced that its product was compatible with Apache Spark.

Database industry veterans Andy Palmer and Michael Stonebraker started Tamr in 2013.


Top Executive: CEO Adam Wilson

Trifacta develops "data wrangling" software for transforming raw, complex data into clean, structured formats for analysis – one of the biggest challenges in data analysis processes.

In March the San Francisco-based company launched the Photon Compute Framework, new technology at the core of the Trifacta software's user interface that provides users with a rich, interactive data exploration and transformation experience when working with large, in-memory data sets.

Trifacta was founded in 2012.


Top Executive: President, CEO Bruce Reading

VoltDB develops an in-memory SQL database that combines streaming analytics with transaction processing capabilities in a single platform. Businesses use the software to develop applications that process streaming data the instant it arrives to make immediate decisions.

VoltDB was founded in 2009 and is based in Bedford, Mass.


Top Executive: CEO Yaniv Mor

Another player in the data integration platform market, Xplenty offers a cloud-based data integration platform that pulls together structured and unstructured data without any coding work. Running on Hadoop, Xplenty positions itself as an alternative to legacy ETL tools.

In January Xplenty added to its platform the ability to integrate data from Salesforce.com, Mixpanel, Intercom, Slack and PagerDuty.

Launched in 2011, Xplenty has headquarters in San Francisco and Tel Aviv, Israel.