Big Data 100: Data Management

Big Data Data Management Vendors

Businesses are struggling with the rapidly increasing volume, speed and variety of information being generated today – what's come to be known as big data. Companies are seeking technologies that not only help them process and manage all that data, but tap into it to develop insights about the markets they compete in as well as their own performance within those markets.

Recognizing that need we present the inaugural Big Data 100 list, developed by the CRN editorial team, identifying vendors that have demonstrated an ability to innovate in bringing to market products and services that help businesses manage big data. Here are 25 data management companies, including industry stalwarts and startups.

10gen

Location: New York
Top Executive: CEO Max Schireson

10gen develops and provides commercial support for MongoDB, the open-source document database designed to address performance and scalability issues with relational database technology. Co-founded in 2007 by DoubleClick founder and CTO Dwight Merriman, 10gen has raised more than $81 million in venture funding.

Actian

Location: Redwood City, Calif.
Top Executive: President and CEO Steve Shine

Actian provides a number of technologies for handling big data, including the Vectorwize analytical database, Ingress OLTP database, Action Apps business applications, and OpenRoad rapid application development system. Actian is now acquiring Pervasive Software, a provider of big data analytics software, for $162 million.

Actifio

Location: Waltham, Mass.
Top Executive: CEO Ash Ashutosh

Actifio offers tools for managing "copy data," the exploding volumes of information copied by data protection, backup, disaster recovery, analytics, business continuity, test and development, and other systems. Actifio says its technology reduces licensing and hardware costs, cuts bandwidth usage, and shrinks the data storage footprint.

Attunity

Location: Burlington, Mass.
Top Executive: CEO Shimon Alon

Attunity provides software that businesses use to enable data access, sharing and distribution across heterogeneous systems -- including cloud systems. Its lineup includes tools for data replication, change data capture, data connectivity, file replication, managed file transfer and cloud data delivery.

Basho Technologies

Location: San Francisco
Top Executive: President and CEO Gregory Collins

Basho develops the Riak open-source, NoSQL distributed database that automatically redistributes data as IT systems scale up and keeps it available when physical machines fail. Basho is targeting fast-growing Web businesses, cloud system operators and large companies.

Citus Data

Location: San Francisco
Top Executive: CEO Umur Cubukcu

Citus Data's CitusDB is a distributed database built on the battle-tested PostgreSQL database that runs analytical queries over very large data sets. A key selling point is the system's ability to run SQL queries on data in Hadoop clusters without loading it into the database, making realtime SQL queries on Hadoop-stored data possible.

Couchbase

Location: Mountain View, Calif.
Top Executive: President and CEO Bob Wiederhold

Couchbase is the company behind the open-source, NoSQLdatabase of the same name that's popular for interactive Web and mobile applications. While supporting the Couchbase community the company pays the bills by selling commercially licensed and supported versions of the database.

DataStax

Location: San Mateo, Calif.
Top Executive: CEO Billy Bosworth

DataStax markets the DataStax Enterprise big data platform that combines the Apache Cassandra NoSQL database with Hadoop and Apache Solr, the latter an open-source enterprise search technology. In January the company debuted DSE 3.0, which it touts as having the most comprehensive security of any NoSQL database.

Datawatch

Location: Chelmsford, Mass.
Top Executive: President and CEO Michael Morrison

Datawatch offers a range of "information optimization" and unified information management software that helps businesses combine structured, unstructured and semistructured data and make it available to analytical applications and other big data systems. The company recently announced Hadoop support in its DataWatch Data Pump software.

EnterpriseDB

Location: Bedford, Mass.
Top Executive: President and CEO Ed Boyajian

EnterpriseDB develops enterprise-scale software and services based on the open-source PostgreSQL database. EnterpriseDB positions itself as a lower-cost alternative to Oracle's relational database and the MySQL open-source database that Oracle now owns after it acquired Sun Microsystems.

Garantia Data

Location: Santa Clara, Calif.
Top Executive: CEO Ofer Bengal

Garantia Data provides advanced in-memory, NoSQL cloud storage services, specifically for hosting Redis and Memcached datasets. Redis is an open-source, in-memory database while Memcached is a general-purpose, distributed memory caching system. The services just became generally available in February, with monthly subscriptions on a per-GB basis.

Informatica

Location: Redwood City, Calif.
Top Executive: CEO Sohain Abbasi

Informatica is perhaps the grande dame of data integration, founded in 1993 long before the term "big data" was coined. The vendor develops a broad range of software for enterprise data integration, cloud data integration, data quality, data replication, master data management, data virtualization and information life cycle management, among others.

MarkLogic

Location: San Carlos, Calif.
Top Executive: President and CEO Gary Bloom

MarkLogic's NoSQL technology-based flagship MarkLogic Server works with popular business analytics software such as IBM Cognos and Tableau, and supports the Hadoop Distributed File System. Other big data-related products, including search and application development tools, round out its offerings.

MemSQL

Location: New York
Top Executive: CEO Eric Frenkiel

MemSQL developed an in-memory database of the same name that the company says processes big data applications 30 times faster than other systems. The company is targeting its technology toward such markets as financial services and digital advertising where fast analysis of machine data is crucial.

Neo Technology

Location: San Mateo, Calif.
Top Executive: CEO Emil Eifrem

Neo Technology developed the Neo4j graph database, a category of NoSQL database technology that uses graph structures rather than indexes to quickly model and query connected data. A marquee customer is Cisco, which reportedly replaced a master data management system running Oracle's Real Application Clusters database with Neo4j.

Rainstor

Location: San Francisco
Top Executive: CEO John Bantleman

Rainstor develops database software and related products for managing big data. The company's database, which runs natively on Hadoop, includes advanced data compression technology that the company says cuts data storage costs by up to 90 percent.

Recommind

Location: San Francisco
Top Executive: CEO Bob Tennant

Recommind develops software for unstructured data management, governance and analysis. The company's CORE (Context Optimized Relevancy Engine) platform automatically accesses, organizes and analyzes large volumes of information from many sources inside and outside a corporate network.

Revelytix

Location: Cockeysville, Md.
Top Executive: CEO Michael Lang

Revelytix just began offering early access availability of its Loom Dataset Management for Hadoop data integration software. The company says the technology makes it easier for data scientists to work with Hadoop, manage big data files, and build analytical applications. The tools also provide companies with data tracking and auditing capabilities.

Simba Technologies

Location: Vancouver, B.C.
Top Executive: President and CEO Amyn Rajan

Simba Technologies develops data access, connectivity and analysis software for relational and multidimensional data sources based on such standards as ODBC, JDBC, SQL and XML. The company, for example, developed ODBC drivers for Apache Hive and HBase, and it worked with Intel to provide ODBC access to the Intel Distribution for Apache Hadoop.

Splice Machine

Location: San Francisco
Top Executive: CEO Monte Zweben

While NoSQL databases have been getting a lot of attention, Splice Machine is bucking that trend with a SQL-compliant database designed for big data applications. The company says its Splice SQL Engine, built on the Hadoop stack, offers the same scalability benefits as NoSQL without having to rewrite SQL-based applications and business analytics tools.

Syncsort

Location: Woodcliff Lake, N.J.
Top Executive: CEO Flavio Santoni

Syncsort offers a range of data integration and protection software and services. Products such as its DMExpress work with systems such as HP's Vertica and EMC's Greenplum to speed up big data processing. It also contributes data integration technology to the Apache Software Foundation's Hadoop project.

Talend

Location: Los Altos, Calif.
Top Executive: CEO Bertrand Diard

Talend develops open-source and commercial software for working with big data, including data integration, data quality, meta data management and data governance tools. The Talend Big Data Platform combines multiple tools with such capabilities as a Hadoop job scheduler and NoSQL connectivity.

Unravel

Top Executive: CEO Kunal Agarwal
Location: Durham, N.C.

Unravel provides optimization tools to help users spend less time worrying about their Hadoop infrastructures and focus on analyzing big data to make better business decisions. The company develops a series of visualization tools such as Profiler, which maps out data flows.

WibiData

Location: San Francisco
Top Executive: CEO Christophe Bisciglia

WibiData leverages Apache Hadoop, Hbase and other technologies to manage and analyze huge volumes of data about user behavior, including profile data and log-oriented transactional data. The idea is that all data pertaining to a user, customer, etc., is kept all in one place.

XtremeData

Location: Schaumburg, Ill.
Top Executive: President Mike Lamble

Xtremedata develops a massively parallel database management system for data warehouse applications that can be deployed either on premise or in the cloud. The SQL database engine can handle hundreds of terabytes of data and is positioned as an alternative to other SQL databases, data warehouse appliances and the Apache Hive data warehouse software.