2015 Big Data 100: Data Management

Data Management Vendors

Last year, the total amount of digital data on the planet was estimated to be 4.4 zettabytes, according to IDC's annual Digital Universe report. (A zettabyte equals 1,000 exabytes.) That number is expected to explode by a factor of 10 to 44 zettabytes by 2020.

That's a lot of data to manage. But there's also been a wave of new technologies that help businesses not just effectively manage all that data, but find ways to put it to use and derive value from it.

The CRN editorial team has created the third annual Big Data 100 list, identifying vendors that have demonstrated an ability to innovate in bringing to market products and services that help businesses work with big data. Here are 30 data management technology companies offering everything from next-generation database systems, to advanced data integration technology, to tools for developing applications that take advantage of big data.


Top Executive: Co-Founder and CEO Sandy Steier

Founded in 2000, 1010data offers its Big Data Discovery platform for data discovery and sharing applications, especially when working with very large datasets. The company has been particularly successful in winning customers for its software and services within the retail, financial service and gaming industries.

In April, the New York-based company debuted Version 8 of Big Data Discovery with the next generation of the platform's QuickApps tools for developing form-based analysis, management dashboards and complex analytical applications.


Top Executive: President and CEO Steve Shine

Actian positions itself as a leading supplier of SQL analytics for Hadoop with its Actian Analytics Platform. Actian also offers a number of operational database and data integration software products. The Redwood City, Calif.-based company scores as a "visionary" in Gartner's Magic Quadrant for data warehouse and data management solutions for analytics.


Top Executive: CEO Ash Ashutosh

Actifio markets a copy data management platform that eliminates the problem of "data sprawl" across a company by creating a single copy of an organization's production data and making it virtually available for backup, disaster recovery, software development and testing, business analytics and archiving purposes.

In February, Actifio unveiled Actifio One, "business resiliency cloud" technology that runs on the company's copy data virtualization technology, and provides a range of data management and protection capabilities in a single application.

The Waltham, Mass.-based company was founded in 2009.


Top Executive: CEO John Dillon

Aerospike develops an open-source NoSQL database for running high-performance applications. The flash-optimized, in-memory database meets ACID (atomicity, consistency, isolation and durability) requirements for reliable transaction processing.

The Mountain View, Calif.-based company, launched in 2009, tapped Dillon for the CEO job in February. Dillon was a sales executive at Oracle in its early days and was Salesforce.com's CEO from 1999 to 2001. More recently, he was CEO at development technology vendor Engine Yard.


Top Executive: CEO Satyen Sangani

Startup Alation just came out of stealth mode in March, debuting its data-accessibility technology that's designed to help people more easily find, understand, use and govern their data for making faster and better decisions. Early customers include eBay, MarketShare and Inflection.

The Redwood City, Calif.-based company, founded in 2012, also recently scored $9 million in Series A financing -- money the company plans to use to accelerate product development, and expand its sales and marketing efforts.


Top Executive: Founder and CEO Dave Mariani

Another startup that recently exited stealth mode, San Mateo, Calif.-based AtScale develops the AtScale Intelligence Platform software that allows commonly used business intelligence tools to access data stored in Hadoop clusters.

AtScale, founded in 2013, is aiming to bridge what's become a major stumbling block for many big data projects. While corporate data is increasingly being collected and stored in Hadoop, there are few straightforward ways to access that data with the kinds of reporting and business analytics tools many information workers use today.


Top Executive: CEO Shimon Alon

Attunity is in the information availability business, providing tools for data replication, change data capture, data connectivity, enterprise file replication, managed file transfer and cloud data delivery.

In March, Attunity acquired Appfluent Technology, a developer of data usage analytics software for big data environments, for $18 million. Appfluent's software helps businesses analyze data usage patterns and moves large volumes of data and processing workloads to Hadoop.

Founded in 1998, Attunity is based in Burlington, Mass. The company reported 41 percent revenue growth in 2014 to $35.7 million.

Basho Technologies

Top Executive: President and CEO Adam Wray

Basho, launched in 2008 and based in Bellevue, Wash., develops Riak, a distributed, NoSQL database that's designed for tasks that require extremely high availability, and Riak CS, a cloud-based, object-storage database that runs on Riak.

In 2014, Basho went through some difficult times that included the departures of then-CEO Greg Collins and CTO Justin Sheehy. But Basho raised $25 million in Series G financing in January and reported that first-quarter sales bookings were up 65 percent year-over-year.

Citus Data

Top Executive: CEO Umur Cubukcu

Citus Data developed CitusDB, a massively parallel columnar database built on PostgreSQL the company said can process petabytes of data in seconds. The company targets both transactional and analytical processing tasks with the software.

Citus Data, founded in 2010 and based in San Francisco, released CitusDB 4.0 in March with faster query performance and support for realtime workloads.

ClearStory Data

Top Executive: Founder and CEO Sharmila Mulligan

ClearStory Data's software is designed to make it easier to access internal and external data sources, including corporate databases, Hadoop and the Internet, and use that data to uncover trends and patterns.

ClearStory Data, founded in 2011 and based in Menlo Park, Calif., recently enhanced its cloud-based Intelligent Data Harmonization engine for analysts and business users. The company has more tightly integrated its software with the Apache Spark in-memory analytics engine.


Top Executive: President and CEO Bob Wiederhold

Couchbase competes in the crowded "alternative database" arena against the likes of MongoDB and Cassandra with its Couchbase Server and Couchbase Mobile products, based on open-source, distributed, document-oriented NoSQL database technology that supports massive data volumes in real time.

The company launched Couchbase Server 4.0 in March with multidimensional scaling that the company said boosts performance by independently assigning and scaling index, query and data services to specific servers.

Couchbase was founded in 2011 and is based in Mountain View, Calif.


Top Executive: CEO Ion Stoica

Databricks was founded in 2013 by the creators of Apache Spark, the open-source, super-fast big data processing engine that turbo-charges Hadoop -- and some industry watchers said could even replace the big data platform.

The San Francisco-based company develops commercial software services around Spark, including the Databricks Cloud end-to-end hosted data platform.


Top Executive: CEO Billy Bosworth

Santa Clara, Calif.-based DataStax developed a massively scalable data platform based on Apache Cassandra, the open-source distributed database for storing and managing huge amounts of data across multiple data centers and the cloud.

In April, DataStax said it had grown its customer base to more than 500 enterprise customers worldwide, including Netflix, Target, Comcast and ING. The company was founded in 2010.


Top Executive: Co-Founder and CEO Phu Hoang

DataTorrent develops its DataTorrent RTS realtime stream processing system, based on Hadoop 2.0, that businesses use to process, monitor, analyze and act on big data instantly.

Founded in 2012 and based in Santa Clara, Calif., DataTorrent in April raised $15 million in Series B financing, bringing its total funding to $23.8 million.


Top Executive: President and CEO Ed Boyajian

EnterpriseDB provides software and services around the popular PostgreSQL open-source relational database. The company markets the Postgres Plus Advanced Server that's compatible with the Oracle Database, as well as that vendor's database management and replication tools and other products.

EnterpriseDB was founded in 2004 and is based in Bedford, Mass. In April, EnterpriseDB announced a partnership with Lenovo through which the two companies will jointly market Postgres Plus Advanced Server running on Lenovo servers, including working with a global network of mutual reseller partners.


Top Executive: CEO Greg Luck

Hazelcast develops in-memory data grid software that evenly distributes data across multiple nodes in a cluster, allowing for better horizontal scaling in both data storage and data processing. The software is offered under an Apache open-source license with Hazelcast developing commercial software and services around the core technology.

The company, founded in 2008 and based in Palo Alto, Calif., raised $11 million in Series B financing in September. The company is using the money to continue its development initiatives to make its data grid technology into a complete enterprise in-memory NoSQL computing system.


Top Executive: CEO Sohaib Abbasi

Informatica, launched in 1993, is, perhaps, the preeminent data integration technology company with its data ETL (extract, transform and load) tools, data quality management software and master data management products.

The Redwood City, Calif.-based company has continued to expand its technology lineup, including providing its data integration tools through the cloud as an integration Platform-as-a-Service offering.

In early April, the company announced that it was being taken private by Permira Funds and the Canada Pension Plan in a $5.3 billion deal. For its first quarter ended March 31, Informatica reported that sales grew 3 percent year-over-year to $250.5 million.


Top Executive: Co-Founder and CEO Eli Singer

Hadoop is generally not an effective platform for running interactive queries, a problem that means businesses still have to run their enterprise data warehouse systems alongside their Hadoop systems to accomplish everyday business intelligence tasks.

JethroData developed an index-based SQL engine for Hadoop, technology that the company said makes interactive business intelligence with Hadoop possible. A public beta of the software was released in late 2014 and on April 7, after two years of development, the company debuted JethroData 1.0, the first generally available release of the product.

The company was found in 2012 and is based in Natanya, Israel.


Top Executive: President and CEO Gary Bloom

MarkLogic, founded in 2001, has been addressing the big data problem with its NoSQL database before the term big data was invented.

In February, the company announced the general availability of MarkLogic 8, the latest iteration of the company's NoSQL database with support for server-side JavaScript and JSON (JavaScript Object Notation), capabilities that make it easier for developers to build and deploy data-intensive, realtime applications on the database.


Top Executive: Co-Founder and CEO Eric Frenkiel

MemSQL develops an in-memory database that enables businesses to process transactions and perform business analytics simultaneously, using both realtime and historical data, in a single database.

Founded in 2011 and based in San Francisco, MemSQL began selling its software two years ago. MemSQL's investors include In-Q-Tel, the strategic investment firm that identifies leading-edge technologies that are of interest to the U.S. intelligence community.


Top Executive: President and CEO Dev Ittycheria

While the market is crowded with NoSQL database vendors, MongoDB, which develops the open-source NoSQL database of the same name (which comes from "hu mongo us"), is among the few that have risen above the noise.

In February, the company launched MongoDB 3.0 with major enhancements to the database's performance and scalability, thanks to the new WiredTiger storage engine.

Founded in 2007, MongoDB has dual U.S. headquarters in New York and Palo Alto, Calif. In January, the company raised $80 million in Series G funding, bringing its total financing to more than $311 million.

Neo Technology

Top Executive: CEO Emil Eifrem

Neo Technology is the San Mateo, Calif.-based company behind the Neo4j graph database. Graph databases, a type of NoSQL database, use graph structures rather than indexes to represent and store data, a design that boosters tout as massively scalable and more efficient for managing and querying highly connected data.

In March, Neo Technology, founded in 2007, debuted Neo4j 2.2 with enhanced read and write performance for building faster graph database applications.

In January, the company raised $20 million in Series C funding, bring its total financing to $44.1 million.


Top Executive: Co-Founder and CEO Prakash Nanduri

Paxata develops "self-service adaptive data preparation" software that simplifies the often tedious work of transforming raw data so that it can be analyzed with business analytics tools. The company positions itself as an alternative to the traditional approach of relying on data warehouse systems built and maintained by IT.

Redwood City, Calif.-based Paxata was founded in 2012. In March, the company struck an alliance with government IT solutions provider Carahsoft Technology Corp. through which Carahsoft will bring the Paxata Adaptive Data Preparation platform to government agencies.


Top Executive: CEO Bob Tennant

Recommind develops an enterprise search and data categorization platform that organizes, manages and distributes huge volumes of data from multiple sources. Law firms and the legal community are a key market for the company's technology.

In January, the San Francisco-based company debuted an upgraded release of its cloud-based Axcelerate 5 eDiscovery and analysis platform with new business intelligence capabilities. The company was founded in 2000.


Top Executive: CEO Gaurav Dhillon

SnapLogic provides data integration Platform-as-a-Service (iPaaS) tools for connecting cloud data sources. The company launched its SnapLogic Elastic Integration Platform in late 2013 and now counts such companies as AstraZeneca, CapitalOne, Cisco and Yelp among its customers.

The San Mateo, Calif.-based company, founded in 2006, has been getting more attention as data integration emerges as a leading hurdle for many big data projects.

SnapLogic raised $20 million in Series D venture financing in October, bringing its total funding to $58.8 million.

Splice Machine

Top Executive: Co-Founder and CEO Monte Zweben

Founded in 2012, Splice Machine developed a full-featured, transactional SQL database on Hadoop that can run operational applications and realtime analytics using Hadoop data. After months of development and beta testing, the company began shipping Release 1.0 of its software in November.

San Francisco-based Splice Machine just announced a partnership with mrc (Michaels, Ross & Cole Ltd.), under which the companies are integrating Splice Machine's Hadoop RDBMS with mrc's m-Power web application development platform, making it easier to build and deploy applications on Hadoop.


Top Executive: CEO Mike Tuchen

Talend has developed an extensive lineup of open-source data management software, including tools for data integration, data quality management, master data management and business process management, as well as an enterprise service bus.

Founded in Paris, France, in 2005, Talend now has its headquarters in Redwood City, Calif.

In March, the company unveiled a cloud version of its data integration technology targeting cloud and hybrid integration products, the first step in what is expected to be a "cloud first" approach to new product development.


Top Executive: CEO Andy Palmer

You have to love a company that has the stated goal of preventing "schema proliferation." Tamr develops enterprise data unification software that businesses use to integrate diverse, siloed data for business analytics tasks.

Based in Cambridge, Mass., the company was founded in 2013 by database industry veterans Michael Stonebraker and Andy Palmer, the pair who started Vertica Systems. Today Palmer is CEO and Stonebraker is CTO.


Top Executive: CEO Adam Wilson

Making big data accessible and usable is a major challenge for many businesses. Trifacta develops technology that's used to transform raw, complex data into clean and structured formats for analysis. Trifacta calls it "data wrangling."

In February, Trifacta, founded in 2012 and based in San Francisco, established integration and ongoing development alliances with MapR Technologies, Waterline Data and Zoomdata.


Top Executive: Founder and CEO Yaniv Mor

Xplenty's cloud-based, Hadoop-as-a-Service platform integrates and transforms structured, semistructured and unstructured data into analyzable data.

Xplenty was founded in 2011 and is based in Tel Aviv, Israel. The company raised $3 million in Series A financing in October, money that's being used to continue developing the company's technology and expand its marketing efforts.