The 10 Coolest Big Data Startups Of 2014 (So Far)

Big Data's Rising Stars

Industry hype around big data continued in full force in 2014. And, vying for a piece of what IDC expects to be a $32.4 billion market by 2017, a number of big data startups either burst onto the scene or continued to fine-tune their strategies.

Ranging from companies whose big data solutions are helping doctors better treat cancer, to those whose platforms promise to drastically simplify big data search and analytics, here are 10 of the coolest big data startups of the year (so far).


CEO: Dane Atkinson

There's no denying that the dawn of social media has flipped traditional marketing on its head. And SumAll, a New York-based big data startup, has built a business around helping organizations make the most out of their social marketing efforts.

SumAll offers an online analytics platform that lets users visualize data from 42 (and counting) social media and e-commerce sites, including Facebook, Twitter, eBay and Instagram, in one intuitive, interactive chart. The company rolled out several stand-out new features this year, including an alert system for Twitter that pings users when they get, for instance, a certain number of retweets or mentions (good or bad).

Also cool about SumAll is the fact that it allocated 10 percent of its ownership to a non-profit,, which aims to use data analytics for "social good."


CEO: Dr. Catherine Havasi

It's been a big year so far for Luminoso, a Cambridge, Mass.-based startup specializing in text analytics.

As a spin-out of the Massachusetts Institute of Technology (MIT) Media Lab, Luminoso leverages what it calls "the world's first cloud-based, massively multilingual, scalable solution" for understanding and analyzing text. The purpose of the platform, Luminoso says, is to turn big data into big insights, helping organizations understand how customers really feel about their company or product by deriving meaning from even the most indirect language or subtlest of hints.

Luminoso garnered a lot of attention this year when it was selected by Sony to fuel the tech giant's "One Stadium Live," an online portal compiling all World Cup-related social media content from Twitter, Facebook, and Google+. The startup in June also nabbed a $6.5 million in Series A funding.

Flatiron Health

Co-Founders: Nat Turner and Zach Weinberg

Founded in 2012 by former Google employees Nat Turner and Zach Weinberg, Flatiron Health is harnessing the power of big data analytics to help doctors better understand, and treat, one of the world's most complex diseases: cancer.

Flatiron Health, based in New York, is developing OncologyCloud, an advanced data platform that is 100 percent focused on oncology. The idea is that the platform aggregates and transforms critical data from electronic medical records and billing systems in real time to provide a comprehensive view of each patient's experience in the oncology office.

The need for a platform like OncologyCloud, Flatiron says, is great; according to the company's website, only 4percent of cancer patients in the U.S. participate in clinical trials, meaning the industry isn't learning from the other 96 percent of patients.

Earlier this year, Flatiron announced a $130 million round in funding, led by Google Ventures.


CEO: Josh James

Aiming to bring the right information, at the right time, to enterprise users' fingertips, Domo offers a cloud-based platform designed to give users real-time access to data scattered throughout different sources via a single dashboard.

Founded in 2011, Domo says its platform can quickly derive structured and unstructured data from almost any source, whether a spreadsheet, a database, or a social media site.

In February the company raised $125 million in Series C financing, doubling its total venture funding. At the time, the company said its annual growth was "far exceeding" 100 percent and that it had signed roughly 500 customers.

Alpine Data Labs

CEO: Joe Otto

Not everyone has the coding or analytical know-how to glean insights from large, complex data sets. Well, enter Alpine Data Labs, one of the better-known big data startups dedicated to bring predictive analytics to the masses.

Founded in 2010, San Francisco-based Alpine Data Labs offers a platform that lets users create analytical queries using a simple and familiar drag-and-drop approach. The platform works with both Hadoop-based data sources and traditional relational databases. It also has built-in collaboration features that let team members, no matter how far, work together on a single predictive analytics model.

Alpine Data Labs in November raised $16 million in Series B venture funding, bringing its total funding to $23.5 million.


CEO: Raymie Stata

Altiscale, formed in 2012 by former Yahoo CTO Raymie Stata, is a big data startup that prides itself on having developed the industry's first cloud service to run Apache Hadoop, the open-source platform for managing big data.

According to the Palo Alto, Calif.-based company, its Hadoop-as-a-Service gives organizations an on-demand, pay-as-a-go model for consuming the Hadoop big data platform. The pricing structure allows companies to only pay for what they use on a monthly basis, eliminating the need for major up-front capex spend.

Altiscale also has a proactive Hadoop help desk that not only assists with the operation of the hardware, but monitors jobs running in Hadoop and ensures the software is always up to date. The company announced general availability of its Hadoop-as-a-Service platform in January.


CEO: Andy Palmer

Another hot big data startup to emerge from MIT is Tamr, a Cambridge, Mass.-based company promising to dramatically reduce the time and effort required to connect and enrich data sources.

Tamr formally launched in the spring of this year and is backed by a number of investors, including Google Ventures and New Enterprise Associates. The Tamr platform leverages machine learning algorithms to identify data sources, understand the relationships between them, and then curate massive amounts of siloed data.

Tamr's founders include Andy Palmer and Michael Stonebraker, the big data gurus who previously co-founded Vertica Systems, the big data startup acquired by HP in 2011. Tamr in June opened up a second office in San Francisco.


CEO: Tom Reilly

One of the most prominent big data startups to emerge over the past few years is Cloudera, which, along with competitors Hortonworks and MapR Technologies, offers its own distribution of Apache Hadoop. Cloudera enhances the open-source Hadoop platform with software add-ons, management tools, services and other offerings that make it easier for customers and partners to work with the platform.

The Palo Alto, Calif.-based company has been busy so far in 2014. In addition to rolling out the latest version of its Hadoop platform -- Cloudera Enterprise 5 – Cloudera in April bulked up its Connect Partner Program with new resources, including training and certification, for its more than 900 solution provider partners.

In June, Cloudera announced the closing of a $900 million round in funding, and named Kim Stevenson, corporate vice president and CIO at Intel, to its board of directors.


CEO: Paula Long

DataGravity describes itself as an early stage company with a mission to turn data into information. The Nashua, N.H.-based company, expected to launch its first product sometime this year, says it's developing a technology that will transform stored data into easily digestible information without the need for complex software packages.

In February, DavaGravity took the wraps off an early access channel program during VMware's Partner Exchange event, aimed at giving solution providers a sneak peek at DataGravity's solution. Early partners in the program include Axis Business Solutions and Qumulus Solutions.

DataGravity is led by CEO Paul Long, who was a co-founder of EqualLogic, the storage company acquired by Dell in 2008.


CEO: Steven Schuurman

Elasticsearch has generated a lot of interest over the past year and a half with its open-source search solution that's purpose-built for big data.

In what it's dubbed the "most advanced search and analytics engine" on the plant, Elasticsearch's platform is designed to quickly help users search through massive amounts of data, "scrub" that data clean, and then visualize that data through a number of analytics tools.

Elasticsearch says it's seeing massive growth in the market. In June, when the company closed a $70 million funding round, Elasticsearch said the adoption of its three core products -- Elasticsearch, Logstash and Kibana -- has grown three-fold, climbing to more than 8 million total downloads.