The Coolest Emerging Big Data Companies Of The 2019 Big Data 100

Part 5 of CRN's 2019 Big Data 100 looks at the emerging big data companies you need to know.

Big Data, Big Potential

While this year's Big Data 100 list included many established vendors, more than a quarter of the list are young companies that are pushing the envelope in developing leading-edge technology.

As part of this year's seventh annual Big Data 100, we've included a list of the companies started in 2013 or later that are bringing innovative products and services to market that help businesses and organizations solve their big data challenges.


Top Executive: CEO David Drai

Anodot's Autonomous Analytics software uses machine learning to identify behavioral patterns within time series business data and detect outliers and anomalies, providing business insights and helping managers remedy problems and take advantage of opportunities.

Based in Ra'anana, Israel, Anodot was founded in 2014.


Top Executive: CEO Chris Lynch

AtScale has developed a data warehouse virtualization platform that utilizes the company's Universal Semantic Layer technology to connect business analytics software to any number of on-premises and cloud data sources as though they were a single system.

In December AtScale, founded in 2013 and based in San Mateo, Calif., secured $50 million in Series D financing, funding the company has applied to expanded product development and accelerated sales and marketing efforts.

In June of 2018 the company named former Vertica CEO Chris Lynch to be its new CEO. Co-founder Dave Mariani, who had been CEO, is now chief strategy officer, overseeing the company's development efforts.


Top Executive: CEO Prat Moghe

Cazena offers its Data-as-a-Service platform as a way to simplify big data workloads in the cloud, including business intelligence, DevOps, machine learning, data engineering and cloud migrations.

In April Cazena, founded in 2014 and based in Waltham, Mass., was awarded a U.S. patent for "Intelligent Analytic Cloud Provisioning" technology used to iteratively provision cloud resources to achieve optimal cost performance for analytic workloads.


Top Executive: CEO Jay Kreps

Confluent markets the Confluence Platform, based on the Apache Kafka technology for managing streaming, event-based data, along with additional products such as the KSQL streaming SQL engine that enables real-time data processing against Kafka.

Confluent, based in Palo Alto, Calif., was founded in 2014 by the original developers of Kafka. In January the company raised $125 million in Series D funding.

In April the company struck a strategic partnership with Google Cloud to help customers move streaming data workloads to the Google Cloud Platform.


Top Executive: CEO Ali Ghodsi

Databricks, founded in 2013 by the creators of the Apache Spark analytics engine for big data, develops the Databricks Unified Analytics Platform that combines data science and data engineering to handle all data analysis processes.

In February Databricks, based in San Francisco, raised an impressive $250 million in Series E funding, bringing the company's valuation to $2.75 billion. The company also said it had experienced a three-fold increase in subscription revenue in the final quarter of 2018 and its annual recurring revenue had reached $100 million in 2018.


Top Executive: CEO Florian Douetteau

Dataiku offers the Dataiku DSS collaborative data science platform that allows teams of data scientists, data analysts and engineers to explore, prototype, build and deploy AI- and machine learning-based systems for such tasks as data management, demand forecasting, spatial analytics, churn analytics, fraud detection, lifetime value optimization and analytical CRM.

In March the company, founded in 2013 and based in New York, launched Dataiku 5.1 with user experience upgrades, more customized coding features and improved regulatory compliance capabilities.


Top Executive: CEO Tomer Shiran

Dremio, which launched in 2017, develops an open-source Data-as-a-Service platform that the company says makes data engineers more productive and data consumers more self-sufficient.

Dremio, headquartered in Santa Clara, Calif., reported in March that its annual recurring revenue had increased 10-fold since 2017 and enterprise customer acquisition grew five-fold.


Top Executive: CEO Adam Famularo

Erwin, which developed data modeling technology in its early days, has expanded into new data management areas with a primary focus on data governance with its erwin EDGE system for enterprise modeling, data cataloging and data literacy.

In September the Melville, N.Y.-based company acquired meta data management technology developer AnalytiX DS to strengthen its data governance and management portfolio.

Erwin, which has had several owners through its history, was acquired from CA Technologies in 2016 by a private equity firm and now operates as an independent company.


Top Executive: CEO George Fraser

As more businesses move their data warehouse operations to cloud systems like AWS Redshift and Snowflake Computing, they need better ways to collect business data from multiple sources and feed it to those cloud systems. Fivetran offers an integrated data pipeline that replaces older ETL (extract, transform and load) systems to support cloud data warehouses.

In December Fivetran, founded in 2013 and based in Oakland, raised $15 million in Series A financing and said it had experienced a three-fold increase in revenue in the previous 12 months and doubled its customer base.


Top Executives: Co-CEOs Brian Platz and Flip Filipowski

Fluree has developed a data management platform, based on powerful graph database technology, that businesses use to develop custom blockchain applications. The company, based in Winston-Salem, N.C., was founded in 2017 and the first production release of the software became available in August 2018.


Top Executive: CEO Asaf Somekh

Iguazio, founded in 2014, launched its Continuous Data Platform in 2017, providing a system that ingests, enriches and analyzes data from a wide range of disparate sources, simplifying the development and deployment of data-driven applications.

New York-based Iguazio launched its inaugural channel program in early 2018 in a bid to globally recruit VAR, system integrator and OEM partners.

Imanis Data

Top Executives: Co-CEOs John Mracek (pictured) and Nitin Donde

Imanis Data provides data management and machine learning technology, used in conjunction with Hadoop and NoSQL databases and cloud databases like Microsoft Azure Cosmos DB, to ensure data resiliency for petabyte-scale data sets, both on premise and in the cloud. Use cases included data backup and disaster recovery, cloud migration, security, copy data management, and data archiving and recovery.

Imanis Data, originally called Talena, was founded in 2013 and is headquartered in San Jose.


Top Executive: CEO Matthew Carroll

Immuta provides enterprise data management software that data scientists, data owners and data stewards use to locate, access, share, control and monitor data.

In April the company, founded in 2014 and based in College Park, Md., debuted the Immuta Automated Data Governance Platform with automated governance capabilities that data scientists and business analysts use to securely share data and scripts without violating data policies and industry regulations.


Top Executive: CEO Osama Elkady

Incorta develops a hyper-converged analytics software platform that speeds up business analysis and reporting by performing data modeling and extract-transform-load (ETL) tasks that often take a great deal of time with traditional data warehouse systems.

In October Incorta, founded in 2013 and based in San Mateo, Calif., raised $15 million in Series B funding from Microsoft's M12 venture fund and Telstra Ventures.


Top Executive: CEO Buno Pati

Infoworks markets a software platform for the creation and ongoing management of big data workflows from source to consumption, simplifying data engineering and DataOps tasks. The system automates data ingestion, transformation and preparation for business analytics, automated data modeling and OLAP cube generation, and manages DataOps and data governance processes.

Buno Pati, who had been the company's chairman, became CEO in February while CEO and co-founder Amar Arsikere became CTO and chief product officer. Founded in 2014, the company is based in Palo Alto, Calif.

Magnitude Software

Top Executive: CEO Chris Ney

Magnitude's software provides unified application data management capabilities, pulling data from operational applications for analysis and providing businesses with insights into their operations.

In September the Austin-based company unveiled the Magnitude Gateway, a universal data connectivity platform with access to data from more than 100 sources. In December Magnitude, launched in 2014, deepened its presence in the SAP ecosystem by acquiring Z Option, a developer of tools used to reduce the cost of implementing and using SAP applications.


Top Executive: CEO Michael Howard

MariaDB develops the Maria database, an open-source, commercially supported fork of the MySQL relational database. In February the company launched a new release of MariaDB Enterprise Server with fine-grained auditing, reliable backups for large databases and end-to-end encryption for data at rest.

MariaDB was founded in 2014 and is headquartered in Menlo Park, Calif.


Top Executive: CEO Katie Horvath

Naveego markets master data management and data quality management software, which together make up the Naveego Complete Data Accuracy Platform, targeted at helping businesses improve the accuracy of their corporate data.

Naveego, based in Traverse City, Mich., was founded in 2015.


Top Executive: CEO Amnon Drori

Octopai has developed an automated, centralized, cross-platform metadata management and data lineage search engine that business intelligence organizations use to quickly discover and govern shared metadata.

Octopai, based in Rosh Ha'ayin, Israel and launched in 2015, really began showing up on people's radar screens in 2018 due, in part, to the company being named one of 10 cloud-based startups to participate in Microsoft's ScaleUp 2018, a four-month program that provides promising startups with tools, resources, connections, knowledge and expertise to accelerate their growth.


Top Executive: CEO Todd Mostak

OmniSci has developed a GPU-accelerated, SQL-based analytics platform that offers increased speed and scale for big data querying and visualization tasks. The company also markets a portfolio of query and analysis applications that run on the platform for specific industries (such as automotive and telecommunications), use cases (housing and urban development analysis) and role (big data analysts and data scientists).

In April the company, founded in 2013 and based in San Francisco, began offering its software through the Microsoft Azure Cloud.


Top Executive: CEO Sean Byrnes develops artificial intelligence-based analysis software that collects data from business applications like Adobe, Zendesk and MailChimp and analytical systems such as Amazon Redshift and Google Analytics, and sifts through it to identify insights from unexpected changes and data outliers. The system also monitors business data and provides notification when unexpected changes occur.

Outlier was founded in 2015 and is based in Oakland.


Top Executive: Founder Alex Johnson

Plotly offers Chart Studio, an online data analysis and graphing tool for creating D3.js and WebGL charts, and Dash, a Python framework for building analytical web applications.

Based in Montreal, Plotly was founded in 2013.


Top Executive: CEO Girish Pancha

StreamSets touts its product, a data operations platform for full life-cycle management of data-in-motion, as the point where DevOps meets data integration. The StreamSets DataOps Platform includes a data integration engine for flowing data from batch and streaming sources to analytics systems.

In March the company, founded in 2014 and headquartered in San Francisco, said subscription revenue more than tripled in its recently completed fiscal year.


Top Executive: CEO Andy Palmer

Tamr is targeting enterprise data unification tasks with its machine learning-based master data management system.

In January the company, founded in 2013 and based in Cambridge, Mass., joined Accenture's life sciences partner ecosystem, through which its software will be used by Accenture clients for drug discovery and scientific research.


Top Executive: CEO Ajay Kulkarni

Timescale develops the TimescaleDB, a time-series SQL database based on the PostgreSQL database, for processing time-series data.

In January the company, founded in 2015 and based in New York, released an enterprise edition of its software and announced that it had raised $15 million in Series A1 financing.

Unravel Data

Top Executive: CEO Kunal Agarwal

Unravel Data markets an application performance management platform designed for big data applications, simplifying the planning and management of business-critical data applications and maintaining their performance and reliability.

In January Unravel Data secured $15 million in Series B financing. The company was founded in 2013 and is headquartered in Palo Alto, Calif.

Waterline Data

Top Executive: CEO Kailash Ambwani

Waterline provides an AI-driven enterprise data catalog system that data management professionals use to discover, inventory, govern and rationalize data in an organization's data lakes.

In February Waterline Data debuted the 5.0 edition of its software that leverages the company's newly patented "fingerprinting" and automated tagging technology, providing for faster and easier discovery of information amongst vast amounts of data.

The company was founded in 2013 and is based in Mountain View, Calif.

Yellowbrick Data

Top Executive: CEO Neil Carson

Yellowbrick Data has developed an all-flash, small-factor data warehouse appliance designed to compete with more complex, more expensive data warehouse systems. Yellowbrick is particularly targeting owners of aging Netezza data warehouse systems with a next-generation alternative.

The company, founded in 2014 and based in Palo Alto, Calif., emerged from stealth mode just last August and announced that it had received $44 million in Series A funding. In February Yellowbrick hired database luminary Brian Bulkowski as the company's chief technology officer.