The Coolest Big Data Management And Integration Software Of The 2019 Big Data 100

Part 3 of CRN's 2019 Big Data 100 looks at the companies you need to know that provide software and tools for big data management and integration.

Middle Management

Top-level business analysis tools may be the software products that data "consumers" most often see. But underlying those applications are the database software and data integration, preparation and management tools that are the real guts of every big data system. So it's no surprise the developers of such products make up nearly half of this year's CRN Big Data 100 list.

As part of the 2019 Big Data 100, we've put together a list of companies that provide database systems and tools for managing, integrating and transforming software for many big data initiatives.

This week CRN is posting the Big Data 100 list in a series of slide show for vendors of software for business analytics, data management and integration, data science and machine learning, and big data systems and platforms.


Top Executive: President and CEO Rohit De Souza

Actian provides a number of data management, integration and analytic database products, including the legacy Ingres database and the new Actian Vector high-performance columnar analytic database and the Actian Avalanche cloud data warehouse, which just debuted in March.

Actian was acquired by HCL Technologies and Sumeru Equity Partners in July 2018 for $330 million.


Top Executive: CEO Ash Ashutosh

Actifio develops copy data virtualization technology that's used for a range of data-as-a-service, hybrid and multi-cloud data management, data migration, compliance, application development, and backup and recovery tasks.

In February Actifio inked an OEM partnership with IBM under which IBM will use Actifio's patented Virtual Data Pipeline technology as the foundation for its IBM InfoSphere Virtual Data Pipeline product.


Top Executive: CEO Satyen Sangani

The Alation Data Catalog software finds and indexes all of an organization's data, even across disparate systems, to improve the effectiveness of business analysis tasks.

In January Alation raised $50 million in Series C funding, money the company will use to accelerate development of the Alation Data Catalog. The company reported that revenue in 2018 grew at a triple-digit rate.


Top Executive: CEO Chris Lynch

AtScale has developed a data warehouse virtualization platform that utilizes the company's Universal Semantic Layer technology to connect business analytics software to any number of on-premises and cloud data sources as though they were a single system.

In December AtScale secured $50 million in Series D financing, funding the company has applied to expanded product development and accelerated sales and marketing efforts.

In June of 2018 the company named former Vertica CEO Chris Lynch to be its new CEO. Co-founder Dave Mariani, who had been CEO, is now chief strategy officer, overseeing the company's development efforts.


Top Executive: CEO Prat Moghe

Cazena offers its Data-as-a-Service platform as a way to simplify big data workloads in the cloud, including business intelligence, DevOps, machine learning, data engineering and cloud migrations.

In April Cazena was awarded a U.S. patent for "Intelligent Analytic Cloud Provisioning" technology used to iteratively provision cloud resources to achieve optimal cost performance for analytic workloads.


Top Executive: CEO Jay Kreps

Confluent markets the Confluent Platform, based on the Apache Kafka technology for managing streaming, event-based data, along with additional products such as the KSQL streaming SQL engine that enables real-time data processing against Kafka.

Confluent was founded by the original developers of Kafka. In January the company raised $125 million in Series D funding.

In April the company struck a strategic partnership with Google Cloud to help customers move streaming data workloads to the Google Cloud Platform.


Top Executive: President and CEO Matt Cain

Couchbase provides the Couchbase Server NoSQL database, one of a number of companies with next-generation NoSQL database products who position themselves as alternatives to traditional relational database systems.

Couchbase offers a cloud-native edition of its data platform, as well as the N1QL query language, Couchbase Analytics, full-text search, event processing and development tools.


Top Executive: CEO Ali Ghodsi

Databricks, founded by the creators of the Apache Spark analytics engine for big data, develops the Databricks Unified Analytics Platform that combines data science and data engineering to handle all data analysis processes.

In February Databricks raised an impressive $250 million in Series E funding, bringing the company's valuation to $2.75 billion. The company also said it had experienced a three-fold increase in subscription revenue in the final quarter of 2018 and its annual recurring revenue had reached $100 million in 2018.


Top Executive: CEO Billy Bosworth

DataStax's data management software lineup is built on the Apache Cassandra open-source NoSQL database. The hybrid cloud DataStax Enterprise system provides a broad range of cloud data management, deployment and development capabilities.


Top Executive: CEO Chris Cook

DataOps tech developer Delphix addresses the data sprawl problem with the Delphix Dynamic Data Platform, which provides a way for businesses and organizations to connect, virtualize, secure and manage data in the cloud or in on-premises environments.


Top Executive: CEO Tomer Shiran

Dremio, which launched in 2017, develops an open-source Data-as-a-Service platform that the company says makes data engineers more productive and data consumers more self-sufficient.

Dremio reported in March that its annual recurring revenue had increased 10-fold since 2017 and enterprise customer acquisition grew five-fold.


Top Executive: President and CEO Ed Boyajian

EnterpriseDB provides software and services based on the PostgreSQL open-source database, including the EDB Postgres Platform, which is positioned as an alternative to the Oracle Database with its Oracle compatibility features. Other products include tools for database management, backup and recovery functions, and data migration and integration. EnterpriseDB Postgres Platform 11 debuted in December.


Top Executive: CEO Adam Famularo

Erwin, which developed data modeling technology in its early days, has expanded into new data management areas with a primary focus on data governance with its Erwin Edge system for enterprise modeling, data cataloging and data literacy.

In September the company acquired meta data management technology developer AnalytiX DS to strengthen its data governance and management portfolio.

Erwin, which has had several owners through its history, was acquired from CA Technologies in 2016 by a private equity firm and now operates as an independent company.


Top Executive: CEO George Fraser

As more businesses move their data warehouse operations to cloud systems like AWS Redshift and Snowflake Computing, they need better ways to collect business data from multiple sources and feed it to those cloud systems. Fivetran offers an integrated data pipeline that replaces older ETL (extract, transform and load) systems to support cloud data warehouses.

In December Fivetran raised $15 million in Series A financing and said it had experienced a three-fold increase in revenue in the previous 12 months and doubled its customer base.


Top Executives: Co-CEOs Brian Platz and Flip Filipowski

Fluree has developed a data management platform, based on powerful graph database technology, that businesses use to develop custom blockchain applications. The company was founded in 2017 and the first production release of the software became available in August 2018.


Top Executive: CEO Asaf Somekh

Iguazio launched its Continuous Data Platform in 2017, providing a system that ingests, enriches and analyzes data from a wide range of disparate sources, simplifying the development and deployment of data-driven applications.

Iguazio launched its inaugural channel program in early 2018 in a bid to globally recruit VAR, system integrator and OEM partners.

Imanis Data

Top Executives: Co-CEOs John Mracek (pictured) and Nitin Donde

Imanis Data provides data management and machine learning technology, used in conjunction with Hadoop and NoSQL databases and cloud databases like Microsoft Azure Cosmos DB, to ensure data resiliency for petabyte-scale data sets, both on premise and in the cloud. Use cases included data backup and disaster recovery, cloud migration, security, copy data management, and data archiving and recovery.


Top Executive: CEO Evan Kaplan

InfluxData offers the InfluxDB Platform specifically designed for handling time-series metrics and events in use cases such as DevOps, Internet of Things and real-time analytics. Developers can use the software to build next-generation monitoring, analytics and IoT applications.

InfluxData's platform is built on the company's open source time series TICK stack, comprising the Telegraf, InfluxDB, Chronograf and Kapacitor projects.


Top Executive: CEO Anil Chakravarthy

A pioneer in developing data ETL (extract, transform and load) software, Informatica today has a broad portfolio of software that covers integrated Platform as a Service (iPaaS), data integration, data quality and governance, master data management, data security and big data management.

In February Informatica acquired AllSight, a developer of AI-enabled customer data analysis software, providing the vendor with its new Informatica Customer 360 Insights product.


Top Executive: CEO Buno Pati

Infoworks markets a software platform for the creation and ongoing management of big data workflows from source to consumption, simplifying data engineering and DataOps tasks. The system automates data ingestion, transformation and preparation for business analytics, automated data modeling and OLAP cube generation, and manages DataOps and data governance processes.

Buno Pati, who had been the company's chairman, became CEO in February while CEO and co-founder Amar Arsikere became CTO and chief product officer.

Magnitude Software

Top Executive: CEO Chris Ney

Magnitude's software provides unified application data management capabilities, pulling data from operational applications for analysis and providing businesses with insights into their operations.

In September the company unveiled the Magnitude Gateway, a universal data connectivity platform with access to data from more than 100 sources. In December the company deepened its presence in the SAP ecosystem by acquiring Z Option, a developer of tools used to reduce the cost of implementing and using SAP applications.


Top Executive: CEO Michael Howard

MariaDB develops the MariaDB database, an open-source, commercially supported relational database. In February the company, founded by some of the original developers of the MySQL database, launched a new release of MariaDB Enterprise Server with fine-grained auditing, reliable backups for large databases and end-to-end encryption for data at rest.


Top Executive: President and CEO Gary Bloom

MarkLogic develops a NoSQL database for large-scale operational and transactional applications, along with related data integration tools. The company uses the system as the platform for a series of big data solutions for digital transformation, regulatory compliance, mainframe-to-SQL links, and operational data hub.


Top Executive: CEO Matthew Scullion

Matillion provides ETL (extract, transform and load) tools for transforming and moving data from operational applications and other sources into cloud data warehouse systems including AWS Redshift, Google BigQuery and Snowflake Computing.


Top Executive: CEO Nikita Shamgunov

MemSQL has developed a distributed, in-memory, SQL database system, which the company calls the "no-limits database," for running transactional and analytical workloads at scale, both on-premises and in the cloud.


Top Executive: President and CEO Dev Ittycheria

MongoDB's namesake flagship software is a distributed cross-platform, document-oriented database system designed for developing and running high-performance applications. The company also provides its database as a cloud service called MongoDB Atlas.

MongoDB recently struck a deal to acquire Realm, developer of the Realm mobile database and synchronization platform.


Top Executive: CEO Katie Horvath

Naveego markets master data management and data quality management software, which together make up the Naveego Complete Data Accuracy Platform, targeted at helping businesses improve the accuracy of their corporate data.


Top Executive: CEO Emil Eifrem

Neo4j provides the Neo4j Graph Platform including the Neo4j graph database, an ACID-compliant system with native graph storage and processing. (Graph databases treat the relationships between data as equally important as the data itself.) Other offerings include the Cypher graph query language, ETL and data integration tools, and its Neo4j Bloom data visualization software.

Neo4j raised $80 million in a Series E round of funding in November.


Top Executive: CEO Amnon Drori

Octopai has developed an automated, centralized, cross-platform metadata management and data lineage search engine that business intelligence organizations use to quickly discover and govern shared metadata.

Octopai really began showing up on people's radar screens in 2018 due, in part, to the company being named one of 10 cloud-based startups to participate in Microsoft's ScaleUp 2018, a four-month program that provides promising startups with tools, resources, connections, knowledge and expertise to accelerate their growth.


Top Executive: CEO Prakash Nanduri

Paxata markets self-service data integration and data preparation software that's designed to help data consumers and business analysts prepare data for analytical tasks with less reliance on programmers and data scientists.

The spring release of the company's Adaptive Information Platform included Intelligent Automation, a smart algorithmic capability that business analysts use to automate and operationalize complex data preparation projects and flows.


Top Executive: CEO Ash Munshi

Pepperdata develops tools for monitoring and managing the performance of big data infrastructure resources (Platform Spotlight APM) and big data applications (Application Spotlight APM). The software provides instrumentation for big data ecosystems, helping to pinpoint and resolve big data performance bottlenecks.


Top Executive: CEO Ashish Thusoo

Qubole markets a cloud-native, self-service data platform, powered by the Apache Spark engine, for a range of big data tasks including analytics, machine learning, data science and data engineering.

In April Qubole announced an expanded partnership with Google Cloud to make Qubole available on the Google Cloud Platform.

Redis Labs

Top Executive: CEO Ofer Bengal

Redis Labs offers the open-source Redis in-memory NoSQL data base for developing high-performance applications, running high speed transactions and performing real-time analytics.

In February Redis Labs raised $60 million in Series E financing and reported 60 percent revenue growth in its fiscal 2019 ended Jan. 31.


Top Executive: CEO Manish Sood

Reltio develops a cloud-native master data management and Platform-as-a-Service system for organizing enterprise data and running data-driven applications — including “customer 360” applications. Reltio IQ provides predictive analytics capabilities.

In February the release of Reltio Cloud 2019.1 offered new integration, collaboration and globalization capabilities.


Top Executive: CEO Gaurav Dhillon

SnapLogic's integration Platform-as-a-Service (iPaaS) tools are used to connect cloud data sources and data from on-premises and SaaS applications. The SnapLogic Intelligent Integration Platform facilitates data flows between applications, databases, data warehouses, big data streams and IoT networks.

Splice Machine

Top Executive: CEO Monte Zweben

Splice Machine's flagship software is a scale-out SQL relational database, data warehouse and machine learning platform. The company touts the ability of its database to handle both transactional and analytical processing.

The company positions its software as an operational AI data platform and in April the company launched a beta program for its new ML Manager data science and machine learning software. In February the company closed on a $16 million Series B round of financing.


Top Executive: President and CEO Doug Merritt

Fast-growing Splunk develops a portfolio of software used to collect and process machine data for a range of applications including IT management, security, business analytics and IoT.

Splunk just debuted Splunk Business Flow, which makes it easier to visualize business processes, and Connected Experiences, which uses augmented reality and mobile applications to improve access to data.

SQream Technologies

Top Executive: CEO Ami Gal

The SQream DB relational database is specifically designed to run on Nvidia GPU-enabled hardware, creating a SQL data warehouse that's capable of processing and analyzing massive data sets at high speed.


Top Executive: CEO Girish Pancha

StreamSets touts its product, a data operations platform for full life-cycle management of data-in-motion, as the point where DevOps meets data integration. The StreamSets DataOps Platform includes a data integration engine for flowing data from batch and streaming sources to analytics systems.

In March the company said subscription revenue more than tripled in its recently completed fiscal year.


Top Executive: President and CEO Ali Kutay

Striim's platform is an end-to-end streaming data integration and operational intelligence system that can perform continuous query processing and streaming analytics.

Striim recently announced strategic alliances with Snowflake Computing, Google Cloud and AWS.


Top Executive: CEO Josh Rogers

Syncsort calls itself the leader in "big iron to big data" software. The company develops data infrastructure optimization tools and high-speed data sorting and data integration software and services for data residing on multiple systems including mainframes, relational databases, Hadoop and NoSQL databases.


Top Executive: Mike Tuchen

Talend offers a portfolio of data integration, data preparation, data quality management and data governance tools for on-premises and cloud systems. The company recently announced the addition of Pipeline Designer, a new data integration design environment, to Talend Cloud, the vendor's comprehensive integration Platform-as-a-Service (iPaaS) system.


Top Executive: CEO Andy Palmer

Tamr is targeting enterprise data unification tasks with its machine learning-based master data management system.

In January the company joined Accenture's life sciences partner ecosystem, through which its software will be used by Accenture clients for drug discovery and scientific research.


Top Executive: CEO Ajay Kulkarni

Timescale develops the TimescaleDB, a time-series SQL database based on the PostgreSQL database, for processing time-series data.

In January the company released an enterprise edition of its software and announced that it had raised $15 million in Series A1 financing.


Top Executive: CEO Adam Wilson

Trifacta is a leading provider of "data wrangling" tools used to clean up and prepare data to make it ready for use in business analytics tasks. Recently the company launched new Active Profiling and Smart Cleaning tools to make data quality assessment, remediation and monitoring more intelligent and efficient.

Later this year the company will bring more data quality processing to the automation process, part of the company's strategy to expand beyond data preparation to advance Trifacta into a more complete DataOps platform.

Unravel Data

Top Executive: CEO Kunal Agarwal

Unravel Data markets an application performance management platform designed for big data applications, simplifying the planning and management of business-critical data applications and maintaining their performance and reliability.

In January Unravel Data secured $15 million in Series B financing.


Top Executive: President and CEO David Flower

VoltDB is an in-memory, ACID-compliant relational database designed to handle huge volumes of transactions, powering applications that require real-time decisions on streaming data.

In April the company unveiled its Smart Stream Processing Architecture to support fast data strategies within edge computing systems.

Waterline Data

Top Executive: CEO Kailash Ambwani

Waterline provides an AI-driven enterprise data catalog system that data management professionals use to discover, inventory, govern and rationalize data in an organization's data lakes.

In February Waterline Data debuted the 5.0 edition of its software that leverages the company's newly patented "fingerprinting" and automated tagging technology, providing for faster and easier discovery of information amongst vast amounts of data.


Top Executive: CEO Ben Sharma

Zaloni calls itself "the data lake company" with its Zaloni Data Platform for on-premises, cloud and hybrid computing environments. The self-service data system operationalizes data processes from data source to data consumer and automates data management, cataloging and governance tasks.