Big data software developer Dremio has debuted a new release of its self-service data platform that the startup says accelerates access to disparate data sources and learns to adapt to changing analytical workloads.
The new Dremio 2.0 also offers support for Looker, a popular business intelligence and analytics software tool.
Providing links between BI tools like Tableau, Qlik and Microsoft's Power BI, and the ever-growing range of disparate data sources such as data lakes, NoSQL data stores and relational databases, is one of the biggest challenges in big data. Developing those links can be a complex, time-consuming process that can sink a business intelligence project.
Dremio, founded in 2015 and based in Mountain View, Calif., exited stealth last year, developing software that links business intelligence tools and the underlying data sources from which they extract data for analysis.
Dremio's platform provides a SQL interface to data, even if it's in a system that's not SQL-based, such as a data lake running on Hadoop or a NoSQL database.
At the core of the platform is the company's Data Reflections technology that's based on the Apache Arrow cross-language development platform for columnar in-memory data. Data Reflections also includes a SQL execution engine, a data catalog, and data curation and data technologies.
Dremio 2.0 includes a number of significant technical enhancements to Data Reflections including the ability to automatically detect – and accelerate queries for – star and snowflake schemas (and combinations known as "starflake") and other variations of joined datasets within a database.
"This is a big advantage for users in lowering the costs of deploying and administering Dremio for these kinds of applications," said Kelly Stirman, Dremio chief marketing officer and vice president of strategy, in an interview. "It opens the door to making more data available to users without putting all that data into a proprietary data store."
Data Reflections now supports cloud data lakes, including those running on Microsoft Azure Data Lake and Amazon S3. A new management engine within Data Reflections improves its scalability 100-fold, according to the company. And it incorporates recent improvements in Apache Arrow that reduce query latency.
Also new in Dremio 2.0 is the Dremio Learning Engine, a combination of artificial intelligence and machine learning technology that analyzes usage patterns and boosts workload processing through recommended joins, schema learning and predictive metadata caching.
"It makes the system smarter over time and easier to use," Stirman said.
The 2.0 release is available now in both a commercial Enterprise Edition and an open-source Community Edition.
Currently most of Dremio's sales are direct, although it does work with systems integrators who implement Dremio as part of larger big data and business analytics projects. Stirman said. The CMO said the company could consider recruiting additional types of solution provider partners as it ramps up sales.