CelerData Targets Data Lakehouse Analytics Performance With Latest Release
The launch of CelerData V3 comes on the heels of the company’s decision to turn its StarRocks analytics database over to the Linux Foundation to be supported as an open-source community project.
CelerData is taking aim at the fast-growing data lakehouse space with a new release of its high-performance unified analytics platform that boasts a cloud-native architecture, real-time streaming analytics, and support for open table formats Hudi, Iceberg and Delta Lake.
The new release, CelerData V3, comes just a month after CelerData contributed StarRocks, the MPP SQL database for real-time analytics that CelerData’s software is built on, to the Linux Foundation where it continues as an open-source project.
CelerData’s founders, including CEO James Li, developed StarRocks in 2020 and the company retained the StarRocks name until October 2022 when it incorporated as CelerData. Today the company sells its commercial on-premises CelerData Enterprise and the Celerdata Cloud managed cloud service.
[Related: 10 Tools For Tackling The Big Data Deluge]
“So right now, StarRocks is officially under the governance of the Linux Foundation project,” said Li Kang, CelerData strategy vice president, in an interview with CRN. “The idea was to better support the open-source community since we have more and more contributors from other companies. Outside of CelerData, it makes it easier to contribute to the project.”
CelerData, headquartered in Menlo Park, Calif., targets the Enterprise and Cloud editions of its analytics engine toward high-performance and real-time data analytics tasks. The company takes the position that data lakehouse analytics today remains limited and cost prohibitive and that many query engines struggle to support ad-hoc queries, real-time analytics and large numbers of concurrent users.
With the new capabilities of the latest release of the CelerData system, “We’re providing the flexibility of data lake analytics, with the performance of data warehouse analytics, and adding other real time analytics to the same platform. And all these great features without the cost of [a] cloud data warehouse,” Kang said.
CelerData works with a number of systems integration partners who assemble end-to-end data lake solutions using the CelerData platform and Kang said those partners, who have been working with the updated software, are seeing improved query and analytical performance with the new capabilities in the V3 release.
Key to that is CelerData V3’s integration with open data table formats including Hudi, Iceberg and Delta Lake, making it possible to use the CelerData query engine on data lakes without data ingestion, according to CelerData. Kang said other cloud data warehouse systems offer limited support for open data table formats.
The new release makes real-time streaming analytics possible on a data lakehouse, in contrast to the common practice of creating a separate system for streaming data analytics.
With the new release users have the option to bring data into the CelerData storage format to improve data lakehouse query performance, according to the company. And multi-table materialized views can be created to further improve query performance.
The V3 release, which is slated for availability in early April, provides a cloud-native architecture that leverages cloud object storage for improved reliability and reduced data storage costs, according to CelerData. It also enables better workload and resource isolation for creating different data warehouses for different use cases.
The new multi-table materialized view capability simplifies data pipelines by making it possible to build materialized views from multiple joint base tables to speed up query performance, the company said. And users can now ingest raw data and perform data transformation within CelerData, which also simplifies the data processing pipeline.