New Databricks Offering Targets Next-Generation Data Streaming
The data and AI platform developer is now marketing its new Zerobus Ingest software as an alternative to legacy message-based software for real-time and near-real-time data movement.
Databricks is stepping up the ability of its data and AI platform to rapidly process streaming data—a critical capability as the growing wave of artificial intelligence applications and agents being deployed is increasing the demand for near-real-time data ingestion.
This week Databricks announced the general availability of Zerobus Ingest, a fully managed, serverless service that streams data directly from data sources, such as operational manufacturing systems, financial trading applications, or telemetry from cybersecurity tools and IoT devices, into a data lakehouse.
Databricks is marketing Zerobus Ingest, part of the Lakeflow Connect capabilities within the Databricks Data Intelligence Platform, as an alternative to such technologies as the Apache Kafka data event streaming platform.
[Related: Databricks Reports $5.4 Billion Revenue Run Rate As It Closes A $7B Investment Round]
“We started noticing that our customers were building more and more real-time data use cases,” said Bilal Aslam, Databricks senior director of product management, in an interview with CRN. But he said many were assembling “super complicated” IT architectures using software such as Kafka or Apache Flink.
Aslam estimated that between 30 percent and 40 percent of Databricks’ customers have either real-time or near-real-time data streaming use applications.
Databricks said Zerobus Ingest can achieve sub-five-second latency while supporting thousands of concurrent clients, delivering data up to 100 MB/second per connection for more than 10 GB/second of aggregate throughput into a single Delta table within a data lakehouse. (Delta tables are the data table format for Delta Lake, the open-source data storage framework for data lakehouse systems developed by Databricks.)
In addition to offering higher data processing performance and lower latency, Databricks says Zerobus Ingest reduces cost and complexity by eliminating the need for messaging buses like Kafka and improves data security by reducing or eliminating the amount of time data in transit sits outside of centralized data governance systems.
Zerobus Ingest was designed to benefit from the data governance capabilities of Databricks’ Unity Catalog, which provides unified data governance across the Databricks platform, Aslam said.
“This was built from the ground up as modern, serverless architecture,” he said of Zerobus Ingest, contrasting it with older message-based systems like Kafka that “were not designed with serverless compute in mind” and so require a lot of supporting infrastructure and hands-on data governance.
The Partner Perspective
Databricks works with system integrator and solution provider partners and Aslam said Zerobus Ingest offers them a range of value propositions and opportunities, the most obvious being the faster time-to-value to implement Databricks-based systems for real-time and near-real-time analytical applications.
“It’s minutes to hours of development, not weeks and months,” he said.
The new product also creates opportunities for partners to propose modernization projects for their clients around legacy cybersecurity and IoT systems.
“Now partners can take high volume sensor telemetry without having to build a lot of edge infrastructure. That’s huge,” Aslam said. “We see growth in cybersecurity, so getting the security events and logs into the lakehouse so that you can do real-time anomaly detection, real time queries and threat detection.”
“I think this opens up a pretty large [sales] funnel for partners,” he said.