Databricks Looks To Disrupt Legacy Database Space With New ‘Lakebase’ Offering

Transactional database technology “hasn’t actually changed that much in the last 40 years” and is inadequate for today’s operational AI applications and agents, Databricks co-founder and CEO Ali Ghodsi said Wednesday during his keynote at the company’s Data + AI Summit.


Databricks has launched a fully managed Postgres database for building data-intensive applications and AI agents, throwing down a challenge to long-time database leaders such as Oracle and Microsoft.

The new Databricks Lakebase offering, unveiled Wednesday during CEO and co-founder Ali Ghodsi’s keynote address at the Databricks Data + AI Summit in San Francisco, adds an operational database layer to the Databricks Data Intelligence Platform that the company said meets the need for fast, reliable data by today’s data applications, AI agents, recommendation engines and automated workflows.

In launching Lakebase, Ghodsi took aim at “traditional transaction” databases such as the Oracle Database, Microsoft SQL Server, the MySQL database (also owned by Oracle) and others. Databricks, in the Lakebase announcement, said operational OLTP (online transaction processing) databases underpin every application and today are a $100-billion-plus market.

[Related: If You Build Them: Databricks To Launch New Workflow, AI Agent Development Tools]

In his keynote, Ghodsi said Lakebase and other newly announced Databricks products are a way to “bring data and AI to as many people on the planet as possible. The reality on the ground is that it’s still really hard to succeed with data and AI.”

He noted that data today is spread across cloud and on-premises systems, is composed of both structured and unstructured data, is managed using customized data ETL (extract, transform and load) tools, and analyzed using multiple business intelligence tools within an organization.

“The data estate is fragmented. This creates complexity within most organizations,” Ghodsi said. “This is why everything moves slowly within organizations.”

For the Lakebase unveiling, legacy transaction databases were the focus of the CEO’s speech.

“If you look at these databases, they were really built for a different era. Technology in transactional databases hasn't actually changed that much in the last 40 years,” Ghodsi said.

The CEO said a major reason for this is that “data is so sticky” and once an organization has implemented a transactional database and loaded data into it, “it's just nearly impossible” to move off of it or move data to a new database.

“And once you lock yourself into that, those vendors don't need to really innovate. They don't need to challenge the status quo because you can't go anywhere,” Ghodsi said.

Along with hindering analytics and AI, legacy databases are also a roadblock to developing AI capabilities into today’s operational applications, he said.

“If you look at all the different categories of software that exists today, whether it's CRM, ERP, human capital management, contracting and so on and so forth, it doesn't require that much imagination to see how AI is going to completely disrupt each one of these. AI is going to be infused into these.”

Databricks executives also make the argument that operational and analytical systems need to converge to reduce latency between AI systems and to provide enterprises with current information for making real-time decisions.

While analytics systems have evolved, “OLTP databases are kinda stuck in the past,” said Reynold Xin, Databricks co-founder and chief architect, joining Ghodsi on stage. They are slow, difficult to provision and difficult to scale, “and so are fairly disconnected from modern day developer workflows,” he said.

Lakebase Details

Lakebase, currently in public preview, is being used by some 300 Databricks customers. It is based on the open-source Postgres database technology Databricks acquired through its $1-billion acquisition of database startup Neon. While that deal was just announced last month, Ghodsi said Databricks was an investor in Neon and had been working with the startup prior to the acquisition.

Lakebase incorporates a data lakehouse architecture that Databricks says is more flexible that legacy databases and separates compute and storage for independent scaling. It’s cloud-native architecture reduces latency and supports high concurrency and high availability needs, according to the company. It automatically synchs data to and from lakehouse tables and is integrated with Databricks Apps development capabilities and Unity Catalog.

Databricks also says Lakebase is designed for AI and agent development with its “unique branching capability” that enables low-risk development for both programmers and agent-based development.

“We actually feel like this is how databases should be built in the future,” Xin said. “And our prediction is that every other database, every other transactional database, will evolve towards this architecture in the coming years.”

“We think this is going to be the future of all databases,” Ghodsi said.

During a press conference following the keynote, a CRN reporter asked Ghodsi how he thought Lakebase would rate with Databricks channel partners given that some work with other database vendors and have invested in their database technologies.

“For this era, you need a different type of database,” Ghodsi said, maintaining that many businesses and organizations are “eager to get off of those old databases. I think we're going to see many more of those projects.”

And that, he said, will grow an ecosystem of VARs, systems integrators and consulting partners to help businesses and organizations undertake those data migration projects.

“I think it's a massive opportunity,” Ghodsi said. He noted that database migration projects, especially transactional databases, are traditionally very difficult but that data migrations from OLTP databases to Lakebase will be easier. He added that partners will also benefit from the elimination of what he called vendor lock-in.

But he added that this transition won’t happen overnight. “I think this is going to be a long marathon and we’re taking the first step on this marathon.”

Databrick’s overall goal of improving data pipelines and data quality for data analytics, AI and generative AI tasks resonates with Hugh Burgin, principal of the AI and data consulting practice at IT services giant EY, a Databricks Global Elite level partner that develops solutions and accelerators on the Databricks platform.

In an interview with CRN Burgin said technologies like those offered by Databricks to improve data management and data quality is a boon for EY when it undertakes data analytics and AI projects for clients. “The biggest value, the biggest ROI opportunity, is to access that AI-ready data across the enterprise,” he said.

Burgin said EY takes the position that between 80 and 90 percent of the dollars spent on AI projects should be spent on the underlying data infrastructure. “It’s a big opportunity for us,” he said.

Agent Bricks Demonstrated

In addition to the Lakebase announcement, Databricks, as expected, also unveiled Agent Bricks, a new unified workspace for building high-quality AI agents. Some of Wednesday’s nearly three-hour keynote session was devoted to detailing and demonstrating the capabilities of Agent Bricks, which is currently in beta.

“These are production agents [that] auto-optimize on your data,” Ghodsi said onstage. Agent Bricks addresses several common problems around AI agents today including difficulty assessing their performance, choosing which techniques to best optimize agents, and balancing agent quality against the costs of developing agents and putting them into production, the CEO said.

“We think that with Agent Bricks, it is both easy to use [and] also able to produce incredibly high-quality agents,” said Hanlin Wang, Databricks CTO for Neural Networks, also speaking on stage during the keynote.

Databricks also announced that Databricks Apps, initially unveiled in October 2024, is now generally available. Databricks Apps, according to the company, makes it easy to build secure and governed data intelligence applications deployed directly in Databricks and governed through the company’s Unity Catalog data governance software.

Ghodsi said that even before being generally available, Databricks Apps has been adopted by 2,500 customers who used it to develop more than 20,000 applications on the Databricks platform.

The company also unveiled a free edition of the Databricks Data Intelligence Platform targeting universities, as well as individuals including students, hobbyists and “aspiring professionals.” Databricks also pledged a $100 million investment in global data and AI education which in a press release the company said is aimed at “closing the industry-wide talent gap and preparing the next generation of data and AI engineers, data analysts and data scientists.”

Databricks also announced that it is open sourcing its core declarative ETL framework as Apache Spark Declarative Pipelines. That follows the recent launch of Apache Spark. 4.0.