Google Cloud Data Analytics GM: Goal For Partners Is ‘Everything Simpler’

‘A lot of these friction points are being removed by design,’ Google Cloud exec Gerrit Kazmaier tells CRN in an interview ahead of Next 2022.

Gerrit Kazmaier, Google’s vice president and general manager for database, data analytics and Looker

Some of Google Cloud’s biggest announcements at Next 2022 around data cloud and BigQuery – new BigLake support for Apache Iceberg, BigQuery support for unstructured data and new integrations with MongoDB and Collibra – should make “everything simpler” for Google Cloud data partners.

Gerrit Kazmaier – Google’s vice president and general manager for database, data analytics and Looker – described the company’s partner-driven strategy around data analytics offerings for customers and how opening up the partner ecosystem helps everyone in an interview with CRN before Next.

“It makes everything simpler,” Kazmaier said. “A lot of these friction points are being removed by design. When you are a partner, you need to consider complexity in what you’re offering because you need to bring it to your customer.”

Kazmaier continued: “By taking all of the plumbing out in between our partners, they can really focus on the business outcomes, how they position the value scenario to our customers. They are not required now to go into the depth of the technical integration side, audit for themselves, or take their customers through that.”

He said that Google Cloud has more than 70 systems integrators (SI) with data specialties and more than 170 who have analytics expertise.

Google Next 2022 runs online Tuesday to Thursday with small in-person events.

Miles Ward, chief technology officer of Los Angeles-based Google partner SADA – No. 102 on CRN’s 2022 Solution Provider 500 – told CRN in an interview that customers call BigQuery in particular “the killer app” and “the superpower” for price performance advantages.

Customers have taken work down from weeks to minutes and tens of thousands of dollars “down to dollars.”

“This is a problem that customers will increasingly have, and this is an unbelievable solution to it,” Ward said. “And so the surrounding ecosystem of tools becomes increasingly important and Google‘s making, I think, really important investments.”

Some of the biggest updates from Next 2022 around Google Cloud’s data offerings include new support from Google storage engine BigLake for the Apache Iceberg open-source table format, with upcoming support for Delta and Hudi.

“After extensive work with our partners and customers, it became clear that actually partners leverage open source file formats because it gives them independence and flexibility,” Kazmaier said. “Customers increasingly understand data as an important asset so they want to have more control over how they manage it, and understand it’s not just something in some cloud storage system, but something that they can actually have control over themselves directly.”

BigQuery has also gained support for unstructured data. Entering preview is new BigQuery support for open-source analytics engine Apache Spark, according to Google.

Google’s data cloud gained new integrations with MongoDB, Palantir, Collibra and Elastic, according to the company. MongoDB will launch new templates for customers to move data between Atlas and BigQuery.

Palantir will use BigQuery as the data engine for Foundry Ontology. And Collibra will integrate with Dataplex for more controls on data stored across all major clouds and on-premises environments, according to the company.

Kazmaier called Google’s partner ecosystem and data technology a differentiator in the competitive cloud and analytics space.

“Everyone has a keen interest to leverage Google‘s differentiated data technology and build their own differentiation based off of that, and it’s a rapidly growing number because we have such a big developer and channel centricity and are making sure that we have APIs for our products and the marketplace concept to support them,” he said.

Here’s what else Kazmaier had to say.

How big is Google Cloud’s data partner ecosystem?

We have more than 800 companies building their apps on our data cloud … from security companies like Exabeam to large teleco(mmunications) conglomerates like Broadcom to Equifax.

Everyone has a keen interest to leverage Google‘s differentiated data technology and build their own differentiation based off of that, and it’s a rapidly growing number because we have such a big developer and channel centricity and are making sure that we have APIs for our products and the marketplace concept to support them. … We are not a services company. We are a product and solutions company.

So we work extensively with global SIs (systems integrators) and small RSIs (regional systems integrators). … We have 74 SIs working with us on our data specialization, 16 in data management – basically, operational database systems. … And then we have a large number, 175 ones who have analytics expertise.

Not a full out-build specialization, but acquired expertise in running these services.

What do you want partners to know about Google’s support of Apache Iceberg?

If you think about it, it has a pivotal importance for the partner ecosystem because it means that if they are supporting that open standard, they are deeply integrating then with our data cloud because we are also committing to the standard. … So they don‘t need to decide anymore if they want to do first-party storage formats on Google.

If they want to use native APIs, they can build their stack around Iceberg, and all of our services will work seamlessly with that. So a big step for us. … We’re going to support Delta and Hudi, two other file formats from that same idea, same ecosystem, making it easier for customers and for partners to pre-integrate these offerings.

Along the same lines … the Spark API, it‘s a technical framework for data engineering, data processing, which used to require a second software stack for that specific one workload.

What do partners need to know about the MongoDB Atlas integration?

If you‘re building an application on MongoDB, if you want to give it intelligence driven out of BigQuery, you don’t have to do the plumbing in between, but get it delivered out of the box from Google itself.

Same with Pantir. They are increasingly moving to Google-native technology. Foundry is their technology platform, and they are going to build this on top of BigQuery now so that a customer who wants to use Palantir as your application platform, you can do this without now requiring your data to be moved out of BigQuery into a second tech stack making the implementation harder and the cost higher for our customers.

Lastly, a very important point, data management used to be a topic that no one really talked as much about in the past. Their systems were mostly on-premises and only selectively in the cloud.

And legislation and regulation was not as focused on data. But it has completely changed 180 degrees to become one of the most important topics for CDOs (chief data officers). … You need to be compliant to various sovereign regulations. You suddenly have a landscape, which potentially goes Google, AWS and on-premises and everything in between.

And still, you need to have an understanding about your data inventory. You need to have clarity on what policies you apply when and where. You need to have strong auditing. You need to have data lineage, and so forth.

And our strategy, partner-led, we start with the Dataplex. We can offer everything that you need on Google. We understand our services deeply. We can pre-integrate them in a way so it‘s really seamless from a customer-experience perspective.

But at the same point in time, we recognize that our customers‘ data landscape is bigger than GCP (Google Cloud Platform) in many cases. And our company … really wants to be that catalog of catalogs to go across cloud and combine all of these systems.

So instead of making this a separate thing, we sat down and said … ‘How about you run on top of Dataplex?’ So you get all of the benefits that we build in APIs for own data management, and you can add additional value on top of that for everything that goes beyond that, be it on-premises, be it on other clouds – living that idea of really driving the partner ecosystem through open APIs.

Channel trends today

Why should partners be excited by these announcements?

It makes everything simpler. A lot of these friction points are being removed by design.

When you are a partner, you need to consider complexity in what you‘re offering because you need to bring it to your customer.

Second of all, you need to think about the skills that actually you need to possess to build an integrated data landscape like that.

And by taking all of the plumbing out in between our partners, they can really focus on the business outcomes, how they position the value scenario to our customers. They are not required now to go into the depth of the technical integration side, audit for themselves, or take their customers through that.

That is one piece that just makes it simpler and removes friction from that process. And the other part is that, by our openness, we extend the catalog from which our channel partners can choose from.

It‘s more support in the marketplace. It’s more offerings that we can support on Google Cloud. So it becomes, for them, just so much more compelling to work with Google Cloud because the set of capabilities get so big for them.

Were partners asking for these changes?

Very much so. If you think about support of Iceberg and open fire formats, we have thought for a long time, what is the best way we can support the ecosystem? Is it through APIs in BigQuery itself, making that more open?

But after extensive work with our partners and customers, it became clear that actually partners leverage open source file formats because it gives them independence and flexibility.

Customers increasingly understand data as an important asset so they want to have more control over how they manage it, and understand it’s not just something in some cloud storage system, but something that they can actually have control over themselves directly.

Are Google Cloud’s data offerings better than those of AWS and Microsoft?

Let me talk about our strengths, and you be the judge on how it compares. For one thing, we believe in simplicity in the sense that there should be a holistic experience in the sense that you‘re not dealing with 20 or 30 different services to get your data landscape together. Preferably, a select set of capabilities.

Our data cloud is really designed around this idea to say it‘s an integrated experience that connects from operational databases to streaming analytics to machine learning and AI (artificial intelligence) to BI (business intelligence) … and doing it in a way so as a customer, as a partner, you’re using capabilities. You‘re not deploying and connecting systems that made that happen.

That‘s one big difference. Because there has been this idea of one use case, one database. That’s a horrible idea. Because the whole idea of data-driven value generation is that you have all of your data together because you want to see these deeply hidden and latent patterns that only emerge once you actually combine your multiple datasets, multiple data aspects together. And you don‘t require someone to rethink all of that.

That‘s the challenge of these integration architectures, that someone has to do something to bring data together and anticipate everyone’s needs. … Because it‘s a great idea, we could combine our call center data, people talk to us, with our marketing data to understand who we should be segmenting in our marketing campaign, to identify who’s happy and who‘s not.

And doing that on a single data backbone without somebody having rethought all of these integrations actually unleashes that creativity that data ultimately is about. You get new ideas, new business models, new experiments that you can run.

So that’s one. And a second piece is that we are fully committed to openness. We are a provider and consumer of open standards. … Apache Iceberg is nothing that Google has invented, but we understand it’s important to the community, so we want to support it.

And last but not least, we differentiate actually by saying we want to have 100 percent partner attachment for all of our customer engagements. So we‘re not looking at partners and asking them, ‘What can you bring to us?’ We look at our customers and think with our partners, how do we put our strengths together in a way so we can get to the best experience?

Bottom line, if you would press me, I would say you cannot be customer focused and not be partner focused at the same point of time. And that‘s one of the things that differentiate us from other vendors.

What do partners need to know about the Collibra integration with Dataplex?

It‘s anticipating what is going to be the next wave of enterprise demand. Right now, everyone is focusing on moving data to the cloud and at the beginning of recognizing how important security and governance and management around this is.

Think about it. If you are a customer of a company and you‘re entrusting them with your data, it needs to be secured to the standard that you are satisfied with. Not someone else, which will be a pretty high bar.

It is just a moment of reckoning in the market that that understanding of securing and managing data reliably, of doing it at scale, of doing it across clouds and on-premises is actually going to be the nucleus … of many data projects in the future.

How is growth in the midmarket going?

Virtually all our products are serverless, which means that they can be deployed at any size. You don‘t need to deploy a large cluster, manage a large system.

Either small or big, you just have, for instance, BigQuery. And it‘s going to work for you if you are someone the size of a Walmart or if you are just a very small small shop with relatively small data processing because the system behind the scenes is fully elastic and fully serverless.

Is Google Cloud’s competitive advantage in data analytics growing?

It’s growing immensely fast. Just to give you a statistic, Google BigQuery, that system every single second … processes per second on average 110 terabytes of data.

Just as a context point, what our customers deployed on our platform in machine learning models has increased by 250 percent year over year. … Our BI products, Data Studio and Looker, have more than 11 million monthly active users. So that gives you an incredible sense of the scale at which these systems are running.

Google built uniquely different systems that are nothing alike because they have been built for the purpose of data processing at scale. And Google has built these systems to solve Google‘s own problems. No one at Google said, ‘Let’s build a data warehouse and sell it.’

Google built a system to run Google. And now, we are saying, ‘Let’s externalize it via GCP.’ And as it so happens, I believe every company is becoming a data company like Google, more or less.

Every company is going to use AI and machine learning at scale. So all of these systems are highly applicable to our customers, highly relevant to them.

And hence, a lot of momentum of departments across the space. Because for them, that‘s also a great opportunity. … I would say that every data model built on these traditional ideas of warehousing by VMs and traditional capacity paradigms, they will always come to the point where the system is bursting at the seams because data growth and data demand will always eclipse. … I can say one thing for certain. All of the customers who run on Google Cloud made a fantastic choice.

What else is Google announcing at Next?

We are also going to add unstructured data support in BigQuery. … Unstructured data has been around, of course, but the challenge now is that once you want to connect unstructured data to machine learning systems, you need to think about how you actually manage unstructured data at scale, similar to the ways that we have managed structured data at scale in the past.

And most of the machine learning platforms out there – at least all of the ones that I know – they are required to replicate your entire data stack again. File store systems, then for your audio and video files, then you have machine learning models, building attributes on top of them – for instance, telling you what‘s on a picture.

And again, fragmenting that out from the rest of your data estate in BigQuery. … Unstructured data support, which means that through the same APIs, for the same processing paradigms that we talked about, you can work with structured and unstructured data at the same point in time.

The reason why we do that is because it‘s ultimately the bridge, the gateway to AI and machine learning systems. … Would everyone (vendors) say, can you do it? Everyone would say, ‘Yes.’ But it … means a 10-month implementation project versus a ‘yes’ where it’s turnkey.

Very important for our partners … SQL translation service, but for services partners. That means that if you want to now move from the legacy data architectures … and you want to move to BigQuery, just give us the SQL content and the schema content, and it‘s going to be translated on the fly in BigQuery for you into our APIs, into our data formats.

So, there is now zero skill required for someone to perform that task intellectually because, for instance, you just take Teradata SQL and you hand it over to BigQuery, and it will just translate it automatically into BigQuery SQL syntax – going to be a major speed up factor for migration projects.

What do partners need to know about Looker Studio and Looker Studio Pro?

Looker Studio is based off of Data Studio, which has been a self-service, dashboarding technology at Google.

And we brought it into an integrated set of capabilities with Looker. So it‘s driven from Looker governance, Looker APIs. It’s run on GCP’s terms of service. And it‘s going to have Workspace-like collaboration features. And it’s going to be a major partner of the BI experience for GCP customers. … Looker Studio is going to be free.

And we‘re going to continue to provide it for free. And Looker Studio Pro is going to have a set of specific enterprise features and specific support guarantees that because of this cost structure that stands behind them, we will have a minimal fee .