Microsoft AI Research Team ‘Accidentally’ Exposes 38 Terabytes Of Private Data: Wiz

Here’s a look at the fallout from Microsoft’s AI research team exposing 38 terabytes of private data including secrets, private keys, passwords and over 30,000 internal Microsoft Teams messages.


Microsoft’s AI research team “accidentally” exposed 38 terabytes of private data including secrets, private keys, passwords and over 30,000 internal Microsoft Teams messages, according to security researcher and software vendor Wiz.

The exposed private data included a disk backup of two employees’ workstations that occured while publishing a bucket of open-source training data on GitHub, said Wiz.

[Related: CrowdStrike CEO George Kurtz: Microsoft’s ‘Failures’ Put Everyone At Risk]

Sponsored post

Microsoft, for its part, noted that the issue was “responsibly reported under a coordinated vulnerability disclosure” and has already been addressed. Further, Microsoft pointed out that “no customer data was exposed, and no other internal services were put at risk. “

The latest Wiz research report comes just two months after a widely publicized Microsoft cloud email breach in June that impacted U.S. government email accounts. In that case, Wiz CTO Ami Luttwak told CRN that the potential scope of the incident is not restricted only to Microsoft cloud email accounts.

With the Azure AD key, “you can basically impersonate anyone on any service,” Luttwak told CRN. With the longer incident timeline and apparent lack of log evidence, “can we say clearly that this [key] wasn’t used in the last two years?”

It also comes with Microsoft facing intense criticism on its security shortcomings from Crowdstrike CEO George Kurtz. In an interview with CRN, Kurtz said the security trade-offs that come with adopting the Microsoft security stack are not worth it.

“[It’s] death by a thousand cuts,” he told CRN. “It’s the technology which is insecure, which is your zero-day Tuesdays. It’s things like the U.S. government being breached because of Microsoft’s failures. There are only so many opportunities to say, ‘Hey, you get it for free, use it’ when people are saying, ‘Well, you’re putting us at risk.’ And that’s really what we’re hearing from customers—Microsoft is putting them at risk.”

In the case of the Microsoft AI researchers incident, Wiz called out the software giant’s failing as “an example of the new risks organizations face when starting to leverage the power of AI more broadly, as more of their engineers now work with massive amounts of training data.”

Wiz said as data scientists and engineers race to bring new AI solutions to production, the “massive amounts of data they handle require additional security checks and safeguards.”

Wiz discovered the incident as part of its Research Team’s ongoing work on “accidental exposure” of cloud-hosted data. As part of a scan of the internet for misconfigured storage containers, Wiz found the GitHub repository, which was being used to provide open-source code and AI models for image recognition.

David Stinner, founder and president of US itek, a Buffalo, N.Y., MSP and Microsoft partner, said that although he was concerned about the misconfiguration that resulted in exposed data, he was heartened by the fact that it was not a partner or customer facing a services issue.

“This is a tough learning experience for Microsoft’s AI research team,” said Stinner. “This proves that those of us in IT security only need to be wrong once, while the bad actors only have to be right once.”

Microsoft needs to learn from the data exposuree incident and tighten up its security processes, said Stinner. “This all comes down to zero trust and least privileged permissions,” he said. “Had Microsoft followed least privileged permissions this storage container would not have exposed data for any of these secret keys and passwords.”

Stinner said ironically the most troubling aspect of the data exposure is that in the future bad actors will have an easier time finding such open doors with the power of AI. “We are entering a time where the arms race between bad actors using AI and security companies using AI is reaching a critical tipping point,” he said. “Sloppy data governance—as we saw in this case—has the potential to destroy reputations.”

In its research note, Wiz said it’s important to note the storage account involved in the incident wasn’t directly exposed to the public, but was rather a “private” storage account. “The Microsoft developers used an Azure mechanism called “SAS tokens,” which allows you to create a shareable link granting access to an Azure Storage account’s data—while upon inspection, the storage account would still seem completely private,” said Wiz.

Wiz said that SAS tokens pose a security risk because they allow sharing information with external unidentified identities. The Microsoft token involved in the incident was valid until 2051, said Wiz.

“Due to a lack of monitoring and governance, SAS tokens pose a security risk, and their usage should be as limited as possible,” said Wiz. “These tokens are very hard to track, as Microsoft does not provide a centralized way to manage them within the Azure portal. In addition, these tokens can be configured to last effectively forever, with no upper limit on their expiry time. Therefore, using Account SAS tokens for external sharing is unsafe and should be avoided.”