Amazon S3 Users Exposing Sensitive Data, Study Finds

A review of publicly visible content on Amazon's S3 storage service found that some sensitive data may be publicly accessible and could contain data used in a future network attack, according to Rapid7, which conducted the study. Misconfiguration issues are common when users set up the S3 service, exposing data that would otherwise likely be deemed private, the firm said.

Boston-based vulnerability management vendor Rapid7 conducted an analysis of nearly 13,000 Amazon S3 buckets and found 2,000 were publicly available. The researchers gathered a list of more than 126 billion files, and a random sampling found 40,000 publicly visible files, many of which contained sensitive data, the firm said.

"This is ultimately a misconfiguration issue," said Tod Beardsley, engineering manager for Metasploit, the penetration tool maintained by Rapid7. "The surprise here was that it wasn't just regular people doing this; it was enterprise-level IT pros and third-party contractors who manage your S3 presence for you."

[Related: 10 Must-Have Apps For Cloud Storage ]

Sponsored post

Security experts have identified data leakage as one of the top cloud security threats. Rapid7 said the most sensitive data it found exposed on the S3 service was account credentials. Most of the data exposure was associated with database backups and website files that stored database passwords and other sensitive information.

Amazon's popular S3 storage service is used to store server backups, company documents and Web logs, among other data. Files are organized into buckets, which are accessible at a predictable URL, Beardsley said. Access controls are applied in a two-tiered method to both the bucket and individual files and directories stored in them. Public buckets will list all files and directories contained in it to any user that asks, according to Rapid7. Sometimes a bucket can be publicly available, but the files are password protected; however, Rapid7 said sensitive information can still be exposed through the file names themselves.

"When you have this two-tiered authentication system, that in itself is complex, and complexity breeds security problems," Beardsley told CRN.

The firm said 1 in 6 buckets were left open to anyone attempting to access them. Many of the files were images. Rapid7 discovered personal photos posted to social network services, sales records and account information for a large car dealership, unprotected database backups containing site data and encrypted passwords, and employee personal information and member lists across various spreadsheets.

Rapid7 said many of the files were PHP source files, some of which contained database usernames, passwords and API keys. PHP source files are associated with websites powered by PHP and are commonly targeted by attackers trying to gain database passwords, Beardsley said.

The most common exposure, the firm said, was log backups stored in public buckets that were globally accessible. Log files and service data exposed sensitive details about an organization and its customers

NEXT: Working To Remediate

In addition, Rapid7 said it found more than 200,000 CSV files, which included personal information such as names, email addresses and phone numbers. CSV files are typically a plain text version of an Excel file used by administrators as an easy way to make tables without having a database, Beardsley said.

Organizations are also implementing password management wrong, storing passwords in plain text or with only one layer of encryption, which can be easily broken with a password cracker, Beardsley said.

Researchers have warned about the dangers of mistakenly exposing sensitive data on the service in the past. Two years ago, independent researcher and penetration tester Robin Wood published a tool, automating the process of crawling S3 and checking for enterprise-created public buckets and the information contained in them.

Beardsley said Rapid7 is working closely with Amazon to improve its documentation, address the configuration problems and limit the amount of exposed data. Amazon is also reaching out to firms that had sensitive data identified in the Rapid7 study, Beardsley said.