Amazon Cloud Outage Highlights Need For Transparency


The Amazon cloud outage that rocked the industry and called into question the reliability of cloud computing has reinforced the need for cloud computing providers to be open and up front with their clients, while also reaffirming that cloud computing support options need to be revisited.

Amazon's Elastic Compute Cloud (EC2) and Relational Database Service (RDS) suffered downtime and interruptions starting last Thursday and continuing through Sunday, an outage that resulted in intermittent downtime and sluggishness for a host of sites using Amazon Web Services' widely popular cloud computing services. The Amazon cloud outage, in some instances, persisted for days.

As of Monday, Amazon said it had the vast majority of its customers back up and running at full steam, with a small fraction of customers whose data

"We have completed our remaining recovery efforts and though we've recovered nearly all of the stuck volumes, we've determined that a small number of volumes (0.07% of the volumes in our US-East Region) will not be fully recoverable. We're in the process of contacting these customers," Amazon wrote on its AWS Service Health Dashboard shortly after 4 p.m. Eastern Monday.

Amazon is still searching for the root cause of the outage, during which data was getting "stuck" in its Elastic Block Storage (EBS) service in its North Virginia data center.

Aside from the Service Health Dashboard updates, Amazon has been tight-lipped about the outage. The company has not responded to CRN's request for comment and has not issued an official statement detailing the issue. Amazon's lack of communication around the downtime have soured some users and solution providers, who say success in the cloud hinges on transparency and lack of dialogue could force some customers to jump ship.

"Amazon has been extremely quiet around how the failure occurred and how it will be avoided in the future," said Joseph Coyle, CTO for North America for global solution provider Capgemini. "We need to remember that this was not a system wide outage so although it is a major hit to Amazon, I believe that if they explain the issue and how to avoid it they can hold back the damage. That being said, there will be some fallout from companies that have the funds available to either move to another vendor or consider bringing things back in house. But the bulk of Amazon’s clients will not be going anywhere due to the favorable cost structure of the Amazon cloud and the complexity of migrating out."

Paul Burns, president of Neovise, a cloud computing research and analysis firm, said he doesn't see the outage sparking a mass exodus away from Amazon's cloud, but it could prompt some users to investigate other cloud options and providers and weigh different cloud vendors. He said the Amazon cloud outage highlighted the need for cloud providers to communicate.

"They really need to communicate during the outage," Burns said. "There have been a lot of complaints about Amazon's lack of communication during this outage. But cloud providers also have a lot of responsibility to ensure their infrastructure is available. Amazon customers expect individual servers to go down more often than in their own datacenters. But they sure didn't expect a widespread, multi-day outage. They understand that is a possibility now and can take some steps."

NEXT: Cloud Outage? Who Do You Call?