Amazon Cloud Outage Highlights Need For Transparency
Andrew R. Hickey
The Amazon cloud outage that rocked the industry and called into question the reliability of cloud computing has reinforced the need for cloud computing providers to be open and up front with their clients, while also reaffirming that cloud computing support options need to be revisited.
Amazon's Elastic Compute Cloud (EC2) and Relational Database Service (RDS) suffered downtime and interruptions starting last Thursday and continuing through Sunday, an outage that resulted in intermittent downtime and sluggishness for a host of sites using Amazon Web Services' widely popular cloud computing services. The Amazon cloud outage, in some instances, persisted for days.
As of Monday, Amazon said it had the vast majority of its customers back up and running at full steam, with a small fraction of customers whose data
"We have completed our remaining recovery efforts and though we've recovered nearly all of the stuck volumes, we've determined that a small number of volumes (0.07% of the volumes in our US-East Region) will not be fully recoverable. We're in the process of contacting these customers," Amazon wrote on its AWS Service Health Dashboard shortly after 4 p.m. Eastern Monday.
Amazon is still searching for the root cause of the outage, during which data was getting "stuck" in its Elastic Block Storage (EBS) service in its North Virginia data center.
Aside from the Service Health Dashboard updates, Amazon has been tight-lipped about the outage. The company has not responded to CRN's request for comment and has not issued an official statement detailing the issue. Amazon's lack of communication around the downtime have soured some users and solution providers, who say success in the cloud hinges on transparency and lack of dialogue could force some customers to jump ship.
"Amazon has been extremely quiet around how the failure occurred and how it will be avoided in the future," said Joseph Coyle, CTO for North America for global solution provider Capgemini. "We need to remember that this was not a system wide outage so although it is a major hit to Amazon, I believe that if they explain the issue and how to avoid it they can hold back the damage. That being said, there will be some fallout from companies that have the funds available to either move to another vendor or consider bringing things back in house. But the bulk of Amazon’s clients will not be going anywhere due to the favorable cost structure of the Amazon cloud and the complexity of migrating out."
Paul Burns, president of Neovise, a cloud computing research and analysis firm, said he doesn't see the outage sparking a mass exodus away from Amazon's cloud, but it could prompt some users to investigate other cloud options and providers and weigh different cloud vendors. He said the Amazon cloud outage highlighted the need for cloud providers to communicate.
"They really need to communicate during the outage," Burns said. "There have been a lot of complaints about Amazon's lack of communication during this outage. But cloud providers also have a lot of responsibility to ensure their infrastructure is available. Amazon customers expect individual servers to go down more often than in their own datacenters. But they sure didn't expect a widespread, multi-day outage. They understand that is a possibility now and can take some steps."
NEXT: Cloud Outage? Who Do You Call?
Many cloud solution providers said the Amazon outage is a cautionary tale and customers will stay put while entertaining the idea of making the leap to another cloud provider. Still, some Amazon customers are fed up with the lack of communication and have brought their outrage to Amazon user forums. One user called the service updates throughout the outage inadequate.
"Also a note to the Amazon customer service team. If you are going to post something, posting [sic] something meaningful or some sort of useful bit of information not just regurgitated and vague automated replies that just makes us even more confused and hopeless. Understand that this is a major outage and lots of us are losing money because of the downtime and VERY LIKEY to switch cloud providers once this is resolved!"
And while the outage, and the subsequent radio silence from the cloud provider, hasn't sparked a mass exodus off of Amazon and to its competitors, some industry experts said it could open the door for other vendors swoop in and poach unhappy Amazon customers.
"This particular outage is a huge opportunity for other vendors," said Brian Fino, managing director of Fino Consulting, a New York-based consulting firm. "It really drives home the fact that there are a lot of cloud vendors out there. Those who may have been on the fence about Amazon may reconsider."
The lack of a response from Amazon will also prove beneficial for cloud solution providers who are able to offer cloud services and support that a company like Amazon cannot, answering the pressing cloud support conundrum. Fino said solution providers will be able to play a bigger role in helping clients select which cloud vendors fit best in their environments. It will also educate users to ask cloud support questions up front.
"They'll want to know what happens. Who do I call if there's an outage?" he said.
Michael Kirven, co-founder and principal of New York-based cloud solution provider Bluewolf, said Amazon's outage raise the issue of cloud support and cloud users require a provider they can reach out to with questions and concerns, even if there isn't an outage.
"They need to partner with a firm they can get on the phone and talk through issues with," Kirven said, adding that "If you're going to stake your infrastructure on it, you need a throat to choke."