Amazon Breaks Cloud Outage Silence With Apology, Credit


Amazon Web Services broke more than a week of silence Friday in an apology to users who were rocked by last week's Amazon cloud outage; and in an attempt to stop the bleeding, Amazon is giving users of the affected Availability Zone a 10-day cloud services credit.

In Amazon's lengthy mea culpa about the cloud outage, the company said it is sorry for the outage that started early Thursday April 21 and knocked a host of customers' Web sites offline or caused sluggish performance. In some cases, the Amazon cloud outage lasted several days.

"We know how critical our services are to our customers' businesses and we will do everything we can to learn from this event and use it to drive improvement across our services," Amazon said. "As with any significant operational issue, we will spend many hours over the coming days and weeks improving our understanding of the details of the various parts of this event and determining how to make changes to improve our services and processes."

In its event summary, Amazon said the outage was sparked by an error made during a network configuration change. That error led to disruptions and service outages for its Elastic Compute Cloud (EC2) and Relational Database Service (RDS) customers leveraging Amazon's North Virginia data center. A network traffic shift, Amazon said, was "executed incorrectly" and instead of routing traffic to the other router on the primary network, traffic was shifted to the lower-capacity redundant Elastic Block Store (EBS) network. Amazon said the issue caused EBS volumes in the North Virginia Availability Zone to become "stuck" in a "re-mirroring storm." That made the volumes unavailable and created latency and outages.

Along with highlighting what caused the outage, Amazon is giving cloud customers that leverage EBS or run RDS database instances in the Availability Zone that was affected a 10-day credit equal to 100 percent of their usage volumes and instances, whether or not they were affected by the cloud outage.

"For customers with an attached EBS volume or a running RDS database instance in the affected Availability Zone in the US East Region at the time of the disruption, regardless of whether their resources and application were impacted or not, we are going to provide a 10 day credit equal to 100% of their usage of EBS Volumes, EC2 Instances and RDS database instances that were running in the affected Availability Zone. These customers will not have to do anything in order to receive this credit, as it will be automatically applied to their next AWS bill. Customers can see whether they qualify for the service credit by logging into their AWS Account Activity page," Amazon said.

NEXT: Preventing Another Amazon Cloud Outage; Open Lines Of Communication