Page 3 of 3
Amazon also vowed to improve its communication with customers during a massive service outage it suffered in the U.S. in April, an incident that cast a dark shadow over the cloud giant.
"Communication in situations like this is difficult," Amazon wrote. "Customers are understandably anxious about the timing for recovery and what they should do in the interim. We always prioritize getting affected customers back to health as soon as possible, and that was our top priority in this event, too. But, we know how important it is to communicate on the Service Health Dashboard and AWS Support mechanisms."
While Amazon said it communicated more frequently during the Dublin outage than during prior bouts of downtime, there is still room for improvement.
"First, we can accelerate the pace with which we staff up our support team to be even more responsive in the early hours of an event," Amazon wrote. "Second, we will do a better job of making it easier for customers (and AWS) to tell if their resources have been impacted. This will give customers (and AWS) important shared telemetry on what's happening to specific resources in the heat of the moment. We’ve been hard at work on developing tools to allow you to see via the APIs if your instances/volumes are impaired, and hope to have this to customers in the next few months. Third, as we were sending customers recovery snapshots, we could have been clearer and more instructive on how to run the recovery tools, and provided better detail on the recovery actions customers could have taken. We sometimes assume a certain familiarity with these tools that we should not."
For the Dublin cloud outage, Amazon said it will provide a 10-day credit equal to 100 percent of usage of EBS volumes, EC2 instances and RDS instances affected, and customers that were impacted by the EBS software bug will receive a 30-day credit along with access to Amazon's Premium Support Engineers via the AWS Support Center. The credits will be automatically applied to customers' next AWS bill, Amazon said. Microsoft also issued a credit to BPOS customers affected by the Dublin cloud outage.
"Last, but certainly not least, we want to apologize," Amazon wrote. "We know how critical our services are to our customers' businesses. We will do everything we can to learn from this event and use it to drive improvement across our services. As with any significant operational issue, we will spend many hours over the coming days and weeks improving our understanding of the details of the various parts of this event and determining how to make changes to improve our services and processes."