Solution Providers Detail Amazon Web Services Outage


An Amazon Web Services power outage Thursday night cut services to customers for about six hours, reviving the issue of the reliability of the cloud.

The Amazon services affected included Amazon Elastic Compute Cloud, Amazon Relational Database Service and AWS Elastic Beanstalk, which are run from Amazon's U.S. East region data centers in Virginia.

The first report of a problem was listed on Amazon Web Services' Dashboard at 8:50 p.m. PST Thursday. Full services were restored by 3:26 a.m. Friday. The company said it still investigating the cause of the outage.

[Related: Cloud Wake-Up Call: Worldwide Survey Shows User Dissatisfaction]

Amazon published periodic status updates throughout the incident, and when the outage was resolved, the company posted a notification.

“The service is now fully recovered and is operating normally,” Amazon wrote. “Customers with impaired volumes may still need to follow the instructions above to recover their individual EC2 and EBS resources. We will be following up here with the root cause of this event.”

Among those affected were cloud managed services and platform providers, including Stratalux, a cloud-based managed services provider based in Santa Monica, Calif.; Digitaria, a San Diego-based digital marketing company; and San Francisco-based Heroku, the cloud platform-as-a-service provider owned by Salesforce.com.

Stratalux lost services for one of its customers for about an hour Thursday, said Jeremy Przygode, CEO and founder of Stratalux. Przygode said the issue stemmed from the Amazon Relational Database Service.

“I like Amazon, but when these things happen you have to drop everything, although I understand these things also happen in traditional IT centers as well,” he said.

Digitaria had 10 clients affected, with service outages ranging from one to four hours, said Kevin Chu, manager of systems & infrastructure.

“I thought AWS's response, which was posted on their Status Dashboard, was better than in the past,” Chu said. “We were getting updates every 30-45 minutes.

"I don't think this outage will hurt AWS's reputation because this outage was isolated to a single Availability Zone," Chu added. "Anyone with experience should know to never put all your eggs in one basket, or in this case, one availability zone. None of our clients' production environments that are configured for multiple availability zones were affected."

NEXT: Heroku Investigates Its Outage


See the latest cloud technologies, learn best practices, and interact with your peers at the channel’s first all-inclusive cloud event: NexGen Cloud Conference & Expo, December 4-5, 2014 at the San Diego Convention Center. Register now at  www.NexGenCloudCon.com