Solution Providers Detail Amazon Web Services Outage

An Amazon Web Services power outage Thursday night cut services to customers for about six hours, reviving the issue of the reliability of the cloud.

The Amazon services affected included Amazon Elastic Compute Cloud, Amazon Relational Database Service and AWS Elastic Beanstalk, which are run from Amazon's U.S. East region data centers in Virginia.

The first report of a problem was listed on Amazon Web Services' Dashboard at 8:50 p.m. PST Thursday. Full services were restored by 3:26 a.m. Friday. The company said it still investigating the cause of the outage.

[Related: Cloud Wake-Up Call: Worldwide Survey Shows User Dissatisfaction ]

id
unit-1659132512259
type
Sponsored post

Amazon published periodic status updates throughout the incident, and when the outage was resolved, the company posted a notification.

’The service is now fully recovered and is operating normally,’ Amazon wrote. ’Customers with impaired volumes may still need to follow the instructions above to recover their individual EC2 and EBS resources. We will be following up here with the root cause of this event.’

Among those affected were cloud managed services and platform providers, including Stratalux, a cloud-based managed services provider based in Santa Monica, Calif.; Digitaria, a San Diego-based digital marketing company; and San Francisco-based Heroku, the cloud platform-as-a-service provider owned by Salesforce.com.

Stratalux lost services for one of its customers for about an hour Thursday, said Jeremy Przygode, CEO and founder of Stratalux. Przygode said the issue stemmed from the Amazon Relational Database Service.

’I like Amazon, but when these things happen you have to drop everything, although I understand these things also happen in traditional IT centers as well,’ he said.

Digitaria had 10 clients affected, with service outages ranging from one to four hours, said Kevin Chu, manager of systems & infrastructure.

’I thought AWS's response, which was posted on their Status Dashboard, was better than in the past,’ Chu said. ’We were getting updates every 30-45 minutes.

"I don't think this outage will hurt AWS's reputation because this outage was isolated to a single Availability Zone," Chu added. "Anyone with experience should know to never put all your eggs in one basket, or in this case, one availability zone. None of our clients' production environments that are configured for multiple availability zones were affected."

NEXT: Heroku Investigates Its Outage

Heroku published its own dashboard outlining its outage. The company said it was investigating.

"There is no finish line when it comes to the trust and success of our customers," Daylan Burlison, director of communications at Salesforce.com, wrote in an email to CRN. "The service is available now, and we’re investigating what happened. Stay tuned to status.heroku.com for the latest or follow @HerokuStatus on Twitter."

Amazon received a barrage of criticism last April when an outage affected customers for more than 24 hours.

Other well known cloud providers have suffered outages as well.

But one analyst said the Thursday outage was not as widespread as last year's and affected a much smaller number of users.

’It [the outage] was not pleasant for anybody who was taken out, but it's not as serious as last year’s outage,’ said Carl Brooks, a cloud analyst for 451 Research. ’It happened in the northeast area, which is the largest and the oldest part of Amazon’s infrastructure, so the law of averages is that something is eventually going to go.’