Amazon Probes Cause Of Cloud Collapse, Some Issues Remain9:35 AM EST Mon. Apr. 25, 2011
Amazon Web Services Monday searched for the cause of the massive Amazon cloud outage that crippled a host of its customers' Web sites late last week and into this weekend.
Amazon's Elastic Compute Cloud (EC2) and its Relational Database Service (RDS) suffered widespread outages and service disruptions starting Thursday.
As of Sunday, most of its cloud customers and services were back on track, Amazon indicated on its AWS Service Health Dashboard, which offers updates on AWS cloud performance levels. Amazon said a number of its accounts suffered "stuck" data in its Elastic Block Storage (EBS) service causing poor site performance and downtime. EBS appears to be the main failure point that created the issue.
"As we posted last night, EBS is now operating normally for all APIs and recovered EBS volumes," Amazon wrote Sunday at 10:35 p.m. Eastern. "The vast majority of affected volumes have now been recovered. We're in the process of contacting a limited number of customers who have EBS volumes that have not yet recovered and will continue to work hard on restoring these remaining volumes."
Amazon said it is working to find what caused the outages and interruptions that plagued its North Virginia data center and rattled popular sites like Foursquare, HootSuite, Quora and Reddit, plus a host of others. Amazon customers reported sluggish site performance or total outages throughout the ordeal.
"We are digging deeply into the root causes of this event and will post a detailed post mortem," Amazon wrote.
Amazon cloud customers who fell victim to the cloud outage expressed anger and confusion on a host of Amazon forums as the disruption moved into the weekend.
Amazon asked customers that are "still having issues related to this event" who have not been contacted by Amazon to create request service.
Amazon has not responded to CRN's requests for additional information.