Overnight AWS Outage Reminds World How Important AWS Stability Really Is

Amazon Web Services, the world's largest public cloud provider, suffered a rare outage in Monday's early-morning hours, a service disruption that took offline many popular websites.

The problems seemed to originate at an AWS Data Center in northern Virginia, where the AWS status page listed a range of errors.

Amazon's status page reported "increased error rates" for its Elastic Compute Cloud, EC2, and "elevated errors" for its Simple Storage Service, known as S3, between 12:08 and 3:40 a.m. Pacific time. Those two AWS workhorse services were not working for many customers in that time, according to partner accounts and reports on Twitter.

[Related: The 10 Biggest Cloud Outages Of 2015 (So Far)]

id
unit-1659132512259
type
Sponsored post

Amazon's status page also documents problems with a number of AWS services, including Elastic Beanstack, MapReduce and Elastic Load Balancing, all emanating from the northern Virginia data center.

While Amazon reported the S3 service experienced trouble in its US Standard region "due to a configuration error in one of the systems that Amazon S3 uses to manage request traffic," US Standard workloads are often routed to the Northern Virginia facility.

The Elastic Compute malfunction caused increased API error rates for RunInstances, used to launch instances, and CreateSnapshot, which is used to store EBS volumes in S3. A later message said the problems affected user metrics till after 7 a.m.

Amazon referred CRN to the status page without providing further comment.

Some AWS partners told CRN they experienced disruptions of their customer workloads; others said they hadn't noticed any problems.

Kevin RisonChu, director of systems and infrastructure at MirumAgency in San Diego, a developer of digital media services built atop AWS, said good design practices prevented problems.

"We build out our clients’ environments with high-availability and redundancy in mind so absent a complete regional outage, we’re fairly well shielded from most outages," RisonChu told CRN.

The outage reminded the world of the importance of Amazon's platform with reports of so many online services being disrupted.

Irony: couldn't post a screenshot of this morning's AWS S3 outage to our group chat, because Flowdock's images are hosted on S3.

/**/ /**/

Amazon investigating major outage, GitHub and Heroku report issues

/**/ /**/

Other Twitter users noted the rarity of AWS disruptions these days.

Ah, it's kind of nice to be up in the middle of the night dealing with a major AWS outage. Feels like old times.

/**/ /**/

And one user, still expecting Amazon to meet its SLA, found the silver lining.

Silver lining to 3-hour S3 outage: Since AWS claims 99.99% uptime, there'll be no downtime for the next 3 years! That's how it works, right?

/**/ /**/

PUBLISHED AUG. 10, 2015