Amazon's Not Alone: 10 Notable Cloud Outages In The Last Year
When The Cloud Turns To Vapor
Cloud computing has changed the face of IT. But every once in a while the cloud goes down. It shuts off. It vanishes. Amazon's recent cloud outage has brought to light the impact cloud outages can have. Users expect their cloud apps, platforms and infrastructure to just work. And when they don't, it's a jarring realization.
Here we take a look at 10 recent cloud outages and what caused them.
Amazon Web Services
On April 21, Amazon's Elastic Compute Cloud (EC2) and Relational Database Service (EBS) suffered sweeping outages and service interruptions for customers using Amazon's North Virginia data center, aka Availability Zone. Initially, the Amazon outage took down several popular Web sites like Foursquare, HootSuite, Reddit and Quora, plus a bunch of others. Service hiccups from Amazon's cloud outage persisted for several days, angering users. Amazon's lack of communication around the cloud outage prompted calls for transparency.
While Amazon is still investigating the root cause of the cloud outage, the company said that the issue stemmed from data getting "stuck" within its Elastic Block Storage (EBS) service.
Amazon has not said how many users were affected or how affected users will be compensated for the downtime.
Yahoo! Mail, the search company's cloud-based e-mail service, went down Thursday. Yahoo could not say how many users were impacted when its popular e-mail service was down for several hours.
Yahoo acknowledged the outage on its Twitter feed.
"Yahoo Mail is currently inaccessible to some users. We are working to correct the issue and restore all functionality immediately," Yahoo said.
As of this writing, Yahoo had not said what caused the Mail outage.
Google's widely popular cloud e-mail service Gmail suffered a massive outage in late February 2011 that wiped out thousands of Gmail inboxes. Gmail users awoke to find messages in their Google Gmail inbox, folders and other data vanished. At its peak, the outage affected roughly 150,000 Gmail users.
In the days that followed, Google apologized for the outage, calling it a "scare." Google said a software bug that was introduced by a storage update had caused the downtime.
Google Gmail was back to full service within a few days.
A number of Intuit's hosted services for SMBs were wracked by a string of service outages in late March 2011. The outages, occurred on a Monday, a Tuesday and a Friday, but many users reported issues lasting an entire week. Popular cloud-based Intuit services like QuickBooks Online, QuickBooks Online Payroll and Intuit Payments Solutions conked out during the outages, which were blamed on errors introduced during maintenance operations.
Intuit suffered a similar outage in June 2010.
Microsoft Windows Live Hotmail
Microsoft's Windows Live Hotmail cloud-based e-mail service rang in the new year with an outage that temporarily deleted user inboxes of more than 17,000 users; an outage that persisted for more than four days.
The Windows Live Hotmail outage started on Dec. 30, 2010 and continued into January 2011. Users said they logged into their accounts and noticed that e-mails, folders and other data had vanished and could not be recovered. Microsoft said it had fixed the problem by January 2, but users reported issues for two days later.
Microsoft blamed the Hotmail outage on a load balancing issue between Hotmail servers.
Popular Internet voice and video calling service Skype was knocked offline for several hours in late December 2010. The Skype outage affected millions of users of Skype's Web-based phone and video calling service. The outage lasted about three hours before Skype services slowly returned to normal. Skype blamed the outage on a lack of "supernodes," computers that act as phone directories on Skype's network. A number of supernodes failed because of a software issue, Skype said.
Known for sporadic outages, usually for minutes at a time, Twitter suffered several outages throughout much of June 2010, spates of downtime that could continue through July.
At the time, the cloud-based micro-blogging monster blamed the outages on major world events like the World Cup 2010, saying that global interest boosted activity to unsustainable levels.
Trouble started on June 11 when Twitter suffered poor site performance and a host of errors due to high capacity. Issues persisted on Monday, June 14 with several hours of ups and downs. Twitter has said periodic outages could continue through the beginning of July, but that it is making internal network adjustments in hopes to avoid future problems.
Hosting.com's New Jersey data center was taken down on June 1, 2010, igniting a cloud outage and connectivity loss for nearly two hours. The outage resulted in degraded and knocked out cloud services. Once service was back up and running, Hosting.com said the connectivity loss was due to a software bug in a Cisco switch that caused the switch to fail.
The Planet was rocked by a pair of network outages that knocked it off line for about 90 minutes on May 2, 2010. The outages caused disruptions for another 90 minutes the following morning and affected the operations of a number of customers hosted in The Planet's Houston and Dallas data centers. Investigation found that the outage was caused by a fault in a router in one of the company's data centers. Then, the next morning, a separate, unrelated outage took out The Planet, causing disruptions for customers in Houston and Dallas. That disruption was blamed on a circuit issue between Dallas and Houston.
On April 26, 2010, NetSuite suffered a service outage that rendered its cloud-based applications inaccessible to customers worldwide for 30 minutes. Some customers experienced sluggish performance long after that half-hour. NetSuite blamed a network issue for the downtime.