AWS, February 28
This was the outage that shook the industry.
An Amazon Web Services engineer trying to debug an S3 storage system in the provider's Virginia data center accidentally typed a command incorrectly, and much of the Internet – including many enterprise platforms like Slack, Quora and Trello – was down for four hours.
The post-mortem said the employee was using "an established playbook," and intended to pull down a small number of servers that hosted subsystems for the billing process. Instead, the accidental command resulted in a far broader swath of servers being taken offline, including one subsystem necessary to serve specific requests for data storage functions and another allocating new storage.
The outage from a provider that owns roughly a third of the global cloud market reignited debate on the risks of public cloud.