Microsoft's Windows Azure Hit With Global Compute Performance Glitch


Microsoft's Windows Azure cloud infrastructure-as-a-service ran into compute and management problems on Wednesday that prevented users around the world from moving code from Azure's staging environment into production.

On its Windows Azure status dashboard, Microsoft classified the issue as a "performance degradation" and not a "service interruption."

The issues began at 7:35 p.m. PST Tuesday when Microsoft, in a bulletin to its Azure status dashboard, warned of "partial performance degradation" in its North Central U.S., South Central U.S., North Europe, Southeast Asia, West Europe, East Asia, East U.S. and West U.S. regions.

[Related: Microsoft Adds Backup And Recovery To Windows Azure Public Cloud]

A few hours later, Microsoft said the compute issues had begun to affect Azure's compute service management functions in all of these regions. It recommended users hold off on using Azure's Swap Deployment function, which developers use to move code between Azure's staging and production environments.

"Swap Deployment is a virtual IP address swap between the staging and production deployment environments for a service," and it's used for moving code from test/dev into production, a Microsoft spokesperson said in an email.

Microsoft certainly isn’t the only cloud vendor to suffer service outages. In an industry where "five-nines" (99.999 percent uptime) means your service can only be down for five minutes per year, any cloud outage tends to get a ton of scrutiny.

But while the glitch did not affect running applications or Azure's compute service, The Register said the fact that it was global in nature is "likely to damage confidence in Microsoft's ability to manage services at scale."

Amazon Web Services has had its share of cloud outages, too, including a massive one in April 2011, and one last December that took down Netflix for several hours on Christmas Eve. But as noted by GigaOm, while Amazon's U.S. East facility had experienced issues in the past, AWS hasn't seen them hit multiple global regions.

In this case, Microsoft said it has fixed the Azure compute and service management issues by 3:45 a.m. PST Thursday, the spokesperson told CRN.

Another service management issue flared up on Thursday morning at 10:45 a.m. PST that prevented users in the West U.S. region from deploying new hosted services, but Microsoft said it had fixed that in a 12:20 p.m. PST dashboard update.

Microsoft has recently added a bunch of new features aimed at luring more enterprises to the Azure public cloud, including backup and recovery and Windows Azure HDInsight, a cloud-based Hadoop service developed in partnership with Hortonworks.

PUBLISHED OCT. 31, 2013

See the latest cloud technologies, learn best practices, and interact with your peers at the channel’s first all-inclusive cloud event: NexGen Cloud Conference & Expo, December 4-5, 2014 at the San Diego Convention Center. Register now at  www.NexGenCloudCon.com