Microsoft Apologizes For Exchange, Lync Cloud Outages, But Says Only Small Number Of Customers Affected

Microsoft has apologized to customers for major outages to its Lync and Exchange cloud services earlier this week, and says it has fixed the glitches that caused them.

"We have a full understanding of the issues, and the root causes of both the Exchange Online and Lync Online services have already been fixed," Rajesh Jha, corporate vice president of Office 365 Engineering, said Thursday in a post to Microsoft's Office 365 user forum.

On Tuesday, Microsoft's Exchange Online cloud email service experienced problems that rendered the service inaccessible for up to nine hours for some customers. These issues led to "prolonged email delays" for emails being sent and received from some customers' organizations, Jha said in the forum post.

The Exchange Online outage was caused by an "intermittent failure in a directory role that caused a directory partition to stop responding to authentication requests," which caused a "small set of customers" to lose email access, according to Jha.

Sponsored post

[Related: After Office 365 Cloud Outages, Partners Calling For Better Communication From Microsoft ]

"Our recovery strategy was two pronged: 1) We partitioned the mail delivery system away from the failed directory partition and 2) directly addressed the root cause for the failed directory partition. In addition to fixing the root cause trigger, we are working on further layers of hardening for this pattern," Jha said in the forum post.

Jha described the Exchange issue as "unique" and said this is what led to Microsoft's "prolonged" recovery time. However, Jha reiterated that the outage only impacted a small set of customers.

Many customers affected by the Exchange outage voiced their frustrations on the Office 365 user forum and on Twitter. Some were confused that Microsoft's Service Health Dashboard, an online tool that shows customers whether or not Office 365 apps are available, didn't indicate anything was wrong.

Jha acknowledged that this was due to a separate glitch in the SHD's publishing process, which has since been fixed.

On Monday, Microsoft's Lync Online instant messaging and VoIP service went down for some users for up to eight hours. Jha said that the outage was caused by "external network failures" that triggered a "brief loss of client connectivity" in Microsoft's North America data centers.

"Even though connectivity was restored in minutes, the ensuing traffic spike caused several network elements to get overloaded, resulting in some of our customers being unable to access Lync functionality for an extended duration," Jha said in the forum post.

The issues with Lync were unrelated to the ones that took down the Exchange Online service, Jha said.

NEXT: One Microsoft Partner Weighs In On Office 365 Outages

Several Microsoft partners told CRN earlier this week they think Microsoft should have provided more frequent updates about the outages. By not doing so, Microsoft put partners in a tough position with their own customers.

Microsoft is aware of this and Jha offered an apology to those affected. "I want to apologize on behalf of the Office 365 team for the impact and inconvenience this has caused. Email and realtime communications are critical to your business, and my team and I fully recognize our accountability and responsibility as your partner and service provider," he said in the forum post.

Reed Wilson, founder and president of Palmetto Technology Group, a Greenville, S.C.-based Microsoft partner, told CRN he's satisfied with Jha's explanation.

"It makes sense. My only issue was the communication cadence," Wilson said. "I believe Microsoft is continuing to work on these processes. They know they can do better and are always open to feedback."

Microsoft claims that Office 365 is the fastest-growing business in its history and is now on a $2.5 billion annual sales rate. And for the past couple of years, Office 365 has had a sterling track record for uptime.

Microsoft's service level promises 99.9 percent uptime per quarter, which amounts to just under nine hours of downtime. But during the first calendar quarter, Office 365 saw 99.99 uptime, according to Microsoft.

On the other hand, Microsoft is trying to get more partners selling Office 365, and its response to this week's cloud outages could give pause to ones that are looking at doing so.

The Office 365 outages will no doubt be a topic of discussion at Microsoft's annual Worldwide Partner Conference, being held from July 13-16 in Washington, D.C.