Microsoft has apologized to customers for major outages to its Lync and Exchange cloud services earlier this week, and says it has fixed the glitches that caused them.
"We have a full understanding of the issues, and the root causes of both the Exchange Online and Lync Online services have already been fixed," Rajesh Jha, corporate vice president of Office 365 Engineering, said Thursday in a post to Microsoft's Office 365 user forum.
On Tuesday, Microsoft's Exchange Online cloud email service experienced problems that rendered the service inaccessible for up to nine hours for some customers. These issues led to "prolonged email delays" for emails being sent and received from some customers' organizations, Jha said in the forum post.
The Exchange Online outage was caused by an "intermittent failure in a directory role that caused a directory partition to stop responding to authentication requests," which caused a "small set of customers" to lose email access, according to Jha.
"Our recovery strategy was two pronged: 1) We partitioned the mail delivery system away from the failed directory partition and 2) directly addressed the root cause for the failed directory partition. In addition to fixing the root cause trigger, we are working on further layers of hardening for this pattern," Jha said in the forum post.
Jha described the Exchange issue as "unique" and said this is what led to Microsoft's "prolonged" recovery time. However, Jha reiterated that the outage only impacted a small set of customers.
Many customers affected by the Exchange outage voiced their frustrations on the Office 365 user forum and on Twitter. Some were confused that Microsoft's Service Health Dashboard, an online tool that shows customers whether or not Office 365 apps are available, didn't indicate anything was wrong.
Jha acknowledged that this was due to a separate glitch in the SHD's publishing process, which has since been fixed.
On Monday, Microsoft's Lync Online instant messaging and VoIP service went down for some users for up to eight hours. Jha said that the outage was caused by "external network failures" that triggered a "brief loss of client connectivity" in Microsoft's North America data centers.
"Even though connectivity was restored in minutes, the ensuing traffic spike caused several network elements to get overloaded, resulting in some of our customers being unable to access Lync functionality for an extended duration," Jha said in the forum post.
The issues with Lync were unrelated to the ones that took down the Exchange Online service, Jha said.
NEXT: One Microsoft Partner Weighs In On Office 365 Outages