Microsoft services suffer downtime following failed wide-area network update

Microsoft services suffer downtime following failed wide-area network update

Posted on

Microsoft Corp. customers were none too pleased today after the company suffered a widespread outage that resulted in services including Azure, Teams and Outlook being unavailable for nearly three hours.

The outage resulted from a planned update to the Microsoft Wide Area Network that started at 2 a.m. EST. According to an Azure status update, “customers experienced issues with networking connectivity, manifesting as network latency and/or timeouts when attempting to connect to Azure resources in Public Azure regions, as well as other Microsoft services including Microsoft 365 and PowerBI.”

Microsoft addressed the issue by rolling back the change implemented in the WAN update. Azure services were restored by 4:35 a.m. EST, with other Microsoft cloud services restored around the same time.

The exact issue that caused the outage, aside from it being the scheduled update, was not disclosed. The Microsoft 365 team and others at Microsoft described the issue as a “networking issue.”

Following the outage, Microsoft committed to do a follow-up, including producing a preliminary “Post Incident Review.” The review will cover the initial root cause and repair items. A final review, which will include a deep dive into the incident, will be completed within 14 days.

“The Microsoft service outage is a more common event than many realize,” Alex Hoff, co-founder and chief product officer at network management software company Auvik Networks Inc., told SiliconANGLE. “For most organizations, changes to the network occur daily or weekly, and the IT team doesn’t always have complete visibility into those changes.”

Hoff noted that documentation of network changes and configurations are often incomplete or have a significant lag time in getting up to date. “This makes it far more difficult for IT teams and network managers to pinpoint and correct issues when the network goes down.” he added.

Matthew Hodgson, chief executive officer of secure messaging platform Element, highlighted that this wasn’t the first time Teams has gone down, forcing businesses to fall back on cumbersome email — except that this time, Outlook also failed.

“One of the biggest problems with using centralized platforms like Teams is that when it goes down, you have put all your eggs in one basket: Your critical conversations have been held hostage in a single system, with a single point of failure,” Hodgson explained.

Photo: Georgetown University

Show your support for our mission by joining our Cube Club and Cube Event Community of experts. Join the community that includes Amazon Web Services and CEO Andy Jassy, Dell Technologies founder and CEO Michael Dell, Intel CEO Pat Gelsinger and many more luminaries and experts.

Source link

Leave a Reply

Your email address will not be published. Required fields are marked *