Windows Azure Cloud Outage: Service Restored

Windows Azure Cloud Outage: Service Restored

Microsoft's Windows Azure cloud suffered a service outage but has been restored. Microsoft's sister cloud, Office 365, was not impacted.

As Microsoft's Office 365 cloud prepares for a major February 27 upgade, sister cloud Windows Azure suffered an outage but service has been restored. Cloud critics may raise new questions about cloud reliability and availability but overall Talkin' Cloud believes the major public clouds are far more reliabile than traditional on-premises servers.

According to a Microsoft (NASDAQ: MSFT) blog entry from Steven Martin, general manager, Windows Azure Business & Operations, dated Feb. 24:

"Windows Azure Storage experienced a worldwide outage impacting HTTPS traffic due to an expired SSL certificate.  HTTP traffic was unaffected but the event impacted a number of Windows Azure services that are dependent on Storage.  We executed the repair steps to update the SSL certificate on the impacted clusters and availability was restored to >99% worldwide by 1:00 AM PST on February 23.  At 8:00 PM PST on February 23, we completed the restoration effort and confirmed full availability worldwide."

Microsoft plans to offer credits to customers impacted, and the company is performing a root cause analysis to gather more information about what went wrong.

Azure, Amazon Web Services, Rackspace and other big cloud services providers (CSPs) continue to suffer outages from time to time. And the Azure outage comes just as Microsoft is ramping up to launch major enhancements on Office 365, a sister cloud that runs  Exchange, SharePoint, and Lync Online.

While I don't have hard data at my fingertips, I suspect CSPs continue to offer better reliability and availablility than on-premises servers. As I've frequently noted, comparing public cloud outages vs. on-premises outages is a bit like comparing plane crashes to car crashes.

Plane crashes are extremely rare but everyone hears about them because the outcomes can be terrifying and impact communities of people. Car crashes are local, isolated events that don't earn the worldwide spotlight because they are frequent and rarely impact communities of people. 

Despite that reasoning, Microsoft will need to raise its cloud game. Amazon is widely viewed as the market leader, Rackspace just cut pricing, and Hewlett-Packard has been promoting aggressive SLAs (service level agreements) for the HP Public Cloud.

Hide comments


  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.