Thursday, 1 March 2012

Microsoft Azure outage - Leap Year Related Issue

Yesterday wasn't just the day of an increased amount of proposals taking place, the Microsoft Azure Cloud Services platform suffered a major outage as well. Estimations of the outage say that it was down for at least 8 hours for some curstomers.

According Data Center Knowledge, the Microsoft Azure outage may have been caused a date glitch with the 29th of February and security certificates. Bill Laing seems to confirm that the leap year date caused the issue. A fix was deployed to solve the issue and was rolled out during the day to resolve the issue. There are varying opinions of how long the outage lasted but Bill indicates that normal service was resumed to most regions after 10 hours.

Cloud outages will be a point of frustration. Last year Amazon EC2 experienced  downtime as a power outage at a Dublin facility caused Amazon cloud services to be affected. With a push to move services to cloud based services, the high availability and the communication to customer is key for service providers. Amazon and Microsoft both face a battle of keeping customers informed and restoring services when an outage occurs.