Are We Getting Too Outage-Sensitive?

As Gigaom reported last week, an AWS outage took down Netflix on Christmas Eve.  Not a good time, considering the number of people at home and streaming.

“GigaOM’s Janko Roettgers reported, Amazon’s US east facility reported issues with its Elastic Load Balancing service that carried over into Christmas morning.”

It was interesting that Amazon Prime Instant Video streaming service, which competes head-on with Netflix and which also runs on AWS, appeared to be unaffected by the US East outages.  Pretty sure that was not the plan.

These outages are concerning, however, I’m not sure there is much productivity in getting too wrapped up in the fact that cloud providers have to use physical computers, drives, and networking devices and thus they fail at times.  Just look at the number of times your internal enterprise systems also experience outages.  I suspect it’s more than AWS, Rackspace, Microsoft, or the other larger providers.

AWS is learning how to manage through these outages, and how to get better at operations as a result.  I’m sure there will be many more outages this year, and next year.  You just need to build those types of events into your cloud service usage and operations planning.  I just don’t see the point of “handwringing” over each of these outages.