Netflix’s new Chaos Engineering push aims to hire staff to help break its cloud-based system

Netflix on Thursday detailed a new engineering initiative dubbed Chaos Engineering, which is built on the premise that unforeseen bugs could shut down Netflix’s services so the company must be fully prepared in case of emergency.

As Netflix’s infrastructure has gotten more complex since the 2010 creation of its original Chaos Monkey tool that tests the stability of its Amazon Web Services system, the company decided to that it needs to better prepare itself when servers fail or outages occur. With its Chaos Engineering initiative as a possible remedy, Netflix wants to hire engineers whose sole purpose is to constantly scout for errors in its massive distributed infrastructure while developing new tools that can figure out how resilient Netflix’s infrastructure really is.

While this may sound similar to the company’s Simian Army of tools developed by Netflix to simulate failures in its AWS environment in order to better prep for the actual emergencies, Chaos Engineering takes it a step further by having an actual methodology in which engineers are personally in charge of running simulations of the Netflix environment and looking for ways to break the system.

Besides creating new tools that simulate crashing the systems so Netflix can figure out possible real-world examples of what may go wrong on a given day, the Chaos Engineering team will also be responsible for creating better system designs for Netflix’s infrastructure to handle unforeseen failures.

From the Netflix blog:
[blockquote person=”Netflix” attribution=”Netflix”] Ideally distributed systems are designed to be so robust and fault-tolerant that they never fail. We must anticipate failure modes, determine ways to inject these conditions in a controlled manner and evolve our reliability design patterns. Anticipating such events requires creativity and deep understanding of distributed systems; two of the most critical characteristics of Chaos Engineers.[/blockquote]

Netflix plans on detailing its research in upcoming months as its Chaos Engineering team solidifies.

Post and thumbnail images courtesy of Shutterstock user Twin Design.