Google is harnessing machine learning to cut data center energy

Leave it to Google (s GOOG) to have an engineer so brainy he hacks out machine learning models in his 20 percent time. Google says that recently it’s been using machine learning — developed by data center engineer Jim Gao (his Googler nickname is “Boy Genius”) — to predict the energy efficiency of their global data centers down to 99.6 percent accuracy, and then to optimize the data centers in incremental ways if they become less efficient for whatever reason.

Part of Gao’s day-to-day job at Google is to track its data centers’ power usage efficiency, or PUE, which demonstrates how efficiently data center computing equipment is using energy. Traditionally many data center operators were seeing about half of their energy consumed by cooling equipment, but in recent years data center leaders like Google, Facebook(s fb) and others have focused on tools like using the outside air for cooling, or running the server rooms at warmer temperatures, to dramatically cut energy use.

Accuracy vs predicted in Google's PUE
Accuracy vs predicted in Google’s PUE

Google has been calculating its data centers’ PUE every five minutes for over five years, which means Gao had a whole lot of data to work with. The PUE data includes 19 different variables, like cooling tower speed, processing water temperature, pump speed, outside air temperature, humidity, etc. The result is that Google had hundreds of millions of data points to analyze and to put to use to make data center energy operations better.

That size of data is best analyzed by computers. So Gao took an online class about machine learning and started building a model to populate all of this PUE data. Google started testing it and quickly found out it was extremely accurate when it came to predicting the PUE of any of its data centers across the world at any time.

Why is that important? If Google knows what the PUE should be at any time, its engineers can set up alerts to inform them if any data center falls outside of the prediction for whatever reason — ie., there’s probably something wrong going on. Google can also use the model to shave off low-hanging fruit of energy efficiency in the data center. The kind of finely grained analysis that can’t be determined by a human looking at a spreadsheet, but can be detected by software crunching the massive data set. For example, Google can use it to decide how often and the best time to clean data center heat exchangers that both save money and energy. Previously they’d been doing that with some guesswork.

Google PUE

Finally Google can use the machine learning to run simulated tests on the data centers to see how it would affect PUE. Such tests probably won’t be done in an actual environment, because they could have unintended consequences and involve some risks to the system.

Google’s head of data center operations, Joe Kava, says that the company is now rolling out the machine learning model for use on all of its data centers. Gao has spent about a year building it, testing it out and letting it learn and become more accurate. Kava says the model is using unsupervised learning, so Gao didn’t have to specificy the interactions between the data is — the model will learn those interactions over time.

Kava says Google has no plans to turn the machine learning model into a product, but by describing its process in a white paper, others can learn to build their own models. Once built, the model runs on just a server, says Kava, so you don’t need a huge amount of compute power to do this.