When big IT goes after big data on the smart grid

This article originally appeared on GigaOM Pro, our premium research service (subscription required).

With many utilities facing the task of storing petabytes of smart meter data for as long as seven years in order to satisfy regulatory requirements, the ability to house and leverage the massive load of data accumulating from the smart grid is a significant IT challenge. And it isn’t a competency that many utilities have had historically.

I’ve written before about how utilities are transforming into IT companies, as they must learn to manage and mine increasing amounts of data. The first task for all utilities engaged in smart meter deployments has been to process smart meter readings to ensure integrity of billing, a market known as meter data management systems (MDMS). Two of the key players in that market, Ecologic Analytics and eMeter, were snapped up by Landis+Gyr and Siemens respectively over the past four months.

That market has been fairly staked out at this point, and has relatively finite growth potential because there are only so many utilities who will need a MDMS. But the opportunity to do something with that data is a crucial big data challenge and potentially a much bigger market because tools to mine data can continue to evolve to solve problems for utilities and save them money.

Stepping into the fray have been established IT companies. EMC made a strategic investment in smart grid networking company Silver Spring Networks in December and data warehousing and analytics firm Teradata has partnered with global meter maker Itron. Companies like EMC and Teradata are quick to point out that existing MDMS software platforms for processing meter data have none of the high performance data processing engines needed to effectively manage and mine data.

EMC, for example, is relying on its Greenplum analytics platform to make an impact on utilities’s bottom line. The company has been using its reputation as a leader in storage warehousing to make the case to utilities that its hardware and software integration is best suited to help utilities manage its big data problem. And the core of that message has been that utilities need to evolve beyond MDMS.

“Once the data is collected by an MDMS, the fundamental problem with utilities is they’re trying to apply analytics to the MDMS on an application that is already performance bound,” said Thomas Price, the Global Sales Director for the Utility Industry at EMC.

The fact that Silver Spring Networks was already deploying Greenplum internally for its utility clients was one of the reasons that drove EMC’s investment. Price added that Silver Spring was one of the only smart grid providers he’s dealt with that is using Hadoop to develop approaches to unstructured data on the grid, thinking in terms of what IT analytics tools can be brought to bear on the petabytes of data accumulating.

And what could an analytics engine specifically built to aggregate, correlate and mine disparate data, ranging from meter readings to weather data, actually do for a utility? For starters, a high performance data engine should increase the speed with which meter readings are batched, a step toward providing real time data. Reports are that data warehousing software not optimized for the smartgrid can take as long as 12 hours to turn over a four hour block of smart meter readings whereas Price says Greenplum will do it in 15 minutes.

But much more interesting is the possibility that analytics can save utilities money by solving problems like “non technical loss,” or people stealing power. We don’t usually think about the possibility that a pot farm is stealing power but power theft is an actual problem for utilities. A couple percent in places like the U.S., but almost 20 percent of power is stolen in countries like Brazil. Analytics engines pick up unusual patterns like correlating the amount of power going out the door versus what’s getting billed in order to see where anomalies exist. Major Canadian utility, BC Hydro, has issued a request for proposals solely focused on theft prevention.

Other applications include using analytics to figure out the source of an outage on the grid, rather than rolling a truck to the end user reporting the outage, which is costly and inefficient. Additional applications revolve around addressing power quality issues like low voltage spots on the grid.

In our conversation, Price noted that EMC is employing 25 data scientists with PhDs in math and statistics, to correlate data and write algorithms specific to industries like the energy sector. The fact that a $60 billion data storage and IT company is employing its software know how to the smart grid tells us that just ingesting and warehousing that data will not be enough going forward. Rather, the real opportunity isn’t installing the meter or verifying the meter read, but helping utilities use that data to build better businesses.