Obama seeks data scientists for election edge

In 2008, then-Senator Barack Obama famously used social media to build a loyal following of young voters who helped him win the presidency. In 2012, it looks like big data — capturing, processing and analyzing that social media activity, and more — will be a driving force behind the president’s re-election campaign. To that end, President Obama’s data team is taking to the streets to find data scientists and analytics engineers, including at an event Tuesday afternoon at Stanford.

If you look at the numbers, it’s easy to see why President Obama is putting his campaign dollars behind a big data team. As I explained in 2009, Obama’s team did a lot of work around social media to build a large base of supporters, and then undertook an extensive data-integration effort to corral all its data into a single source. By doing this, it was able to, for example, match voter registration information with online information to better personalize the campaign experience for voters. When all was said and done, the Obama for America campaign had amassed more than 13 million email addresses and 5 million online friends, among various other datasets.

Obama for America’s data architect Luke Peterson told me then that the campaign “used data mining and integration to figure out ‘who the cream was on our barrel,’ on whom it should unleash its hordes of volunteers.” He said the campaign’s data-mining effort was its single biggest technology-derived success.

But the data landscape for 2012 is much different. On Twitter alone, President Obama has surpassed the 10-million-follower mark, and more than 23 million people “Like” his Facebook profile. That’s on top of the various other information sources that campaigns typically use, so there’s lots of opportunity for the President’s data team to mine social data alongside traditional data sources to determine where to campaign and how to do it most effectively.

Those large follower counts are why social media data could be such a game-changer for Obama 2012; their sheer scale is unrivaled. The President has essentially built himself a giant focus group that can be sorted, with relative accuracy, by age, geography and other relevant demographic factors. By way of comparison, Mitt Romney has about 93,000 Twitter followers, and Rick Perry — who utilized data very effectively in his last Texas gubernatorial campaign — has about 88,000 followers.

I recently spoke with someone familiar with the analytics efforts of political campaigns, who explained to me just how important analytics have become since the Clinton re-election campaign in 1996. Essentially, he said, Clinton brought about the idea of focusing on swing voters instead of expending resources trying to reach the entire voting base, many of whom were already locked in to one candidate or another. Spearheaded by Karl Rove, the George W. Bush campaigns took targeted campaigning to another level by focusing on swing voters it actually could swing (i.e., non-voting Republicans).

The Bush team, the expert explained, showed how to meld information from voter-registration lists, polls, and sources such as Experian and Axciom to figure out even more about who to target. Voters earning $200,000 a year and driving large SUVs or luxury cars, for example, might not be as receptive to campaign literature about gas prices as would cash-strapped voters driving 15-year-old cars. It’s all about targeting the right groups with the right messages, which requires data experts.

Going forward, I expect all campaigns will have to place greater significance on social media, because it offers the potential to do unprecedented things around personalization and message-targeting. Imagine, for example, attributing a greater weight to a particular poll response because of that person’s demographics, social network or Klout score. Or using a real-time advertising service like 33Across to intelligently place ads based on where certain sects of a candidate’s social graph are likely to visit, instead of just buying banners on certain high-traffic sites or relying on Google (s goog) AdWords.

Even with all the data mining, though, campaigns still place a premium on privacy. As my expert explained, nobody’s sitting around sifting through data on an individual basis and intentionally targeting specific individuals. Campaigns are very cognizant of not using or exposing personally identifiable information or otherwise violating individual privacies, and data security is tantamount. After all, a security breach or the exposing unscrupulous data practices could spell doom for a campaign.