The personalized web is just an interest graph away

You know how our social graphs are creeping into every aspect of our web lives, from search results to coupons? Well, get ready for something a lot more personal, a lot more targeted and, perhaps, a lot more creepy.

Much as social graphs are maps of our social media connections that follow us across the web, interest graphs are maps of our interests. Some companies want them to follow us across the web, too, meaning that wherever we go, there we are. There’ll be no more need to search through news sites for the stories we want, or shopping sites for the products we want, because the site will know as soon as we hit its system who we are and what we like.

Whether you’re fascinated or appalled by the idea of interest graphs, here’s a taste of how they might work.

1. Figure out what you like

I recently discussed the idea of interest graphs with Gravity CTO Jim Benedetto, who described how his company determines visitors’ interests so its content-industry customers can deliver personalized experiences. Gravity does that by tracking a user’s activity across the Gravity network of sites to try and get a true sense of a user’s interests so it can feature that content automatically whenever that user visits a Gravity-powered site. An example on the Gravity Labs site explains the process of building a user graph in some detail.

Technologically, Gravity accomplishes this via a big-data engine that sits above expansive data sets such as Freebase and DBpedia and helps its system determine what that user meant when he clicked on a particular article or made a certain comment; what his actual interest is. Describing the system last week in a post about software patents, I called it “a system that can determine with high accuracy that a person tweeting about Vanessa Laine (Los Angeles Laker Kobe Bryant’s ex-wife), for example, is probably more interested in basketball than about Laine’s date of birth or other accurate but irrelevant information.”

Actually, though, Benedetto said that Twitter and Facebook aren’t as good for determining people’s actual interests (and Gravity uses them primarily to figure out what’s generally popular) as are clicks and comments on other sites. He said a high percentage of tweets are from people sharing information that’s important to them professionally, while Facebook is limited largely to sharing information among a collection of people with whom users are often quite close. In either case, he said, “There’s a lot of posturing [to impress others] … that results in incomplete data.”

At MySpace, where Benedetto was senior vice president of technology, he said his team often found people would join groups based more on their feeling toward the person inviting them than on an actual interest in the issue. We’ll sign up to save the whales not because we care so much about whales but because we don’t want to disappoint a friend.

2. Take it with you

“One day,” Benedetto told me, “people will be able to apply their interest graphs on every site across the Internet.”

It’s a compelling vision. Much as our social graphs influence our search results when we use Bing (s msft) or Google (s goog), our interest graphs will influence what we see wherever we go on the web. If you like skiing, expect to see deals on skiing gear when you’re perusing ecommerce sites, or stories about winter sports on news sites. And expect to see this without having to log in, as it will be based on everything you’ve done across the entire web.

If you think about it, we all probably have several disparate interest graphs hanging around already, even if they’re not as advanced as what Gravity is building. The uproar around Google’s personalized search? Based on an interest graph. Amazon’s (s amzn) recommendation engine? Interest graph. Yahoo (s yhoo) News? Interest graph.

The real trick will be in figuring how to connect these disparate graphs to build a single, portable account of who we are and what we’re interested in. Presumably, methods such as OAuth, which enable API-based data sharing among services, will have to play a role. Login once when you get online, and your interest graph is ready to go, ready to make the web a place that serves you rather than a place you must learn to navigate.

3. Rein it in

Of course, for every person in love with the idea of a personalized web-surfing experience, there’s probably one person who considers it a privacy nightmare. There likely are others who want an objective experience not tainted by any system’s assumption about what they want to see. For these reasons, the personalized web will almost certainly have to be an opt-in experience.

Privacy could be a particularly difficult problem to solve. Discussing the issue of the personalized web in a November report for GigaOM Pro (subscription required), I explained the necessity of empowering consumers to decide what data is a part of their digital fingerprint, or interest graph. Most people probably don’t want their affinity for pornography affecting the news they’re shown, or to be reminded of an impending foreclosure everywhere they go just because they’ve done a lot of research on the subject. Site permissions and “do not track” buttons might have to get increasingly granular, and increasingy easy to understand, to ensure the personalized web doesn’t become an enemy.

A true picture of our interests is one thing, but we’ll always need boundaries, especially as our lives get even more connected. Do you want ads for Valtrex popping up while watching Google TV with your new girlfriend? Do you want her popping on your computer to check the news and see wedding planning stories front and center, ruining your romantic proposal? Just as we can have our own rooms in our homes, or can drive across town and shop anonymously, we need pockets of privacy in the personalized web.

If you find the possibilities and pros and cons of a personalized web as fascinating as I do, be sure to get to our Structure:Data conference next week in New York, or at least watch the livestream. We’ll be talking all about the latest techniques for analyzing web and mobile data, and I’ll be speaking with Gravity’s Benedetto and Gibson Dunn partner Ashlie Berlinger about the line between what’s possible with analytics and what’s ethical, or even legal.

Interest graph image courtesy of Gravity; sherpa image courtesy of Ian S.