Close

The world according to Twitter as seen through a high-performance computer

I watched the 2012 elections on Twitter, and I wasn’t alone. Data from the social network proves that out, with more than 31 million election-related tweets recorded that night. But what does that type of activity look like in real time? Using the SGI (s SGI) UV 2000 Big Brain supercomputer at the University of Illinois, two data aficionados were able to generate a heat-map of activity in real time.

Below is the video of their efforts, set to somewhat over dramatic music for an election (maybe if it were 2000, y’all). And here is where you can see the live trending data showing sentiment on Twitter right now.

[youtube http://www.youtube.com/watch?v=oVaBws-3BVs]

The two researchers, Kalev H. Leetaru of the University of Illinois and Shaowen Wang of the CyberInfrastructure and Geospatial Information (CIGI) Laboratory at the University of Illinois at Urbana-Champaign, also tracked tweets related to Hurricane Sandy as part of what they call the Global Twitter Heartbeat Project. Other than cool, real-time heat maps, the effort shows how scientists (and eventually marketers) can use high performance computers to track real-time unstructured data. It is something people can do today. From a release on the project:

The Global Twitter Heartbeat project performs real-time stream processing of ten percent of Twitter’s 400M daily tweets as they are posted. The project analyses every tweet to assign location (not just GPS-tagged tweets, but processing the text of the tweet itself), and tone values and then visualizes the conversation in a heat map infographic that combines and displays tweet location, intensity and tone. With SGI UV, the entire process from data analysis to heat map was produced once per second.

The SG UV 2000 is an impressive machine with a maximum of 4,096 cores that can scale to 64 terabytes of cache-coherent shared memory at a peak I/O rate of four terabytes per second. Presumably one could use another high-performance machine with a lot of parallel processing and fast IO, but Leetaru and Wang apparently had an SGI machine on their desks. In the release, Leetaru likened the process of looking at all this data to peering into a telescope focused on the “post-demographic world” where individuals could be processed directly in the flow of information rather than forcing them into a specific demographic cohort.