Splunk wants to webify big data

IT analytics company Splunk has received a patent for its method of organizing and presenting big data to mirror the experience of browsing links on the web. The patent validates Splunk’s unique approach to the problem of analyzing mountains of machine-generated data and hints at a future where writing big data applications doesn’t require a Ph.D.

Splunk began as a simple IT search company to let systems administrators easily peruse log files, but Co-Founder and CTO Erik Swan said the goal has always been bigger. Between the filing of the patent five years ago and now, Splunk has been transforming its product to fit its vision of creating what Swan calls a navigable space linking one event to another using “what effectively look like hyperlinks.” Essentially, he explained, Splunk wants users to think about big data like a web problem and not like an analytics problem. And it wants to transform its product from an indexing engine into an application engine.

Ideally, the result of Splunk’s efforts is that even web developers can use its products to extract meaningful business insights from machine-generated data. Traditionally, writing big data applications and making sense of the results requires what have come to be called data scientists, but Swan said that’s in part because tools like Hadoop present results as CSV files. Splunk, on the other hand, turns data into HTML. It’s not about algorithmic horsepower, he explained, but about learning how to move around within the web of data.

To a large degree, the web navigation experience is present in Splunk’s product today, but the one thing missing is true web-style application development. That’s why, Swan said, the company has hired a team of developers in Seattle to create software developments kits and APIs to open Splunk data to Java (s orcl), Ruby and other web developers. If this effort is successful, Swan said, developers will be able to get MapReduce results without writing heavy-duty applications.

Splunk is ahead of the game when it comes to democratizing big data, but it’s not alone. Even for something as relatively complex as Hadoop, there are numerous startups (e.g., Hadapt and Datameer) building products on top of it to mask the complexity and let everyday business users run Hadoop jobs. However we get there, the end has to be big data products that take the high science out of analytics. Everybody is interested in big data, but it’s likely still rather intimidating for most companies without the means to hire teams of data scientists.

Image courtesy of Flickr user jonny golstein.