Platfora Founder and CEO Ben Werther doesn’t want to make Hadoop less painful to use; he wants Hadoop to underpin a whole new way of analyzing and visualizing business data. Platfora’s flagship product, which it unveiled publicly for the first time on Tuesday after launching in September 2011, looks to do just that by turning the business intelligence experience on its head. It’s part of a bigger trend toward democratizing data analysis, but with an emphasis on scale.
Werther, who spent time at DataStax and Greenplum (s emc) before starting Platfora, describes the current state of business intelligence as being like a double-edged sword. Traditional data warehouses are mature, he explained, but limited because strict processes for adding data and changing schema essentially force analysts to “live inside the constraints of whatever’s available to them.” Hadoop, on the other hand, is irresistible but flawed — it stores lots of raw data without fixed schema, but tools such as Hive that try to add a familiar facade “don’t in any way make it interactive or suitable for general use.”
The state of affairs has Fortune 500 companies “literally screaming out for an answer,” Werther said.
Big, fast data
With its eponymous software product, Platfora thinks it has that answer. It uses Hadoop as a scalable data store from which users can grab data sets and manipulate which variables are shown and how that data set relates to others. Platfora calls this data-management process building a “lens.”
What makes the product really hum, though, is how users can interact with data once it’s visualized into a graph. Like with Tableau, they can drag and drop new variables into a graph and watch it automatically account for them. However, Platfora also utilizes its own massively parallel in-memory database to store more than a terabyte of related metadata, which is the underpinning of what the company calls its “Fractal Cache” technology. It means users can change their minds about what data to include or how it’s related, and then start analyzing it anew without a hitch.
Based on what I saw in a demonstration, Platfora’s HTML5 canvas rendering makes the process as visually rich as it is fast in terms of processing speed. You can highlight a portion of a graph and drill down into just the data points included in that zone, and then drill down even further or just start a whole new analysis using the smaller data set as a the starting point. Using concepts from the “grammar of graphics,” Platfora VP of of Products and Marketing Peter Schlampp said users can create just about any type of visualization they can imagine.
Essentially, Werther said, Platfora has turned Hadoop into a sub-second interactive engine that operates much faster than any Hadoop-to-data-warehouse connector or Hive-based approach could ever hope to do. (Hive is the SQL-like query language developed for Hadoop that companies such as Facebook (s fb) use to turn Hadoop into a data warehouse for unstructured data.) “At the point where you can synthesize on the fly,” he said, “[legacy BI tools] start to look like relics of another age.”
If legacy dies, who wins?
As proof that its approach works, Werther points to the tens of well-known customers taking part in Platfora’s private beta and the hundreds it hopes to let in now that it has officially taken the lid off the product. One customer, he said, was able to get up and running on a petabyte of data in just four hours. Another, a large media company, is using Platfora to analyze 2.4 petabytes and more than 2 billion user records. In that deployment, Platfora replaced a BI strategy that included Hadoop, Hive, Vertica (s hpq) and Tableau, and that required users to request a new Vertica database every time they wanted to change the data underlying their analyses.
However, there are many similar data environments out there, and getting companies to make the switch to something new won’t necessarily be easy. Even when it’s relatively slow and cumbersome, a combination of Hadoop, a data warehouse and a BI tool generally serves its purpose, even if it’s an old-world stack, Werther acknowledges. And it’s not as if legacy companies such as Informatica (s infa) or MicroStrategy (s mstr) are going gently into that good night and just letting the big data revolution pass them by.
Even if companies really are looking to make a change, they might also look at more-modern alternatives from vendors including Hadapt, ClearStory, Birst and even Teradata (s tdc). Although they’re more batch-oriented and less interactive BI, established Hadoop startups such as Datameer and Karmasphere are offering some impressive products as well. Datameer, in particular, has put a focus on letting business users create top-notch graphics to illustrate their data, and we’ll see if young startups such as Datahero decide to tackle big data in the future.
Despite some stiff competition, though, Werther isn’t sweating Platfora’s future. He knows his company has promise, and he knows every one of his competitors is part of a larger movement to disrupt a lucrative market for BI products that analysts estimate at $12 billion a year and growing. “We all win if we can accelerate Hadoop usage,” Werther said.
That’s true even if some companies in the Hadoop distribution space are happy propping up legacy BI vendors for as long as companies still want to buy those products. However, he cautioned, “The old guys never win, they never survive the change.”