WibiData gets $15M to help it become the Hadoop application company

WibiData — the big data startup from Cloudera Co-founder Christophe Bisciglia and Aaron Kimball — doesn’t have overly big plans. It only wants to become one of the first, if not the first, company selling off-the-shelf software that lets other companies build valuable, customer-facing applications on Hadoop. On Thursday, WibiData announced $15 million in Series B funding from Canaan Partners, as well as existing investors NEA and Google Chairman Eric Schmidt, to help make the goal a reality.

Kidding aside, that’s actually quite an ambitious goal in a Hadoop market that’s big and growing, but that’s exemplified by expensive consulting arrangements and purpose-built applications. Even more so for companies that want to do something other than transforming unstructured data into structured data (often called ETL) or run back-office analytics jobs. In fact, WibiData has spent the last 18 months doing just this type of deal, and Bisciglia says every single customer has already engaged with one of the big three Hadoop vendors (Cloudera, Hortonworks and MapR).

Home energy-management startup Opower is a good example of this process. It’s actually one of Cloudera’s banner customers, but “when they wanted to take [their software-as-a-service tool] beyond batch analysis and ETL workloads,” Bisciglia said, Opower came to WibiData. So whereas the Opower service was originally focused on nightly data analysis comparing users’ energy usage against that of other users, it’s now working on dynamic recommendations for users and letting them engage with the application in new ways.

The WibiData architecture
The WibiData architecture

During these engagements, WibiData has been building up its core technology for connecting those brawny back-office Hadoop environments to predictive customer-facing applications — a collection of HBase, data-formatting tools and machine learning algorithms that the company has been slowly open-sourcing under the Kiji banner. It has also been learning the similarities among the applications it’s building for customers in the same field, figuring out what’s repeatable. What does any given company in the retail space, for example, need to get started on its own recommendation engine?

And now, Bisciglia says, WibiData is going to double down on building application software based on what it has learned. The first two industries it targets will likely be financial services and retail, two areas where the company has seen a lot of traction. He envisions the finished product including some pre-defined schema for formatting data and some pre-built predictive models, both broadly applicable across that industry rather than specific to a single user.

There will also be different interfaces that allow different types of users (e.g., data scientists, systems engineers and business users) to interact with the data in the ways they need to.

Time will tell if WibiData can actually accomplish its goal of turning Hadoop into a collection of somewhat specialized software packages, but someone has to. Even industry heavyweights like Cloudera see the need, but their hands are full just getting Hadoop integrated into existing environments and getting those early uses up and running. As Cloudera CEO Mike Olson said at Structure: Data in 2012 to anyone ambitious enough to tackle the Hadoop-application gap, “Call me, I’ll connect you with funding. The money is out there.”

If you want to hear more about the need for Hadoop applications, check out this panel from Structure: Data 2013, where I speak with WibiData’s Omer Trajman, Continuuity’s Jonathan Gray and Pivotal’s Muddu Sudhakar. [youtube]