A few months ago, I posited that additional funding for Cloudera and Karmasphere signifies a large market opportunity for solutions that utilize the open-source analytics tool Hadoop. After all, the argument goes, data volumes are growing for organizations across the board, as is the competitive need to extract as much insight as possible from those stores.
Enter the high-powered, highly scalable Hadoop, along with solutions designed to make using it as easy as possible. This week, Yahoo hosted its third annual Hadoop Summit, and the sheer amount of news that generated only affirmed my beliefs that either directly or indirectly, Hadoop will make its early champions lots of money.
What happened? Without getting into too much detail, Cloudera rolled out of v.3 of its distribution, new tools and a commercial Enterprise edition; Karmasphere advanced its Hadoop management and development solutions; Datameer and Zementis partnered on predictive analytics; Talend incorporated Hadoop into its new data integration solution; MicroStrategy announced Hadoop support within its BI solution; and Appistry announced ecosystem support — from Datameer, Concurrent and Kitenga — for its entirely distributed HDFS alternative.
These vendors represent just a portion of those selling Hadoop-management tools, building products on top of Hadoop or integrating Hadoop support into their existing data-management products. Even in these early days for commercialized Hadoop, it’s a crowded field.
The only potential obstacle for these and aspiring Hadoop vendors appears to be Hadoop’s open source roots. Debate has raged over the past few months about whether there’s money to be made in open source software; if there is, successful vendors will have to strike a balance between what’s free and what costs money, and they’ll have to find a way to transition free-version users into paying customers.
Notably, Cloudera and Karmasphere are currently trying to maneuver this transition, a task made only more difficult by the fact that Hadoop creator Yahoo keeps releasing new open source tools and distributions (like it did this week with Hadoop with Security and Oozie) already proven to work at web scale. In fact, Yahoo has been touting its new releases as enterprise-class. Certainly, many companies will pay for enterprise functionality and support, but many others might be content downloading any of the high-quality free versions and building their own Hadoop competencies.
Perhaps a survey is on the horizon to see how, exactly, organizations are using Hadoop and what distributions and tools they’re employing. For now, it’s safe to assume the vast majority of usage is free, because proprietary management tools and Hadoop-powered analytics solutions are relatively new. Will it remain that way forever, or even over the next year? I think not.