If you thought the Hadoop war of words was over, think again

Different year, different CEOs, different arguments, same old story: Cloudera and Hortonworks do not see eye to eye about the best way to run a business based on Hadoop. Two years ago, Hadoop startups Cloudera and Hortonworks took great pleasure in publicly debating who was more Hadoop than who by counting Apache contributors, commits and bug fixes. Today, the debate is over business models — specifically, whether it’s better to be Red Hat or IBM.

This time around, Cloudera fired the opening shot by declaring in October that it doesn’t consider Hortonworks or fellow Hadoop vendor MapR to be its competitors anymore. Cloudera CEO Tom Reilly told me then that while those companies are focused on Hadoop as a technology and business focal point, Cloudera is now focused on becoming an “enterprise data hub” that offers a whole suite of data products a la IBM or, presumably, Oracle. The logic goes that although Hadoop is the foundation of that strategy, the real value comes from higher-level features and data-analysis products, many of which Cloudera is building itself and some of which are open source even if not via Apache.

On Tuesday, Hortonworks fired back via a blog post by VP of Corporate Strategy Shaun Connolly who wrote, essentially, that Cloudera’s (er, “one company’s”) model is the wrong one. At least right now. The Hortonworks team believes it’s poised to take the Hadoop crown and Cloudera is poised to fail because the market is still too young to buy the kind of package Cloudera is selling. Modeling itself after Red Hat, Hortonworks is content to keep making Apache Hadoop better, keep helping customers who want basic Hadoop as the product, and keep letting partners like Microsoft, SAP and Teradata add the bells and whistles and drive adoption by their customers.

It is, as Connolly’s chart mapping Red Hat revenue shows, a long-term strategy.

Source: Hortonworks
Source: Hortonworks

As tiresome as the back-and-forth can get, though, it is kind of fascinating that because Hadoop is really a market unto itself — one based on open source technology, at that — there’s relatively little debate about whose technology is better. They’re all technically distinct, but only one pure-play Hadoop company — MapR — has ever really sought to distinguish itself based on pure technological edge. For the most part, they’re really asking customers to choose one business model (or maybe one support contract) over another.

The truth is that it’s probably too soon to declare either company a winner or a loser; there’s good reason to believe they’ll both win. On one hand, Hortonworks’ strategy is nothing if not prudent, and its big-time partners (if they stick around) do represent a foot in the door to some major corporations that already rely on those partners’ software. On the other hand, Cloudera’s strategy is more audacious (it’s essentially battling IBM et al and Hortonworks et al, whether it wants to or not) and probably has more income upside if the company can execute it.

Anyone interested in hearing more about this should attend our Structure Data conference in March, where we’ll have executives from Cloudera and Hortonworks on stage answering questions about how they view the big data world. Otherwise, I guess everyone can just rest assured knowing that Hadoop probably really is the best thing to happen to data since sliced bread (or SQL?), so a step toward Hadoop is probably a step in the right direction. If it’s as revolutionary as everyone says it is, what’s a couple degrees more or less in either direction?