Is Hadoop Champion Cloudera the Next Red Hat?

cloudera Cloudera, a startup based in Burlingame, Calif., today announced the release of its first commercial product, Cloudera Desktop. It’s a graphical interface for managing Hadoop, the open-source framework that is catalyzing the data mining renaissance. Cloudera’s Hadoop now works on almost all major cloud platforms: Amazon Web Services, Rackspace and soon, VMware’s vCloud.

In light of today’s news, as well as my recent conversations with industry insiders such as Netezza CEO Jim Baum, I realized how synonymous Cloudera has become with Hadoop, even though its three co-founders didn’t really have anything to do with its early development. I wonder if you guys see that as well?

One of the upsides of getting old is that you see history repeat itself. Or as Yogi Berra would say, it’s like déjà vu all over again. I see some interesting parallels between Hadoop and Red Hat, which rose to prominence on the back of Red Hat Linux, a version of Linux optimized for corporate users. (Related Research from GigaOM Pro, sub required: Open-Source Startups Follow Red Hat’s Path to Profit.)

Red Hat Cloudera
Offerings Red Hat Linux OS, Services Cloudera Hadoop, Data Warehouse software, Services
Open Source Linkage Linux Hadoop
Key Open Source Champion Marc Ewing Doug Cutting
Venture Investors August Captial, Graylock, Benchmark Captial Graylock, Accel Partners
Key Executive Bob Young, CEO Mike Olson, CEO

For example, Red Hat CEO Bob Young used to run a catalog business that sold Linux and Unix software accessories and books before he bought Marc Ewing’s Red Hat Linux in 1995, merged it with his ACC Corp. and named the new company Red Hat Software.

Cloudera, by comparison, was started in 2008 by Christophe Bisciglia, who created and led Google’s academic cloud computing initiative; Dr. Amr Awadallah, Yahoo’s former VP of engineering; and Jeff Hammerbacher, formerly of Facebook. The founders’ pedigree gave the startup instant credibility, which in turn allowed them to snag Mike Olson, a well-respected open source executive, as their CEO and the fourth co-founder.

In the case of Red Hat, the Young-Ewing combo enabled the company to raise $6.25 million in funding from the likes of August Capital, which allowed it to quickly scale. I see the same happening at Cloudera.

Cloudera’s talent pool has paved the way for it to raise $11 million in two rounds of funding from Accel Partners and Graylock. Cloudera’s other backers include Diane Greene (former CEO of VMware), Marten Mickos (former CEO of MySQL) and Jeff Weiner (president of LinkedIn).

Back in the late 1990s, the focus was on lowering the cost of infrastructure by moving away from proprietary software platforms to open-source operating systems. The focus today is on data and using it smartly. Unlocking the data, mining it for intelligence and analyzing it is the next big opportunity. “The web changed the way we radiate and consume information and in doing so, created a new opportunity to measure and monetize it,” writes Gary Orenstein. “The preferred architectural model for this web-derived data warehouse –- a combination of low-cost server hardware, distributed systems and open-source software — set off an innovation path that outpaced the commercial market.” Hadoop is well on its way down that path.

Hadoop was developed to support the distribution of the open-source search engine project known as Nutch, and was inspired by Google’s MapReduce and Google File System work. A top-level Apache project, it was created by Doug Cutting (and named after his child’s stuffed elephant) and championed by Yahoo, which quickly became the largest contributor to the project as it implemented the technology in its web and advertising businesses.

The big change came this past August, when Doug Cutting left Yahoo and joined Cloudera. Cutting’s involvement is like the icing on the cake, giving the company the ability to corner all the Hadoop talent out there. It also helps that Cloudera has started to make inroads into newer markets, including biotech and retail. “Hadoop is going to find potential markets in any industry where there are large data sets that need complex analysis,” CEO Olson told me.

I remember talking to Red Hat executives back in the day and listening to their pitch about Linux everywhere, how they were going to go beyond the web community and help drive Linux into other corporate environments and eventually, build a services business around it.

Cloudera is following that same path. It’s developed its own version of Hadoop, one that’s optimized for the needs of large corporations, especially those that prefer a little hand-holding from their suppliers. By giving them this version of Hadoop, Cloudera hopes to make revenue from services. And the timing — the company unveiled Cloudera Desktop at Hadoop World (we are media partners) in New York, an event it organized — is perfect.

Game, set, match for Cloudera.

Related posts from The GigaOM Network: