Exclusive: Pivotal CEO says open source Hadoop tech is coming

Pivotal, the cloud computing spinoff from EMC and VMware that launched in 2013, is preparing to blow up its big data business by open sourcing a whole lot of it.

Rumors of changes began circulating in November, after CRN reported that Pivotal was in the process of laying off about 60 people, many of which worked on the big data products. The flames were stoked again on Friday by a report in VentureBeat claiming the company might cease development of its Hadoop distribution and/or open source various pieces of its database technology such as Greenplum and HAWQ.

Gigaom has confirmed at least part of this is true, via an emailed statement credited to Pivotal CEO Paul Maritz:

“We are anticipating an interesting set of announcements on Feb 17th. However rumors to the effect that Pivotal is getting out of the Hadoop space are categorically false. The announcements, which involve multiple parties, will greatly add to the momentum in the Hadoop and Open Source space, and will have several dimensions that we believe customers will find very compelling.”

Those announcements will take place via webcast.

Paul Maritz at Structure Data 2014. (© Photo by Jakub Mosur).
Paul Maritz at Structure Data 2014.

Multiple external sources have told Gigaom that Pivotal does indeed plan to open source its Hadoop technology, and that it will work with former rival (but, more recently, partner) Hortonworks to maintain and develop it. IBM was also mentioned as a partner.

Members of the Hadoop team were let go around November when active development stopped, the sources said, and some senior big data personnel — including Senior Vice President of R&D Hugh Williams and Chief Scientist Milind Bhandarkar — departed the company in December, according to their LinkedIn profiles. Both of them claim to be working on new startup projects.

When EMC first introduced its Hadoop distribution, called Pivotal HD in February 2013 (it was one of the technologies that Pivotal the company inherited), executive Scott Yara touted the size of EMC’s Hadoop engineering team and the quality of its technology over that of its smaller rivals Cloudera, MapR and Hortonworks. However, Pivotal has been getting noticeably more in touch with its open source side recently, including with the Hortonworks partnership referenced above (around the Apache Ambari project) and a big commitment to the open source Tachyon in-memory file system project.

The current Pivotal HD Enterprise architecture.
The current Pivotal HD Enterprise architecture.

Pivotal has been a big proponent of the “data lake” strategy whereby companies store all their data in a big Hadoop cluster and use various higher-level programs to access and analyze. Last April, the company took a somewhat brave step toward ensuring its customers could do that by relaxing its product licensing and making Pivotal HD storage free.

Whatever happens with Pivotal’s technology, it’s not shocking that the company would decide to take the open source path. Its flagship technology is the open source Cloud Foundry cloud computing platform, and Cloudera, Hortonworks and MapR have cornered the market on Hadoop sales, by all accounts. If Pivotal has some good code in its base, it’s probably best to get it into the open source world and ride the momentum rather try to fight against it.

For more on the fast-moving Hadoop space, be sure to attend our Structure Data conference March 18-19 in New York. We’ll have the CEOs of Cloudera, Hortonworks and MapR on stage to talk about the business, as well as Databricks CEO Ion Stoica discussing the Apache Spark project that is presently kicking parts of Hadoop into hyperdrive.