MapR didn’t miss the memo about the key to success in the Hadoop space being the creation of a data platform that can do many things. And on Wednesday, the company released its take on HBase, called M7.
Last week, I explained how HBase is fast becoming the star of the Hadoop ecosystem because it allows users to build more real-time, almost transactional applications on top of Hadoop. True to its form with its other products, MapR has taken HBase even further with M7 by promising greater availability (99.999 percent), instant recovery, faster operations and the ability to handle 1 trillion tables in a single cluster. In open source versions of HBase, MapR VP of Marketing Jack Norris told me, the accepted table limit per cluster is several hundred.
Additionally, M7 shares a single data layer with the Hadoop file system, meaning less performance overhead and, presumably, easier management.
As we’re seeing with other Hadoop vendors, including Cloudera (which released its Impala SQL query engine on Tuesday), the Hadoop market is fast becoming one where each vendor is trying to set itself apart from the rest by building the best platform with the broadest set of capabilities. In furtherance of that mission, MapR also announced on Wednesday full-text search on its Hadoop distribution thanks to a partnership with Lucene specialist LucidWorks. It already has its own Hadoop distribution complete with proprietary code to bolster the file system and speed up MapReduce, as well as an open source SQL-on-Hadoop project called Drill in the works.
MapR employees are probably sleeping a lot easier these days as a result of this platform push. Others in the Hadoop market used to talk about the fear of fragmentation and then point at MapR as the example of a company helping foment that outcome with its proprietary software. Now, however, even if everyone else is building open source products, they’re all still backing their own and largely dismissing the others.
I suspect the result is feature lock-in even there’s no technological lock-in, kind of like using Amazon Web Services for cloud computing and then hoping to replicate its various servies elsewhere. It might be easy enough to move your data, but impossible or very difficult to replicate those additional capabilities elsewhere. If MapR can build a better version of HBase and companies are willing to pay for it, then so be it.