IBM’s Hadoop Effort Grows from Project to Product

Two big data-driven conferences kicked off today – IBM (s IBM) Information On Demand and the Teradata (s TDC) Partners Conference – and brought product news with them. Both companies announced slews of upgrades and new products that directly target those hoping to do more with more data.

From my perspective, IBM’s biggest news is the beta availability of its Hadoop-powered InfoSphere BigInsights software, as well as a technology preview of the offering running in IBM’s Smart Business Development and Test. Details on BigInsights and some early trials are available this piece (sub req’d) I wrote for GigaOM Pro earlier this year, but the gist is that BigInsights lets users churn through massive amounts of data and receive results in a variety of useful visual formats (e.g., spreadsheet, tag cloud, etc.).

We’ve been covering the uptick in Hadoop interest over the past year or so, and it behooves IBM to get its first Hadoop product to market before the majority of users start down the Hadoop path. It’s doubtful the product will be inexpensive – especially when compared with free distributions and products from startups like Cloudera and Karmasphere (as well as Yahoo) – so IBM will want to get its hooks in users before they invest too much time and too many resources building Hadoop deployments with alternative software. The upshot of BigInsights is that it abstracts much of the complexity usually associated with getting results from Hadoop, in a manner similar to startup Datameer’s Datameer Analytics Appliance, making it more useful for business-level users.

The option of BigInsight on IBM’s Development and Test cloud is intriguing enough, although a production-grade cloud service would be a far bigger deal. Currently, Amazon Web Services has the Hadoop-in-the-cloud market cornered with Elastic MapReduce, but the potential for such services seems large. One big area could be social media analytics: When I was in Armonk in August, IBM VP of Emerging Technologies Rod Smith indicated that the appetite for social media analytics is “huge,” citing one BigInsights customer that is analyzing more than a terabyte of Twitter data per day and maintaining a 30-day archive.  The value doesn’t go away with an on-premise BigInsights deployment, but imagine being able to pull, store, analyze and visualize terabytes of web-based data all without dedicating a single disk or processor.

IBM’s other Big Data releases today are Cognos 10 business intelligence software; DB2 10 database software, which boosts performance and integrates data from multiple sources; and InfoSphere Server 8.5, which IBM describes as “the data backbone for organizations.”

For its part, Teradata rolled out Teradata database 13.10, as well as a line of new and improved appliances. The highlight of the new database is called Temporal, a feature that shows how data has changed over time so that users can ascertain trends without having to undertake manual processes. The coolest data appliance upgrade might be the Extreme Performance Appliance 4600, which runs entirely on solid-state drives. This comes on the heels of SAP’s (s SAP) in-memory High-Performance Analytics Appliance (HANA), which it announced last week. Both in-memory and SSD processing are ideal for real-time analysis.

Teradata has been fingered as the next data warehouse vendor to get acquired as larger vendors try to fill out their big data offerings. Competitors Greenplum and Netezza (s NZ) recently became the property of EMC (s EMC) and IBM, respectively, and unveiled its first Greenplum-based data appliance a couple weeks ago.  Teradata does more business than both Greenplum and Netezza combined, so whoever buys it will have to a premium, but will get a pre-built Big Data portfolio and customer base in return.

Elephant image courtesy of Flickr user RachScottHalls.

Related content from GigaOM Pro (sub req’d):