Rating is available when the video has been rented.
This feature is not available right now. Please try again later.
Published on Oct 23, 2012
IBM has taken up the Big Data Challenge with a multilayered announcement promising fully integrated, enterprise-grade Hadoop/PureData systems that are orders of magnitude simpler to set up and manage and provide up to 25X performance increases. The announcements include a fully integrated Hadoop appliance incorporating the complete new IBM Big Data stack that one Beta tester had set up and running in its shop 89 minutes after the system rolled off the truck.
The "Big Data at the Speed of Business" stack includes: BLU Accelerator: a set of technologies based on more than 25 patents developed by IBM Research in 10 laboratories worldwide designed to provide greater performance, storage efficiency, and simplicity for analytics workloads. The four major parts of BLU Accelerator are:
Dynamic In-Memory Processing: Columnular data processing with the intelligence to move the data required for the specific analysis from storage to in-memory when it is needed and move data no longer needed to the storage array. This provides the performance benefits of a flash-based in-memory system without the limitations of size and cost of a flash-only system.
Parallel Processing with "Single Instruction Multi-Data": This automatically harnesses multiple processor cores to a single analysis while applying the latest advances in IBM processors to allow loading of multiple data elements for a single instruction. The combined effect is to dramatically increase performance.
Actional Compression: This new compression technique dramatically reduces space requirements for storing large amounts of data in a form that enables processing and analysis without decompression. This again increases the speed of the system while also saving space in the in-memory flash storage as well as the background storage array.
Data Skipping: This provides BLU Accelerator with the intelligence to completely skip processing data that is irrelevant to the analysis being run at the moment.
The combined effect of these four technologies, says Bernie Spang, director of strategy and marketing for the IBM Software Group, is to dramatically accelerate performance. Beta test clients saw 8X to 25X improvements in reporting and analytics workloads. The business impact is to facilitate interactive querying by delivering what IBM calls "Speed of Thought Analytics" to business end-users of IBM's Datamart and BI Patterns. Users can run queries and get results in seconds or minutes, allowing them to refine queries or ask other questions as those occur to them.
The announcement also includes a new version of IBM Infosphere Streams that supports analysis of data in motion.
It also includes V2.1 of BigSQL, IBM's SQL end-user interface for its BigInsights Hadoop distribution. "We have a consumability issue with Hadoop in a lot of our enterprises," says IBM Director of Big Data Produce Strategy Nancy Kopp-Hensley. "There's a shortage of skills, & we need to leverage the skills we've got. My favorite quote from the Strata Conference in February was 'It's ironic that the future of noSQL is SQL.'"
The PureData for Hadoop Data Appliance
The other major announcement is a new Hadoop data appliance. This is a fully pre-configured hardware/software system that includes the entire IBM Big Data stack. It is designed to solve a major problem in companies trying to harness Big Data — the complexity, cost, and risk of trying to build a large Hadoop cluster using technologies that are often new to the data warehouse professionals who are often asked to take on Big Data.
PureData for Hadoop is designed to meet two major challenges: fast deployment and simplified, unified management. The first alpha/beta test customer to try it out reported that they had the system running in 89 minutes, compared to the several weeks it had taken them to build a custom cluster with an alternative Open Source Hadoop distribution.
The appliance is designed both for analytics and archiving use cases, Kopp Hensley said. IBM has found that customers are using Hadoop as an archiving system for cold data out of their data warehouse and therefore built archiving capabilities into the new release of BigInsights. And the archived data stays active, making it a strong solution for compliance and other use cases where the data needs to be accessed occasionally.
Because the appliance can be up and running in less than two hours after delivery, it also supports exploration, allowing companies, for instance, the tap new sources of data, collecting and doing light integration of unstructured data.
The system also includes Big Sheets, a spreadsheet-like visualization tool designed to be easy for business end-users to employ. Finally, it includes the new Application Analytic Accelerator which includes three specific Big Data patterns: Social Media, Text Analytics, and Machine Data. This is IBM's aggregation of analytics functionality that is the new arms race in Big Data analytics.