Oracle Teams With Cloudera To Tackle 'Big Data'


Oracle is shipping its Big Data Appliance, a system combining Oracle's new NoSQL database with Cloudera's distribution of Hadoop that's designed for managing and analyzing huge volumes of information.

The appliance is the latest addition to Oracle's line of what it calls "engineered systems" that meld Sun server hardware with Oracle software. In the current quarter Oracle expects to sell some 300 of the engineered systems, including the Oracle Exadata Machine and Oracle Exalogic Elastic Cloud, CEO Larry Ellison said during the company's second-quarter earnings call last month.

The new system also strengthens links between Oracle's flagship relational database and Hadoop, the fast-growing technology for managing "Big Data," the industry term for describing the growing volumes of information -- especially unstructured data -- generated by social networks, Web sites, radio frequency ID (RFID) systems and other sources.

"To get real value from big data, you're going to need to integrate these environments," said George Lumpkin, Oracle' vice president of product management, data warehousing, in an interview.

As an example, he cited log files and location data generated by mobile applications that businesses are eager to capture and analyze. Another example would be combining information about consumer behavior on Web sites with sales transaction data.

Announced at Oracle OpenWorld in October, the Big Data Appliance includes the Oracle NoSQL database that's based on the Berkeley DB database and Cloudera's distribution of the open-source Apache Hadoop. "It's a very natural relationship between two industry leaders," said Kirk Dunn, Cloudera's chief operating officer, in an interview.

Oracle and Cloudera began discussions about the project last summer and recently completed the necessary development work, Dunn said.

Other critical components include the Cloudera Manager toolset for configuring and managing Apache Hadoop, an open-source distribution of the R development environment for building statistical applications, and a series of connectors for linking the Big Data Appliance to other Oracle systems. The latter includes the Oracle Loader for Hadoop for loading data into Oracle Database 11g, and the Oracle Direct Connector for Hadoop Distributed File System (HDFS) that enables the Oracle Database SQL engine to access HDFS data.

On the hardware side the Big Data Appliance scales up to a rack configuration of 18 Intel-based Oracle Sun servers with 864 GB of main memory, 216 CPU cores, 648 TB of raw disk storage, 40 Gbit-per-second InfiniBand connectivity between nodes and other Oracle engineered systems, and 10 Gbit-per-second Ethernet data center connectivity.

Targeted toward large businesses, the Big Data Appliance will be sold direct and through channel partners and systems integrators, particularly those who already sell Exadata systems, Lumpkin said.