Project Savanna Aims To Bring Hadoop To OpenStack Clouds

Project Savanna, unveiled last week at the OpenStack Summit in Portland, Ore., includes a framework that connects Hadoop management tools with OpenStack infrastructure. Mirantis is building the framework and is partnering with Red Hat for the underlying infrastructure and Hortonworks for the Hadoop aspects.

Mirantis, as leader of Project Savanna, will ensure the project meets its goal of developing open-source APIs that allow Hadoop workloads to be moved back and forth between public and private clouds, Mirantis CEO Adrian Ionel said last week in an interview.

[Related: Red Hat Nudges Its OpenStack Release Closer To Finish Line ]

"We're providing the hardware and software development expertise, and our role going forward will be to drive the road map," Ionel told CRN. "We will incorporate ideas from Red Hat and Hortonworks, but we intend to be the leader for the project."

Sponsored post

Many organizations are using Hadoop in the cloud today, and Amazon's Elastic MapReduce (EMR) works in conjunction with Amazon EC2 and Amazon Simple Storage Service (Amazon S3). But since Amazon's APIs are proprietary, projects that begin on EMR can't be moved in-house to a private cloud without requiring additional development work. That's the issue Project Savanna seeks to address.

"We see a need for customers to have a unified computing infrastructure using open source, as opposed to Hadoop islands and OpenStack islands," Ionel said.

According to Greg Kleiman, senior director of storage strategy at Red Hat, Amazon EMR is good for Hadoop pilot projects but prohibitively expensive for running production services at scale.

"We have a lot of big data customers that use the cloud today for Hadoop. They want a more open approach, but they don't want to lose that cost effectiveness or ease of use," Kleiman told CRN.

Ionel said the idea for Project Savanna came about during the course of Mirantis' work in building large-scale production platforms for its customers. Mirantis has around 320 engineers focused on building OpenStack cloud platforms, Ionel said, and its customers include PayPal, Gap, NASA, Dell and Hewlett-Packard, among others.

Other vendors are exploring ways to bring Hadoop, an open-source big data platform for developing and deploying distributed, data-intensive applications, to the cloud. Last June, VMware launched Serengeti, an open-source project that includes a free deployment toolkit for deploying a Hadoop cluster on vSphere.