Amazon Web Services Unveils Data Pipeline To Manage Workloads

By Jack McCarthy, CRN 3:47 PM EST Thu. Nov. 29, 2012

Amazon Web Services (AWS) on Wednesday unveiled Data Pipeline, an automated data flow system to help businesses organize and route data from disparate repositories.

AWS also announced two EC2 instances, or dedicated computing capacities, for heavy analytics workloads and application-building.

[Related: Amazon Web Services Hits Back At 'Old Guard' With Price Cuts]

Werner Vogels, CTO of Amazon.com, made the announcements at the AWS re: Invent partner conference in Las Vegas.

Vogels said cloud networks are being populated with an increasing number of data collection systems, such as Amazon Storage Service System, DynamicDB, Elastic MapReduce and Amazon's newly launched Redshift data warehouse.

AWS Data Pipeline is a way to manage and schedule the flow of data from those systems, he said.

"The new AWS Data Pipeline is a data-driven service that helps you move data through several steps to get it where you want to go," Vogels said.

Data Pipeline will feature a drag-and-drop interface to allow users to schedule and run data-intensive programs.

Vogels also announced two new instances to build new applications for very large analytics workloads.

The "cluster high memory" EC2 instance type includes 240 GB of RAM and two 120 GB of storage. The other EC2 instance is called "high storage" for heavy analytics and offers 117 GB of RAM.

PUBLISHED NOV. 29, 2012