EMC Intros Quantum Dedupe Technology


EMC, Hopkinton, Mass., used the opening day of its EMC World conference in Las Vegas to unveil these and other enhancements to its product line.

Much of the drive behind EMC's new product offerings comes from the incredible growth in data to be stored, combined with the need to improve the efficiency of that growth, said David Donatelli, executive vice president of the company's storage product operations.

By 2011, the industry can be expected to be storing about 1,773 billion Gbytes of data, or about 1.8 zetabytes, about ten times the amount of data stored in 2006, Donatelli said. "But 85 percent of that will be managed by businesses," he said.

To help those businesses get ready, EMC on Monday unveiled two new disk backup appliances featuring dedupe technology from San Jose, Calif.-based Quantum.

Sponsored post

Deduplication, also called "dedupe," removes duplicate information as data is backed up or archived. It can be done on the file level, where duplicate files are replaced with a marker pointing to one copy of the file, and/or at the sub-file or byte level, where duplicate bytes of data are removed, resulting in a significant decrease in storage capacity requirements.

Dedupe products can be classified in a couple different ways. The primary difference between them lies in where the de-dupe process takes place.

Some products dedupe the data as it is being sent across a LAN or WAN. Known as in-line dedupe, this results in fewer files and less data being sent over the network, but can affect the performance of the backup because of the overhead caused by the dedupe process. Data Domain features in-line dedupe technology.

Other products use post-processing dedupe in which the full data to be backed up is copied onto a destination drive, after which it is deduped. This mitigates the bottleneck by accepting the full data set and then eliminating duplicates as it is stored, but in this case the customer must have enough storage capacity to temporarily store the entire data set.

The EMC Disk Library 3D 1500 and DL3D 3000 virtual tape libraries (VTL) include dedupe software licensed from Quantum. The DL3D 1500 is based on EMC's Clariion CX310 array, and provides up to 36 Tbytes of usable capacity, while the DL3D 3000 is based on the Clariion CX340 array for a usable capacity of up to 148 Tbytes. Both use 1-Tbyte SATA hard drives and feature RAID 6 protection.

When EMC acquired Avamar in November of 2006, it got its first dedupe technology. However, the Avamar technology allows only in-line dedupe, while the Quantum technology is flexible and allows both in-line and post-processing dedupe, Donatelli said. The Quantum technology also allows policy-based dedupe which can either take place automatically during the backup process or shut off for certain time periods to improve IT performance, he said.

EMC on Monday also said it will add the Quantum dedupe technology to its existing DL 4000 disk-based VTL as an upgrade by the end of July, he said.

Keith Norbie, director of the storage division of Nexus Information Systems, a Plymouth, Minn.-based solution provider and partner to EMC, Quantum, and Data Domain, the Santa Clara, Calif.-based data dedupe technology leader, said that Quantum has technology on par with that of Data Domain but has not be able to market itself as well.

"We'll have to see if EMC, the ultimate marketing machine, can do it," Norbie said. "I'm encouraged that EMC is offering an integrated solution, not just using the Quantum piece. I can't say how much EMC will impact the market for Data Domain. Branding is everything. When you have a company like Data Domain, which invented the category, it's hard to overcome. But EMC has a larger partner organization."

EMC also introduced low-powered 1-Tbyte SATA hard drives as well as technology to spin down unused hard drives to its VTL family of products as a way to cut power usage, Donatelli said.

On the software side, EMC added a fast start capability to its NetWorker data protection software. EMC NetWorker Fast Start, which also includes Avamar dedupe technology as well as continuous data protection, differs from previous versions in that it cuts the number of key clicks to configure by about 80 percent, Donatelli said.

Norbie said that just making it easier to install is nothing to brag about.

"As an analogy, think about how amateurs talk about the importance of backing up data while professionals talk about the importance of data recovery," he said. "Ease-of-install is easier to fix than ease-of-use. What is unknown is whether NetWorker is easier to use. Is its look-and-feel pleasing? Is the workflow any easier? That's far different from whether it's easier to install. How many times to you install a software?"

The DL3D 1500 is expected to be available May 28 with a starting price of $150,000, while the DL3D 3000 on that day is expected to start shipping with a starting price of $200,000. The spin-down and low-power drive technologies for the Disk Library 4000 are also expected to be ready by May 28.