Hewlett-Packard is the latest major storage vendor to enter the data deduplication market and start gunning after market leader Data Domain.
HP this week unveiled a two-pronged data deduplication strategy with its introduction of both a family of in-line and post-processing dedupe appliances.
Meanwhile, Data Domain, Santa Clara, Calif., this week introduced the ability to lock data stored on its appliances for IT regulatory governance purposes.
Deduplication, also called "dedupe," removes duplicate information as data is backed up or archived. It can be done on the file level, where duplicate files are replaced with a marker pointing to one copy of the file, and/or at the sub-file or byte level, where duplicate bytes of data are removed, resulting in a significant decrease in storage capacity requirements.
Dedupe products can be classified in a couple different ways. The primary difference between them lies in where the de-dupe process takes place.
Some products dedupe the data as it is being sent across a LAN or WAN. Known as in-line dedupe, this results in fewer files and less data being sent over the network, but can affect the performance of the backup because of the processing overhead caused by the dedupe process.
Other products use post-processing dedupe in which the full data to be backed up is copied onto a destination drive, after which it is deduped. This mitigates the bottleneck by accepting the full data set and then eliminating duplicates as it is stored, but in this case the customer must have enough storage capacity to temporarily store the entire data set.
HP has integrated in-line dedupe capabilities to its existing disk-to-disk storage appliance line, including the D2D2500 and D2D4000 appliances, aimed at small and midsized businesses and branch offices of larger businesses, said David Rogers, product marketing manager of disk-based data protection products for HP StorageWorks.
The new dedupe capability of the D2D appliances is designed to let companies with limited IT resources and no storage specialists administer the storage without help, Rogers said. The dedupe capability is integrated at no extra charge.
The company has also introduced a new series of appliances based on an OEM contract with Sepaton Inc., a Marlborough, Mass.-based dedupe and virtual tape library developer. That family, the VLS line, emulates multiple tape libraries or tape drives, and features post-processing dedupe technology, Rogers said.
HP has already started talking about its entry into the dedupe market, said Dhruv Gulati, executive vice president of Lilien Systems, a Larkspur, Calif.-based solution provider and HP partner.
Unfortunately, Gulati said, HP is very late with the product. "HP has lost a lot of the market to Data Domain and EMC and others," he said. "It's three years late."
As a result, HP solution providers have had to offer alternative products to HP customers who require dedupe capability, Gulati said. "We didn't really have a good solution until now," he said. "We had at one point signed up with Avamar, but it was acquired by EMC, so there wasn't much there."
That acquisition of Avamar by EMC in 2006 was EMC's first move into the dedupe market.
EMC last month also unveiled a long-expected deal with Quantum, San Jose, Calif. under which EMC is OEMing software from Quantum to add dedupe capabilities to its Clariion product line.
Meanwhile, HP and EMC arch-rival IBM, of Armonk, N.Y., made its own entry into the dedupe market when it acquired privately-held Diligent Technologies in April.
Other vendors such as Sun Microsystems, Santa Clara, Calif., have also entered the dedupe market with OEM deals. In Sun's case, it partnered with FalconStor Software, Melville, N.Y., on dedupe technology.
Data Domain, on the other hand, is the dedupe market leader, and the company against which other vendors measure themselves.
Rogers said that HP's D2D4009fc product lists for about 14 percent less than Data Domain's DD530, giving it a lower cost per Tbyte by up to 45 percent while offering a transfer rate just slightly slower than that of its competitor.
Jeff Sosa, director of product management at Data Domain, said that it is common for vendors like HP and NetApp, Sunnyvale, Calif., to bolt dedupe capabilities onto existing products, but that such moves validate his own company's focus on this market and brings new customers in to talk to Data Domain.
"It creates interest in Data Domain as well as customers get interested in the technology and start looking at alternatives," Sosa said.
Data Domain this week introduced RetentionLock, a new WORM (write-once, read-many) capability for its dedupe appliances.
RetentionLock allows files to be locked for specific periods of time to comply with SEC regulations and IT governance rules, said Sosa.
RetentionLock also allows trusted IT administrators to do certain things to those files while they are locked, such as change the retention parameters, Sosa said.
"Customers can change the retention period," he said. "For instance, in the European Union, if an employee leaves, a company has to delete all personal information. So now the company can change the retention period of the data. It is also allows the permissions of a locked file to be updated. For example, a company may need to grant access to a file to a new group of users, or cut access to other users."
The new version of the Data Domain software with RetentionLock and other new features is expected to be available on July 1. The capability can be turned on with all Data Domain's operating systems by purchasing a license key starting at a price of $500.
HP's new dedupe appliances are expected to be available in July, Rogers said.
The dedupe license for the VLS6000 costs either $8,750 for a shelf full of 500-Gbyte hard drives, or $17,500 for a shelf full of 750-Gbyte hard drives.
The license for the VLS9000 costs about $100,000 for each 30-Tbyte or 40-Tbyte storage model.
The license for the VLS12000 costs $5,000 per 2-Tbyte LUN, Rogers said.