HDS Enhances Content Archiving Solution

HDS's Hitachi Content Archive Platform (HCAP) is based on technology that came along with its February acquisition of Archivas, a developer of software for archiving and searching of data for compliance applications.

HCAP puts content from multiple sources into a centralized repository, which enables a single tool to search all the content regardless of how it was created. The contents of a file, along with its metadata and policies that govern its retention, is converted into a nonrewritable, nonerasable object.

When the data is stored, a full text scan is done of the content. This enables users to search the data while maintaining a copy of the original content, and also allows the hashing of an object based on the original data to be used to authenticate the data was not changed.

HCAP has been shipping as a complete solution, including the Archivas software, AMD Opteron-based servers and HDS' WMS100 storage arrays, for about a year, thanks to an OEM agreement HDS had with Archivas before it bought the company, said Asim Zaheer, senior director of business development for content archiving at HDS.

Sponsored post

One of the biggest changes in version 2.0 of HCAP is that customers and their solution providers can now use any of HDS' current storage arrays as part of the archiving solution, including the entry-level midrange WMS100; the modular AMS200, AMS500 and AMS1000; the flagship USP V; and the NSC55 SAN gateway.

The USP V, or Universal Storage Platform V (USP V), which HDS unveiled earlier this month, allows nearly 250 petabytes of HDS and non-HDS platforms to be connected behind its storage controller into a virtual storage pool.

The ability to do content archiving across different HDS platforms, as well as non-HDS platforms via the virtualization capability of the USP V, allows solution providers to go into existing HDS customers and utilize their spare storage capacity for archiving, Zaheer said.

"This is unique," he said. "No one else offers this flexibility."

Competing archiving solutions, on the other hand, are based on a single storage platform, or consist of low-cost clustered servers with internal storage that lead to reliability issues as they scale upward in capacity, Zaheer said.

"With our system, our servers are only used for content access and management, and so the storage scales as it is needed," he said.

That scalability across multiple platforms is an important differentiator, said Dave Cerniglia, president of Consiliant Technologies, an Irvine, Calif.-based solution provider and HDS partner.

"It's pretty consistent with Hitachi's technology," Cerniglia said. "They want to standardize their software across all their platforms. By originally limiting it to WMS100, it was a limited package like EMC's Centera. But Hitachi bought Archivas, which always worked on multiple products."

It also simplifies the job of solution providers to make HCAP available across multiple arrays, and not be limited to a single storage platform, Cerniglia said.

"Why complicate the channel?" he said. "Make it available on multiple platforms. Generally speaking, people are trying to understand the data they have and put a value on it. As they classify the data, they find the data has a certain time for which it is more valuable. So they move it to an archive. Hitachi has the technology to archive the data and search for it."

Also new with HCAP is automatic AES encryption of the content, metadata and search index of files sent to the archive, Zaheer said. Encryption key management is handled by a patent-pending HDS implementation of what is known in the security industry as "secret sharing," under which the key is broken up among the various members of the archiving solution so that all members must be active for the key to work.

"Other companies need external key management solutions," Zaheer said. "So the encryption key is sitting on the outside."

HDS also added optional deduplication, or dedupe, capability to decrease the amount of capacity required to store archived information, Zaheer said. Duplicate files are removed and replaced with a pointer pointing to one copy of the files.

HDS hashes a unique object for each deduped file, and then does a binary comparison between deduped files to prevent "collision," which could possibly happen when two non-like files get hashed into the same object. Zaheer admitted he has never heard of an actual case of collision.

"But it's possible," he said. "Why not do binary comparison. This is an insurance policy."

While encryption is slowly becoming important for customers looking to protect their data, dedupe is huge, said Cerniglia.

"The cost advantage is important," he said. "As for collisions, I think there is some validity. What if there is actually a collision some day? What are the consequences? Hitachi sells reliability and integrity above all else."

Also new is automated object replication, which automatically replicates a data object as it is ingested into an archiving folder with this capability turned on, Zaheer said. In this case, the object is replicated with its encryption and then compressed, and then sent to another storage device, which may or may not be similar to the primary storage device, he said.

The HCAP can now scale up to 32 billion different objects, or to up to 400 million objects in each of up to 80 nodes in an archiving cluster, Zaheer said.

HDS has already signed 50 software partners since HCAP was released about a year ago, Zaheer said. The one major holdout is Documentum, the enterprise content management software acquired by HDS archrival EMC in 2003.

"We asked Documentum to participate in validating the HCAP solution," he said. "But they declined."

Other vendors offer solutions similar to HCAP. EMC, for instance, last April put together a unified archiving platform with software from several acquisitions, including e-mail archiving from Legato, document imaging from Captiva, report archiving from Acartus, and document management from Documentum, along with its own Centera line of content-addressable storage appliances.

Others include IBM's DR550 family of storage arrays for storing and securing regulated and nonregulated data, and Hewlett-Packard's HP StorageWorks Reference Information Storage System (RISS), an application-aware, content-based archiving array.

HCAP version 2.0 is expected to ship in mid-June.

"Most if not all HCAP solutions sold had a channel partner," Zaheer said.