Sun Aims For 100-year Data Archiving With Honeycomb

storage archive

The "long term," for Sun, means looking at how to protect data and evolve a storage system over the next 100 years, said John Considine, director of the company's Storage Systems Product group.

The StorageTek 5800 Honeycomb uses a number of technologies including RAID and RAIN (redundant array of independent nodes) to build a storage system with an estimated mean time between data loss of 2 million years, Considine said.

That is something RAID alone cannot do, Considine said. "With RAID, at some point you lose too much data," he said. "Even with RAID 6, if you lose two hard drives, you lose data. The Honeycomb spreads data over a number of nodes."

Those nodes are combined into cells, which include up to 16 storage nodes, each with 2 Tbytes of raw capacity, plus redundant switches and a separate storage processor. Customers can start with a half-cell, or a full-cell, and then add new cells to expand both capacity and performance, Considine said.

id
unit-1659132512259
type
Sponsored post

In their initial version, the Honeycomb cells do not allow what Considine called "sloshing." With sloshing, data in one cell can be automatically spread across new cells as they are added to the system, with load balancing happening automatically as the cells are added. However, Considine said that capability will be added later.

In addition to its redundancy features, the StorageTek 5800 Honeycomb also has a number of technologies to make sure the data is available in the future, Considine said.

For instance, the StorageTek 5800 includes the ability to move data from one generation to another quickly and easily as the underlying server technology is upgraded, Considine said.

Sun has also adopted an open source strategy for the Honeycomb, including making both the APIs and the source code available to third parties and to customers. "The technology will be on the Web forever," Considine said. "It will let people keep evolving the product over time."

The StorageTek 5800 also couples the data with the ability to search it, Considine said. "We realize that as these things scale to hundreds of millions or to billions of objects, we need to scale the ability to do searches," he said." As customers grow the archives, they also grow the number of nodes and the total compute power. So customers get high-performance searches."

One 16-node cell, including 32 Tbytes of raw storage capacity, lists for about $245,000. Customers can also purchase a half-cell with eight nodes for about half the price, Considine said. Customers can upgrade the half-cell to a full cell, and then increase capacity one cell at a time.