Storage: Reports Of My Death Have Been Greatly Exaggerated8:00 AM EST Thu. Dec. 20, 2012
The future of storage: The phrase brings up images of quantum computing, of atomic-scale data storage, of encoding books in DNA -- new ways to squeeze the massive and ever-growing data stores into easily managed, easily used elements.
The State of Storage:
Stop waiting. It won't happen, at least not in any time frame that will allow today's active IT professionals to use them.
Instead, the ways IT professionals will collect, store, access, manage, archive and delete data into the foreseeable future will remain familiar to anyone with today's enterprise storage experience.
Sure, big changes are coming over the next decade. Faster, more powerful servers and storage devices, coupled with new ways to tie them together and with the cloud, will combine to revolutionize the data center. New technologies will begin to take the hard drive out of the corporate data center and replace it with flash memory, with cloud gateways, or someday even with an empty room.
The real revolution, the revolution today's IT professional will experience before he or she retires, has already started. Here is part one of our special report on the state of the storage industry, which originally ran as an exclusive on the CRN Tech News App in November.
Moving Off Hardware
To get a glimpse of what the future will look like, travel to the California cities of Sacramento or Santa Clara or the Connecticut cities of Stamford and Walllingford to watch how NextCloud, a provider of managed data center services, moves customers off their hardware infrastructures.
NextCloud uses a VMware-based architecture combined with high-speed metro Ethernet in conjunction with the Sacramento-based Herakles Internet data center to enable customers to run their entire business with no storage, servers or PCs, said Founder and CTO Gary Lamb. Everything can be run with thin clients or terminals, he said.
"Life with no hard drive?" Lamb said. "We believe that today. We're living that. I don't believe there is much that cannot be moved from a customer's own premises to a cloud or to a hosted infrastructure."
Lamb cited one customer with 65 users and no PCs in their office. "Before working with us, they had 23 servers that we Vmotioned to our facility, which took 1.5 days. Now they have three virtual servers running 65 clients. If they need to increase performance, we can add to the virtual servers."
For another look into the future, head to Los Gatos, Calif., and visit the offices of Pertino, a startup developer of cloud-based virtual networks.
At Pertino, there are no servers on-site. No storage arrays. In fact, the only hint of an IT infrastructure is in the PCs Pertino's employees use to develop its technology and run its business.
Instead, Pertino's entire business IT infrastructure is running in a cloud, including Salesforce.com for CRM, Marketo for marketing automation and lead nurturing, Intacct for accounting, and others, said Todd Krautkremer, vice president of marketing for the company.
That lack of a local IT infrastructure makes sense for a company that's developing a software-defined, cloud-based network-as-a-service platform that will span multiple Internet data centers across multiple service providers and multiple geographies. The service can spin up and down as required to meet capacity and proximity needs.
Common to both Pertino's product and business IT strategies is the ability to leverage cloud elasticity and economics on a demand and subscription basis to reduce operating costs, Krautkremer said. "So we can turn services up or down depending on our business traction," he said. "As a startup, that capability is immensely valuable."
NEXT: Exponential Data Growth
Valuable indeed. The need to get storage arrays, servers and other boxes out of the corporate data center and turned into a service is growing quickly as businesses large and small face growing mountains of data that threaten to overwhelm their IT management capabilities.
Research firm IDC in its annual "State of the Universe" study estimated that the amount of information created and replicated in 2011 to surpass 1.8 zettabytes, or 1.8 trillion gigabytes, up 900 percent from five years earlier. That data exists in about 500 quadrillion files. By 2015, IDC estimated the amount of data created and replicated will reach nearly 8 zettabytes. The San Diego Supercomputer Center surveyed CIOs and CTOs at 30 large enterprises this year and found that corporate data grew at a median annual compound growth rate of 40 percent, or basically doubling every two years.
If it was a simple question of throwing more people into the data centers, all this growth could be managed with relatively little pain. But that's not going to happen.
IDC estimated that, even as data centers will be managing 50 times, or 5,000 percent, more data over the next decade, they will be doing it with only about 50 percent more people than they now employ.
Handling exponentially more data with limited personnel is only the start. Businesses over the foreseeable future will also be dealing with increasingly more complex data requirements.
The San Diego Supercomputer Center notes that there are at least five different data types based on its persistence, or how long it is kept. These include temporal (transactional data which lasts for as little as a fraction of a second), active (available for immediate use by an application), retained (such as backups, copies, replications), historical (aged data on lower-cost storage devices), and archive (data that may never be accessed but which must be kept for regulatory or compliance purposes, maybe forever).
These disparate data types must be handled using different technologies, ranging from being able to generate data for a transaction and then flush it away to understanding which data should be archived while making sure it can be accessed 30 or more years later if needed.
Another issue complicating management of storage is that the fastest-growing part of the information explosion, unstructured data, also just happens to be the hardest to manage. Unstructured data, which accounts for about 90 percent of the digital information being collected and stored, includes text, audio and video files, photographs, and other data that is not easy to handle using traditional database management tools.
The bigger question is, after the data is collected and stored, now what? That's where the concept of big data, which is technology for real-time analysis of huge amounts of data, comes in, and where assumptions about a company's ability to manage storage with existing tools goes out the door.
Further complicating the issues related to uncontrolled data growth is the fact that so much of that data is actually useless. Deidre Paknad, founder of the Compliance, Governance and Oversight Counsel and director of information life-cycle governance solutions at IBM, wrote in a Forbes magazine article that a survey of businesses found that, at a typical organization, 1 percent of data is on litigation hold, 5 percent is in some form of records, and 25 percent has current business value.
That leaves 69 percent of information in the typical business having no business, legal or regulatory value. At the same, Paknad wrote, IT needs to make a billion choices to determine what part of that data can be safely tossed.
NEXT: Into The Cloud
The cloud can be a safe harbor from many if not all of these issues. In fact, over the next decade, the cloud will become a preferred technology for storing all or part of a company's data, once some key issues are solved.
Of the total 7.9 zettabytes IDC estimates will be stored worldwide in 2015, 0.8 zettabytes is expected to be maintained in the cloud while 1.4 zettabytes will be stored or processed in a cloud during part of the time between when data is first created and eventually disposed of.
Roberto Basilio, vice president of storage product management at Hitachi Data Systems (HDS), said a majority of data will go to the cloud as a way to provide IT as a service. "It will happen when everything is secure, and when everything happens very fast," Basilio said.
How much data moves to the cloud over the next decade is predicated on the question of how much of the rest of the IT infrastructure, including servers, will move to the cloud, said David Scott, senior vice president and general manager for storage at Hewlett-Packard.
"I won't say it won't be an inexorable trend for plenty of data to move to the cloud whether the servers are there or not," Scott said. "Massive stores of information are going to the cloud over the next decade, and content depots will be searched. ... But if there are terabytes of data in the results, there will still be issues with connections between a business and the cloud. The speed of light limits the speed of data in the cloud."
Still, the attraction of the cloud as a place to store data is compelling, in part because of its flexibility vs. storing data in a company's own physical infrastructures. As the amount of data a company stores continues to grow, there has to be somewhere to put it all. With on-site or remote physical storage devices, that means purchasing more boxes.
That raises the question of how much capacity should be purchased. Few if any companies truly know how fast their storage is growing, and so are forced to purchase far more capacity than they need in order to ensure they don't run out of space. Even with such technologies as compression and deduplication, thin provisioning, and virtualization of multiple arrays into large storage pools, keeping up with data growth is a logistical nightmare.
An enterprise storage cloud allows capacity and data volumes to grow as needed and when needed, all without the IT manager worrying about when and how to purchase new physical capacity. Cloud storage also shrinks when needed. For instance, with the cloud, it is possible to build a storage infrastructure for testing or staging new applications on real data, run the tests, and then delete the data when done.
NEXT: Cloud Storage Use Cases
Clouds can be used for data protection, file sharing, collaboration, archiving, test and development, pre-staging of new applications, disaster recovery, and other storage-related tasks that in the past were grounded in physical infrastructures.
They will also be the way queries to large databases will be handled, said Dave Hitz, executive vice president and co-founder of NetApp.
A database search might be wrapped in a kilobyte of data that is sent to a data store in a cloud, Hitz said. "The search will be done on the cloud, which will return two to three kilobytes of results," he said.
Meanwhile, nearly all the headaches related to storage, including data management and making sure data is properly secured and backed up, becomes someone else's problem. And that is a beautiful thing for the IT manager who can now invest in better ways to use a company's data to gain business value rather than in the money pit that is the day-to-day problems of managing all that data.
It's a decision that smaller companies are already making whether or not they realize it, Pertino's Krautkremer said. Many companies sit in what he called a "partly cloudy" world.
"Part of their business is in the cloud, like their Exchange data, and part is in on-site legacy apps, like MPR [manufacturing resource planning]," he said. "For these customers, who make up the majority of SMBs today, a cloud-based network can seamlessly bridge both worlds and aid in migration between them while providing unified access, visibility and control."
Indeed, in an ideal world, the biggest decision about how to store most or all of a company's data in the cloud in many cases comes down to which cloud storage provider to use. However, cloud storage is no silver lining. Unlike all the hype surrounding it, cloud storage is not cheap. Issues related to Internet bandwidth, security, and longtime archiving are in flux but remedies are in the works.
For example, IT administrators can mitigate cloud latency issues by taking advantage of new cloud gateway technologies coming to market. Cloud gateways combine a storage appliance, which keeps more frequently accessed data available locally for high-performance applications, with the moving of the bulk of a business' data to the cloud where it can sit, usually undisturbed after 60 or 90 days of lessening access to it.
NEXT: Convergence On The Horizon
While cloud storage grows, the next decade also will see servers and storage meld together into a single technology as customers demand IT systems that require fewer resources to manage and as vendors look for ways to increase storage performance.
Today, every major storage and server vendor has some sort of converged infrastructure bringing servers, storage and networking technologies together as a single system that can be managed as a whole. Hewlett-Packard, IBM and Dell have done so on their own, while Cisco, EMC, NetApp, Oracle and Hitachi Data Systems have done so in partnership with each other or with other networking vendors.
Technological constraints have kept servers and storage separate, said Vincent Hsu, an IBM Fellow. However, he said, flash memory and other new technologies are allowing them to be integrated, he said.
"The trick is, how can those silos work together?" Hsu said. "People are working very hard to get those silos to work together, both from an interoperability view and in systems from a single vendor, and then managing it all as one."
One way to do that is to forget the traditional data query process of making data available to a server. "In the future, users will send the queries to where the data is stored," he said. "The queries will be processed there locally, and then return the results faster. We will send the function to where the node is, and it will be processed there."
EMC plans to start demonstrating capabilities next year that allow customers to run virtual machines inside storage arrays, said Pat Gelsinger, former president and COO at EMC and current CEO of VMware.
"This allows you to say, 'Hey, the data is, like, really big and really hard to move around so why don't I move that comparatively light application right over to the big heavy data, and run it as close as I can to the data."
NEXT: What Happens To Hard Drives
So with the growth of convergence and cloud storage, will the hard drive, or any of today's other common storage technologies, disappear from the corporate data center?
No. While it would be nice to get rid of every hard drive as a way to improve the management of storage, it will not happen in the coming decade. Individual desktops and mobile PCs will likely continue to use hard drives as their primary local storage, even if customers add SSDs or a flash memory cache to increase performance. And one or more storage appliances will remain on-site with a copy of a business' data for staging latency-constrained backups and for fast recovery of lost or corrupt data.
It's like the tape drive. Forty years ago, tape was dead, said HDS' Basilio. "And yet we're still using it," he said. "Disk will follow a similar path."
People want to get off disks because the power and space they need are costly, Basilio said. "But disk drives remain cost-effective," he said. "Despite falling flash drive prices, they are still much higher than disk prices."
There are a couple areas where hard drive technology is evolving. The first is capacity, or how much data can be stored on an individual drive, which can be impacted in a couple of ways.
The first is areal density, or the number of bits of data that can be stored per square inch on the spinning platters. Average areal density in 2011 was 744 Gbits per square inch, but could grow to 1,800 Gbits per square inch by 2016, according to research firm IHS iSuppli. That would result in capacity-per-drive of 30 TB to 60 TB for 3.5-inch drives, and 10 TB to 20 TB for 2.5-inch drives, compared to a 4-TB maximum capacity this year.
The second measure of capacity is the maximum number of platters that can safely spin inside a disk, which is currently five. That limit stems from the drag caused at the edge of a disk spinning at high speeds, which causes the drives to use more power to spin the disks and requires a certain space between the disks to account for vibrations in the heads and disks from the turbulence. This issue is being addressed with moves to replace air inside the drives with helium, which significantly reduces the friction that causes turbulence and could allow seven or more platters to spin in the same space.
However, don't expect hard drives to get any faster. If anything, they will get slower. The speed at which platters inside the disk spin, as measured by revolutions per minute, is currently at a maximum of 15,000 rpm. But the amount of power needed to push that speed higher is such that drives with higher spin rates are not expected to be offered commercially.
In fact, don't be surprised if 15,000-rpm hard drives get phased out over the course of the next two years. The development of flash-based storage technology will take a bite out of the need for faster hard drives, as even a small amount of flash storage when tied to large amounts of disk capacity results in a significant increase in storage performance.
NEXT: Flash Storage
Flash memory-based technology promises to speed up the performance of storage, either locally or on the cloud, but it is no panacea for managing the coming data deluge.
Flash storage is currently undergoing a rapid revolution in development, with the technology being applied in a number of ways to tackle the problem of increasing storage performance from multiple angles.
For instance, vendors are adding SSDs to portable PCs to serve as a boot drive or even as primary storage, to storage arrays to serve as either cache or as a spinning drive replacement for higher-performance data, or to servers to act as boot drives or as a spinning drive replacement.
Flash storage is also available on a PCIe card for servers to speed up the performance of a specific application, and in appliances that allow multiple servers to access the high-performance servers. New technologies released this year allow flash memory in servers to be pooled between multiple servers.
Yet while flash storage is being substituted directly for disk storage in consumer devices like tablet PCs and thin form factor mobile PCs, that is not how they will be used in the data center, said Claus Mikkelsen, chief scientist at HDS.
Mikkelsen said that only 5 percent of data can actually benefit from dynamic tiering, which automatically moves data between different types of storage media depending on how quickly it will be accessed.
"That limits the overall impact to dynamic tiering," he said. "So that will limit the volume of SSD production."
Also, despite all the advances, flash memory is still several times more expensive on a per-Gbyte basis than spinning disk. Gelsinger said that flash memory costs 30 to 100 times more on a per-Gbyte basis than hard drives, making flash a small part of the storage mix for years to come.
"If I come to you as a great customer of EMC tomorrow and say, 'You should move to all-flash, it's just going to increase your storage costs by, let's say, maybe 50-X,' how do you think the conversation is going to go?" Gelsinger asked.
Looking over the next 10 years, Gelsinger said he does not see the 30-times to 100-times per-Gbyte cost differential between flash memory and hard disk storage changing substantially.
"To change this equation, it would have to change dramatically," he said. "If I'm off by a factor of two, then it might be 15 to 50 times more expensive. ... We feel very confident in saying that we continue in this hybrid world where you need both Flash and hard disk drives for as far as we can see into the future."
Falling prices and rising volumes for flash storage will change the score with hard drives, but not for the foreseeable future, said Steve Sicola, CTO of storage company X-IO.
"You couldn't satisfy the world's needs with the entire amount of flash available today if price was no issue," Sicola said. "But over the next 10 years, I see the percentage of flash used in storage systems that also have hard drives going up as the price difference slowly erodes."
NEXT: Tape's Roll
When looking at the storage technology of the future, it is too easy to forget tape, the big storage technology of the past, but it will still be a fixture on the storage landscape for the foreseeable future.
Now, it's OK to forget tape as a backup medium. Tape is too slow to use for normal everyday backing up and recovering of data. SOHO and SMB customers can more easily use the cloud for routine data protection, perhaps in combination with disk-based appliances to keep data locally for quick restores and to stage the backups when Internet latency is an issue. And enterprises will continue to adopt a combination of cloud for protection non-mission-critical data and other technologies for mission-critical data.
Tape, however, will continue to prove the lowest-cost, most reliable technology for archiving of most types of data.
Horison Information Strategies estimates that more than 85 percent of tape drives feature the LTO technology, which is currently in its fifth generation. With compression, LTO-5 tapes can store 3.0 TBs per tape cartridge with a data rate of up to 280 MB per second. However, road maps have been announced for the next three generations of LTO tape technology, with LTO-8 eventually expected to be able to store up to 32 TB of data with a data rate of up to 472 MB per second.
Tape as a storage technology for archival data will continue to offer significant advantages over disk-based storage, and even cloud-based storage. Disk storage, when used for backups, cost about four times tape, and when used for archiving the cost difference is 15-to-1.
Tape storage also remains far cheaper than the lowest-cost cloud storage on a per-GB basis, and will likely remain so for the next decade and beyond.
So while tape storage and other familiar technologies aren't going away anytime soon, the storage market is transitioning and big changes are ahead. For a look at how these changes will impact IT professionals and solution providers, make sure to read the next installment of our storage revolution series.