Storage 101: Cost Control Through Storage Virtualization


At a time when budgets are shrinking and businesses are scrutinizing their spending patterns, it is especially important to use every storage resource to its fullest extent. It is expected that the average customer will buy 100 percent more storage this year than last, increasing the amount of administrative burden. According to Gartner, a research and advisory firm, every dollar of storage requires five dollars of IT resource to manage it. Yet, while half of IT budgets are typically spent on storage, less than 50 percent of the existing storage capacity is actually in use. This shows that as environments continue to grow and become increasingly complex, it is not only important to leverage existing IT staff and optimize the use of all storage resources, but also to minimize the increasing costs of managing accelerating amounts of data. Technologies deployed within an IT environment must ensure a simplified, centralized storage management strategy that will reduce the learning curve for IT professionals and can therefore increase efficiencies to contain these costs.

Typically, end users are not interested in the physical aspects of the storage serving their applications (i.e. seek times, how many disks are in a string, etc.). What they do care about are the business issues of application response time and throughput, sufficient capacity for their data as it grows and reduction or elimination of application downtime. In short, they care about the availability and accessibility of their data, not the physical aspects of the storage. Storage virtualization presents multiple ways to access, manage and use existing storage resources.

The following discussion defines storage virtualization and how new technologies offer even more ways to address the functional challenges of storage.

What is Storage Virtualization?
Storage virtualization is the process of consolidating multiple physical storage devices from various vendors and reorganizing them into logical (virtual) pools, or units, of storage. These units are presented to the operating system (OS) for use by the appropriate applications and end users. Despite the recent surge in interest, storage virtualization is not new, either in concept or in practice. Defined almost 20 years ago in mainframe computing, storage virtualization is finding new life and importance with the emergence of storage area networks (SANs). With all the excitement about SAN virtualization, it's easy to lose sight of the immediate benefits of storage virtualization that can be achieved in the prevailing, directly-attached storage architecture as well. Whether in mainframe or open system environments, storage virtualization technologies are used to simplify and centralize storage administration and provide flexibility in meeting the demands of today's data requirements.

Storage virtualization removes the physical restrictions of storage by creating a layer of abstraction above the physical storage itself, accessible as a logical 'pool' of storage, which can be allocated when and where needed.

This layer provides the ability to combine heterogeneous physical devices into virtual entities designed to meet individual application requirements. For example, a virtual pool of storage can be created using the fastest physical disks to optimize performance for a mission critical application, while shielding the user and the application from the details of the hardware implementation. Then, as new hardware becomes available or application characteristics change, modifications to the physical layer can be made without interrupting access to the data on the logical device. There are many storage virtualization technologies that have evolved including storage-based, host-based, in-band and out-of-band solutions.

Storage-based solutions were one of the first of the virtualization technologies and allow multiple servers to share access to data on one (large) individual array. The downside to this however, is that any of these servers could not access data beyond their own array. Additionally, users run the risk of becoming locked into vendor-dependent solutions based on the hardware compatibility restrictions of the large array or the direct attached servers. Storage-based virtualization can then be more expensive than the other, less proprietary, alternatives.

Host-based solutions permit disks within multiple arrays and from multiple vendors to be represented as a virtual pool to a single host server. This allows IT staff to have greater flexibility by enabling them to be completely storage vendor independent while still enjoying the benefits of centralized storage administration from a single console. And, though the data is only available through a single server, this server can be made highly available to ensure 24X7 data access across the enterprise. The potential disadvantage with this solution arises when multiple servers require shared access to the same data. In this case, data replication or another method of storage virtualization may be considered.

The next step in the storage virtualization evolution was to ensure that disks from multiple arrays and multiple vendors are represented virtually to more than one server. This can be accomplished with an in-band solution, often referred to as symmetric virtualization. Like host-based solutions, this type of virtualization is also storage vendor independent and can be utilized in both a SAN and local area network (LAN) environment. However, in this scenario, it is critical that the in-band appliance itself has the ability to scale to meet expanding application and storage management requirements.

Out-of-band solutions provide the power of symmetric virtualization but move the storage appliance out of the data path. This type of virtualization is also best positioned to utilize the fast connection power of a SAN switch. To achieve these benefits however, software is required to run on the application servers themselves for virtual device communication. This may increase central processing unit (CPU) utilization and will make OS upgrades more precarious when these solutions are in place.

All of the different virtualization technologies are useful, and which option to select depends on the IT environment and specific end user business requirements. It is important to understand one's data flow when selecting a storage virtualization solution to ensure that today's business needs will be met, while maintaining an ability to adapt to the unknown requirements that will present themselves as technologies continue to grow and change. A storage virtualization layer that supports rather than restricts business decisions is critical.

Flexibility and Intelligent Provisioning
Because data is not tied to specific hardware devices, virtualization allows an unprecedented degree of flexibility in using storage resources to meet both end user and application requirements. The virtual storage devices are not restricted by the capacity, speed or reliability limitations of the physical devices that comprise them. Applying intelligent storage software in the virtualization layer offers a way to address the functional challenges of storage without compromising the availability needs of the data.

Provisioning is the act of providing users or applications with the right amount and right type of storage, at the right time. Virtualization makes provisioning much easier. With storage virtualization and centralized administration, changes to the physical storage layer can be made without interrupting data access, to continually provide the best Quality of Storage Service (QoSS) through real time provisioning.

Consolidation and Cost Control
The cost of managing storage is significant and over time exceeds the purchase price of the storage hardware. One way to monitor this cost is to evaluate the amount of storage managed per administrator. Efficient organizations may have one administrator per terabyte of storage or better, while others with more difficult environments may have a much lower storage to staff ratio. Storage virtualization can simplify the management of resources in heterogeneous environments, ultimately reducing the true cost of the storage. Management is simplified through the ability to learn and deploy one tool to centrally manage entire pools of storage.

Virtualization also reduces costs through better hardware utilization and consolidation. Consolidation is the act of combining storage resources into a virtual pool of storage, accessible to many applications or, in clustered or SAN environments, to many servers. Before there is even a need to add resources, storage virtualization enables the proactive consolidation of all storage devices. As part of a virtual unit, storage can become available to applications and servers that previously had no access to it. This applies particularly well to SAN environments, but is by no means exclusive to them.

Intelligent provisioning and consolidation as a result of storage virtualization will help control IT costs and reduce overhead, allowing effective data storage management with less manpower.

Be sure to look for our second Storage 101 class, "Networked Storage Solutions," which we will be posted on January 21.