Storage 101: What is Networked Storage?


In the beginning, a computer was connected directly to its storage, and no other computer could access that storage. All applications ran on a single mainframe computer, and the world was at peace. As client-server computing developed, applications ran on dedicated servers each with its own storage, and soon those applications needed to share data. And, as disk array capacities grew, a single array could supply the storage needs of multiple servers. So, networked storage was born.

The first requirement--sharing data between servers--is addressed by Network Attached Storage (NAS). A NAS device provides file access to clients that it connects to using file access protocols (primarily CIFS and NFS) transported on Ethernet and TCP/IP.

Storage Area Networks (SANs) allow multiple servers to share disk space from one or more disk arrays. SANs provide block-level storage access to servers using Fibre Channel technology.

In the NAS paradigm, the file system that organizes blocks of storage into objects that are convenient for applications to deal with resides in the storage device. The NAS storage device is responsible for allocating storage space and for keeping clients from stepping on each other's toes as they make file access requests. On the host side of the interconnect, a file access client translates applications' file I/O requests into network messages and sends them to the NAS device for execution.

By contrast, in today's SAN paradigm, the file system is on the computer side of the interconnect. System-wide storage capacity management and conflicts among client data access requests are resolved by cooperation among the SAN-attached servers. This makes host-side software much more complex than with NAS devices.

By absorbing the file system into the storage device, the NAS model makes concurrent data access by different types of computers easy. In fact, today, NAS is the only widely available way to make the same data accessible to computers of different types.

Additionally, NAS file access protocols are very general and functionally rich. Moreover, they usually connect to TCP/IP-based networks, which are designed to support very general interconnection topologies. Because of their functional richness and generality, these protocols are predominantly implemented in software, which executes slowly compared to the device-level firmware and hardware typically used to implement SAN protocols. Raw data access performance of NAS device, therefore, tends to be lower than that of otherwise comparable SAN devices, and both client and server processor utilization for accessing data tends to be higher. In simple terms, the trade-off today is, therefore, as follows:

  • Choose NAS for simplicity of data sharing, particularly among computers and operating systems of different types.
  • Choose SAN for the highest raw I/O performance between data client and data server. Be prepared to do some additional design and operational management to make servers cooperate (or at least not interfere) with each other.

    Benefits of Networked Storage
    Throughout the journey into storage networking, it's important to keep sight of the benefits being sought. The specific benefits that storage networking delivers are different in every situation, but with storage networking, as with any aspect of information technology, benefits can be broadly classified as either:

  • Reducing the cost of providing today's information services.
    or
  • Providing or enabling new services that contribute positively to overall enterprise goals.

    Storage networking offers ample opportunity for an information services department to deliver both types of benefits. For example, in the realm of cost savings:

  • If all online storage is accessible by all computers, then no extra temporary storage is required to stage data that is protected by one computer and used by others. This can represent a substantial capital cost saving.
  • Similarly, if tape drives and robotic media handlers can be accessed directly by all computers, fewer of theses expensive and infrequently used devices are needed throughout the enterprise. This, too, reduces total enterprise capital cost for information processing without diminishing the quality of service delivered.
  • Probably most important, however, are the administrative and operational savings in not having to implement and manage procedures for copying data from place to place. This can greatly reduce the cost of people--the one component cost of providing information services that doesn't go down every year!

    The Secret to Networked Storage Success: Software
    Today, most of the attention given to networked storage is focused on the interconnects (such as Fibre Channel) that allow universal storage connectivity and the storage devices and computers that connect to them. But interconnects by themselves don't add any functionality to information processing, they only enable functionality to be added. To realize the benefits promised by networked storage, not only must the hardware connectivity, performance, availability and function be in place to enable them, but system and application software must also take advantage of the hardware to deliver them.

    When evaluating networked storage technology, the hardware components deserve close scrutiny, to be sure. More important, however, one must also scrutinize the software capabilities carefully to ensure that the implementation will deliver the functionality enabled by the hardware. The following are some examples of how software helps realize the benefits of networked storage:

    Sharing tape drives: A SAN-attached tape drive can be shared among servers, because tape drives are expensive and they're only actually in use while backups are occurring. If a tape drive is connected to computers through a SAN, different computers could use it at different times. All the computers get backed up. The tape drive investment is used efficiently, and capital expenditure stays low.

    Sharing Online Storage Devices: Sharing online storage that's housed in an enterprise RAID subsystem is similar to sharing tape drives, except that more of it goes on and requirements for configuration changes are more dynamic. A typical enterprise RAID subsystem makes the online storage capacity of one or more arrays of disks appear to be one or more very large, very fast, or very reliable disks. For now, accept that a RAID subsystem can look like several virtual disks from the viewpoint of host servers. It is quite reasonable that different servers be able to access those virtual disks at different times. For example, one server might collect a business day's transaction records on disk and hand them off to another server at the end of the day for summarization, analysis, or backup.

    Application Failover: Since SANs connect all of an organization's storage devices to all of its servers, it should be possible to create highly available computing environments, in which a substitute computer can take over from a failing one, restart its applications, and resume processing its data. These things are computers after all, so they should be able to recognize the symptoms of failure and fail over automatically. Ideally, this would happen transparently to applications, since it's not really practical to rewrite all of the world's applications overnight to take advantage of highly available computing environments.

    Sharing Data: More advanced forms of computer clustering allow for the concurrent sharing of data among different applications running of different servers. This can be extraordinarily useful, for example, for incremental application growth or scaling. Simply stated, if an application outgrows the server on which it is running, don't replace the server with a bigger one. Instead, connect another server with the necessary incremental power to the SAN, leaving the original system in place. Both servers can run separate copies, or instances, of the application, processing the same copy of the data. More servers can be added as application capacity requirements grow.

    Securing Your Data in a SAN
    Building a SAN provides tangible benefits to your business by enabling increased data sharing, equipment utilization, and centralized management of storage resources. This must be balanced by a rational approach to data security that allows you to capture the benefits of a SAN without sacrificing the security and integrity of your most important corporate asset, your data.

    A common data security issue in today's SAN environments is the inflexible and incompatible zoning mechanisms on current switches and HBAs. Depending on the vendor's implementation of zoning and your knowledge of their software your data protection scheme may not be valid if you reconfigure your SAN (i.e. moving a host or storage device from one switch port to another) or adding new hosts to your SAN (i.e. new Microsoft Windows hosts writing labels to LUNs currently being used by a Unix host).

    Software can help alleviate this problem, by controlling access to zoning tools that manage switches and HBAs, and by providing access control lists per LUN that protect existing data if the SAN is reconfigured or new hosts are added.

    Business Benefits of Networked Storage
    In summary, networked storage enables any-to-any connectivity between servers and storage devices and improves the ability of organizations to move their data. Networked storage can lower information processing capital costs through increased device and capacity sharing, as well as through more efficient communications. This lays the groundwork for truly global computing - the ability to process the same information around the clock from any data center around the world, with both the data and primary application execution site migrating to optimal locations as requirements dictate.