Data Protection Techniques: Ensuring Stored Information Integrity

Data Protection 101

As threats continue to mount against customer networks, effective approaches to data protection are crucial for ensuring the integrity of stored information. In the event that antivirus, e-mail archiving or other security tools are compromised, or if a system crashes, having reliable backups and a strong disaster recovery plan in place can be a critical difference maker.

There are several approaches and technologies that solution providers can leverage with their offerings to contribute to effective data protection. With a little help from CRN's Channel Encyclopedia, we'll examine various data protection methods, including different backup types, disaster recovery plans and data deduplication.

Backup Types

Data backups are a critical part of the data protection and recovery process. Backing up data enables duplicate copies of data to be stored separately from the primary storage medium and ensures that data is available if the primary storage fails. When it comes to backing up files in a local data repository, there are different data backup types that serve different purposes.

A full backup backs up all of the files in a repository. A differential backup allows the backing up of selected files. An incremental backup backs up files that were changed since the last full backup was done. A delta backup backs up data that has been changed in a file, and not the entire file itself. There are also several service providers that offer hosted, online backups. Storing data off-site through an online backup service or the cloud can be essential to a disaster recovery plan.

Disk And Tape Backup

Backups can be performed on both tape and disk media. Tape backups involve using magnetic tapes to store duplicate copies of data separate from the primary storage medium. Magnetic tape can also be used to move and back up data from one storage array to another.

While the restore process for tape backups can be complex, tape is much less expensive than disk. Disk-based backups aim to reduce backup management complexities by enabling fewer read errors and faster access to data, but are more expensive. Therefore it is common to see storage environments where backups are first done to disk for fast backups and restores, and then to tape for long-term and offsite archiving.

LAN Free Backup

LAN free backup enables data to be backed up without transferring it over the LAN or WAN. LAN free backups can be accomplished with either a separate backup server or with a storage-area network (SAN) in place of a server.

With backup servers, those devices handle the backups separately from the normal LAN traffic of customers' primary servers. A SAN, on the other hand, requires a separate network, typically configured with Fibre Channel, set up specifically for centralizing the storage and management of data.

Data Deduplication

Deduplication removes duplicate information as data is stored, backed up or archived. It can be done at the file level, where duplicates are replaced with a marker pointing to a copy of the file, and/or at the sub-file or byte level, where duplicates are removed and replaced by pointers, resulting in a significant decrease in storage capacity requirements.

Dedupe products can be classified in several ways. The first is according to where the dedupe takes place. Source dedupe dedupes the data before it is sent across a LAN or WAN. This results in fewer files and less data sent over the network, but affects backup performance because of the processing overhead. However, with new high-performance processors, this is less of an issue.

Target dedupe starts the dedupe after the data is copied onto a destination device such as a virtual tape library. This takes away any overhead related to source dedupe, but requires more capacity at the target to temporarily store the entire data set.

Data Deduplication...Continued

The second type of data deduplication occurs according to when the dedupe process occurs. With in-line dedupe, files are deduped as they are stored on a device. This adds processing overhead to the dedupe process, but does not require extra storage capacity for the dedupe process.

With post-process technology, data is sent to the target device to be deduped. This requires extra capacity to temporarily store the incoming files before they are deduped, but takes the overhead away from the originating storage device.

Disk Mirroring And Storage Replication

Disk mirroring, also known as RAID 1 in a RAID array, is the process of recording multiple copies of data for a fault-tolerant operation to ensure data remains available if one copy is lost or corrupted. Disk mirroring can be done by writing the data on separate partitions of the same disk or on separate disks within the same system.

Storage replication is similar to data mirroring, except that the second and subsequent copies of a data set are done over long distances, typically to a remote location.

The Encryption Algorithm

Cryptography refers to the conversion of data into scrambled code before it is transmitted over a public or private network using an encryption algorithm. The encryption algorithm uses a string of bits known as a "key" to perform the calculations needed to transform the data into a new form which cannot be read without the use of that key. For protecting data in transition, encryption is critical for ensuring confidentiality, integrity and in some cases even authenticity.

Disaster Recovery Planning

A disaster recovery plan is exactly what it sounds like: a plan for duplicating all IT operations in the event of a disaster, such as a fire or flood. Disaster recovery planning ensuring that backed-up data as well as procedures for activating business-critical systems are available in a location located some distance from a user's primary data center. A key element to a disaster recovery plan is testing. Ensuring that backup systems and recovery methods will actually seamlessly restore day-to-day business operations and make the data accessible to all customers and employees is critical.

In addition to the multiple service providers hosting disaster recovery services, there are various cloud computing platforms that also provide disaster recovery capabilities.