CIS 303a - Computer Architecture

Chapter 12, File and Secondary Storage Management, part 2

Objectives:

This lesson concludes the discussion of file storage and secondary storage concerns that relate to system development. Objectives important to this lesson:

  1. File migration, backup, and recovery
  2. Fault tolerance methods
  3. Storage consolidation methods
Concepts:
Chapter 12, part 2

Chapter 12 continues with a discussion about file migration which begins oddly. The author tells us that some applications and some file systems include features that can automatically save the previous version of a file whenever that file is changed. This feature is also found in transaction processing systems. In a transaction processing system, a series of time stamped actions are saved in a file, and then used as data to update the appropriate files. Usually, the transactions are saved in a log, and can be used to roll a database back in time or to bring it up to date if a backup copy has to be restored to the live system.

On page 466, the text explains that a sequence of file versions may first be referred to as parent and child, then when a new version is added to the set, the parent becomes the grandparent, the original child becomes the parent, and the new version becomes the new child.

Another kind of versioning simply adds a number to each new version of a file, which leads to the issue that causes us to think about migration. How many versions does it make sense to have in readily available storage? At the end of this section of the chapter, the author finally mentions that a system can be set up to move older versions of files away from standard storage to more and more remote versions of storage, including offsite backups.

Having introduced the idea of backups, the text discusses three common methods used to create them. The author takes an odd approach, talking about tagging files to show whether they have been copied to a backup device. This is usually explained a bit differently, so let's clear up the discussion. First some terms:

  • Target - the device, volume, folder, or group of files being backed up
  • Archive bit - a bit in a file that is turned ON when the file is changed; it is used to flag files that have changed since the last backup: most backup programs look for files whose archive bits are set to ON, copy those files, then reset the archive bits (turn them OFF) on the target files
  • Full - a backup of all files in the target; sets the archive bit of each file to OFF once the backup is made
  • Incremental - a backup of target files that are new or changed since the last backup; depends on the fact that programs that change files typically set the archive bit to ON when a change is made; sets archive bit to OFF for all files it copies
  • Differential - a backup of all files new or changed since the last Full backup; copies all files whose archive bit is set to ON; does not change the archive bit of files it copies because they will be copied again in the next differential backup
  • Copy - like a Full backup, but it does not change the archive bits of files it copies. This is typically not part of a standard backup strategy, but an option to work around the system.

This needs more explanation. Assume we use a tape drive to make backups. In a Full backup strategy, the entire target is backed up to tape every time we make a backup tape. This strategy consumes the most time and the most tapes to carry out a backup. To restore, we simply restore the most recent tape(s). This is the least time consuming strategy for restoring, but the most time consuming for creating backups.

The second method, Incremental backup, means that we start with a Full backup of the target, and then each successive backup tape we create only backs up the elements that are new or changed since the last backup was created. This means that successive backups will not always be the same length. Therefore, this is the least time consuming backup, but the most time consuming restore. To restore, we must first restore the last Full backup made, and then restore EVERY tape made since then, to ensure getting all changes.

The third strategy, Differential backup, also starts with a Full backup tape. Then each successive tape made will contain all the files changed since the last Full backup was made. This means that we will have to restore only one or two tapes in a restore operation. If the last tape made was a Full tape, we restore only that one. If the last tape made was a Differential tape, we restore the last Full tape, then the last Differential tape.

The fourth strategy, Copy, is not mentioned in this text, but it is no different from Full in terms of backup or restore time. In both Incremental and Differential backup strategies, you will typically use a rotation schedule. For example, you could have a one week cycle. Once a week, you make a Full backup, then every day after that you make the other kind you have chosen to use: Incremental or Differential.

To keep them straight in your mind, remember these facts:
Backup type What does it back up? What does it do to the archive bit?
Full copies everything Resets all archive bits in the target set.
Incremental everything different from the last backup Resets the archive bits of the target files it copies.
Differential copies everything "different from Full"
(Different from the last Full backup.)
Does not reset any archive bits.
Copy makes a Full backup Does not reset any archive bits.

The time required to create backups should be considered along with the time to restore a backup. When you consider the two concepts as two sides of the answer to a question (What method should I use?), the answer may be the most common choice: Differential. It is the best compromise in terms of backup time versus restore time. Note also, that all standard methods require a full backup on a regular cycle. The recommendation is usually to run a Full backup weekly.

The discussion above assumes that your backups are being written to tape, which has been the most common method for many years. The text discusses three other methods, each requiring different hardware. Copying to other drives is faster, but only if connected by a fast channel, such as being in the same computer. This leads to a problem of removing the copy from the same location as the original. Copying to a disk in another data center is possible, and fast if they are connected by fiber, but costly in terms of setup.

The text discusses fault tolerance, by which it means the ability of a system to tolerate the failure of a part. In particular, it is concerned about the failure of a hard drive that holds important data. Systems that provide tolerance for this kind of event typically use a form of RAID, which has been defined several ways. Eventually, all hard drives fail. RAID allows a system to continue in most cases. One common meaning is Redundant Array of Independent Drives. The word "independent" seems unnecessary, and is in fact misleading. Hard drives set up in a RAID array perform functions that relate to each other. Several kinds of RAID exist to provide for redundant storage of data or to provide for a means to recover lost data. The text lists several types and discusses a few. Follow the link below to a nice summary of RAID level features not listed in these notes, as well as helpful animations to show how they work. Note that RAID 0 does not provide fault tolerance, the ability to survive a device failure. It only improves read and write times.

RAID levels and features:

  • RAID 0: Disk striping - writes to multiple disks, does not provide fault tolerance. Performance is increased, because each successive block of data in a stream is written to the next device in the array. Failure of one device will affect all data. This will provide a performance enhancement by striping data across multiple disks. This will not improve fault tolerance, it will in fact decrease fault tolerance.
  • RAID 1: Mirroring and Duplexing - provides fault tolerance by writing the same data to two drives. Two mirrored drives use the same controller card. Two duplexed drives each have their own controller card. Aside from that difference, mirroring and duplexing are the same: Two drives are set up so that each is a copy of the other. If one fails, the other is available.
  • RAID 5: Parity saved separately from data - Provides fault tolerance by a different method. Data is striped across several drives, but parity data for each stripe is saved on a drive that does not hold data for that stripe. Workstations cannot use this method. It is only supported by server operating systems.
  • RAID 0+1: Striping and Mirroring - uses a striped array like RAID 0, but mirrors the striped array onto another array, kind of like RAID 1

The last topic in the chapter is storage consolidation. The text finally tells us that it considers secondary storage to be on any resource that can be readily accessed, like a local hard drive or a folder on a mapped network drive. (I presume that it would call RAM "primary storage".) It refers to any secondary storage in the same computer you are using as direct-attached storage. The cost of adding more storage of this type can be high, since each user on a network would need more devices added to each computer. Network storage is more economical and backups can be centrally controlled.

The text moves on to discuss two related systems: Network Attached Storage (NAS) and Storage Area Network (SAN). One version of a SAN is illustrated on page 474, showing servers on a LAN (Local Area Network). The servers are also connected to a SAN switch, essentially putting them on another network that has access to a dedicated file storage device, in this case the SAN server. These servers will use different network protocols when storing on the SAN than they would when storing on devices on the general LAN. The SAN server is actually not a true server, in that it does not have the capacity to function as a general purpose computer.

The illustration on the next page is a little less clear, but the explanation in the text explains the idea: a NAS device is simply that, a device "hung" on an existing network that provides additional storage beyond what it already on workstations and servers on the LAN. A NAS device is a member of your LAN, and it will use common network file protocols. The NAS device is described as having all the capabilities of a general purpose computer, in addition to its role as a storage device.

One distinction between the two systems is that the NAS system can provide file service like any other network resource, but the SAN system needs to be accessed at a lower level, which the text describes as block-oriented or sector-oriented access. A security related distinction is that NAS devices can be exploited and protected in the same way as hard drives on any other computer on your LAN. Using NAS devices on a network without high bandwidth connections to the NAS device can produce a service bottleneck.