Information Systems Theory

Chapter 5: Data Storage



This chapter discusses various storage methods. The objectives important to this chapter are:

  1. Understanding storage
  2. Understanding the floppy disks and hard disks
  3. Understanding data compression
  4. Understanding optical disks
  5. Understanding magnetic tape
  6. Understanding mass storage


Key Concepts:

Storage refers to two concepts: volatile (temporary) storage, and nonvolatile (permanent or secondary) storage. This chapter is really about nonvolatile storage.

Storing data on some medium is usually referred to as writing or saving the data, while retrieving the saved data is referred to as reading it.

Most data storage is done on some kind of magnetic medium, such as a floppy disk. These have been available in several sizes and several capacities over the years. The most common size currently is the 3.5 inch floppy disk (diskette), which can hold up to 1.44 MB of information.

In order for floppy disks to store data, there has to be a system of organizing the data. Disks are divided into tracks, which are concentric circles. Most floppy disks have 80 tracks per side. Floppy drives read and write to the tracks with magnetic read/write heads (like in a tape recorder or VCR). In most machines, the heads take turns, writing to one side then the other, so the user does not have to be aware that both sides are used (and does not have to flip the disk over, as I had to with my first computer: an Apple IIc).

The disks are further divided into pie-shaped wedges called sectors. (Why is that word red? You'll see in a few lines.) The number of sectors on a disk varies from type to type. Our common example, the 1.44 MB 3.5 inch disk has 18 sectors on each side. Now for the confusing part : the word "sector" has another meaning. Sector can mean one of the wedges on a disk, but it can also mean the part of a track inside that wedge. If we consider that a 1.44 MB disk has 18 sectors on a side, and that it has 80 tracks on that side, then those tracks are divided by the sector lines into 1,440 track segments that are also called sectors. It will become tedious if I continue using colors for that word, and it will be unrealistic: the real world is not color coded. You have to understand which meaning of the word is intended from the context in which it is used, just like any other oddity in English.

Something that does not vary, at present, is that a sector can only hold 512 bytes of data, no matter what kind of disk is being used. (Okay, one more time. I made it blue so you would know I meant "a segment of a specific track".) Since there are two sides to that disk, that means it can hold 512 bytes per sector, times 2 sides, times 1,440 sectors per side, making 1,440 Kilobytes, which is 1.44 Megabytes. (In this paragraph, every time I used the word "sector", I meant "a segment of a specific track".) Information about where data is stored on a disk is kept in a file on that disk. Often the file is called the File Allocation Table (FAT).

Along the way, the text mentions that the platters in a diskette (the disk itself) are made of mylar, a kind of plastic, and are coated with either iron oxide or cobalt oxide, which are compounds sensitive to magnetism. You may want to know that higher density (capacity) diskettes tend to use cobalt oxide, while lower density (capacity) diskettes used iron oxide. Drives that use the higher capacity (storage capability) diskettes must use stronger magnetic fields to affect those cobalt particles. This is part of the reason why you should not try to buy cheaper, lower density diskettes and format them for higher densities. The devices that use them behave in different ways, and the stronger magnetic fields meant for cobalt oxide disks will cause the iron oxide disks to "bleed" information from one track into another.

The next major concept is the cluster. Think of sectors and tracks as being physical aspects of a disk that are dealt with by the hardware. Think of clusters as being logical aspects of a disk that are dealt with by the Operating System. A cluster is defined as "the smallest unit of data that can be read from or written to a disk at one time". For the type of disk we are discussing (1.44 MB), the smallest amount of data that can be written to the disk (a cluster) is the size of one sector. For the other types of floppy disks, it varies between one sector and two sectors.

You should know that a disk can be formatted more than once, and each time it is formatted you erase/remove any data that might be on it.

When you use a floppy disk, you should be aware of the parts on it. As shown on page 5.5, you can expect that there will be a square hole in two corners of a "high density" floppy disk. The hole on the right in that picture is the write-protect window. Inside the drive, a light shines on one side of the disk, and a sensor looks for the light on the other side. If the write-protect window is open, the sensor can see the light, and the drive will not write to the disk. If the window is closed, the sensor cannot see the light, and will be able to write to the disk. The text compares this to the write-protect knock-out on a VHS video tape. The other hole is to tell you that this is a high density disk. If it is not there, suspect that this is a "double-density" disk.

Floppy Disk Types
Type Storage Capacity Tracks per side Sectors per side Cluster Type
3.5-inch extra-high-density 2.88 MB 80 36 2 sectors
3.5-inch high-density 1.44 MB 80 18 1 sector
3.5-inch double-density 720 KB 80 9 2 sectors
5.25-inch high-density 1.2 MB 80 15 1 sector
5.25-inch double-density 360 KB 40 9 2 sectors

Some numbers are given in the table above about storage capacity for several types of disks.

On page 5.8, there are some graphics illustrating things you should never do with a disk you care about. Rephrasing these as rules:

  • Do not touch the surface of a disk
  • Do not expose the disk to magnetic fields
  • Do not expose the disk to contamination from food
  • Do keep disks in a protective holder when possible
  • Do not smash the disk with heavy objects
  • Do not smoke near the disk (or the computer, for that matter)
  • Do not expose the disk to extreme heat or cold
  • Treat the disk gently

Hard disks are like floppy disks, in that they are coated with magnetic storage material and are used for storing programs, data, etc. Hard disks are usually aluminum instead of plastic (harder) and usually hold much more than floppies. The disks themselves are sealed in hard drives. Their capacity may be measured in megabytes (millions), gigabytes (billions), or terabytes (trillions). Hard drives spin very fast, the magnetic heads fly very close to them, and there is the potential for a head crash if the hard drive is bumped while running. This would result in the head carving a trench in the disk medium. Access time for hard drives is much faster than for floppy drives.

Disk cartridges come in several varieties. Think of them as hard drives that can have their disks changed for new disks by the user.

File are stored in clusters, as noted above. Typically a user stores files, deletes some files, and stores more files. The operating system will store a file in the first available cluster it finds, and if the file does not fit, it continues storing the file in the next available cluster, and so on. In this way, since deleting files opens clusters, a file may be stored in clusters that are contiguous (sequential, in a row) or it may not. If files are stored in pieces that are scattered all over a disk, access time increases, and performance suffers. A disk with many files in this state is said to be fragmented. To correct the problem, you use a defragmentation utility, which moves clusters around on the disk until files are as contiguous as possible.

All disks go bad. You can't stop it, only delay it; you must realize that all disks fail eventually. This is one reason for making backups of your data and programs. A backup is a safety copy, it is your insurance against that day that will eventually come, when your data is not where you left it.

CD-ROMs are laser disk systems. They work by burning data into a platter with one kind of laser, and reading that data with another kind of laser. A pit is where a laser has burned a depression into a disk. Pits are read as 0 bits. A land is where a laser has not burned a depression, so it is a flat area. Lands are read as 1 bits. A typical CD can hold up to 650 MB of data. A DVD (digital video disk) is a newer system that is similar to a CD-ROM.  DVD disks can hold up to 4.7 GB.

A CD-R drive is one that can record CDs. A CD-RW drive is one that can record and rewrite to special CDs. Both kinds of drives can read regular CDs. A DVD drive is needed to read a DVD disk, but it can also read the other kinds.

Magnetic tape is another storage option. The drawback is that access to data on a tape is sequential. Suppose you had a database with a thousand records on tape, and you wanted to access the 400th record. You would have to read through the first 399 records to get to the 400th in a sequential access system. A hard drive offers random access, which means that you can skip ahead to any record you want without reading the ones in between. Tape systems are more useful for making and storing backup copies than they are for regular daily access. Two major kinds of tape usage exist: reel to reel systems and cartridge systems.

Some PC Cards can also be used as storage devices. Usually this is done to add additional storage to a laptop, or to make it easier to carry large amounts of data from one machine to another.

A RAID system is a redundant array of inexpensive devices. This is a series of hard drives. A RAID level 1 system has two hard drives, one of which is a mirror (carbon copy) of the other. Another kind of RAID uses data striping, which means it has several hard drives, and a little of each data file is written to each of them. The devices are redundant in the sense that any one of them could fail, and the data could be recovered or reconstructed from the remaining ones.

Data that is saved on tapes is often stored for years without being needed. A mass storage system of many tapes is shown in the picture on page 5.21. In this example, the tapes are stored in small round room called a silo. Tapes are in two racks, one that covers the inner wall of the silo, and another that is a kiosk in the center of the silo. A robot reader, controlled by computer access, can follow track in the silo to retrieve and read any tape stored in it. After reading the tape, the robot can replace it in the proper slot in its rack.

Two more concepts are shown on page 5.22. A smart card is a card shaped like a credit card which contains a processor. The processor stores data about the person the card is issued to, such as the amount of time the user has purchased on a telephone access system. When the card is used, it is updated. Optical memory cards are like flat, rectangular laser storage media. A laser system is used to read and write information to these cards.