This chapter sets a long list of objectives, but it is another
short one, so take heart about that. The chapter begins with a review
of some commands and a quick summary of others.
Selection commands are commands that extract information from files.
Manipulation and transformation commands do something with or to data or files.
The text explains that the pipe
operator (|) is another redirection
operator. The explanation on page 215 is correct. If you put a pipe
on the command line between two commands, the first command will be executed
and its output will be sent (piped) to the second command as input. The
generic form is like this:
Regarding commands we have already discussed, the cut command
can be viewed as a way of displaying only a part of the information found
in some file. Its syntax reflects the fact that, like most UNIX commands,
it looks at a file one line at a time. The command
If we want to see a sequence of characters, we might use
If the file is organized (as we have seen) into fields, with
all fields separated (or delimited) by the same character
(like a colon, or a tab), we can tell cut to show us certain fields.
The grep command is discussed
next. The text suggests three possible meanings for grep. The one I recall
is their second choice: Global Regular
Expression Parser. Global in the sense that it will search through
a file, a list of files, or all the files in a folder. Parser in the sense
that it looks through the parts of a file (characters and words). Regular
Expression because that is what someone once called search strings that
are allowed to include wildcards and other operators. Note the options
on page 217. The default behavior is to return filenames and lines in
those files that match the search string, but you can use the -l option
to limit the return to just a list of filenames that contain hits.
The uniq command has a very specific function. You feed it a file that consists of lines of text. It examines
each line, and it returns each line, but only if the line it just
returned does not match the current line. In this way, it returns one
and only one copy of each combination of characters in a line found in
that file, as long as the file was sorted alphabetically to start with.
You may wonder, what the hell good is that?!? Well, it was written
before the sort command had a -u
switch, which does the same thing, but it sorts the file first then
looks for "unique" lines. UNIX admins are not known for updating their
systems, so the sort filename | uniq
method will still work, even if the "newer" version of sort has not
been installed. Note: this filter method is not meant to be used when
editing text that depends on multiple instances of the same string.
The comm command compares two files, and it produces three columns of output:
If you want a mnemonic for the comm command, remember that the third column reports lines that are common to both files. Note that each column can be suppressed in the output by using -column_number as a switch.
The diff command examines two files that are supposed to be similar, and it gives us a report about the lines in each that are different. Take a look at the discussion on pages 221 and 222, then be glad we will not be wasting a lot of time on this command. It may be wonderful for instances that need it, but it is hard to imagine finding ourselves in such an instance.
The wc, word count, command is used to count three kinds of things about a text file. It can count the number of words (-w), the number of lines (-l), the number of bytes (-c), or any combination of those three options. Note that the switch for bytes is -c, which assumes one byte per character, as in the ASCII or extended ASCII sets.
The text explains that you may (will?) sometimes want to make global changes to huge files. In a case like that, you want an editing tool that is made to dig into large bodies of data. The tool the text describes is sed, whose name may stand for stream editor. It helps to have already accepted the concept of streams, which can be treated as files. sed is happy to apply your instructions to each line in several files tirelessly, where a human (or other biological life form) might grow inattentive and error prone. The link I have just given you goes to a web page with much more information about sed than we are given in the current chapter in your text.
One method of using sed is from the command line, such as
The -e option means to read the sed commands entered on the command line (as opposed to -f, which means to read them in a sed script). The quoted text has three parts, separated by slashes. The "s" means to substitute text. The "text_to_find" is the text that sed will search for on each line. The "text_to_write" is the text that will replace the "text_to_find" on each line. Finally, filename could be the name of a specific file to process, or a phrase that expands to many filenames.
The command may also specify a line number or a range
of line numbers to process in the named file(s), however this may not
be a good way to do it, because it gets tricky.
This example means to process the sed substitute command on lines 1 through 250 in filename. If filename above expands to mean several files, then the first one in the list contains lines 1 through x. The second file contains line numbers x+1 through y, the third contains line numbers y+1 through z, and so on. For the purposes of sed, all lines in all files in the list are considered as though they were actually consecutive lines in a single file, with ever increasing line numbers. If we intend to process all lines in the first file, and half the lines in the second file, we had better use the wc command to count the lines first.
Using a sed script file is similar to the section above:
The sed command also has a delete option. Remember that, being
a line editor, it deletes entire lines. With that in mind, you
Your text explains that sed can append to a file, but it might also be thought of as in insertion. You have to use the a\nnn switch, in which the nnn represents the line to begin appending after. If you do not supply a line number, sed will append at the end of the file you are editing.
The text explains the tr (translate) command as being useful for translating sets of characters. It is hard to see what value this one has, so let's take a look at some examples from another source. It is clearer from the material on the web site that the two character strings you need to supply can be specified in different ways, but they usually must be the same length. There are some exceptions to that "rule" in the examples on the web page.
The pr command prints to the standard output stream. Its format assumes you want paged output in standard 66-line pages. (That makes 6 lines per vertical inch on 11 inch sheets of paper.) In the 66 lines, the command assumes a 5 line header, which displays information about the file by default. There is also a 5 line footer, which the text calls a trailer. You can override the default settings with optional switches, as noted on page 225.
On page 225, the text begins a new section that leads into a major project: designing an application that makes use of the functions of the operating system we have discussed. It is debatable whether the end product produced by the projects in the chapter is really an application, since it produces no machine language files. It does, however, give us a reason to discuss several points about application design which will help you in this project and in others that will follow in other classes.
Browse the pages at the end of the chapter to get an idea of what this project is about. You may want to practice skills with the various commands in this chapter by doing some of projects 5-1 through 5-11.