Teach Yourself UNIX Shell Programming in 14 Days

Chapter 4: Using Metacharacters


This chapter discusses the use of metacharacters: wild cards and other characters with special meanings to the shell. Objectives important to this chapter are:

  • Quotation characters
  • Filename metacharacters
  • Substitution metacharacters
  • Input / Output uses of metacharacters

Metacharacters can be used with filename specifications to specify a long list of possible filenames. For instance:

  • asterisk - matches any number of other characters
  • question mark - matches any one other character
  • square brackets - used to enclose a list of allowed characters, either a set or a range

The command

	ls c*

would mean to list any file whose name starts with a letter c. Any number of any characters (or none) can appear after the c.

The command

	Ls c?

means to list all files whose filename begins with a c which is followed by only one other character.

The command

	Ls *c*

means to list all files whose names have a c someplace in them.

Recursion is a handy thing and also a dangerous one. The Ls command can expand the asterisk character recursively, giving the full contents of all subdirectories. This is handy. The rm command can recurse through all subdirectories as well, and this is dangerous. This is why it is not the default behavior of rm to do so, and it will not unless you use the -r switch to engage it.

The use of square brackets to enclose sets of possible letters or numbers is less familiar to most users. Your author indicates the characters specified in the brackets may be:

  • lowercase letters
  • uppercase letters
  • digits 0 through 9
  • special characters

For instance the command

	Ls [a-e]*

would list all filenames starting with the letters a through e.

You may use several bracketed sets. The command

	Ls [AE]*[1-4]

would list all filenames starting with a through e and ending with 1 through 4. It was important for this command, however not to space except after the Ls

If you want to turn off the expansion of filenames using metacharacters, you can set the f flag off with the command

	set -f

and turn it back on again with

	set +f

This is not always a practical option, however. So we more often use various forms of quotes around material we do not want expanded.

Your author lists four characters used in quoting:

  • the single quote, or apostrophe
  • the double quote, or quotation mark
  • the back quote, or accent grave
  • the backslash character (Which is it's correct name. How odd.)

Each has its own properties and limitations. To illustrate them, we may use the grep command. This command may be used to search for a specific text string in any number of text files. (It has other uses, but this one is illustrative.)

If we want to search the file called eats for the string pizza, we could enter

	grep pizza eats

to show every line in that file containing the word pizza.

If we want to search for a multiple word string, however, we should enclose the phrase in single quotes to have the phrase processed as a single argument, such as

	grep 'single quotes' unix_file

A feature of single quotes is that dollar signs, backslashes and accents grave inside them are not interpreted as special characters. If you need to have a dollar sign used as the value-of operator, you can use double quotes instead of single quotes. The echo command will illustrate this. Suppose the command is:

	echo 'Good morning, $USER'

The message echoed would be exactly what you see here. The shell would not substitute the value of the USER variable. However,

	echo "Good morning, $USER"

would result in the user's id being substituted in the output.

The backslash means that the shell should ignore the special meaning of the next character. It is like putting single quotes around just one character. It is useful when you want to print a single or double quote to the screen, since you can use the backslash to tell the shell to treat that character like any other. This is exactly the meaning that the backslash character has in printf statements in the c language. It means to escape the normal interpretation of the next character. You can also display one type of quote by enclosing it in the other type of quote.

The back quote, or accent grave, may be used to enclose a command that you want executed, usually so that the result of that command may be used by the rest of your command line as an argument. It is difficult to see the back quote in text, and to tell it apart from the apostrophe even when you can see it. This makes it more difficult to debug scripts that use it. As an example, the command

	Ls -l `which rm`

means that the shell should first run the which program on the rm file, finding out which directory the rm file is in. Then the result of that command (the path to rm) is offered to the Ls command and the result is displayed in long form.

It is a bad idea to use metacharacters in filenames. The book offers the example of a filename starting with a leading ampersand. It is legal to do so, but when the shell encounters an ampersand, it takes it as a request to run everything on the line before it as a background process. This means that the commands before the ampersand will not see the filename as one of their arguments: they will already be banished to the background, and will not receive the odd filename. Don't do it. Metacharacters do not belong in filenames. Like c, UNIX will let you do things that are not good for you. Learn not to do them. (Next, we learn not to smoke, drink, or tell naughty stories. Well, pick a couple, anyway.)