LUX 211 - Shell Programming

Lesson 3: Chapter 8, The Bourne Again Shell, part 1

Objectives:

This lesson introduces the concepts from chapter 8. Objectives important to this lesson:

  1. Startup files
  2. Writing and executing a shell script
  3. Job control
  4. Parameters and variables
Concepts:
Chapter 8

Chapter 8 has several pages of background about the bash shell and the special characters that are used in it. The text reminds us that a shell is needed for you to even see a login prompt, which means that a shell is running on your system before you ever ask for one. The first shell is a limited version of the full featured shells that you can invoke once you have logged in and gained access to resources. The text reveals that several environment variables are set in the noninteractive shell that actually runs before and while you are logging in.

We will begin our discussion of the chapter on page 284, where we are reminded that you can read a simple shell script with the cat command, as shown on page 285, but you probably can't run it until you grant execute permissions to it.

In the image on the right, I used cat to show that the file did not exist for me yet. To create it with the cat command, I used a little redirection:

cat > whoson

This command will create a file you name, and wait for any number of lines of input. After you enter your lines (such as those in the text) you eventually should generate an end-of-file character on an otherwise blank line by pressing ctrl-d. This will close the file and return you to a normal prompt. I did that, then tried to run it in the session shown on the right, but it failed for two reasons.

The next section in the text reminds us to check permissions on the new file with an ls -l command, where we see that execute permissions are not set by default, even for files we create ourselves. Use the chmod command as shown on page 286 (or in the more generous way I used on the right) to grant permissions to yourself. I also tried a syntax variation: entering the permissions last on the command line. That failed, which shows that the chmod command is sensitive to the order of its arguments.

In the sample session on the right, I then tried to execute the script again, and it failed again, because Linux does not look for a script in your current folder by default. It must be told to do so, either by adding your current folder to the search paths (PATH), or by specifying the path to the script in your command. I did the latter, prefixing a ./ to the script name, specifying that it exists in the current folder. Running the script like this worked, as it did for our author.

On page 287, the text reminds us that there are often multiple shells available in most Linux distributions, and that there are some syntax differences between them, so it is advisable to call a shell of your choice into existence on the first line of a shell script. You may recall seeing the notation the author uses as a script's first line in a previous text, but this time we are offered options that can be useful while troubleshooting:

  • #!/bin/bash
    This notation means to tell the operating system to start a bash shell, providing that the shell executable is stored in the /bin folder.
  • #!/bin/bash - eu
    This notation does the same thing as the one above, but it uses the -e option to exit the shell on a simple error. It also uses the -u option to exit if the script tries to use the value of a variable that has no value.

The text also explains that if you call a script written in Perl, and there is no particular associated shell, there is no need to call a new shell. Just call the script instead, using the shell you are in.

On page 289, the text begins a discussion of control operators, characters in a script that manage how the computer interprets your script.

  • separators - Commands must be separated from each other, in a script or on a command line. The simplest way to separate them is to put them on separate lines. In this case, the line return (newline) character is the separator. Another way to separate commands is to put semi-colons between them. This is more commonly used when you have several short commands that are just as readable on one line as they would be on separate lines. (By the way, whenever we talk about commands, remember that the discussion applies regardless of the command being on a terminal command line or executed in a shell script.)
  • pipelines and lists - You should know that you can use a pipe operator (|) to take the output of one process and send it to the input queue of a another process.
    process1 | process2
    This is a common notation, but we may not have remarked that the processes will always run sequentially. When we are not passing information from one process to another, we may want to use the single ampersand (&), which will throw the process whose name it follows into the background. We can list several commands we wish to start on one command line. Example:
    process1 & process2 & process1 & process2
    This command would run all four processes as quickly as they can be loaded, and the first three would be running in the background. The fourth would run in the background if we added another ampersand at the end of that command line.
    On page 291, the next to last example shows three processes with pipes between them, and one ampersand as the final character on the line. You might think that only the last process would be sent to the background, but the text explains that the system sees all three processes as one pipeline, and the whole thing is sent into the background as one job.
  • Boolean && and || operators - The double ampersand (&&) stands for the Boolean AND operator. The double pipe (||) stands for the Boolean OR operator. The text remarks that Boolean operations always evaluate to true (0), or false (1).  This makes sense for test commands, but the other uses on page 292 strike me as another category.
  • && and || as controllers - On page 292, we see a different concept from what George Boole had in mind.

    First example: process1 && process2
    This command would execute the process on the left, then evaluate it as having ended successfully (true) or unsuccessfully (false). If the first process evaluates as true, the second process will be executed. If the first process evaluates as false, the second process will not be executed, because the entire statement will already be false.
    true AND true = true
    false AND true = false
    true AND false = false
    false AND false = false
    For the purpose of writing a script, we can place two processes in this framework when we want the second one done only if the first one succeeds.

    Second example: process1 || process2
    The || has a different effect. First, the process on the left is run. If the first process executes successfully, that means it will be evaluated as true. Since a conditional OR statement only has to be true on one side, the second process will not run in this case. If the first process fails to run, then it will evaluate as false, so the conditional must execute the second process to determine whether the whole thing is true or false.
    true OR true = true
    false OR true = true
    true OR false = true
    false OR false = false
    For the purpose of writing a script, we can place a two processes in this framework when we want the second one executed only if the first one fails.

    We do not have to care whether the entire conditional evaluates to true or false in either case. The use of the logic operators can be just to control when and if the second command we place in the structure is actually executed.

Moving ahead to page 294, the text begins a new section about jobs, which meant command lines that include pipes in another chapter, but the text seems to be less sure about that, telling us that running the date command (and therefore any command) would be a job. A more precise definition on the same page tells us that if we have a long command line separated by one of more ampersands, we can consider the portion to the left of each ampersand to be a separate job. The text offers several job related commands:

Command
Purpose
jobs
lists the jobs currently being processed in the background
fg jobnumber

or

%jobnumber

brings the job with the number submitted to the foreground
ctrl-z
suspends the job in the foreground
bg
moves the foreground job to the background, if it is currently suspended (a job cannot be moved like this unless it is suspended)

Let's move ahead to page 300. The text begins a long section on variables, which it breaks into several topics. Let's see what it wants us to know:

  • parameters - any value that can be accessed by the shell script, especially those that are handed to it as arguments on a command line
  • variables - a named memory space in which a value, character, string, or other data element can be stored and accessed; variable names must start with a letter or an underscore (no numerals) , and may only contain letters, underscores, and numerals (no special characters)
  • shell variable - a variable that is created in a shell, and is known (local) only to that shell
  • environment variable - a shell variable that has been exporteed; using the export command with a variable makes it available to other shells, other scripts, and other processes; an exported variable is also called a global variable

The text explains that there is a rule for assigning a value to a variable in the bash shell. The variable name goes on the left side of an equal sign, the value to be assigned goes on the right side of the equal sign, and there are no spaces in the phrase. Example:
variablename=value
Putting spaces around the equal sign would make this a test of equality, not an assignment. This is not true about the tcsh shell, but we are running bash, so keep that in mind.

On page 301, the text points out a feature that is more likely to make a program fail than it is to make one succeed.

  • In the example in the text, the author shows us that he has a script called my_script, which does one thing. It echoes the value of TEMPDIR, a variable that does not exist in that script. The author issues this command in his terminal:
    TEMPDIR=/home/sam/temp ./my_script
  • This line creates a variable called TEMPDIR, gives it a value, then calls my_script from the current directory.
  • The script runs, and echoes the value that was given to the variable on the command line that called the script. Okay, so we set a variable's value as part of the command? Sort of.
  • Now for the weird part. On the next line in the author's terminal session, he enters this:
    echo $TEMPDIR
    and the response is nothing.
  • The shell in which the author created the variable and gave it a value knows nothing about that variable. This would not have happened if the author had created and initialized the variable on a separate line. Because he did it as part of a line that also called a script, the script had access to the variable but the shell did not. This is a very weird example of limiting the scope/visibility of the variable. Avoid doing this unless you have a reason to do it. By the way, the variable would have been available to child processes of the process that was called in the command line, but not to other processes running at the same time or later.

The text warns us that there are system variables that are often set when we log in. They are often in /etc/profile or /etc/chs.cshrc. You may want to check these locations when you are writing scripts, to avoid changing the values of variables that are used for another purpose.

On pages 302 and 303, the text covers using and avoiding the use of the $ operator. In the discussion of variables, the $ operator means "the value of or stored in". Example:
echo variable1

echo $variable1


echo "$variable1"


echo '$variable1'


echo \$variable1

The first command, echo variable1,  would put variable1 on the screen. You are only telling the system to echo some characters.

The second one would put the thing stored in variable1 on the screen. The $ means to use the value stored in the variable.

The third one, echo "$variable1" would do the same thing as the second one. It accesses the value of the variable, and reports it to us.
If there had been any spaces in the string we stored in the variable, we would have enclosed the string in quotes to make sure we got the spaces and all the words when we assigned it to the variable.

The fourth one, echo '$variable1' is different. The single quotes mean to ignore control characters inside those quotes. So, the result is simply all the characters inside the single quotes.

The fifth one, echo \$variable1, uses a backslash to override the meaning of the $ operator, making it act like any other normal keyboard character.

You may have noticed that the tcsh (C based) shell uses set to assign a value to a variable. It is implied in bash, but not used. Knowing this will make the next command, unset, more sensible. We use unset to remove a variable from the shell it is in. On page 304, the text shows us that we can clear the contents of a variable with a null assignment, such as this:
variable1=
The problem would be that the variable still exists, taking up memory space, and potentially accessible by a script or a command. The unset command would remove the variable completely:
unset variable1

The next section gets a little heavier. There are several features possible in bash variables that can make scripts act more like programs written in a "real" programming language. Up to now, we have only had to use a variable to both create it and to load something into it. You need to know that the declare command can used to give a variable more definition and several restrictions. Its syntax is a little strange in some cases.

Command variation
Purpose
declare variable1=blue
creates a variable, and gives it an initial value (initializes the variable); the declare command is not needed if this is all you want to do
declare -a variable2
creates the variable as an array, a series of memory locations that typically hold the same kind of data
declare -i variable3 creates the variable as an integer holder; the default data type for variables is string; this makes it easier to do math with the variable's contents
declare -r variable4=crimson creates the variable as one whose value is not allowed to change (read only); you probably want to initialize the variable at the same time as the declaration (or before it!)
declare -x variable5 exports the variable to the environment

If you enter the declare command by itself, it returns a list of all your current variables, as well as their contents. If you enter declare with one or more options, you get a list of all variables with those options turned on (as well as their contents).

Are you ready for the weird part? If you need to turn an option off, you issue the command that turned it on again, but this time you use a plus sign instead of the dash/hyphen/minus sign. Really? You use the minus sign to add a feature, and a plus sign to remove it? And note the barely mentioned warning on page 306: you cannot remove the read only attribute.

The text goes on for a few pages about variables used by the system. Take a look at the list on page 314. Note that all the keywords, as the text calls them, are spelled in all capital letters. Let this be an indication to you that your variables should be spelled in lower case.

Let's stop here for a bit, and continue with page 346. next time. For now, we need a little logic for this week's shell script. To write the program in assignment 2 below, you need to know how to do a basic loop, a test, and input collection from the user.

  1. For the input collection, use the read command, as I recommended last week to collect the user input in a variable.
  2. For the loop, I recommend a while structure, which may look like this:
    while [ test goes here ]
    do
        command
        command
    done
  3. Remember to ask for another input and to collect it inside the loop, congratulate the user outside the loop, and only run the loop if the first guess is wrong.