LUX 205 - Introduction to LINUX/UNIX

Chapter 6, Introduction to Shell Script Programming

Objectives:

This lesson discusses creating a shell script to serve a particular purpose, using common standards for writing programs. Objectives important to this lesson:

  1. The program development life cycle
  2. Comparing shells with regard to writing scripts
  3. Shell variables, operators, and wildcards
  4. Logical structures
  5. Creating a menu
  6. Debugging
  7. Customizing the environment
  8. The trap command
  9. Creating a menu based application
Concepts:

The chapter begins with a few pages of foreshadowing, letting you know that you will be creating another application that will use variables, operators, and control structures. If this sounds mysterious, it actually is at this point in the chapter. All will be discussed in more detail in the pages that follow.

On page 273, the text begins a general discussion of program development. The image on page 264 is a representation of several steps in one version of a program development cycle. The cycle is also frequently called a system development life cycle. There are multiple versions, some having more steps, and most of them are depicted as circles, not vertical flows. Why circles? No matter how good a system is, it should be reexamined from time to time, to determine if it needs updating, changing, improving, or replacing. Take a look at the images behind the link I just gave you. It leads to a Google search for images on the subject. What none of them seem to show is that you should check your results at each stage, and be ready to roll back to a previous stage in the cycle if a current problem requires redoing that earlier stage.

In a short list, we might look at the cycle like this:

  1. What does the requester/user need the system to do?
  2. Let's make a plan to meet those requirements.
  3. Let's write the code. In this case, let's write a script file that we can test.
  4. Test the script, and if it does not make the requester happy, roll back to an earlier step to fix it. If the requester is happy, move to the next step.
  5. Deliver the system, test again, and set a schedule to get feedback from the users.
  6. Periodically, evaluate the system for the need to start at step 1 again.

The text moves on to consider some languages commonly used by programmers. COBOL is an old language, one of the first high level languages, written by Grace Murray Hopper, whose career is too long and impressive to do justice here. Visual Basic is a Microsoft improvement of the BASIC computer language, and you should know by now that C was invented at Bell Labs. Programs written in languages like these begin their lives as source code files, which must go through several transformations before they are rendered into machine language executable files. The description in your text is a bit abbreviated. Take a look at the discussion on this page of my notes about building applications for a more detailed version. The point our author is making is that most programs you run are created this way. Shell scripts function in a different way.

A shell script works by calling (running) operating system utilities that are already compiled, just as you have been doing from the command line in each of the first four chapters in this text. The commands in the script are not compiled. They are run by a command line interpreter, just as they would have been had you entered each of them at a command prompt. This covers most programs you will ever use: they are either compiled (translated into machine language and stored) or interpreted (translated one line at a time each time they are run). It should be no surprise that interpreted programs take longer to run, since each line must be translated into machine language before it is executed.

The text explains on page 275 that, besides writing it and troubleshooting it, there is another necessary step that makes a script file a runnable program: setting permissions. The simplest way to make sure that everyone can run the file is to make sure that all three entity permissions are set to odd numbers. That guarantees that the third bit in each "binary number" is turned on. The text offers three ways to use the chmod command on page 276.

chmod ugo+x filename

chmod a+x filename
chmod 755 filename

Each example above uses a different notation.

  • The first adds the execute permission (+x) to the user who owns the file, to the group that user belongs to, and to everyone else (others) on the system.
  • The second example adds the execute permission (+x) to all users on the system. This is shorter than the first example, and it does the same thing, but we could have chosen to only use some of the letters (u, g, or o) in the first example..
  • The third example uses the decimal notation that stands for the rights being assigned to each of the three entities. This one is hard to get wrong, if you understand it at all. It is also the most explicit way to set or change rights to all three entities. All three digits are required with this method, but you know exactly what rights are assigned at the end of that command.

The text continues to with a discussion that is a bit unclear. To understand it, you have to know what the PATH variable does on a system. PATH holds any number of pathnames to directories that are likely to hold executable files. In the example below,

  • I have started Fedora, opened a Terminal session, then entered PATH as a command. PATH is not a command, it is a variable holding search strings, so entering it like a command does no good.



  • I then entered $PATH, which effectively asks the operating system, "what is the value of the variable called PATH?". Notice that this variable holds several search strings. They are separated by colons. Each of these pathnames ends in either bin or sbin. Notice also that there is a bit of garbage at the end about no such file. Not helpful if you do not know that you can ignore it.
  • The proper way to ask for the value of the variable is to enter echo $PATH. This means to print the value of the variable to the screen. This is what I did in the last of the three examples in the image above.

However you learn what this variable holds, you will see that it holds paths to folders that it has been told to search for executable files when they are called from the command line. If you save an executable file to one of those folders, then type the file's name at a command prompt, the file should run. The problem here is that you may not have write permissions for any of those folders. This leads to three ways mentioned in the text to execute a script file you have written.

  • Modify the PATH variable to contain a search string for the folder that holds your file.
  • Move to the directory that holds your script, then precede the name of your script file on the command line with ./, which means to look for it in the current directory. Example: ./scriptname (This will seem odd to those of you who know this variable from Windows or DOS. No, this is not a mistake. Linux may not automatically look in the current folder.)
  • When you enter your command to run the script, prefix it with a relative or absolute pathname that will lead the operating system to your file. Note the special notation on page 276 that tells Linux to start in the home directory of the current user: the tilde (~).

The text suggests that you consider prototyping a program that you are working on. This really means to write a simplified version of the program first, show it to users, get useful feedback, and change the next iteration to include more features as well as corrections for any errors in the last iteration. The author points out that this can be done easily in a script, while doing so in a compiled language might take a lot more time and effort. When this is true, it will pay off by producing fast turnarounds between versions and easier improvements. Note, however, that you would still have to write the code for the program in a high level language, which introduces the risk of producing untested, error laden code.

The text returns to the idea of using comments as internal documentation of a file. This time we see an example on page 277 that shows one method for creating a comment. Placing a # in the first position on any give line makes that line a comment. Any other text on that line will not be read by the interpreter. If the interpreter sees a # in the first position, it ignores everything else on that line. The text offers suggestions about documentation that may be useful, but its suggestions may not match your company's point of view.

We are advised, once again, that shell scripts can be written for any of the shells listed on page 278, but our author prefers the programming capacity of the Bash shell, so the examples in this text assume that Bash will be used to run them.

The text begins a longer discussion on page 279 about three types of variables. If you are not a programmer, you should still have encountered variables in math classes. A variable can be thought of as a named location in memory that is meant to hold some numeric or alphanumeric value. The three types of variable that the text introduces:

  • Configuration variables - their names are typically all upper case letters; these hold information about the operating system
  • Environment variables - their names are typically all upper case letters; these hold information about the user environment, such as the location of the user's HOME directory, and the present working directory (which is confusing because the variable is PWD, but so is the command to report what the variable holds)
  • Shell (script) variables - their names are typically all lower case letters; these variables are created in a script and they are assigned values that will be used while the script runs.

The text suggests two ways to view variables. The printenv command, issued without arguments, will present a list of current configuration and environment variables and their values, as shown in the illustration on page 280. Notice that we can run the command with a list of variable names as arguments, which will then present only the variables we asked for. The second way to see variables is to use the set command. This command will report the configuration and environment variables, as well as any script variable that are loaded in memory.

The text presents a list of configuration and environment variables on pages 281 through 283. This list also shows the purpose of the variables and whether they can be changed or set by the user. Page 284 offers ten guidelines about naming and using shell variables. Run through these advisories and discuss any that are unclear to you in the discussion board this week.

The remainder of the chapter is a series of lessons to make you a more capable shell script programmer. On page 285, we learn some basic operators and operands in shell scripting. First, the equal sign is used two different ways in shell scripting. The difference between them is whether you space around the equal sign. When we assign a value to a variable, the text says we are defining it. When using an equal sign to do this, the equal sign is an assignment operator, and you must not put spaces before or after it. For example:

name=Steve


This line is a shell script creates a variable called name, then assigns the string "Steve" to it. In this example, the text would call the equal sign the operator (it does the work) and would call name and Steve the operands. Operands are what the operator works with. Note that in the second example on page 285 the text encloses the string it is assigning in quotation marks. Why?

name="Steve Vincent"

This time the string includes a space. In the first example, the assignment operator would stop assigning characters to the string as soon as it sees a space, a tab, or a line return. To include these characters in a string assignment, we have to enclose the entire string in a pair of quotes. The quotes can be single or double quotes. It does not matter which kind of quotes you use, as long as you mark the beginning and end of a string with the same kind. (In another circumstance, it does matter whether you use single or double quotes. Cue foreboding music...)

Look carefully at the third example on page 285. If you are skimming the chapter, you might mistake the back quotes, also called an accents grave, for single quotes. They are different and have a different purpose.

list=`ls`

Placing a pair of accents grave around the ls command means to run the ls command, capture the output, and assign that output to the new variable called list. Note the special nature of this idea: run the command, grab its output, and assign the output as the value of the variable.

This is important to know when you consider the information on page 286. The text calls the $ an evaluating operator, because when it occurs to the left of a variable's name, it means to read and use the contents of the variable. Usually. The examples there show what looks like the same command three ways. One of them is different.

echo $variablename
echo "some other words $variablename"
echo '$variablename'

In the first and second cases, the system would echo (print to the screen) the value of the variable called variablename. The value is whatever is stored in the variable. The reason the second case is shown to us is that we need to know that we could have other words echoed to the screen and that the value of the variable would still appear with them. Of course, those other words could have been put in a quoted string by themselves, but whatever...

In the third case, the command would put the literal string enclosed in the single quotes on the screen. The system would not report the value of the variable. This allows us to put instructions on a screen that include phrases like $variablename.

Farther down page 286, we learn that we have not been told the whole truth about the equal sign yet. It has another use: a test of equality. When two phrases (math or otherwise) are meant to be compared, they are put on each side of an equal sign, but they must each be separated from the equal sign by a space.

variable1
=variable2
variable1 = variable2

The first line would assign the contents of variable2 to variable1. The second line would test whether the two variables were equal instead.

Despite the appearance of the table of examples on page 286, do not put spaces around math operators if you want the computer to do math. The screen shot on page 287 is more accurate. let x=6+4*2 is a phrase that would assign the value 14 to the variable x. Spaces are not wanted or needed in a phrase like that, except immediately after the command let. Some of you will ask why it assigns 14 instead of 20. You will ask that if you don't know Aunt Sally. Computers typically follow an order of operations that follows the mnemonic phrase "Please excuse my dear Aunt Sally".

Please
Evaluate items in parentheses first.
excuse
Evaluate any exponents next.
my
Do any multiplication as the third step.
dear
Do any division as the fourth step.
Aunt
Do any addition as the fifth step.
Sally
Finally, do any subtraction.

Following Aunt Sally's order of operations, the computer would multiply 4 times 2, then add 6 to the result, then assign the answer to x. 14, right? By the way, let is a command that says to do math in the next phrase. Without the let, the operation might have assigned a string to the variable.

Let's move on to page 289, where we are introduced to the export command. The export command overcomes a problem that may not be a problem for you. The "problem" is that variables in a script are local to that script and cannot be seen by other scripts or the shell itself. The export command can promote a variable to global status, making it visible to the next script to run and to the shell. Note the variation of the set command on the previous page that can used to automatically make all variables global. This is a time saver if you need to use export frequently.

Page 290 shows us a new wrinkle about the PATH variable. Remember that it holds a list of paths to folders that contain executable files? When you write a shell script, want to be able to test it easily, so you may want to add a path to the current directory to the PATH variable. This is a quick way to do that:
PATH=$PATH+:.
This reads the current value of the PATH variable, appends a colon to it, then appends the path to the current directory, and assigns all that to the PATH variable. The path to the current directory is symbolized by the period at the end of the command. The text cautions us that this is a temporary change to the value of PATH. It would revert to its customary value in your next Linux session.

The text reviews material we have seen about wildcard characters on page 291.

With about twelve pages to go, the text turns to lessons about programming. There are three classic structures used in any program: sequence, selection, and iteration. There are variations in each type, as you will gather from this material.

Sequential logic is a fancy way of saying that a script runs the first line, then the second, then the third, and so on until it ends. That is fine for simple scripts, but sometimes you want the user or the situation to determine that a particular set of statements (lines of instructions) does not need to be run, or another set of statements needs to run in a different order than the sequence that was originally intended. This creates the need for selection, which our text calls decision logic. Making selections typically involves a conditional operator like if in the example on pages 293 and 294. In this example, the decision structure asks the user for a response, compares the response to a known value, then takes one action if the comparison is true and another if it is false. The general structure would be like this:
if [ value1 = value2 ]
then
action to take if they match
else
action to take if they do not match
fi

If begins the structure, and fi ends it. The test condition is enclosed in square brackets, and it is evaluated as being either true or false. This structure would only have two results, based on the condition evaluation only having two possible outcomes. If more possibilities are needed, a different selection structure might be used, or more if structures could be nested in this structure.

For some reason, the text does not discuss the other structure it presents, the case structure, until page 298. There we learn that a case structure can present a variety of possible values for a variable, an action to take if one of them has occurred, and a default action to take if there was no match.

On page 295, the text begins a discussion of looping (iteration). If you have written looping code before, the examples in the text may look familiar to you, or they may seem very odd, depending on the syntax you are used to.

The text demonstrates a for loop and a while loop. The major thing that will determine which you want to use is how they run. A for loop will run a specific number of times, based on a known value, a variable's value, or a control string as shown in the first example. A while loop runs until a test condition becomes true, or becomes false, depending on the programmer's choice for a comparison test. In the example on page 297, the user is asked to input a choice, then the loop runs if the choice does not match a control value. Inside the loop, the user is asked for another choice, and the loop continues if the new choice is still incorrect.

Before the chapter turns you loose on the projects, it describes a few more tools:

  • tput - this command has a mysterious set of parameter that let you position the cursor on the screen, turn bold on and off, and clear the screen. This is meant to be useful when displaying a menu or other material on the user's screen.
  • sh - you can invoke a shell to run a script, but more than that, you can invoke it to check the syntax, display the lines of code, and help with debugging a script
  • trap - this command watches for events that may be a surprise to some of you; you may be aware that error codes are issued when processes fail, but you may not know that programs can issue "return values" when they stop running, either successfully or unsuccessfully. Although error traps are most interesting, trapping successes can be nice as well. Note the list of common codes that trap may be set to monitor on page 302 and 303.
It is recommended that you practice the material in Projects 6-1 through 6-14 at the end of the chapter to become familiar with the syntax of these features and commands in your working environment.