ITS 2310 - Linux I

Shell Scripting Review (continued)


This lesson continues the review of shell scripting. Objectives important to this lesson:

  1. read vs. tr
  2. wc
  3. expr
  4. local variables
  5. aliases
The course lecture for this week continues examination of more scripting commands.

One of the first assignments I had as a programmer was to scan a data stream to identify what kind of stream it was. If I could identify it, I could parse out the data and present it on a computer screen. It was good that the streams in question all came from a mainframe, and they were the result of transactions that had been written by different programmers. That meant that the streams were all probably different from each other in some detectable way. I had to detect the nature of the stream because I would not know what transaction had been started.

That relates to the first topic for this week. When you need to parse out data for display or processing, it helps to know what to look for.

The first problem this week looks for breaks in the data stream in two similar but different ways. The read command typically reads STDIN until the user presses the Enter key. Go over the article I gave you a link for in the last sentence. No, really do it. I am referring to an example in it in the discussion below.

The article explains some of the ways you can use and modify the default behavior of read. Consider some of the goodies in example 3.1 on that site:

  • IFS - The Internal Field Separator defines what characters(s) can separate words in the data stream we are using. It can be set for a specific line if necessary. When you are reading data from ordinary text files, the default setting works well. It is any whitespace character. When you are reading from a comma delimited file, like a CSV, you need to set IFS=','. To set it back, you can enter IFS=" ", and read will once again expect that any whitespace character can separate words. It this example, the IFS is changed inside a loop, which causes that action to be forgotten when the loop ends. Nice.
  • exec - this opens a filehandle to a named file to read it (note the <)
    exec {file_descriptor}<"./file.csv"
    At the end of the code module, he/she closes the file with a redirection operator, pointing to &- which can be read as "file close"
    exec {file_descriptor}>&-
  • declare - The author creates an array with a declare statement.
    declare -a input_array
  • The author opens a while loop, and sets the value of IFS to a comma. He/She then reads each line of the file (-u) into an array (-a), and while doing so, outputs the first and third fields of each line in the array.
    while IFS="," read -a input_array -u $file_descriptor
                 echo "${input_array[0]},${input_array[2]}"

The use of an array to hold words gives us a second method to access them. Another way is shown in our text. The question shows us a use of the trap command, putting a newline character on the screen every time a space is trapped by the loop. Does this do the same thing? Test it and see.

You should also have an idea what the wc command does, since it can be used several different ways. Consult Help in the Linux version of your choice, and you may find there is a syntax you don't know.

The second problem for this week uses the expr command. Among other things, it can tell the shell to consider a string as number. You need to do this from time to time for validation of data. If you need to do math with a value in a variable, it may not happen if that variable is considered as a string instead of as an integer or a floating point number. Problem 2 asks you to test two lines of code, and to walk through them, explaining what happens when they execute. This is good practice for troubleshooting more complicated programs. The video below describes the general use of expr.

The third problem is easier to deal with if you remember the word local. All variables in a shell script are local. What does that mean? The question to research here is what must be done? There is more than one answer.

The fourth problem is really just troubleshooting. What does the alias command do in general? What does it do in the example in the text? How does that affect your experience when you test it?