ITS 2310 - Linux I


Chapter 9, Perl and CGI Programming

Objectives:

This lesson discusses the Perl language. Objectives important to this lesson:

  1. Perl scripting
  2. Perl data types
  3. Using files in Perl scripts
  4. Using Perl and CGI on a web page
Concepts:

As such things go, Perl is a better acronym than some we have seen. It stands for Practical Extraction and Report Language. Versions of it are available for UNIX, Linux, Windows, and OS X. The text tells us that a Perl interpreter is built into several Linux distros. We would expect to find the interpreter saved as /usr/bin/perl.

The description of features in the text makes Perl sound like as good a scripting language as Bash, so the question is why should we use Perl instead?

The first example in the text looks a lot like C.

#!/usr/bin/perl
# Program name: example1.pl
print("This is a simple\n");
print("Perl program.\n");
$name = "Charlie";
print ("Greetings, $name\n");

  • This example begins with a shebang, like the scripts you have seen, but it calls the Perl interpreter, not Bash.
  • The second line is a remark/comment declaring the name of the script file.
  • The third and fourth lines call a function called print, and pass a string to the function. Each of those strings ends with a \n, which is a standard control character that tells the print function to go to a new line. It is standard to put the arguments you pass to a function inside parentheses, but the author tells us on the next page that they are optional.
  • The third and fourth lines are executable, so they end with semi-colons. This is also a characteristic of C.
    The next two lines are taken from the author's second example.
  • Line five above assigns the string Charlie as the value of a variable called $name. The $ is actually the first character of the name of the variable, and it means that this variable is a scalar variable: it holds a single value. Most variables are of this type. The dollar sign is a sigil, a character that declares the variable's type. The link in the previous sentence goes to an article that explains Perl sigil types.
  • In line six, the print function is called, and the string to print is enclosed in quotation marks again. The text tells us (on another page) that we could have used single quotes instead. If you use single quotes, Perl will print the characters that are actually in the string, including things like newline characters (\n). If you use double quotes, Perl will substitute the values of variables (or the control characters) instead of printing their names. A list of control characters appears on pages 464 and 465.
    Take a moment to breathe. This is similar to what we saw in Bash scripts.

In another example the author shows us a way to get input from the user:

#!/usr/bin/perl
# Program name: example3.pl
print ("Enter a number: ");
$number = <STDIN>;
print ("You entered $number\n");

  • In the first line, there is a shebang, and the program file name appears in the second line
  • The first print call is to prompt the user for input.
  • The new variable, $number is loaded with whatever the user types until the user presses the enter key.
  • The second print call tells the user what was typed.

The text continues developing the script, adding an if-else structure on page 460. Pager 461 compares numeric and string comparison test operators (numeric operators are math symbols, string operators are two letter abbreviations). Math operation symbols are the ones you should expect, shown on page 462.

On page 463, the text begins discussing data and variable types. A type must be declared as part of the variable name. This is especially important the first time it occurs in a script, to reserve enough memory space for the value to be stored in the variable.

  • scalar - variables that can hold number or strings
  • numeric - can be signed integer (positive or negative) or floating point real numbers; can also be hex numbers if preceded by 0x, or octal numbers if preceded by a leading 0.
  • string - a series of alphanumeric characters
  • array - shown on page 466, typically treated as an array of scalar variables, accessed by offset subscript notation

That last one gets a little heavy, so let's listen to the Urban Penguin for a bit. WARNING: he uses the word "ampersand", which means this character: &. That is not what he means at all. The sigil for an array is @, which does not appear to have a proper name in English. In fact, he misunderstands the use of the word sigil, which means a flag denoting a variable's type. Cut him some slack, he is doing his best.


We will skip ahead to page 471, and the discussion of file operations. The text introduces the term filehandle, which means a label that stands for a link to a file on a local or network drive. This is not the same as a hypertext link. It is a link that interfaces with the operating system of your computer to access a file on its hardware. We are told that STDIN, STDOUT, and STDERR are all filehandles, which should not be surprising since I told you that Linux thinks everything is a file, which includes those streams. The author's first example is to open a file whose name is passed to the script on the command line. This example is more general:

#!/usr/bin/perl
# Program name: perlread2.pl
# Purpose: Open disk file. Read and display the records
# in the file. Count the number of records in
# the file.
open (FILEIN, "students") || warn "Could not open students file\n";
while ()
 { print "$_"; ++$line_count; }
print ("File \"students\" has $line_count lines. \n");
close (FILEIN);

  • This time there are three remark lines to describe what the script will do. The first command tells Perl to open a file. that the handle for the file will be FILEIN, and the file itself will be called students. This instruction is followed by two pipe characters which stand for the logical word OR. The instruction on the right side of the || will not be carried out unless the instruction on the left fails.
  • The author then opens a while loop with no test condition in its control section. The program reads a line of the file and stores it in the "default variable", $_. The reason it does so is that Perl was told to open a file, and it assumes that the print function is being told to read that file for input. This is a stretch of logic, but it is how Perl works in this case: when there is no test condition in the control section of the while loop.
  • Still on the line in the body of the while loop, the script then increments the value of a variable called $line_count. The first pass through the loop, the variable has no value, and it is incremented to 1. Each time the loop is run, the ++ operator increments the value. The loop stops running when there are no more lines to read in the file.
  • The script prints again, this time using the control/escape code \" every time the author wants to put a quotation mark on the screen. If the author had not put a backslash to the left of each quotation mark that needed to be printed, the script would have thought that the string to print was closed, as it does when it sees the final quotation mark.
  • Finally, the script closes the file by passing the filehandle to the close function.

This video takes a different approach to running the loop, but the intent is the same.


The video below is long, but the author promises to teach you a lot about Perl in an hour. If you have not gotten the idea yet, give him a try.


The text gives you a quick introduction to HTML, which is totally inadequate unless you speak HTML already, in which case it is unnecessary. Similarly, the discussion of CGI and Perl as used on web pages is inadequate. The author provides some references for research, but they are a bit dated. Searching on the web, I find no videos or references to Perl and Common Gateway Interface that are not old and outdated. That makes this a good place to stop the lesson.