LUX 211 - Shell Programming
Lesson 2: Chapter 4, The Filesystem; Chapter 5, The Shell
This lesson introduces the concepts from the first three chapters
of the text. Objectives important to this lesson:
- File system characteristics
- File system management
- Using ACLs
- Hard links and symbolic links
- Command-line syntax
- How the shell interprets a command
- Redirection and pipes
- Background commands
- Stand-alone and built in utilities
Despite the author's desire to type "filesystem" in each of his books,
I believe we can turn that into two words without losing any meaning.
Doing so will also make our spelling checkers happier. The author spends
two pages reminding us that a file system is typically explained and shown
as a hierarchy that starts at a root
and grows branches (containers,
folders, directories), which can contain other
branches (more containers, folders, directories), any of which can also
contain leaf objects (files).
The author also reminds us that the list of all containers
that we pass through from the root
to any given container or leaf
object is the path or pathname
to that object. Remember that a pathname that starts
at the root is an absolute pathname,
and that a relative pathname starts
at some point other than the root
of the file system.
A pathname, for example, may start with a tilde
(~) and a slash, which means that it starts at the current user's home
directory, not the current directory and not the root. Page 89
shows us this usage and a more cryptic one. If the tilde
is immediately followed by a user name,
that means the path starts at the home directory of that
user. A pathname may also use a single
dot, which stands for the path from the root
to the current directory. A double
dot stands for the path from the root to the parent
of the current directory. (Unlike children, directories never have more
than one parent, so the double dot is unambiguous.)
pages 84 and 85, the text gives
us a short lesson on filenames.
It makes some recommendations we should follow:
- directories need names and
they should follow the same rules as filenames
- use characters that can be entered from the keyboard
- make the filename relevant
to the file's content
- respect the length restrictions
of legacy file systems when you must interface with them
- DOS systems used 8.3: up to 8 characters, followed by a dot, followed
by up to three characters as an extension
- some UNIX systems were limited to 14 characters
- some Macintosh systems were limited to 31 characters
- pick a case or a rule
about case and stick with it, because some file systems ignore case,
some see upper and lower case as different letters, and some allow upper
and lower case but they don't care about it
- don't use spaces in filenames,
because of the way you have to handle the spaces in Linux commands (see
page 85) and because they violate the rules about complying with the
Americans with Disabilities Act, as in item 1.1 in the checklist shown
on the right (https://www.hhs.gov/web/section-508/making-files-accessible/checklist/multimeda/index.html)
- use hidden filenames with
care; in Linux a filename that begins with a dot
is hidden from the ls command
unless the -a switch is used
with it, which means "show all"
Starting on page 90, the text
reviews commands to create, remove,
and navigate directories, as well
as the more flexible mv and rm
commands. Note the discussion of the mv command that should make you eager
to use a mouse to move files and folders.
Pages 96 and 97 present a long
list of directories that are compliant with the Linux Filesystem Hierarchy
Standard, (FHS), a standard whose
pedigree the book traces through three other standards committees with
long impressive names. (Truly, it must be worthy.) In short, these are
directories whose locations we can hope to be where the text says they
will be in current Linux releases. (Your mileage may vary. Check before
Page 98 reviews using the ls
-l command to view the
permissions that are assigned
to a file (or directory). Page 99
begins a discussion about using the chmod
command to change permissions.
This should be familiar to you. If it is not, review it and learn it,
because it is about to get harder.
Page 101 introduces a new kind
of permission, but it does not introduce the topic in a sensible way.
I showed you how to set permissions on a file with a three
digit number like 750 or 751. You can also set a more important permission
with a four digit number, or with
a +s if you are using that notation
for chmod. The text presents two versions of the technique: setuid
and setgid. It may be easier to
understand with an example of
why you would do so.
When you use the passwd command,
you change your password on the system. That doesn't sound like much,
but the truth is that in order to change a password, you have to access
a folder on the system that only root
has the right to read. Well, you aren't root, so how can that work? It
works through running the process
(passwd) with the rights of the owner
of the process, who happens to be root in this case. At some point, the
passwd command was altered with a command like one of these two variations:
chmod 4755 passwd
chmod u+s passwd
Either of these commands would set the permissions on the passwd command
to be -rwsr-xr-x. Note the s
instead of an x in the set of
permissions for the user who owns
the file. In this case, the owner still gets execute rights to the file,
and so does the group, and so do others. What the 4 in the first example,
or the s in the second example, does is to change the file itself in a
special way. Normally, when you run a file that has the x
permission for you, the process you start affects other files and
objects on the system. It can only affect those other objects if you
have the right to do so. When you run a process that has the s
permission set for its owner, that process runs
with the permissions of the owner instead of your permissions.
That means that anything the command you started needs to do, it does
with all the rights associated with the user who owns
that command. And when the owner of the process is root, that means the
process can do anything at all that the programmer wanted done.
That's how we turn on the setuid
permission, and why we need it. If we wanted to enable a process to use
the rights of the group associated
with it instead of the owner, we would turn on the setgid
chmod 2755 passwd
chmod g+s passwd
This discussion could go on, but your text does not discuss the other
related points. Let there be some mysteries we will not explore yet.
On page 103, the text reminds
us that the standard r, w,
and x permissions mean something
different when applied to a directory.
- The read permission means
you are allowed to see what is in
the directory. You can use the ls command to do so.
- The write permission for a
directory means you are allowed to create objects in
- The execute permission means
that you are allowed to use the change
directory (cd) command to make it your current
directory. If you only have
the execute permission, you can see the permissions
set for the directory with ls -l, but you must also use the -d
switch (ls -dl). Even then, you will only see data about the directory,
not about its contents. If you use a ls command to look for a specific
file in a directory, you will only see it if you have rights to the
Page 104 begins a discussion
about ACLs, Access Control Lists.
The text mentions that ACLs provide finer control over permissions, but
they present a larger burden on the system since there is more to process.
ACL support can be added to a file system when it is mounted by using
the acl switch, but the default
is not to add it (no_acl). The text recommends not enabling ACLs on file
systems that contain system files. In Linux, you can set (modify) the
rules of an file's ACL with the setfacl
command, and you can view the ACL of a file with the getfacl
command. If the text seems a little unapproachable to you, try reading
article written by a system administrator.
Each ACL (every file can have one) should contain a list of rules that
refer to the usual user, group, and other, and may contain rules for specific
users and/or specific groups. (There are no specific others,
so you won't see an ACL rule for a named other.) You can also set an effective
rights mask that will override the settings you make for individual
users and groups. (Yes, that seems illogical to me as well.)
Assume we have a file named ACLexample. We can set basic permissions
with the chmod command:
chmod 644 ACLexample
If we ask to see the ACL of this file (without a header), we should do
it like this, and expect this kind of output:
getfacl --omit ACLexample
line above has no specific user name
between its two colons, which means that line is for the user who owns
this file. The same thing is true for the group
line: it applies to the owner's group.
If we run the command "ls -l ACLexample",
we should see a plus sign (+) at the end of the permissions list, which
means that the file has an ACL. If we do not see a plus sign at the end
of the permissions, there is no ACL for that file.
To add a line to the ACL for
a user named bob
with full permissions, we can do it like this:
setfacl -m u:bob:rwx ACLexample
That command would modify the ACL to look like this
The four lines above refer to the rights of the owner of the file, the
user named bob, the owner's group, and others. The text shows us on page
107 that we can add lines for multiple
users at once. We just need to separate the user permission phrases
setfacl -m u:bob:rwx,u:tom:rw- ACLexample
The text also mentions that you can apply a setfacl command to multiple
files at once by listing more than one file as the target of the
Turn to page 109 to begin the
section on links, which the text tells us are pointers.
Pointers typically hold a memory location, but in this case a link holds
the address of a location on a hard drive. Filenames in directories are
pointers in this respect, although they technically point to the inode
of that file. In order to share a link to a file with someone else, the
text gives us a procedure to follow:
- Grant the read and write
permissions for the file to
the new user. (Use a setfacl command to do so.)
- Make sure the user has read,
write, and execute
permissions for the file's folder
to the user. (Use another setfacl.)
- Use the ln utility as described in the test to make a hard
link or a soft (symbolic) link.
A hard link points to the same inode
that the directory entry for the file points to. They both point to
a place on a hard drive.
A soft/symbolic link holds a path
to a file, so it is resolved each time it is used. A soft link can point
to a file in another file system, but a hard link cannot.
- If you are making a soft link (as the book suggests you always should),
use an absolute pathname to your target file. Soft links that are made
with relative pathnames will break if they are moved from a folder at
one level to a folder at another level in the tree.
If you remove a file, you should probably remove all links to it as well.
If you are only removing a file temporarily, planning to replace it later,
soft links will still work after the file is replaced, but hard links
will not, since the file will probably have the same pathname, but it
will be located in a different block on the hard drive.
The chapter ends with several pages on esoteric information about links
that will only be confusing at this point. Let's move on to the next chapter.
The chapter begins with a discussion of command syntax. Confusingly,
the author points out that when he uses the word command,
he may mean what you type on the command
line, or he may mean the actual
program that runs when your typed command is processed. Both definitions
are correct. Isn't English fun? On page 126, he starts with an example
of a simple command.
Syntax means grammar,
the rules of putting words together so they make sense. Command syntax
means the specific grammar that is required by whichever command we are
- The syntax for a simple command
begins with the name of the command,
- followed by any options (switches)
we want to use,
- followed by any arguments
that may be required,
- with spaces between tokens.
The text tells us that a token
is a sequence of non-blank characters, sometimes called a word. Calling
it a word helps maintain the similarity to a discussion of the grammar
of a regular human language. Taking that approach for a moment:
- the command name is the verb
of the sentence, telling the computer what to do
- the options are the adverbs
of the sentence, telling the command how to do it
- the arguments may be the objects
of the sentence, the things on which the verb performs its actions
There are dependencies that happen for some arguments. If you choose
particular options, associated arguments may be required, which makes
writing a script that performs this way more challenging. The text points
out that options often begin with a dash
or a double dash, but filenames
typically do not, which leads to the advice that you should never give
files names that start with any number of dashes. You don't want the command
to think that an argument is actually an option, or vice versa. When a
script/command/program runs, it has to parse
the command line. This means that it must receive parameters, correctly
interpret what to do with them, then run, ask for input, present an error
message, or just fail. The text remarks that the script/command/utility
must do its own error trapping and interpreting, that the shell has no
way to do that for us. This should be obvious, if you think about it,
so think about it.
When we want to execute a script, we must give ourselves (and others?)
the execute permission for the file, but that is not the only thing that
would stand in your way in your assignments. The text offers some tips
on page 132 that apply to what we might think of as housekeeping:
- Set appropriate execute permissions for the script.
- Either place the script file in one of the folders that is listed
in the PATH variable, or amend the PATH variable like this:
This command would take the current list of paths in the PATH variable
($PATH), and append a colon and the absolute path your current directory
(:.). This assumes that your current directory is where your script
- Consider starting the script with something link this:
The initial dot stands for the path to the current directory, and the
slash serves as a separator between that path and the name of your script.
This gives the shell all the information it will need to find the script,
so the paths in PATH are not consulted.
The text also points out that a command line might not begin with the
command itself. That is true, but it is standard practice to do so. The
example at the bottom of page 131 is less helpful than it is remarkable
as a badly written command line that will still work. It does give us
a good excuse to consider the standard
input, standard output,
and standard error streams.
For any hardware setup, there is an assumed standard input or output
device for each of these streams of data. The actual hardware that is
installed affects the assumption, as does the nature of the program being
run. How a data stream is sent to a device may surprise you. Page 134
tells us that Linux treats most devices like files.
We should know that driver files for devices are typically stored in subdirectories
of the /dev directory, and that
when we send data to the operating system, bound for a device, the OS
passes the data through the driver, which acts like a filter that sends
output to the intended device. As far as the operating system is concerned,
it is just writing to a file.
The text reviews redirection
in this part of the chapter. It reminds us that the output redirector
(>) can cause a file to be overwritten, if the target file already
exists. Redirection operators typically pass input from a file to a process,
or from a process to a file.
|sends output to specified target, creating
or overwriting it
|takes input from specified source
|sends output to specified target, creating
or appending to it
|when turned on, attempting to overwrite
with redirection will cause an error
|the pipe character is used to pass
output from one process directly to another process as its input
In the rare case in which a process creates output
that we want to ignore (or delete),
we can redirect that output to /dev/null,
which the text refers to as a bit bucket
or data sink. Data sent to this
location is not saved.
On page 141, the text reviews how a pipe
character works on a command line. You should be aware that it takes output
from one process and passes it as input to another process. Using this
operator removes the need to write to a file, then read from a file, which
you would have to do if you only used the "less than" and "greater than"
redirection operators. The text refers to a command line that uses a pipe
operator as a pipeline.
Page 145 shows an example of using a pipeline
with three processes. First the user runs the who
command to find out which users are logged in. The output of who is passed
with a pipe to a new process, tee,
which passes output to two locations:
to a location you specify (often a temporary file), and to standard output.
The pipeline being demonstrated then passes the standard output of tee
through another pipe, which flows to a grep
process that looks for a specific user name. If the user name is found
in the output from who, that line appears on the screen. The advantage
of using tee in this example is that it can save a copy of its input in
case you need to examine that body of data again.
On page 146, we review using an ampersand (&) to move a process into
the background. The previous text did not mention benefits of doing this.
Our current text says that we can take advantage of multitasking with
background processes. The foreground can only run one process at a time,
which is fine for processes that don't take very long to run. The background
can hold several running processes, which may speed up your script a good
The text illustrates calling a process and sending it to the background
at the bottom of page 146. Note that the system generates
two numbers that appear on the screen. The first is a job
number, a number in square brackets that indicates the command
line includes a pipe. Pipelines are given job numbers when they are sent
to the background. The second number is the Process
ID of the first process
in the pipeline. Both are useful if you want to manipulate a process in
- The job number can be used to bring a process from the background
to the foreground. The command to do so is simply fg,
if there is only one process in the background, but it is fg job_number
or % job_number, if there are multiple jobs in the background.
If you don't know the job number, enter the jobs command to see
a list of running jobs, including their numbers and the command lines
that started them. There is an example on page 148.
- The process ID can be used with the kill command to
stop a process, whether it is running in the foreground or the background.
The text shows a sensible way to find the process number you need on
page 147. If we know the name of the command that is running, we can
enter ps | grep command_name, which should produce one
line of output for that command, and that one line will start with the
process ID. The syntax for the kill command is just kill process_ID.
The chapter makes another odd jump to a new topic on page 148 where it
spends several pages discussing the use of wildcard characters and lists
of characters enclosed in square brackets. If you need review on this
material, read it over. This should not really be new to you in this class.