ITS 2310 - Linux I

Chapter 3, Mastering Editors

Objectives:

This lesson discusses editing files in UNIX and Linux. Objectives important to this lesson:

  1. Review of redirection and introduction to streams
  2. UNIX and Linux files
  3. Types of editors
  4. Using the vi editor
  5. Using the Emacs editor
Concepts:

Last time we began our discussion of the Linux file system. The Linux concept of files includes files, directories, and the standard data streams that we can refer to as stdin, stdout and stderr. Operating system commands refer to these files with numbers, a series of integers called file descriptors. The numbers for the standard files are:

  • stdin - 0
  • stdout -1
  • stderr - 2

Input and output can be redirected from their usual paths by using the redirection operators. When using the greater than and less than signs, think of data as flowing toward the point of the arrow head.

A single greater than sign sends output to the designated file destructively. That is, it creates the file if it does not exist, and it destroys the file's contents if any exist already. In order to preserve the present contents of the destination file, use a double greater than sign, which means to append.

In Linux it is possible to concatenate several files into one destination file. Simply enter the command like this:

	cat file1 file2 > file3

This would concatenate the contents of file1 and file2 and place them into file3. In order to preserve and append to an existing file3, you would use the double greater than. sign.

To redirect input, we can use the less than sign. The book illustrates this with the sort command, which is also useful for sorting files, line by line. To sort a file of data, we might use redirection to feed it a file like this:

	sort < filename

This would result in the sorted output being sent to the standard output device (the screen.) To save it in a file, we might use both types of redirection:

	sort < file_to_sort > sorted_file

There is also a double less than sign, used to set a marker at which to stop reading input. If used with cat, it would look like this:

	cat << stop

it means to read the standard input stream until encountering the word "stop" on a line by itself. If used like this:

	cat input_file << "Magic Word"

it means to cat each line of the file onto the screen until encountering a line the contains only the text "Magic Word". The quotes are used to mark the complete phrase as the stop value, not just the first word, and the line containing it and only it is not put on the screen, nor is any subsequent line.

The pipe is used to link two commands, taking the output of the command on the left of the pipe and sending as input to the command on the right of the pipe. Example:

	ls -s | sort -n

This command illustrates the -s switch for ls. It means to output the short form of the list. The -n switch of the sort command means to sort numerically, as opposed to alphabetically, which is the default. The use of the pipe is recommended when you do not wish to create a file that holds the output of the first command. It sends the text to the second command directly.

The tee command is like an additional option for the pipe. It allows you to take the output from a pipe and send it to a file as well as to the screen. In the example:

	ls -s | sort -n | tee filename

we are reading a directory list in short form, sorting it numerically, and sending the output of the sort to the designated file and to the screen.

File Descriptor numbers can be used with redirection symbols to send output to a standard file instead of the usual device. For instance:

	cc newstuff.c 2>errors

This would compile a program from a source code file named newstuff.c. If errors are generated, stdout 2, they would be redirected to a file called errors. The video below discusses data streams in more detail.


Chapter 3 begins with a reminder that files can be collections of binary code. Well, they all are, really. Some are meant to readable and editable by common mortals, not just by programmers with integrated development environment tools. Sometimes, binary code stands for machine language, such as the contents of compiled executable files. Sometimes the files are just plain text, and this chapter wants to show us some tools for creating and editing text files. In a computer system, we typically represent all numerals and characters as items in a standard numbered list. Once such a list is created, each numeral and character can be represented simply by its number, which is then converted to a symbol that can be printed on paper or sent to a screen for a user to read.

I think the author has gone a little bonkers on page 112. He probably should have said that some text files are written in ASCII, and some are not. ASCII stands for American Standard Code for Information Interchange. Files written in ASCII (pronounced like ask-key) may be written in just the original 128 characters defined by seven bits of a byte, or in the extended ASCII code that uses eight bits. See the table of defined characters on this web page. ASCII was not created by our friends at IEEE. It was created by ANSI, the American National Standards Institute.

The text mentions another character coding system, Unicode, which uses two or four bytes for each character. Another system is the one that was used for years on mainframes. It was called EBCDIC. We can think of the concept as the evolution of character representation:

  • EBCDIC - used on IBM mainframes in the 1960s (8 bits per symbol)
  • ASCII (IA5) - used on personal computers since the 1970s (7 bits per symbol); American Standard Code for Information Interchange is very close to International Alphabet 5: follow the two links for this bullet to note the insignificant differences, and note that there are other similar code sets for different countries
  • Unicode - created in the late 1980s and early 1990s by a consortium of vendors to represent more characters (16 bits per symbol); later it was expanded to use four bytes for some characters, so it has the capacity to store the characters for all current languages and some dead languages; Unicode includes ASCII as the first 256 characters in its system

So, an ASCII file is intelligible to an editor that understands Unicode, which means that more systems will be able to read it. UNIX has been around since ASCII was developed, so it features several available file editors that can read and write such files. The chapter introduces us to two text/ASCII editors: vi and Emacs. Both are what the text refers to as screen editors. To understand this point, you need to know that editors can be sorted into two categories:

  • Line editors - these apply changes to one line of text at a time
  • Screen editors - these allow a user to move the cursor about on the page and change anything in any order

The main reason for mentioning this is to console us that vi and Emacs are screen editors, and more user friendly than their ancestors. (The earlier ones must have been a real pain.) Other editors exist, and you may want to consider any of them as alternatives. See this short review of several editors that can be added to most UNIX/Linux systems.

vi is called vi because it is meant to be more visually oriented than older editors. As the text explains, it immediately displays changes that you make. (Rather what we would hope for, isn't it? Other editors did not display their changes to their users. You may wish to gasp in horror.)  The book then tells us that vi has three modes, each of which is used for a different purpose. These modes are called by other names in other documentation, which can lead to a lack of understanding when looking for supplemental training material.

  • insert mode - When vi first starts, you should be in command mode (see below). You can enter the insert mode by typing a lower case i. This is the mode in which you do most of your composing and editing. Key presses made in this mode are assumed to be text, until you press a command to enter another mode.
  • command mode - You can enter this mode from the mode above by pressing the Escape key. This mode allows you to issue commands to vi, like save, delete, and quit. Any key presses made in this mode will be treated as commands to the program.
  • ex mode - You enter this mode by first entering command mode, then pressing the colon key. You can issue commands in this mode that are more complex than those available in command mode. The text explains that this mode emulates the operation of another editor, called ex, which I feel sure you have never used or heard of.

The text presents instructions on a series of topics about using vi (and vim, the version of vi available in Fedora, RHEL, and SUSE Linux).

  • starting vi - Open a command line and enter vi, or vi path-to-file-to-edit
  • saving and naming a new file - If vi is running, and you have made a new file, press Esc to enter command mode, then enter :w filename
  • inserting text - From command mode, enter insert mode by pressing i. Then, enter text as needed.
  • repeating a change - Assuming you are in insert mode and have done some text entry, press Esc to enter command mode, then press the period key. Your last change will be repeated.
  • moving the cursor - Note that the topic is not "moving the mouse pointer", since vi does not understand a mouse. Page 117 provides a chart of key presses that result in movement of the onscreen cursor through a text file. This is necessary with you want to change the cursor's position, so that you can edit at a different point in that file. This is the kind of quick reference you would want to have as a card on your desk when using vi.
  • deleting text - Page 118 provides a short list of command mode commands to perform deletions. The text recommends using them in conjunction with cursor movement commands when deleting large sections of the document being edited.
    • x - deletes the character the cursor is under
    • dd - yes, twice. deletes the line the cursor is in. (Note: this command actually cuts the text to a buffer file.) To paste it back, position your cursor, and press P to paste above the current line or p to paste below the current line.
    • dw - deletes one word, assuming the cursor is at the beginning of a word, or from the cursor position to the next end of a word
    • d$ - deletes from the cursor position to the end of the current line (compare to dd)
    • d0 - deletes from the cursor position to the start of the current line
  • undo a command - enter command mode and type the letter u
  • searching - You may forward search in vi using the / key, and backward search using the ? key. Press the key for the direction you wish to search from the current cursor position, then type the text to search for, then press Enter. The cursor moves to the first occurrence of the search text, and may be moved to the next occurrence of it by pressing the letter n. This is how to search for an exact, literal string of characters.
    The text goes on to explain that you can search for more complex strings. The chart on page 119 shows the use of search modifiers that can follow the forward slash.
    • /\>xxx - Forward search for the next word that ends in xxx
    • /\<xxx - Forward search for the next word that begins with xxx
    • . - Use the period for a single wildcard character in the position specified in your search string. If you can't remember whether you are searching for principal or principle, you could search with /princip..
    • [] - These are the square brackets, and they let you find any combination of the letters inside the brackets, at the position specified in your search string. For example,If you are searching for affect and effect, search with /\[ae]ffect

The text finally gives us a reason to know about ex mode on page 119 with the explanation of the Search and Replace command. As noted above, press Esc to enter command mode, then : to enter ex mode. The text explains that the command we have seen up to this point are screen oriented commands, which take place at the cursor position. An ex mode command is a line oriented command, which will find all lines in the file that meet the command's requirements and make the specified changes. This is not very intuitive terminology, but that is nothing new in UNIX/Linux.

To execute a search and replace, first go to command mode. Then enter something like this:

:line-number-where-the-search-will-start,line-number-to-stop-searchings/search-for-this/replace-with-this/g

In the example in the text, the command is :1,$s/insure/ensure/g

1 means line 1. $ means the end of the file. s means to search, insure and ensure are the search and replace strings, and g means to do it on every line that matches the search string.

Take a deep breath, let it out slowly, and think of something that would make you happy. Do this for a minute. I will wait here. If you have taken a break and need some convincing that rational people in the 21st century would actually choose to use this program, consider this article by a possible convert who learned a few reasons to appreciate vi.

Page 120 continues the lesson with advice about saving your work and exiting vi.

  • To save without exiting vi, enter command mode, then enter :w, which will write (save) the current file.
  • To save and immediately exit vi, enter command mode, then enter :wq, which will write (save) the current file, then quit vi. This can also be done with the command ZZ, and the command :x.
  • If you want to close vi without saving your changes, use the command :q!. This not usually one's intention, but it is possible you may want to do so some time.

Copying from one file into another is one version of doing a copy and paste operation. In vi, it is not as simple as you are used to. In vi, you open the file (call it the destination file) that is to receive text from another file (call it the source file). Enter command mode, and use the command :r filename. In this command, the filename is the name of your source file.

On page 122, the text describes cutting, copying, and pasting in the same file.

  • dd - deletes one line of text
  • 5dd - deletes five lines of text
  • yy - copies one line of text
  • 5yy - copies five lines of text
  • p - pastes whatever is currently in the copy/cut buffer

Some people are passionate about their chosen text editor, mainly because it has features that they use a lot. This video is about the vim editor, an improved version of vi. The presenter likes it very much, and that is really the point. If you use something because it does what you need it to do, you will probably like it, or you will find something you like better. And you will only do that if you have a reason to use a text editor.


The text changes topics on page 123, beginning its discussion of Emacs. We are told that unlike vi, Emacs has no modes, and that commands are often issued in it with Alt or Ctrl key combinations. The tutorial for Emacs that is included with it discusses using these key combinations to navigate the screen, and to edit a document. A chart of these keyboard commands appears on pages 124 and 125. This list of commands, while valid, is misleading. Read ahead a bit and you will find that Emacs has a GUI interface that "runs under X Windows", which means that it will partially understand a mouse under whatever GUI your system might be running. Emacs is like a cross between a character based word processor and a GUI based one.

The text only discusses Emacs for a few pages. It make more sense to me to mention that distributions of Fedora may come with LibreOffice installed. LibreOffice is a free office suite that works a lot like Microsoft office. It is much more familiar to users and you will be much more productive using it. You will not have permission to install this suite on every Linux machine you may ever use, so don't depend on it having it, even though it is free to download, install, and use. Learn vi or Vim and use them when necessary. Learn LibreOffice and use it when you can. Or learn something you don't hate.