CIS 303a - Computer Architecture

Chapter 10, Application Development

Objectives:

This lesson discusses application development and programming languages. Objectives important to this lesson:

  1. Application development processes
  2. Comparing programming languages
  3. Differences between assemblers, compilers, and interpreters
  4. Static and dynamic linking
  5. Application development tools
Concepts:
Chapter 10

This chapter starts an entirely new topic, which seems quite unrelated to the previous chapters. Applications are programs that do specific things like word processing, spreadsheet creation, or web browsing. Application development is the process of determining what a user needs an application to do, how to accomplish that in a program, and creating the program to meet the user's needs. The user in question may be the general public, or one or more people in a company who have a specific need for an application.

The text tries to explain that developing an application can be considered as two translations. This is better understood by looking at the figure on page 365.

  1. We begin with a user making a request for a program. In this example, the user wants a program that will process accounts receivable and accounts payable. This is what the text calls an abstract need. It is abstract because the user has not yet stated what the program must specifically do.
  2. One or more IT professionals, typically system analysts, will consult with the requester to determine the detailed steps the program needs to perform. A common technique is to start with what the requester wants as the output of the program, then to reverse engineer that output to determine what must be the inputs and what processes must take place that will produce the desired outputs. In this step, the analyst is determining what the program will do, but not how it will be done.
  3. The next step in development is to take the system requirements from the previous step, and to write computer code/commands that the program will execute. The text shows a series of steps that might be part of such a program on the far right of figure 10.2, which are actually representative of machine language, not the code a programmer might write. We will see more about this in a few pages.

So, in step 1 a request is generated. In step 2, analysts work with users to determine what a program would have to do to meet their needs. In step 3, analysts or programmers create a program. The text says that step 3 involves translating the user requirements from "natural language" to computer language. This is bit misleading. When the author says natural language, he really means the language that the requester speaks, or the language that the analyst speaks. Sorry, but no human language is any more "natural" than any programming language. The distinction is more that step 2 generates a description of what the program must do, and step 3 creates the program. It is common for programming texts to refer to the user's spoken or written language as natural language, so you need to understand what that phrase means. It is also common to refer to the output of step 2 as pseudocode, a symbolic expression of what the program does, as opposed to the actual programming language that the programmer will write.

The text refers to the System Development Life Cycle (SDLC), which it covered in chapter 1, a chapter that is not included in the plan for this class. A summary of the steps in the author's preferred version of the SDLC, the Unified Process, appears on page 366. It is the same summary that appears on page 3. If you do not recognize the basic steps in it, skim chapter 1 for a bit then come back here. What should you get out of the Unified Process diagram?

  • This model has six major steps:
    • business modeling - describe what is done by whom, in what order, in IT terms
    • requirements - what must the application do
    • design - how will the application work, and what is needed to make it work
    • implementation - build a version, an iteration, of the application
    • testing - make sure the application does what is desired, accurately
    • deployment - install the application, train users as needed, and transition from the old application, if there was one
  • The relative activity in each step is shown by the height of a curve, each step having its own curve. Note that business modeling (determining what the customer's business does) finishes early in our project, but the other steps can go on for the entire project.
  • The horizontal axis in this diagram shows a series of iterations. You should think of each iteration as an improved version of the project or the program we are creating. We do something, we check with the user to make sure it is right, and we change what is wrong, what was misunderstood, or what must now be different due to outside forces.

On page 366, the text talks about system developers using various models, tools, techniques, and processes to create systems. In this section, the text does not discuss the differences between those terms, or between any of the different models, tools, techniques, or processes that exist. In a few pages, it does discuss three types of tools.

Before we start that section, consider figure 10.5 on page 368. On the left side, it shows graphics for five requirements models, which are systematic ways of presenting what a client wants the system to do. In each example, one requirement is illustrated that flows from a customer placing an order to that order being shipped. None are too thrilling or surprising. Note that each illustration is more understandable once you know what it means. A lesson to take from this is that there are many ways to represent what a customer wants to happen in a system, and they are all good, if you understand them. You need to know which of the many modeling methods is to be used in the environment where you will work, so you can make acceptable and understandable documentation.

On the right side of that figure, we see five illustrations of details from design models. A design model depicts what the system being constructed will do. Note that the design model is less understandable if it is not paired with a requirement that makes it clear what is supposed to be happening. In a system development class, we typically study two or more popular models so you can recognize their notation and symbols, and so you can practice using them to become familiar with them. This chapter is an introduction to system development, however, not a course in it.

The text becomes confusing on page 369. All the author is doing is discussing the fact that there are many products we can call development tools, some of which support multiple models, and some of which are dedicated to specific models or languages. What does language have to do with it? The author is talking about programming languages. Languages can be considered tools all by themselves, while development tools that handle syntax problems for you are more automated than tools that simply produce programs from your typed input.

On the bottom of page 369, the author talks about creating programs. A programmer is a person who writes programs. The actual files created by a programmer can be called code or source code files. The code written by the programmer is called source code because it must be converted into another form for a computer to execute/run the instructions in it.

The O'Reilly publishing company, a technical book publisher, estimates that there are over 2500 different programming languages. It is beyond the scope of this chapter to discuss more than a few of them. Instead, the author discusses some common categories that refer partly to the decade in which a language was developed.

Generational Types for Programming Languages:

  • First-Generation Languages (1GL) - machine language; actual binary instructions for the processor of a computer; programmers wrote code in machine language in the 1940s, when the first electronic computers were built; human error makes it impractical to write a long program of this sort
  • Second-Generation Languages (2GL) - assembly language; short symbolic code is used to represent commands to the processor; associated with the 1950s; the text calls the code symbols "mnemonics", but this word also means any method used to remember something; this kind of language began the use of variables, names for memory addresses that hold data, and labels, names for addresses in memory that hold instructions; had to be converted to machine language for execution by an assembler
  • Third-Generation Languages (3GL) - features instruction explosion: one program instruction can stand for many processor instructions; associated with the mid-1950s and later; typically have a character-based interface; the text lists several examples, most of which are quite different from each other in terms of syntax; typically translated into machine language by a compiler, which converts the whole program before the program is run, or an interpreter, which converts one line at a time while the program is being run; programs written in a 3GL can be compiled or interpreted to run on computers with different processors
  • Fourth-Generation Languages (4GL) - associated with the 1970s and later; higher rates of instruction explosion than 3GLs; libraries of standard functions; able to access external databases;
    On page 373, the author compares two code examples on the next page. He remarks that the C example contains many more commands than the SQL example. This is true, but it is not a reason to prefer SQL for anything other than manipulating data files in standard ways. C is a programming language, but SQL is not. It is what its name says: Structured Query Language. It can read, write, and report elements from data files, but it cannot create a new program to do something innovative.
  • Fifth-Generation Languages (5GL) - associated with the 1980s and later, but some were developed in the 1960s; higher rates of instruction explosion than in 4GLs; a 5GL may support nonprocedural rules, which support decision making and expert system programs

Other Types of Programming Languages

  • Object-Oriented Programming Languages - The text describes this approach as beginning in the 1970s, and distinguished by not seeing programs and data as necessarily separate. This may be true in the author's mind, and it may sound nice, but it does not put any tools in your hands, which is what a programming language should do.
    It may be more helpful to say that an object can contain methods (programs or functions), can contain attributes (data values), can send messages (requests to other objects), and can return responses to requests.. A message from one object to another can also be thought of as an event. Read through some of this discussion on Wikipedia, and see if it makes more sense than the text.
  • Scripting Languages - The author seems to forget about shell scripts (in UNIX) and batch files (in DOS) that predate the creation of Javascript, VBscript, and PHP. His single paragraph about script languages might lead us to think that they are not important enough to learn, which is not so. Adding a script to a web page can make it dynamic, or can make it take forever to load and display, which explains the idea that we should learn to use scripts properly.

On page 378, the text begins a section that addresses the idea of compiling, and adds a related concept: the link editor. In the scenario described in the text, our programmer has written a source file in C++. C++ is one of the variants of the C language developed at Bell Labs many years ago. (If you haven't heard of them, follow the link. We owe them for a lot more than just the telephone.) The point about C in this case is that C is a compiled language, which makes heavy use of header and library files. These are files that hold lots of functions (code modules), already converted into executable code, that can be called (used) in any program you write in that language.

The way it works is pretty simple. You state which header and library files to compile with your program, and call the functions you need in the program. Call them? That means to write a line of code using the function. The function you are calling is not actually part of your program until you compile the source code and link it with the appropriate library or header file. (Why do I keep saying that? There is a difference, which is not really important now, but I am trying to be accurate.)

So how does the compiler know how to run the function? It doesn't. The compiler actually writes an intermediate file first, called an object file. The object file translates source code into machine language. It also contains placeholders for each call to an external function (a function that's defined in another file). The object file is handed to a program called the link editor, which looks for calls to functions that are not in the object file yet. The link editor searches each header, library, or other type file named in your code for the missing functions. When the link editor finds a copy of the necessary function, it copies it into the object code file, like a copy and paste operation. (Remember, functions in those external files are already stored in machine language.) After the substitutions are made, the final compile takes place, and you get an executable program.

The kind of linking described above is called static linking (page 386). The text contrasts this to dynamic linking, which keeps the library files separate from the executable files, and calls functions from those files, which are often dynamic link libraries (DLLs). The text tells us that dynamic linking has two main advantages. The first is smaller executable files, but this is an illusion because you still need the DLL files for the program to run. The other advantage is that you can update the DLL files without updating the entire program. This makes sense, but historically it has caused problems because one program would update a DLL that another made calls to as well, and suddenly the necessary functions were missing or incompatible as far as the second program was concerned.

The text talks in a good bit of detail about C and how a good program is structured. This discussion seems quite out of place in this text, since this is not a language class. We will move on to the next related topic.

If a program is written in a language that is not compiled, it is probably meant to be interpreted. Interpreted programs typically run slower than compiled programs because each line of code must be translated to machine language and fed to the processor at run time. In fact, the interpreter reads, translates, and runs each line before it moves on to the next line. If you have ever seen script files or batch files run in Windows, DOS, or UNIX, you have seen examples of interpreted programs. More elaborate interpreted languages can also contain calls to library files, causing them to need link editors as well.

Moving ahead to page 392, the text describes application development tools, generally sold as integrated suites of tools to carry out creation of source code, compiler or interpreter (as needed by the language being supported), a link editor and library files, user interface templates, debugging tools, and a common interface for all the tools in the suite.

The last topic in the chapter is CASE tools, computer-assisted software engineering tools. As the text points out, any application development tool is probably also a CASE tool, but often suites sold as CASE tools also include tools for modeling the user's requirements and the application design, which a more focused language tool might not have.