CIS 251 - System Development Methods

Chapter 11: Managing Systems Implementation
Chapter 12: Managing Systems Support and Security

Objectives:

This lesson discusses material from chapters 11 and 12. Objectives important to this lesson:

  1. Implementation phase
  2. Types of testing
  3. Documentation
  4. Installation
  5. Training
  6. System changeover
  7. Support activities
  8. Managing system performance
  9. Security concerns
  10. Backup and recovery needs
  11. Assignments for the week
Concepts:

Chapter eleven begins with a discussion on page 450 about what can be called quality improvement or quality assurance. It is called by several names in the discussion that follows. The concept is that systems are never perfect, and they must be examined for flaws and improved after they are put in place. This examination of a system begins in the implementation phase of a project.

It is unclear to me why the text moves on to review the system design methodologies it discussed in previous chapters. The discussion moves ahead on page 456, discussing structure charts, which may be useful to programmer/analysts who are charting the functionality of their applications. The text describes them as functional decomposition diagrams for applications. This material applies more to a class that is writing programs in a non-object-oriented language. The text moves on to discuss similar charts that would be used by programmers who are building a system using an object-oriented approach. Finally, the text discusses a system built with an agile approach, and brushes by the documentation that might be used. Let's make note of this material and move ahead.

The text mentions that someone has to take all our plans and designs and turn them into a system. Well, yes, unless our plan was to deliver the band instruments and uniforms, then skip town with the rest of the money. This could be done in an integrated development environment, such as Microsoft's .NET, IBM's WebSphere, or any of the other examples that are mentioned. You should be aware that these exist, and that you may be required to be familiar with them if you are hired for a job that includes development.

After the coding is done, the system will need to be tested in a number of ways.

  • syntax error check - In an integrated development environment, checking for syntax errors should happen with each iteration of the system. A syntax error is an error in the program itself, an error the programmer made that violates some rule of the programming language being used.
  • logic error check - Once we know there are no more syntax errors, we can check for errors in logic. The program can only do what it is told. A logic error happens when the programmer tells it to do the wrong thing. This is also called a semantic error. For example, the program might store the total of several numeric values, when the plan was to have it store the average of those values. The system must be checked to make sure it does what it is supposed to do. This sort of test is often done with a set of values that will produce a known result when processed.
  • structured walkthrough - A formal test of the functions of a program, testing each feature with every kind of data that we anticipate being used in the system.
  • design walkthrough - A test with users to determine if the system makes sense, and meets the requirements set at the beginning of the project.
  • unit test - This is a test of one module in the system, assuming that the system has been constructed of modules or subsystems. The kinds of tests described above may be used, but they are restricted to one unit of the system. Each unit of the system should undergo similar testing.
  • integration testing - Some system features require that various modules interact with each other. This test determines whether such dependent modules are functioning correctly. The text cautions that we must test for how the modules react when given improper data. This is true of all of our testing.
  • system testing - Finally, the developers should test the entire system as a whole, to make sure everything is done, and that all the pieces link properly. Users should perform a User Acceptance Test (UAT) of the system as well, as part of the formal completion of a system.

On page 469, the text move on to discuss documentation. Four kinds of documentation are described, which vary by who writes it and who uses it.

  • program documentation - documentation of the decisions made by system analysts as they plan the system; this is handed off to programmers who use it as a construction guide
  • system documentation - documentation of the decisions made by programmers as they write the code for the system; should include updates made to the system since the original requirements were made
  • operations documentation - documentation written for staff who maintain servers that the system must run on; servers, in this sense, includes mainframes and minicomputer systems (a minicomputer is a system that is larger than a typical server, but smaller than a mainframe); includes lists of staff to contact in case of emergency, lists of input and output files, and special requirements and instructions about the system
  • user documentation - as it says, documentation for users, instructions on how to use the system; this is what most people think of when they think of documentation; review the features the text recommends, and ask yourself if you have ever seen documentation this complete

The text warns that good user documentation takes a significant amount of time and effort to create. Bad documentation can take a long time as well, but documentation made in a hurry should be suspect. A classic example is a ridiculous warning label on any product.

Once documentation is complete and management has approved the test system, the project moves ahead to installation of the new system. The text discusses a test environment for the system. This is a strange time to bring up this idea, since our developers and testers must have been using this environment for weeks or months at this point. Every time we created a prototype, we should have been doing it in a test environment.

  • You build and test on a separate system, not on your live system.
  • You don't install the live system until the test system is functional and approved.
  • The live system runs in what is called the production environment (operational environment).

Training for the new system is covered next. The text notes that you typically need to train three kinds of staff: users, managers, and IT support staff. You may also need to provide training for vendors who are allowed/expected to use your system. The training should be tailored to the audience's interests and needs, and it should serve as a transition from the old system to the new one. Note the three sets of topics the text proposes for these audiences in figure 11-30. The text discusses several methods of communication to staff, including vendor training, outside training companies, webinars and podcasts, online training, and video based training.

The text mentions data conversion as a step in activating the new system. This is sometimes as simple as copying data files, but may be more complicated if the new system uses a different file structure or cannot use the old system's data. This explains the four scenarios offered for system changeover. Figure 11-37 provides a graphic version of the differences between them:

  • direct cutover - a clean change; we stop using the old system and start using the new one on a chosen date; dangerous if/when there are unexpected problems
  • parallel operation - both the old and new systems may need to operate for a while, if we are in doubt about the new one working, if we need to close out operations that depend on the old system, if we cannot transfer some kinds of data and must complete cases in the database rather than move them to the new system
  • pilot operation - we begin with only a few users on the new system, and when it is proven successful, we move the rest of the users to the system
  • phased operation - we start with a few users, like a pilot, but we do not move the rest of the users all at once, we move the remainder in bunches; the changeover may be made by organization, by functional org chart unit, or by geography, depending on the needs of the organization or the new system

Figure 11-38 places these four choices on a chart based on cost and risk. Note that cost and risk are inversely proportional.
Direct cutover - High risk, lowest cost (unless it fails)
Pilot operation - Medium risk, low to medium cost
Phased operation - Medium risk, medium to high cost
Parallel operation - Low risk, highest cost

Chapter twelve is the last chapter of the text. It discusses systems support and security concerns.

The chapter begins its discussion of support with user support. We have just discussed training, which is important for existing users who transition to the new system, and for new users who will have no knowledge of the old or new system.

Ongoing support for users may be provided by a help desk, either run by your enterprise or by the vendor of your system. The duties of a help desk agent vary, depending on what kind of help desk they work for: an enterprise help desk may have all the duties shown on page 507, but a help desk for a vendor may only deal with issues about their products. Many companies have outsourced their help desk services, purchasing it from companies that specialize in this service.

The next topic is maintenance of the system. The cost of maintenance should be part of the ongoing operational cost of the system. The text lists four types of maintenance:

  • corrective - correcting errors in the system (we may not catch all errors in testing)
  • adaptive - typically, adding enhancements to the system that were not in the original request
  • perfective - changes to the system to improve performance (either system performance or user performance)
  • preventive - precautionary maintenance, like swapping out NICs after a few years, defragging hard drives regularly, system reboots, and virus scans

It should be obvious that such changes to a system will typically involve requests, which must be approved, and must be implemented by technical staff. The text presents a lengthy discussion of these facts. In practice, the process to request and approve changes will vary from one organization to another.

The text moves ahead to performance management on page 519. The first category it discusses is fault management. A fault, in this sense, is an event in which something breaks: hardware failure, software failure, power surge or loss, and user errors are examples. Fault management involves anticipating these events, preparing for them, and having a way to recover from them quickly.

The text talks about measuring system performance. This is a noble goal, but it is difficult to achieve for the reasons stated in the text. To paraphrase John Donne, no system is an island. Everything depends on shared infrastructure, Internet performance, and a dozen variables that we can't get a handle on. If you use one of the automated performance tools, you can get consistent results, but you may not be able to determine where faults in throughput are actually caused.

On page 526, the text begins its discussion of security. This section of the chapter is a short discussion of security, only hitting highlights about it. For those who have not had an introduction to security, the text covers some basics. Three aspects of information that are typically protected:

  • confidentiality - information is accessed only by those who are meant to access it
  • integrity - information is correct, and has not been altered except by authorized persons
  • availability - information is accessible when needed

The text moves ahead to discuss risk management. Its presentation is a little different from our security texts:

  • asset identification - what do we care about?
  • threat identification - what are the dangers?
  • vulnerability appraisal - how much could be lost?

The three concerns above are included in the concept of risk identification.

  • risk assessment - how likely is a loss? The text also includes impact on the organization in this concept.
  • risk mitigation - how do we reduce the risk? The text calls this risk control.

Basic terminology:

  • assets - devices and information that we care about
  • threat - a potential form of loss or damage; many threats are only potential threats
  • threat agent - a vector for the threat, a way for the threat to occur; could be a person, an event, or a program running an attack
  • vulnerability - a weak spot where an attack is more likely to succeed
  • exploit - a method of attack
  • risk - the probability of a loss

The text discusses three classic methods of addressing the identified risks. These are all forms of risk mitigation:

  • diminish the risk - patch, make and enforce policies, enlighten staff about safer procedures
  • transfer the risk - obtain insurance against loss; possibly subcontract the service that incurs the risk
  • accept the risk - write off losses as the "cost of doing business"

Some categories used to classify attackers:

  • hackers - One of the buzzwords of computer system geeks, this one can mean anything; it is generally accepted to mean someone with more skill than an average user, may be a white hat (good guy) or black hat (bad guy). A hacker may break in to a system for a thrill, to show off, or to cause some kind of damage.
  • script kiddies - Attackers who use hacking tools that they don't really understand.
  • spies - Computer attackers who are looking for specific data from specific systems.
  • employees - Computer security includes the concept of protecting data from people who aren't authorized to access it. What about protecting it from authorized users who want to give or sell it to someone else? What about authorized users who give out their password because someone asks for it? What about users who are no good at protecting their secrets?
  • cybercriminals - They are after some financial gain. This could be data they can sell, actual fund transfers, or theft of financial instruments.
  • cyberterrorists - A cyberterrorist is defined as a system attacker whose motivations are ideological. Do I really care why he does it? No, but a prosecutor or a law enforcement official will.

The rest of the security discussion should be left for security class or security specialists.

On page 540, we turn to backup and recovery. As the text states, a backup is a copy of your data. The word can be used as verb as well, If I backup (which should really be "back up") the system, I make such a copy of the data. Recovery is the process of restoring lost data. To understand the following, understand this: an Archive bit is a bit in a file that is turned ON when the file is changed; it is used to flag files that have changed since the last backup.
Standard backup schemes:

  • Full - a backup of all files in the target; sets the archive bit of each file to OFF once the backup is made
  • Incremental - a backup of target files that are new or changed since the last backup; depends on the fact that programs that change files typically set the archive bit to ON when a change is made; sets archive bit to OFF for all files it copies
  • Differential - a backup of all files new or changed since the last Full backup; copies all files whose archive bit is set to ON; does not change the archive bit of files it copies because they will be copied again in the next differential backup
  • Copy - like a Full backup, but it does not change the archive bits of files it copies. This is typically not part of a standard backup strategy, but an option to work around the system. It is like the scheme that the text calls continuous backup.

To keep them straight in your mind, remember these facts:

Backup type What does it back up? What does it do to the archive bit?
Full copies everything Resets all archive bits.
Incremental everything different from the last backup Resets the archive bits of files it copies.
Differential copies everything "different from Full"
(Different from the last Full backup.)
Does not reset any archive bits.
Copy makes a Full backup Does not reset any archive bits.

The time required to create backups should be considered along with the time to restore a backup. When you consider the two concepts as two sides of the answer to a question (What method should I use?), the answer may be the most common choice: Differential. It is the best compromise in terms of backup time versus restore time. Note also, that all standard methods require a full backup on a regular cycle. The recommendation is usually to run a Full backup weekly.

The text encourages system developers to include business continuity in their plans. What kind of backups will we use? How will we test the backups? How will we continue to operate if there is a system disaster?

There are a few more concepts in the chapter, but we seem to have covered the most important material.