CIS 2210 - Database Management and Design

Chapter 9: Database Design

Objectives:

This lesson discusses material from chapter 9. Objectives important to this lesson:

  1. Information systems
  2. SDLC
  3. Database Life Cycle
  4. Conceptual design
  5. DBMS selection
  6. Logical design
  7. Physical design
Concepts:

Chapter 9

The chapter begins by telling us that a database is only one part of a larger system that allows data collection, storage, transformation, and retrieval. Transformation means changing the data from one form to another, but it also means processing it to provide information to the users. Concentrating on the database portion of our system, this chapter addresses database development as a part of an SDLC. Development includes the design of a database as well as implementation.

The text introduces the idea of a system to put its topic for this chapter in context. Up to now, we have been talking about taking to users, gathering data, and normalizing tables, all of which is contained in System Analysis. Systems are often created in a series of steps that are collectively called system development. The process of using such a series of steps, typically in a repeatable cycle, is called using a System Development Life Cycle (SDLC). An SDLC process is a little different from one text to another. This one gives us a set of five steps, or phases:

SDLC Phases

  1. Planning
  2. Analysis
  3. Detailed system design
  4. Implementation
  5. Maintenance

This chapter has a section about each of the five phases listed above. Each of the phases from 2 to 5 can fall back to an earlier phase if there is a change in circumstances that requires the system to be reconsidered from that earlier phase. For example, if we discover in phase 4 that there was a critical error in phase 2, we move back to phase 2, correct the error, and move on from there. At any level, there must be approval from the project sponsors (typically upper management) to move ahead to the next phase after review of the completion of the current phase. Also, it is normal to completely reexamine a system after it has been used for a while, starting over at phase 1, and proceeding through the cycle to incorporate whatever changes are needed.

Planning

This phase often starts with a request for a new system. The team conducting this phase should learn about the company making the request, its data, its processes, and its existing systems. The text presents three standard alternatives that may be chosen during this phase:

  • Should the existing system continue to be used? Does it do the job we need it to do?
  • Should the existing system be changed to work better, or to incorporate the new requirements? The text points out that the system may already do what needs to be done, but it might perform better if it was amended to include some things we merely want. Our job is typically to meet the actual needs of the company, to fulfill the actual system requirements.
  • Should the existing system be replaced? Sometimes, this becomes a question of building a new system while maintaining the old system for its original purpose. The text reminds us again to determine what is actually needed to meet the system requirements.

The text changes to a related topic: can we actually build the system that is being requested? The text considers three perspectives that affect the answer to that question.

  • technical perspective - What are the hardware and software requirements for this system? Can it be built with current technology? Do we need new hardware, software, or infrastructure?
  • system cost perspective - Assuming the system can be built, can we afford it? Will the cost of creating/changing the system make sense in terms of money we will gain or money we will stop losing? Does it increase our profits or diminish our losses?
  • operational cost perspective - What will the ongoing costs be to operate and maintain this system? Operational costs may include the cost of training/retraining staff to use the new system, which should be a consideration in the design process.
In this phase we must:
  • Analyze the data and project requirements.
  • Create ERDs and normalize tables.
  • Verify that our model is accurate.
  • Complete a design that includes any new features that have become requirements.

Analysis

The previous phase involves a lot of interviewing, fact finding, and questioning. It leads to this phase in which logical technical requirements for the system must be determined. We must determine what the system has to do first, then we must build a model/plan for a system that would meet those requirements. In the process of developing that model, we translate the system requirements into technical terms. What are the inputs? What are the outputs? What processing must be done to create those outputs?

Creation of the logical design includes making data flow diagrams (DFDs), hierarchical input process output diagrams (HIPOs), entity relationship diagrams (ERDs), and normalized tables for databases.

Detailed Systems Design

The analyst completing the design establishes the requirements for software functionality, for menus and reports, for devices that must be allowed to attach to the system, and for any other technical capabilities the system must have. A migration plan to move from the old system to the new system is created, and training plans for users and for staff who will maintain the system are created.

Implementation

Hardware and software are installed, tested, and demonstrated to the new users and owners. During the installation, the old data is converted for use in the new system (if necessary and possible) or the old system remains available to provide access to the old data (if needed).

Maintenance

The new system is maintained by its new administrators/owners who assess its performance at regular intervals.

The text presents another spin on development, the Database Life Cycle, in which an SDLC is rewritten as a DBLC:

DBLC Phases

  1. Database Initial Study
  2. Database Design
  3. Implementation and Loading
  4. Testing and Evaluation
  5. Operation
  6. Maintenance and Evolution

As you can see, there is a lot similarity from one system to the other.

Database Initial Study

The text lists some steps in this phase that are just as appropriate to the SDLC. The company and its problems are examined, constraints and objectives are defined, and scope and boundaries of the project are determined. In any project, you must know what must be done, what cannot be changed, and what the limits of the project's authority will be.

Database Design

The analysts working on the project design the pieces that will make this database system work: its tables, its relationships, its keys, its reports, and its functions. The text warns us that a database project cannot be designed like a sequential program. It is more like an event driven program that must provide whatever its users want next. The code written for the system may have a great deal of iteration in it, but it must also provide an interface that allow the users and administrators to select which functions they need to run.

The text also warns us about something else that is true of database projects and most IT projects. The business expert staff providing the analysts with their business rules and processes see the system from their own point of view, which is often a human interaction view or a view based on their own job. The analyst receiving this data must turn it into a system view that can be used to specify hardware, software, and human interfaces in the language of IT. It is as if the analyst between the business experts and the designers must see the entire elephant, because the analyst must be able to describe the part that interests the person at each end of the elephant, armed only with the data given to him/her by a person at the other end of the elephant. Only, you aren't sure if its an elephant, a horse, or something you have never seen before. Do you have a little more respect for that person in the middle now?

Implementation and Loading

The structure of the relations should be known, and sources of data identified, so someone has to build the database and load data. Like most software, the database can be created in a virtual machine, which gives you an advantage regarding backups. Most virtual machines can generate incremental snapshots of themselves, allowing an administrator to examine what the system looked like at various points in its history.

The text describes creation of a database in several steps, using DB2 as its example system.

  • Note that before creating any file objects, the system administrator had to create storage areas that would hold files and their data by creating a storage group. This is a required step in some database systems.
  • Inside the storage group, the system administrator creates a database space, giving the database a name. A database administrator ID is created as well, and it is assigned rights to the database space.
  • The database administrator (DBA) logs in to the system.
  • Inside the database space, the database administrator creates one or more table spaces, each of which can hold tables. Tables are created and populated.
  • The DBA also creates user IDs and assigns rights to table spaces and tables.
  • Integrity constraints are added to the tables, and other programming as needed.
  • Someone adds initial data to the tables, then the database can be tested.

Testing and Evaluation

Each of the features of the database should be tested, to make sure that it works as desired. The text provides a list of features that should exist in the system, and whose functions should be tested at this time.

  • Physical security
  • Password security
  • Access rights
  • Audit trails
  • Data encryption
  • Compatibility with diskless workstations, if needed. An example would be a kiosk station in a store that allows a customer to check product prices or availability.

Ongoing testing of the system will be needed as your tables grow. Indexing tables may be useful when searching them takes too long.

The text mentions setting a regular schedule for backups. It describes three methods that can be all be used at different intervals to make a more recoverable system than any of the methods would provide is used alone.

  • full backup - This is just like it sounds, a backup of all the database objects and files.
  • differential backup - This is used more frequently than a full backup. It only copies items that have changed since the last full backup. In case of data loss, you would restore the most recent full backup, then restore the most recent differential backup to catch up to changes since that full backup was made.
  • transaction log backup - In some systems, changes made to the database are not immediate. Changes may be stored in a transaction log, with each change being tagged with the time it was done. Changes in this log are made part of the database at regular intervals, which may be at the end of a business day, or more frequently for more dynamic data. If your system has this feature, complete restoration requires that a backup of this file be processed against the database, but only if the transaction log contains changes that do not appear in the most recent differential backup.

Table 9.1 in the text reminds us that software can be attacked, hardware can fail, transactions can be lost to deadlocks, and the unexpected will eventually happen. Backups and restores should be scheduled and practiced. A deadlock, by the way, is a condition that occurs when two transactions each lock a record they are updating, but their updates also require a change to the record that the other transaction has already locked. This prevents either transaction from completing unless the system is programmed to drop one of the transactions so it can process the other one.

Operation

Some problems will not show up in testing. Some will only show up once there are a critical number of users on the system, performing enough transactions to stress the system to an unexpected limit. If the system cannot keep up, hardware and software design must be reconsidered. Perhaps a cluster of servers will be needed, featuring load balancing as the system bogs down. Changes will be requested regardless of the quality of operation, leading to the next phase.

Maintenance and Evolution

Once the system is running, there will be requests for changes. There will be requests for corrections to data and to features, new tables and reports, and reexamination of security. These are common requests that someone who is maintaining the system Changes should be submitted through a process that will examine them, then reject them or schedule development and application of a solution.

Conceptual Design

The text returns to the design phases discussed above to expand on them. Conceptual design is the first part of three stages of design. They are conceptual, logical, and physical design. A schematic way to consider the stages might look like this:

  • Conceptual design - What do the users want?
  • Logical design - What would a system have to do to meet the requirements?
  • Physical design - How is the system going to work, and how are we going to build it?

The text presents a motto that it calls the minimal data rule: all that is needed is there, and all that is there is needed. This corresponds to a different statement about the subject that I have heard more often: don't store any data you are not going to use. This version only implies that you are storing everything you need, but it does not say so. The minimal data rule does say so. The author leaves himself an out regarding this rule. He tells us not to be shortsighted when we choose our data. The customer may decide later that some data elements are more important than they seemed at first.

The text expands on each of the bullet point in the design process for several pages. Read through the details of this section to get an idea of the work that must be done.

DBMS Software Selection

This phase of design is one that might be done sooner in the process, basing your decision on requirements that the company/customer/requester must comply with, such as already owning a DBMS product. Assuming the decision must still be made the text suggest we consider several DBMS characteristics, some less obvious than others:

  • cost
  • features and tools
  • model: hierarchical, network, relational, object oriented
  • portability: does it allow us to use it on multiple platforms and with multiple languages?
  • hardware requirements does the product require better computers than we have or can buy?

Logical Design

The text seems bent on setting a record for number of pages spent on the same concept. The point of this stage of the process is to turn the concept of the database into a model based on the chosen DBMS, which we already know is most likely to be a relational DBMS. The text seems to know this as well, since the example it gives us is a relational DBMS, and we are told to construct relations, relationships, constraints, data definitions, and views needed by users.

The text provides examples of ERDs and older text diagrams of entities, including their attributes, primary keys, and foreign keys.

In section 9-6c, the text shows us a valid notation for attribute constraints for three attributes that must conform to particular rules. In the first column we see the name of an attribute, and in the second column we see rules that apply to that attribute, each rule on a separate line. This is easily constructed using table. In the example on the right, you see two attributes and the constraints that must be applied to them in the database that is being constructed.

The next page shows you a notation for creating a custom view of some data, which will make no sense at this time, unless you have already read chapter 7 or are familiar with SQL. We will skip over that this week.

Physical Design

Remember that the physical design in a project has to do with the way that features will actually be implemented. This includes hardware and software specifications where they are appropriate.

The remainder of the chapter goes into some very muddy water. Navigating it will not assist you in making good choices about your projects in or outside of this class, so we will stop here.