CIS 331: Database Management Using SQL

Chapter 1: Introduction to Database Management


This chapter introduces the student to the basic concepts of the course. The objectives important to this chapter are on page 1:

  1. Introduction to the case study for the lessons
  2. Basic database terms
  3. Description of database management systems
  4. Advantages and disadvantages of database processing
  5. Introduction to the case study for the assignments

The chapter begins with a description of a company keeping business records in a spreadsheet. They are keeping all the information they seem to need about each sale, but the text points out that their methods are repetitive and wasteful, as well as potentially prone to error.

The simple flat file database that is their spreadsheet stores many information items redundantly. For instance, each order placed by a given customer features that customer's name and ID number. Needed information, of course, but it would take much less storage space to use a database instead. Each order presents the need to enter this same information, presenting the opportunity for error, which would cause data to be left out of reports that might be generated based on matching any one variation of the customer's name. A database allows us to store the customer's information in one place. This information is then accessed each time it is needed, providing for consistency.

The existing spreadsheet database is described more fully. Shortcomings of spreadsheets are described:

  • spreadsheets have limited security
  • spreadsheets have limited ability to allow multiple users to access or update the data
  • spreadsheets become hard to manage when very large

Before replacing the spreadsheet with a database management system, the company examines what information it wants to track.

  • Sales reps - The company needs to know a sales representative's number, last name, first name, address, total commission, and commission rate.
  • Customers - The company needs to know a customer's number, name, address, current balance, credit limit, and assigned sales rep.
  • Parts - The company needs to track each part's number, description, units on hand, class, the number of the warehouse we store them in, and price per unit.
Basic Database Terms

A common systems analysis technique is to determine what kind of reports you want from a system, as a aid to designing what must be stored in that system. An example of such a report that our company uses is a customer order. The sample order we are shown has three parts: a heading, a body, and a footing. Each part points out more data we must store, based on orders:

  • Heading - This will show the order number, date placed, customer number, customer name, customer address, sales rep number, and sales rep name.
  • Body - This contains multiple lines. Each line is an order line, also called a detail line, or a line item. Each line shows a part number, part description, number of units ordered, unit price, and the order line price which is obtained by multiplying the last two items.
  • Footing - On this report, the footing shows only the total price of the order. This is also a calculated item, since it is the sum of the line totals.

As noted above, examining this report tells us several more information items we need to store and track that were not described in the first three lists of information.

The text pauses here to introduce some common terms used in describing databases. This text defines a database as "a structure that contains information about many different categories of information and about the relationships between the categories". (Page 5) Not all databases are relational, but the most useful ones are. Another definition is on page 6, but we need to define some key terms first:

  • entity - something that you want to store data about, such as customers, accounts, employees, inventory, places, or events
  • attribute - a characteristic of an entity; also called a field, column, or property
  • relationship - an association between entities in your database. Customers are associated with their sales reps, in that each customer has one and only one sales rep. This is a relationship. From the point of view of the sales rep, this is a one to many relationship, since each sales rep has many customers, but each customer has only one sales rep. Relationships can be one to one, one to many, many to one, or many to many.
  • data file - all data in a database is stored in files, including lists of entities, their attributes, and the relationships between entities

So, now we are ready for the definition on page 6: "a database is a structure that can store information about multiple types of entities, the attributes of those entities, and the relationships among the entities".

The text now examines the entities that our company is tracking, and tells us that we will create a table for each one. Each table describes on kind of entity (e.g. Sales Representatives). Rows in the table are instances of that entity (individual Sales Reps) and columns hold the attributes (e.g. LastName, FirstName, Commission). The first few tables are obvious: Rep, Customer, Orders, and Part should all be intuitive at this point. We are also shown a table called OrderLine, whose purpose is not obvious yet. It exists to satisfy a database design requirement that your book does not describe for another hundred thirty pages. The author's argument in this chapter is simply this: if we constructed a table of Orders, some orders would have more than one item in a given cell. This would cause problems later, in determining which items the customer ordered how many of. This is not allowed in standard databases. So, a rule is implied:

In a standard table, we will enter only one data element at the intersection of any row and column.

It is possible to read the various tables to answer the questions posed at this point in the chapter. Students should do so, for practice.

Description of Database Management Systems

Next, the text introduces the idea of a database management system (DBMS). This is any software solution that allows users to store and retrieve data, and produces reports needed by the users. Users interact with the DBMS directly or through an interface of some sort. Users typically can add, delete, and modify records through a DBMS. They should also be able to ask the DBMS to retrieve all orders for a specific part, or all commission payments to specific sales people. Making such a request is called running a query. (Query is another word for a question.)

Some examples of DBMSs are listed: Access, Oracle, DB2, and SQL Server. In the example in the text, the DBA, or Database Administrator (person who manages the database) designs the tables to be used and creates several forms that will be used in the user interface to the DBMS. Using a form, the users can add a new part to the Part table without knowing anything about database design themselves.

By creating a series of forms the DBA will create a switchboard system for the users. A switchboard system is a set of forms that users call up from a menu, each form allowing the users to perform a common function with the database.

Advantages and Disadvantages of Database Processing

'The text presents a standard list of virtues of database systems:

  1. Getting more information from the same amount of data - You need to understand what the author means by those two terms. Data may be considered to be raw facts. Information is what we get when we process data to gain knowledge. A DBMS may help us see a relation between data elements that we did not think were related.
  2. Sharing data - A DBMS will allow more users to access the stored data than a spreadsheet system can.
  3. Balancing conflicting requirements - One of the jobs of a DBA is to impose structure on the database so that the company can get the most use out of it. Serving the needs of the greatest number of users (the company as a whole) is usually the goal.
  4. Controlling redundancy - Redundancy means saving the same thing in several places. Saving a data element in only one place conserves storage space, and makes updating the system much simpler. Redundancy cannot be eliminated, but it can be reduced.
  5. Facilitating consistency - Consistency is a direct result of reducing redundancy. When a data element is stored in fewer places, it is less likely to be stored incorrectly, or fail to be updated.
  6. Improving integrity - Integrity constraints are rules for databases, such as a rule that a certain field must contain only numeric data, or a field used in a query must contain a data element that is in the database (think of a drop down list). DBMSs allow integrity constraints.
  7. Expanding security - Security can be the assignment of passwords for DBMS users, or the assignment of different rights to various users or groups.
  8. Increasing productivity - In the old days, a programmer would have to write a program for every query a user wanted to run on a database. A DBMS will allow that programmer to work on something else, once the users understand how to use the DBMS.
  9. Providing data independence - This means that we can change the structure of the database, and add functions to the DBMS, without having to reenter all the data.

A short list of disadvantages to using a DBMS:

  1. Larger file size - The DBMS itself is probably a larger program file than the simpler system we are converting from. This may mean upgrading some hardware for a company that is changing to a DBMS.
  2. Increased complexity - Since the DBMS is capable of doing a great deal, it has many features that might be confusing to new users. (Maybe they should read this book.)
  3. Greater impact of failure - If we centralize all of the company's data in one system, the impact of the failure of this system is greater than the failure of a single system in a company where each user has part of the data. This means we need to practice making backups on a regular basis.
  4. More difficult recovery - Restoration after a data disaster is more difficult than with a simple spreadsheet system. It is not impossible, just harder. Think of it like this: If a storm blows down a house and a tent, it is more difficult to rebuild the house than it is to put up a new tent. However, when it is done the house will be a house again, and the tent is still only a tent.
Introduction to the case study for the assignments

The assignments for this chapter are about a bookstore whose owner has created several tables of information: Branch, Publisher, Author, Book, Wrote, and Inventory. To practice reading tables and relating data, we will do several exercises in class based on these tables, and those in the body of the chapter.