NET 101 - Networking Essentials I

Chapter 11, Information and Data Management

Objectives:

This lesson discusses databases and leads into an introduction to SQL. Objectives important to this lesson:

  1. Database concerns
  2. Database terms
  3. Database maintenance functions
  4. Database management systems
  5. DBMS types

    Concepts related to Student Learning Objectives
  6. SQL
  7. Common SQL commands
Concepts:
Introduction

The chapter begins with a description of a database as a collection of data that can be accessed by users to create meaningful information. The simplest kind of a database is just a list, but a static list has obvious limitations. For many databases, a spreadsheet program like Excel can provide all the storage and organization that is needed. Such databases are called flat files, because they are only two dimensional.

A simple flat file database has limitations as well. For example, it can store many information items redundantly. Each order placed by a given customer might feature that customer's name and ID number. This problem is multiplied by the number of customers in the database and the number of orders each makes. This is necessary information, of course, but it would take much less storage space to use a database that referenced a single copy of the customer's data instead. Each order in a flat file presents the need to enter this same information, presenting the opportunity for error, which would cause data to be left out of reports that might be generated based on matching any one variation of the customer's name. A database allows us to store most of the customer's information in one place. This information is then accessed each time it is needed, providing consistency.

Shortcomings of spreadsheets:

  • spreadsheets have limited security
  • spreadsheets have limited ability to allow multiple users to access or update the data
  • spreadsheets become hard to manage when they are very large
This may be a good place to reference three aspects of computer system security:
  • confidentiality - information is accessed only by those who are meant to access it
  • integrity - information is correct, and has not been altered except by authorized persons
  • availability - information is accessible when needed
We should consider these three areas of concern when we think about any computer system.
Advantages and Disadvantages of Database Processing

'The text presents a list of virtues of valuable information on page 467. Some of the important items:

  1. Information is accurate - This relates to the concept of integrity above.
  2. Information is verifiable - The example given is more about data that is used for authentication: relying on data in the system to prevent unauthorized access to other data. This relates to confidentiality.
  3. Information is timely - There is no point to providing information that is out of date, such as listing a catalog item item we no longer sell, or a price for an item that is no longer valid.
  4. Information is useful - This can include ready access, as well as a format that has meaning to the staff who access it.
Basic Database Terms

The text pauses here to introduce some common terms used in describing databases. It leaves out some basic terms as well. This list is more complete.

  • entity - something that you want to store data about, such as customers, accounts, employees, inventory, places, or events
  • attribute - a characteristic of an entity; also called a field, column, characteristic, or property
  • record - all the data about one instance in your data file

    This is easier to explain with a familiar example. Assume the college has a database about students. There must be a table about the students themselves. This file is the Student table. (Student is the entity the table is about.)
    The table has columns with names like ID, last name, first name, phone, etc. The columns are the attributes, the kinds of data we are tracking.
    Each line in the table is information about an instance of the entity: a particular student. The data stored in that line is all the information about that student. It is a record in the table, the record for that student.

    Student ID Last Name First Name Phone
    1100123 Jones Shane 810.555.1212
    1100136 Smith Kim 810.555.2345

  • data types - On page 469, there is a list of common data types. In a database, one way to impose a certain kind of integrity is to only allow a particular kind of data in a given field. It is much harder to enter the wrong kind of data when the system won't allow it.
  • relationship - an association between entities in your database. Customers might be associated with their sales reps by saying that each customer has one and only one sales rep. This is a relationship. From the point of view of the sales rep, this is a one to many relationship, since each sales rep has many customers, but each customer has only one sales rep. Relationships can be one to one, one to many, many to one, or many to many.
  • data file - all data in a database is stored in files, including lists of entities, their attributes, and the relationships between entities; often files are organized as tables, as shown above. Like a spreadsheet, we can refer to its rows, columns, and cells.

On page 473, there is a bulleted list of validation methods that might be used on different kinds of data. It is not possible to rely on a database to check all the data, but it is possible to program checks of these types to improve our data's accuracy and reliability. Consider the book's discussion of these types of validation:

  • range check - values must fall within the range of a minimum and a maximum value
  • consistency check - cross check values to make sure they make sense
  • completeness check - do we have all the required values?

So, now we are ready for another definition: "a database is a structure that can store information about multiple types of entities, the attributes of those entities, and the relationships among the entities".

We must consider what entities our company is tracking in order to create tables about those entities. We will create a table for each one. Each table will describe one kind of entity (e.g. Students). Rows in the table are instances of that entity (individual Students) and columns hold the attributes (e.g. LastName, FirstName, Major). If we construct tables this way, we might eventually find ourselves wanting to enter more than one item in a given cell. This would cause problems later, so this is not allowed in standard databases. So, a rule is imposed:

In a standard table, we will enter only one data element at the intersection of any row and column.

Database Maintenance Functions
There are several standard functions any database must support:
  • creating tables
  • deleting tables
  • modifying tables
  • adding records
  • deleting records
  • modifying records
The text spends a few pages describing these general ideas. They leave out the concepts about tables, but it should be obvious that the concepts are needed.
Description of Database Management Systems

Next, the text introduces the idea of a database management system (DBMS). This is any software solution that allows users to store and retrieve data, and produces reports needed by the users. Users interact with the DBMS directly or through an interface of some sort. Users typically can add, delete, and modify records through a DBMS. They should also be able to ask the DBMS to retrieve all orders for a specific part, or all commission payments to specific sales people. Making such a request is called running a query. (Query is another word for a question.)

Some examples of DBMSs are: Access, Oracle, DB2, and SQL Server. In most cases, the DBA, or Database Administrator (person who manages the database) designs the tables to be used and creates several forms that will be used in the user interface to the DBMS. Using a form, the users can add a new macguffin record to the Macguffin table without knowing anything about database design themselves (and precious little about Macguffins).

Relational Model

The chapter begins with a description of relational databases. A relational database can be considered as a collection of tables. For some reason, the tables are called relations, hence the name relational database. The information about each entity in this kind of database is stored in a table dedicated to that entity. The attributes of that entity are represented as columns in the table, and the rows represent unique instances (cases) of the entity. In other words, in a table about students, each column represents a characteristic common to students, and each row represents an individual student.
Student Number Last Name First Name Grade
1100123 Jones Shane B
1100136 Smith Kim A

Rows can also be called records or tuples. (No one likes that last word.) Columns represent characteristics, also called attributes or fields.

Relationships between entities are shown by common columns in two or more tables. Our student table could be linked to a table of majors by having a column from the majors table in each table. In the majors table, each major would appear once. In the student table, each major might appear any number of times. This puts majors in a one-to-many relationship to students.
StudentNum LastName FirstName Grade Major
1100123 Jones Shane B 1
1100136 Smith Kim A 3

Major Description
1 Accounting
2 Computer Science
3 Nursing

The text presents a formal definition of a relation, which has several parts. A relation is a two dimensional table that meets these rules:

  1. Each entry in the table contains one and only one value. No cell may contain more than one value.
  2. Each column represents one attribute of the entity, and must have a unique name. Each column name is used only once in the table.
  3. All values in a column are values of the attribute that column is about.
  4. The order of columns in the table is not important.
  5. Each row is unique in some way.
  6. The order of the rows is not important.

This set of rules is very important. Without them, we cannot create tables that will be useful in databases.

A formal notation for tables is presented. Start with the name of the table, follow it with a list of the columns in parentheses, separated by commas. This notation helps analyze the structure of the table without being distracted by the data elements.

Student (StudentNum, LastName, FirstName, Major, Address, City, State, Zip)

Tables may have any number of columns. In most cases, the entries in one of the columns will uniquely identify each row. That column is the primary key of that table. Sometimes the primary key is not just one column, but a combination of two or more. In the notation above, primary keys would be shown by underlining them.

Student (StudentNum, LastName, FirstName, Major, Address, City, State, Zip)

In the example tables above, StudentNum is the primary key for the Student table. The Major column is the primary key for the Major table.

The Major column appears in both tables, providing the relationship between the two tables. In the Major table, the Major column is the primary key. In the Students table, the Major column is a foreign key, the primary key of another table.

The chapter presents a short discussion of SQL. SQL commands can be used to CREATE tables, DROP tables, INSERT data into tables, etc. (You can practice doing this on the SQLCourse web site. There is a link on the menu for this course.)

For example, this command will create a table called Customer:

CREATE TABLE Customer
(CustomerNum CHAR(3) PRIMARY KEY,
CustomerName CHAR(35) NOT NULL,
Street CHAR(15),
City CHAR(15),
State CHAR(2),
Zip CHAR(5),
Balance DECIMAL(8,2),
CreditLimit DECIMAL(8,2),
RepNum CHAR(2) )
;

The words shown above in capital letters need not be entered capitalized. I have capitalized them to show that they are actual SQL command words.

This example would insert a record into the Customer table:

INSERT INTO Customer
VALUES
('524','Kline''s','838 Ridgeland','Fillmore','FL','33336',12762.00,15000.00,'20');

To erase a table, use the drop command:

DROP TABLE Customer;

This example would remove this table from the database.

You can use the select command to see the data in a table. To see the whole table use a command like this:

SELECT * FROM Customer;

The asterisk is a wildcard character. It appears in the command above where I might have inserted a list of columns. Since the asterisk is there, the command means "show me all the columns of data in this table". To see selected fields, specify them in a comma delimited list, instead of using an asterisk.

Assignment 1: Since there was a test in week 7, let's relax a bit, and do the multiple choice questions and the matching questions on pages 500 and 501.

This is an individual assignment.

SQL commands can be simple or complex. Let's consider some of the basics and do a few exercises with them.

To pull specific data from a table, you have to know the table's name and the names of the columns that hold the data. The general form of the command is like this:
SELECT columns FROM table WHERE conditions;

Again, the words in CAPS are command words in the SQL language. The italic words are placeholders for words that relate to the specific command you are running. It is not necessary to set conditions, but it is necessary to state the name of a table, and to either state the names of columns or use a wildcard standing for all columns. The wildcard for "all" is the asterisk in MySQL, which is often used at Baker College. In that version of SQL, the command may be entered on multiple lines to make it easier to read, but it must end with a semi-colon.

Assuming we have a table called Students, similar to the examples above, what would be a valid SQL command to report the names of all students in the table?

This is a data retrieval command, so it begins with the word SELECT. We only want data from two columns so we need their names: LastName and FirstName. This is the first phrase of the command.
We have to specify the name of the table after the word FROM. This is the second phrase of the command.
The third phrase is the WHERE phrase, which is not needed this time, because we want all students. The command would be this:

SELECT LastName, FirstName
FROM Students;

This leads to two observations about selecting all of something.

  • If you use an asterisk instead of your column list, you get data from all columns in the file.
  • If you leave off the WHERE phrase of the command, you get data from all rows in the file.


Assignment 2: This assignment is for Week 8. We will work on basic SQL commands.
Using the notes above, write SQL commands to accomplish the following needs:

  1. Write a command to create a new table for data about pets. The table must have character fields for
    name of the pet, species of the pet, and owner of the pet.
  2. Assume that someone has entered a hundred records in the table you created. Write a command to
    tell us all the pets' names and their owners' names.
  3. Write a command to delete the table created in part 1.

This is an individual assignment. You may practice this on the SQLCourse web site to make sure your answers work.