This chapter introduces the student to the relations, relational algebra, and several functions common to queries. The objectives important to this chapter are:
Concepts:IntroductionRelational ModelThe chapter begins with a description of relational databases. A relational database can be considered as a collection of tables. For some reason, the tables are called relations, hence the name relational database. The information about each entity in this kind of database is stored in a table dedicated to that entity. The attributes of that entity are represented as columns in the table, and the rows represent unique instances (cases) of the entity. In other words, in a table about students, each column represents a characteristic common to students, and each row represents an individual student.
Rows can also be called records or tuples. (No one likes that last word.) Columns represent characteristics, also called attributes or fields. Relationships between entities are shown by common columns in two or more tables. Our student table could be linked to a table of majors by having a column from the majors table in each table. In the majors table, each major would appear once. In the student table, each major might appear any number of times. This puts majors in a onetomany relationship to students.
The text presents a formal definition of a relation, which has several parts. A relation is a two dimensional table that meets these rules:
The statements about row and column order being unimportant do not mean that you can't sort the table. They mean that sorting the table does not affect the data stored in the table. If you sort a table, you sort entire rows, so the information in any row stays with the rest of the information in that row. Think of it like this: a row is a record. You want each record to stay intact when the table is sorted. Would you want information from someone else's record swapped with your record? (No, you wouldn't.) If a relation does not meet the first rule, that relation is unnormalized. The text does not explain at this point that we will be putting relations into normal form. That comes up soon. A formal notation for tables is presented. Start with the name of the table, follow it with a list of the attributes in parentheses, separated by commas. This notation helps analyze the structure of the table without being distracted by the data elements. Student (StudentNum, LastName, FirstName, Major, Address, City, State, Zip) When writing about columns in a database, it is possible to make a mistake about what is meant. Since relations are linked by having the same column in two relations, it is necessary to have a notation that tells us which column in which table is meant. The notation is the table name, followed by a period, followed by the column name. (To a programmer, this notation means object.attribute.) Student.Major Tables may have any number of columns. In most cases, the entries in one column will uniquely identify each row. That column is the primary key of that table. Sometimes the primary key is not just one column, but a combination of two or more. In the notation above, primary keys would be shown by underlining them. Student (StudentNum, LastName, FirstName, Major, Address, City, State, Zip) QBEThe text presents a series of examples of using Query by Example with an Access database. In general, a query is a question that you ask the DBMS. Whatever DBMS you use, there will be a structured way to make a query. QBE allows you to make a query by filling in a grid or template. The variations presented in the text:
Relational AlgebraRelational algebra, as stated in the text, is not used in current DBMS systems. It is a theoretical system of getting data from databases. It is presented here to aid your understanding of SQL. Commands and syntax in relational algebra:
Relational algebra follows set theory (from mathematics). A union of two tables is the result of taking each unique row
from each table, and each row that matches from each table. (Note that
this is not possible unless the tables have the same number of columns,
and the columns contain the same kind of data. The columns in each original
table do not have to have the same names.) In the diagram here, the union of tables A and B is represented by the red, blue, and yellow areas:
the data that is unique to each table, and the data that is common to
both. The intersection of two tables is the set of all rows that appear
in both tables. This is the result of the INTERSECT command in relational
algebra. In the illustration here, the intersection of tables A and B
is represented by the blue area: the data that appears in both tables. The difference of two tables is the set of all rows that are unique to the first table named in the equation. This is the result of the SUBTRACT command in relational algebra. In the illustration here, if we execute the command table A SUBTRACT table B, the result is the red area: the portion of table A that does not intersect with table B.
