CIS 331: Database Management Using SQL

A Guide to MySQL, Chapter 5: Multiple-Table Queries


This chapter introduces the student to the the join command, and introduces several new functions. The objectives important to this chapter are:

  1. Retrieving data from more than one table
  2. Using IN and EXISTS
  3. Nesting subqueries
  4. Aliases
  5. Set operations
  6. Using ANY and ALL
  7. Special JOIN operations

Retreiving data from multiple tables also begins with the SELECT command. A query on mulitple tables may come in three parts:

SELECT column list
data tables
relationship between the tables ;

  • SELECT is followed by a list of columns, which may be any columns in the tables being accessed, separated by commas. For columns that are unique to one table, the column name is sufficient. For columns that appear in both tables, use the table.column notation.
  • FROM is followed by the name of the tables to be read, separated by commas.
  • WHERE is not optional. It must name the columns that form the relationship between the tables, in table.column notation, with an equal sign between them. Example: Customer.RepNum = Rep.RepNum ;

The WHERE clause must contain a statement of the table relationship, but it may also contain restrictions. If it does, the statement of relationship is followed by AND, which is followed by your list of restrictions. Format:

SELECT column list
data tables
relationship between the tables
AND column_name operator comparison_value ;

The IN operator can be used to match against any of a set of values, as shown in chapter 3. The text shows an example that is a variation of the above syntax. It is not as clear, but it does about the same thing. The difference is that we are running a nested query. Format:

SELECT column list for second table
second data table
column in second table that relates to first table IN
(SELECT column in first table that relates to second table
FROM first data table
WHERE column_name operator comparison_value )

This query can be done with the EXISTS operator. Format:

SELECT column list for second table
second data table
FROM first data table
WHERE second_table.column_name = first_table.column_name
AND column_name operator comparison_value )

This runs the inner query on both tables, looking for rows that meet the condition in the final line. The outer query then runs, selecting the columns desired from the second table, retrieving only the rows that match the output of the first query.

You can run a subquery in a subquery, as an extension of the format already shown for one nested query. Note that the innermost query runs first, then the query that it was inside, and finally the outermost query. This follows the standard computer principle of evaluating the expressions inside parentheses first. When you have nested parentheses, the innermost phrases evaluate first, followed by the next innermost, working our way to the outside.

An alias can be used to serve as shorthand notation for the name of a table. The syntax is odd, in that we can use the alias before we define it in the command. Example:

SELECT CustomerNum, CustomerName, R.RepNum
FROM Customer C, Rep R
WHERE C.RepNum = R.RepNum ;

In this example, C was made an alias for Customer, and R was made an alias for Rep in the FROM clause. The aliases can be used in the SELECT clause and in the WHERE clause. Note that the alias stands for the table, not the series of letters in the table name.

This comparison operator in this example has a field name on its left, as is typical, but the comparison value on its right is also a field name. This is unlike most comparisons. Usually, the comparison value is a value, enclosed in what some people call single quotes. Other people call these apostrophes. The purpose of the quotes is twofold: to show that the value is not the name of a field, and to enclose the exact value to compare.

Compound conditions are just two or more conditions linked by logical operators: AND, OR, and NOT. These are also called Boolean operators. One way to use the NOT operator is to compose a comparison, and to precede it with NOT. Example:

SELECT CustomerNum, CustomerName
FROM Customer
WHERE NOT (Balance = '3000') ;

In this example, the query is looking for the customers whose balances are not equal to 3000. It is not required to put the condition in parentheses, but it helps make the meaning of the operation clear.

Computed columns are created by using math operators with the names of fields. The standard math operators are used: + for addition, - for subtraction, * for multiplication, and / for division. The example in the text suggests that we create a column that computes the available balance for a customer by subtracting their balance from their credit limit.

SELECT CustomerNum, CustomerName, (CreditLimit - Balance) AS AvailableCredit
FROM Customer;

This command would create a computed column, and display the results under the heading "AvailableCredit".

A useful operator for times when you are searching for a partial match is the LIKE operator. When searching for a character string that occurs somewhere in a field, use the LIKE operator with a comparison value enclosed in percent signs. Example:

SELECT CustomerNum, CustomerName
FROM Customer
WHERE CustomerName LIKE ('%Sport%') ;

This command would retrieve the selected fields from all records having Sport anywhere in their names. The % character placed like this will match any number of characters before or after the comparison value. Placing it only before or only after the comparison value would match any characters at that position. (In some systems, you will use the * instead of the %.)

Sorting is accomplished with the ORDER BY clause. ORDER BY is followed by the names of the fields on which to sort the output. If sorting on more than one field, the first listed is the primary key, the second is the secondary key, and so on. Example:

SELECT CustomerNum, CustomerName, CreditLimit
FROM Customer
ORDER BY CreditLimit, CustomerName;

A sort will display results sorted in ascending order, unless the DESC switch is used. This switch follows the name of field to be sorted in descending order. Example:

SELECT CustomerNum, CustomerName, CreditLimit
FROM Customer
ORDER BY CreditLimit DESC, CustomerName;

A list of functions common to spreadsheets is offered, called aggregate functions. They give us the ability to provide a summary report from a query, rather than a detail report.

Function Result
AVG Average (mean) of a column
COUNT The number of rows matching retrieval criteria.
MAX Greatest value in a column
MIN Smallest value in a column
SUM The total of the values in a column

These functions can be combined with WHERE criteria, in order to apply them to subsets of the data. Example:

SELECT COUNT(*) FROM Customer WHERE CustomerName LIKE ('%Sport%');

This query would reply with the number 2, which is the number of records matching the criteria. MySQL works with this syntax, and with the alternate syntax in the text, replacing the * with the name of any field.

It is possible to run a query on the results of another query by nesting the first within the second. Figure out what you want to do for each query. The first one you need to run must be enclosed in parentheses, it must follow the IN operator, and that operator comes at the end of the second query to run. Example (colors are added for reference):

SELECT OrderNum FROM OrderLine

WHERE Class = 'AP')

In this example the blue query runs first, and results in a list of part numbers that fall in the AP class. The red query then runs, matching records in the OrderLine table against the list of part numbers from the first query. I have made the semi-colon red because the command does not end until the second query runs.

The text continues with an example of linking multiple tables in a SELECT command. Although lengthy, the examples show the same syntax as other examples.

SELECT column list
data tables
relationship between two tables
AND relationship between two other tables
AND another relationship if needed ;

The idea is to link the tables in the WHERE clause, continuing with as many linking statements as there are relationships between the tables you are using.

Set operations relate to math concepts. Three set operations are discussed.

  • As stated in another chapter, an intersection of two tables is the set of all rows that appear in both of those tables. The SQL command for intersection is INTERSECT.
  • The difference between two tables is the set of rows in the first table, minus any row that appears in the second table. The SQL command for difference is MINUS.
  • The union of two tables is the set of all rows that exist in either or both tables. It is not the same thing as tacking one table on the end of the other. The SQL command for union is UNION.

Set operations can only be carried out on tables with the same columns. One way to make these commands available to you is to use queries to draw out the required columns from as many tables as needed, then use the set operator on the results. The general syntax is:

query set_operator query ;

The ANY and ALL operators are used to match a query against the results of another query. The inner query produces a set of data, and the outer query matches against the result. If the match is like this:
WHERE field < ANY
the outer query will match if the value being compared is less than any of the values in the subquery. If the match is like this:
WHERE field < ALL
the outer query will match if the value being compared is less than every value in the subquery.

Three more types of queries are defined in the text.

  • An inner join is one that shows only the rows of data that match the conditions given. Most students will wonder at this statement, since it describes every query we have discussed.
  • An outer join shows all the rows from one or both tables in a join. There are three types. To understand them, assume that we are running a query on two tables. The first one listed can be called the left table, and the second one listed can be called the right table.
    • A left outer join will include all rows from the left table, and only the rows from the right table that match the query. This could be a query that seeks customers who ordered a certain product, but the result shows all customers, so we can see who ordered the product and who did not.
    • A right outer join will include all the rows from the right table, and only the rows form the left table that match the query.
    • A full outer join includes all rows from both tables.
  • The product of two tables is the combination of all rows in the first table and all rows in the second table. It can be created as an error by leaving out a necessary WHERE clause. In other words, you use the SELECT and FROM commands, but have no rule for the selection, so you get everything too many times. If you selected fields from the Customer and Orders tables, and did not use a WHERE clause, the result would look like every customer ordered every single order. As noted, this is probably the result of an error in constructing your query.