Concepts:
Retrieving data begins with the SELECT command. A query using this command may have several variations. This is a classic form: SELECT column list FROM data table WHERE conditional test ;
Remember that the command must end with a semicolon, and that you must space between words. Other than that, the syntax for the command is free form, which means that you can put line returns in it to make it easier to read, and the command will perform the same way. The WHERE clause can be a simple condition, such as testing
for a particular value in one field. If so, you can use the standard
comparison operators listed in the text. You are probably familiar with
most of them:
Example from the bookstore database: SELECT BOOK_CODE, TITLE, PRICE Note that this comparison operator has a field name on its left, and a comparison value on its right. The comparison value is enclosed in what some people call single quotes. Other people call these apostrophes. The purpose of the quotes is twofold: to show that the value is not the name of a field, and to enclose the exact value to compare. Compound conditions are just two or more conditions linked by logical operators: AND and OR. These are also called Boolean operators. There is also a Boolean NOT operator. One way to use the NOT operator is to compose a comparison, and to precede it with NOT. Example: SELECT BOOK_CODE, TITLE, PRICE In this example, the query is looking for the books whose prices are not greater than 20.00. It is not required to put the condition in parentheses, but it helps clarify the meaning of the operation. Computed columns are usually created by using math operators with the names of fields. The standard math operators used are + for addition, - for subtraction, * for multiplication, and / for division. In this example, we will create a column that computes the price of a book including the local sales tax. SELECT BOOK_CODE, TITLE, PRICE,
(PRICE * 1.06) AS WithTax This command would create a computed column, and display the results under the heading "WithTax", which does not exist as a field name, but is used as a column heading in the output of the command. A useful operator if you are searching for a partial match is the LIKE operator. When searching for a character string that occurs somewhere in a field, use the LIKE operator with a comparison value enclosed in percent signs. Example: SELECT BOOK_CODE, TITLE, PRICE This command would retrieve the selected fields from all records having of anywhere in their names.
The IN operator is used to match against any of a set of values. For example, we can ask for customers having any of several specific credit limits: SELECT BOOK_CODE, TITLE, PRICE This query will report all books whose prices match
any of the three listed values. Note that these values are not markers
for ranges: a book's price must match
one of those three numbers, or that record is not retrieved. Can you rewrite
that query to pull books whose prices are inside a range? (We will add
that to the assignment this week.) Sorting is accomplished with the ORDER BY clause. ORDER BY is followed by the names of the fields on which to sort the output. If sorting on more than one field, the first listed is the key for the primary sort, the second is the key for the secondary sort, and so on. Example: SELECT BOOK_CODE, TITLE, PRICE A sort will display results sorted in ascending order, unless the DESC switch is used. This switch follows the name of field to be sorted in descending order. Example: SELECT BOOK_CODE, TITLE, PRICE Remember that the ORDER BY command will sort the rows in your output based on a specified column. It does NOT sort only the named column, which would disrupt the integrity of the data records. A list of functions common to spreadsheets is offered, called aggregate
functions. They give us the ability to provide a summary report
from a query, rather than a detail report.
These functions can be combined with WHERE criteria, in order to apply them to subsets of the data. Example: SELECT COUNT(*) FROM book WHERE TITLE LIKE ('%of%'); This query would reply with the number of records matching the criteria. MySQL works with this syntax, and with an alternate syntax, replacing the * with the name of any field. It is possible to run a query on the results of another query by nesting the one whose results you need inside the one that reads those results. Figure out what you want to do for each query. The first one you need to run must be enclosed in parentheses, it must follow the IN operator, and that operator comes at the end of the second query to run. Example (colors are added for reference): SELECT DISTINCT AUTHOR_NUM
FROM wrote
WHERE BOOK_CODE IN (SELECT BOOK_CODE FROM book WHERE TYPE = 'FIC') ;
In this example the blue query runs first, and results in a list of book numbers that fall in the FIC type. The orange query then runs, matching records in the wrote table against the list of book numbers from the first query. I have made the semicolon orange because the command does not end until the second query runs. I have used the DISTINCT operator to avoid repetition of author numbers. Can you adapt this to give us the names of the authors who wrote fiction books, sorted by last name? SELECT AUTHOR_FIRST, AUTHOR_LAST from author In this example, I have added a new query. The blue
query runs first, generating the list of book codes for fiction books.
The orange query runs next, generating
the list of author numbers for fiction authors. Then the black
query runs, giving us the names of the fiction writing authors, and ordering
the list by last name. The trick was to determine how to get what I actually
wanted. I pulled the book codes for all fiction books, then pulled the
author numbers for the authors who wrote those books, then reported the
first and last names for the authors whose numbers I just pulled. In the
online interface for a bookstore, the SQL would be hidden from the customer
who would have no need to understand it. When you are troubleshooting commands, it is useful
to ask for a bit more information, to make sure you are
getting the results you want. If I were trying to troubleshoot the orange
section of the command above, I might remove the DISTINCT operator to
see how many separate book hits we get for each author. I might ask for
more fields in each query to make sure I was getting the right data, such
as getting book titles as well as codes in the blue query.
(SELECT AUTHOR_NUM
FROM wrote
WHERE BOOK_CODE IN (SELECT BOOK_CODE, TITLE FROM book WHERE TYPE = 'FIC'));
I have asked SQL to do the same thing, except that this time I want to see the book titles that match the book numbers in the result. This way, I can read the output and make sure that only the records I want to see are being reported. (Assuming I know anything about the books in my store.) After I was sure that the query worked, I would change it so that the output only contained what I (or my customer) really wanted in the result. That might not tell me enough, so I would also run the blue section by itself, with a similar change: SELECT BOOK_CODE, TITLE, TYPE FROM
book
WHERE TYPE = 'FIC';
This
command would show me the output of the inner query, along with the
type for each book selected. That output would be easy to check
visually, to verify that I am only getting book numbers for the correct
type. Again, I would only do this to check the command. If I ran the
nested query with this change in it, I might get results that I do not
want. The purpose of the blue query is just to generate a control list
for the orange query. It is more important to remove these visual
checks from the inner
query than from the outer one, because we are feeding the result of the
inner query to the outer one. We don't want the outer query to choke on
more data than it expects. This applies to each of the nested query
pairs, orange to blue, and blue to black.
|