Data as an Asset
The chapter begins with a not very convincing argument that data is valuable because it can provide your company with competitive advantages. To have value, there must be something you can do with the information that can be derived from the data. In the graphic on page 723, the text illustrates a concept it calls the data-information-decision cycle.
On the next page, the text explores some causes of dirty data, data that is incorrect.
Some of the errors above result in problems across the enterprise. Standardization
becomes more important the larger the organization becomes. You can't
just start out with good intentions and trust that everything will go
well. Data assets need to be examined
and changed to meet standards,
which includes keeping them current.
The text recommends setting up master copies of tables that are used in
multiple system. Updates should be made to the master copy, then the other
necessary copies should update from the master. This controls redundancy
and sets a standard that other tables should follow.
Establishing a Role for Your DBMS
Even though we use a DBMS in each department of our organization, we
can be using different data spaces with different rules and different
versions of what should be the same tables. This makes our database look
more like an electronic version of an outdated paper file system. Central
control and management can fix this problem.
The text stresses that an enterprise
database is a planning tool. This is a special database that tracks
the activity of the company that uses it. It is there to support
managers making decisions, and to support the three standards goals
of security: Confidentiality, Integrity,
The text presents a common model of management:
The text presents several bullets for each category above, detailing the planning concerns at that level. It should be remarked that management at any level may need to access data for concerns at a higher or lower level as well. Sometimes everyone is concerned with the emergency of the day. The general idea is that the enterprise database will contain information that concerned parties in the enterprise will use to carry out their jobs. It is unlikely that any single database will contain all the information or tools a person will need, but it should provide access to everything it has that a person needs.
The text also discusses three problem areas that are typically encountered when an organization installs a management tool/enterprise database such as it has been proposing:
Note that the third area concerns what we usually call soft skills. People need time to accept change. They often need to be convinced that there are good reasons for that change. This aspect of the text is not pertinent to your skills in designing tables or linking them to each other. We will move on.
The text provides some history about:
The text explains that database administrators (DBAs) often fall into one of two organization chart locations. They may be in staff positions. In that case, they advise other IT staff, they design database structures, but they don't do general policy creation or enforcement. A DBA in a line position does all of those things, including the enforcement of policies. The text goes on to say that these two organization chart examples are not the only possibilities, but they are common. A DBA may be attached to any part of an IT chart that maintains services on servers.
On page 729, the text presents another view of a Database Life Cycle
(DBLC). It tells us that the DBA needs to be involved in all the steps
listed there. I am a bit confused by the next illustration, which shows
each of the functions in this version of the DBLC as a functional organization
chart for the DBA position. They are all duties the DBA would perform.
I don't think putting each of them on a branch of the DBA's activity tree
helps us much. The second org chart fragment on page 730 is clearer. It
shows a specialist DBA for each
of five kinds of systems, and all of them work for a systems
administrator, who oversees all of their work. At the bottom of
that page, the author adds another job title, data
administrator (DA). This role is mentioned again in the next section.
It is a higher management position that reports to the upper layers of
the organization. Think of it as a vice president over all the database
functions of the organization.
This section has an odd start. It begins with the assumption that your organization has DBAs and at least one DA over them. If this is so, the chart on page 731 will probably apply, with the higher level, managerial functions performed by the DA and the lower level, more technical functions performed by one or more DBAs. The part that feels odd to me is the idea that the DBA may have to do all of these functions is there is no DA involved. This is not entirely so. There may be some form of governance over IT functions that would have to be satisfied before new systems are brought up, new policies are enabled, and new job functions are added to a person or to a team.
On page 733, there is a table of skills that a good DBA should have,
divided into two categories: managerial
and technical. The managerial
skills are needed by any IT person working as a liaison between a system
and its users, whether building or maintaining that system. The text continues
the managerial section for a few pages (to page 738), and it seems that
the author is assuming that the DBA is either a team leader or a manager
with regard to the allocation of human resources mentioned on pages 733
As an aside, I will note that these definitions of standard and procedure
are not universal. Some organizations might consider a standard to be
a modification of a procedure to be used in particular circumstances which
vary from one location to another. The example in the text is simply a
more detailed instruction, not linked to a "problem" environment. It should
also be noted that managers are not the only ones who write procedures,
but they are typically the source/conduit of rules. Most people will resist
a rule unless they hear about it from their immediate manager, the person
who approves or disapproves their work. Some IT departments, and some
upper management people, forget about this, thinking they can just make
a rule that people will blindly follow.
This section of the chapter goes on to consider other duties that the
DBA or analysts working with the DBA would probably carry out. Look them
over and discuss any you have questions about.
On page 736, the text returns to the idea of security, which is more integral to the work a DBA does. Security policies that apply to a database may exist in the DBMS or in the network software, or in both. I think the author is padding this chapter, adding the section that follows about backup, recovery, and disaster management. These are topics that apply no more to a particular database than to any other IT asset of the organization. The system that the DBA manages should be part of an overall backup and recovery plan. It you have never read a discussion of these concepts, please read through them. If you have, scan this page and the next.
On page 738, the text mentions data
distribution as the last management concern of a DBA. The discussion
is pretty theoretical, except for the caution that administrators must
take more precautions to protect data that can be accessed outside their
own networks. On the same page, the text begins a list of DBA duties that
are more technical than administrative.
Page 745 takes us to the discussion of security.
We usually have this topic in several courses, so it should be no surprise
to find it in this text. The author begins with the classic three elements
of computer security: Confidentiality,
Integrity, and Availability.
The system should only open itself for authorized
users, it should not be changed
except by those allowed to make
changes, and it should be available
to authorized users when it is needed.
Table 16.4 on page 747 presents a list of system components, vulnerabilities that may apply to them, and security measures that should be taken to reduce vulnerability. You should review this chart to learn about vulnerabilities you don't already know about.
Security for Data, the Database, and the Organization
The text makes some recommendations about the security of databases. Several apply to networks, servers, and computer equipment in general.
Tools and Strategies
This section recommends obtaining and using several administration tools. It also recommends reading the information in the data dictionary, which is also called the information resource dictionary on page 751. The text recommends querying the SYSTABLES table, which is part of the data dictionary in DB2. In MySQL, you would examine several different, but similarly named tables. The online manual lists them on this page.
You may recall that the InnoDB database engine is the current default
engine for MySQL. Follow the link above for a description of what is stored
in each of those tables.
The text mentions CASE (Computer Aided Systems Engineering) tools, and
recommends them for building large systems. They often provide documentation
tools, help with complying to the flow of SDLC phases, and a known interface
once you learn it. It goes off on a tangent to discuss creating an ERD
in Visio, which is a fine tool for that purpose, but it has nothing to
do with creating a database system.
The chapter moves on to discuss creating a strategy for database management on page 755. In addition to to database concerns, it lists issues that pertain to any IT project, so making them standard concerns in all of your projects is a good idea.
On page 756, the text begins a section on the DBA's role in a cloud-based system. The DBA still has to manage the database, just not so much of the physical server. The server is a virtual server on someone else's computer, so some of the headaches are simplified. As the text points out, the connection to the remote server becomes more critical in this situation. No connectivity to the Internet, and specifically to your vendor's cloud means no value from your database while that is so. The text makes an argument against going with this option. The vendor on the cloud is benefited by the system running slower and using more processing power, which will cost you more in the long run than you might be saving by not owning the hardware. This should be monitored to determine what your best option is.
The chapter ends with a section on using Oracle and the management tools that are part of it.
Okay, that is enough theory from chapter 16. Let's take a look at some problems using the bookstore database. First off, let's assume you want to want to list the title of each book written by a particular author. For this problem, we will assume it is author 20. What tables do you need? Take a look at this unfinished ERD for that database.
The titles only occur in the book table, so we need that. There are no author numbers in the book table, so we need to relate to another table. How about the author table? It has author numbers, but it does not relate to books. The book table also has book codes for each book, and those codes are linked to author numbers in the wrote table.
So, let's assume we need the from clause to say:
The requirement for the problem says we only need the book title, but
that doesn't lead to good troubleshooting. How about outputting the title,
the book code, and the author number? Why would I want to do that? To
make sure I am getting the right selections. So the select clause
might say this:
Why did I pull two fields from the wrote table? It is a bit easier to do that since we are sure we can get both in that table, and that they will be matching pairs.
That takes us to a where clause, which is more complex for multiple
table queries. How can we select the right rows from the wrote
table? That's pretty easy, since we already know which author number we
want. The where clause can start like this:
That's not enough. We also need to pull the right book titles. The phrase above tells the database to read the rows for the right author. Now, we need to match the book numbers in those particular rows against titles in the book table.
and wrote.book_code = book.book_code
To make it nicer, let's sort the output by title.
Here is a screen capture of doing it in the MySQL Workbench:
We can walk through more in class if we can spark some interest in doing that.