Integrity Constraints (Part 2)
This is the second of a short series of installments of this regular column on the important topic of integrity constraints. Just to remind you, an integrity constraint (or just constraint for short) is basically a conditional expression that must evaluate to true. Last time, I introduced a classification scheme for such constraints, dividing them into domain (or type) constraints, column constraints, table constraints, and database constraints, and I gave a number of examples of type constraints specifically. Now I want to consider some examples of other kinds of constraints.
The examples that follow (in this installment and in remaining installments in this short series) are all based on the well-known suppliers and parts database. Here's an SQL definition for that database (in outline):
CREATE TABLE S --------------------------------- suppliers |
As you can see, first of all, there's a suppliers table S
,
with supplier number {S#
} as primary key and some other items (supplier
name, status, and city or location). Second, there's a parts table
P
, with part number {P#
} as primary key and some other
items (part name, color, weight, and city or location). And third, there's
a shipments table SP
, which shows which suppliers are shipping,
or supplying, which parts in which quantities. The primary key for shipments
is the combination {S#,P#
} of supplier number and part number (because
there's an assumption that there can't be more than one shipment of any given part
by any given supplier at any given time). Note that shipments can be thought
of as a relationship between suppliers and parts: a many-to-many relationship, to
be more specific. Note also the foreign key specifications.
Column Constraints
Now I want to focus on column constraints. Like domain or type constraints,
column constraints are fairly simple; the column constraint for some given column
C is essentially just a statement to the effect that values of that column
are drawn from some given domain or type D. Here are some examples
-- to be specific, the column constraints for the columns of the suppliers table
S
:
S.S# IN S#
S.SNAME IN NAME
S.STATUS IN INTEGER
S.CITY IN CHAR
The symbol "S
" in these constraints denotes an arbitrary
suppliers row. For simplicity (though perhaps a little unrealistically), I'm assuming
that columns STATUS
and CITY
are defined on some simple
system-defined types (INTEGER
and CHAR
, respectively),
instead of on user-defined types as the other two columns are (see the previous installment
in this series). Note that I'm using "S#
" as the name
of a domain as well as the name of a column.
Column constraints are effectively checked "immediately," or in other
words at the end of each update statement, loosely speaking. For example,
whenever the user tries to place a value somewhere in column SNAME
(say),
the system immediately checks to see whether that value is a legal NAME
value; if it isn't, the statement is rejected, and no update is done.
By the way, note that column constraints don't apply just to columns in base tables! -- because (as we saw a couple of installments back) it must be possible to update views as well as base tables.
Table Constraints
Now let's turn to table constraints. A table constraint can be as complicated as you like, provided only that it refers just to the individual table in question and no others. Here's an example:
S.STATUS > 0 AND S.STATUS
£100
("table S can never have a value in which some row has a status value that's
outside the range 0 to 100 inclusive"). Note that this constraint certainly
does refer to just the relevant table (namely, table S
). Note
too that it can be regarded as shorthand for a more formal version that looks something
like this:
FORALL s# IN S#, sn IN NAME, st IN INTEGER, sc IN CHAR
£
( IF { S# s#, SNAME sn, STATUS st, CITY sc } IN S
THEN st > 0 AND st100 )
("for all S#
values s#, all NAME
values
sn, all INTEGER
values st, and all CHAR
values
sc, if a row with S#
component s#, SNAME
component sn, STATUS
component st, and CITY
component sc appears in table S
, then st must be in the
range 0 to 100 inclusive"). For obvious reasons, I'm not going to worry
too much about such more formal expressions in this series; I merely remark that
we'd have to worry about them if we were trying to develop this theory formally (which
isn't what I'm trying to do here, of course).
By the way, please understand that the example is indeed an example of a table constraint, not a column constraint, even though it mentions just one column. Column constraints, to repeat, are constraints of the form "column C takes its values from domain D." The example is also, obviously, not a domain constraint either (domain constraints have to do with the legal values in a domain, not in a column or a table).
Note: Like column constraints, table constraints don't apply just to base tables, because (to repeat) we must be able to update views as well as base tables.
Here's another example of a table constraint:
IF S.CITY = 'London' THEN S.STATUS = 20
("suppliers in London must have status 20"). This example refers to two distinct columns (in the same table, of course).
And one more example:
IF SX.S# = SY.S# THEN SX.SNAME = SY.SNAME
AND SX.STATUS = SY.STATUS
AND SX.CITY = SY.CITY
("if supplier rows SX
and SY
have the same supplier
number, then they also have the same name, status, and city; hence they're really
the same row, and {S#
} is a superkey"). Note:
In case you're not familiar with the term "superkey," let me explain:
- First, a candidate key is a set of columns that can serve as a unique
identifier for the table in question and includes no column that's unnecessary for
unique identification purposes. For example, {
S#
} is a candidate key for suppliers, but {S#,CITY
} isn't (because of theCITY
column, which isn't needed to identify suppliers).
- Second, a superkey is a superset of a candidate key (so a candidate key
is always a superkey, but the converse is false). For example, {
S#,CITY
} is a superkey for suppliers. So is {S#
}, of course.
- Finally, a primary key is a candidate key that's chosen for some reason to be "more equal than the others." (If indeed there are any others, that is. If there's just one candidate key, no harm is done by calling it primary. But if there are two or more, then which if any is to be regarded as primary is essentially a matter of psychology, not logic, and I certainly don't want to get into that kind of issue here.)
Like column constraints, table constraints are checked immediately (where, to
say it again, "immediately" means at the end of each update statement,
loosely speaking). Thus, whenever the user tries to update table S
,
for example, the system immediately checks to see whether the new table value is
a legal value for S
; if it isn't, the statement is rejected, and no
update is done.
(To be continued.)
# # #
About our Contributor:
Online Interactive Training Series
In response to a great many requests, Business Rule Solutions now offers at-a-distance learning options. No travel, no backlogs, no hassles. Same great instructors, but with schedules, content and pricing designed to meet the special needs of busy professionals.