COP 5725 Fall 2012
Database Management
Systems
University of Florida, CISE Department Prof. Daisy Zhe Wang
Adapted Slides from Prof. Jeff Ullman
and from R. Ramakrishnan and J. Gehrke
SQL: Queries, Constraints, Triggers
SQL Constraints & Triggers DML & Views
Why SQL?
• SQL is a very-high-level language.
– Say “what to do” rather than “how to do it.” – Avoid a lot of data-manipulation details
needed in procedural languages like C++ or Java.
• Database management system figures
out “best” way to execute query.
Basic SQL Query
• relation-list A list of relation names (possibly with a range-variable after each name).
• target-list A list of attributes of relations in relation-list
• qualification Comparisons (Attr op const or Attr1 op Attr2, where op is one of ) combined using AND, OR and NOT.
• DISTINCT is an optional keyword indicating that the answer should not contain duplicates. Default is that duplicates are
not eliminated (bag semantics)!
SELECT [DISTINCT] target-list FROM relation-list
WHERE qualification
, , , , ,
Conceptual Evaluation Strategy
• Semantics of an SQL query defined in terms of the following conceptual evaluation
strategy:
– Compute the cross-product of relation-list.
– Discard resulting tuples if they fail qualifications.
– Delete attributes that are not in target-list.
– If DISTINCT is specified, eliminate duplicate rows.
• This strategy is probably the least efficient way to compute a query! The optimizer will find more efficient strategies to compute
the
Example Instances
sid sname rating age
22 dustin 7 45.0
31 lubber 8 55.5
58 rusty 10 35.0
Sailors
sid bid day
22 101 10/10/96
58 103 11/12/96
Example of Conceptual Evaluation
SELECT S.sname
FROM Sailors S, Reserves R
WHERE S.sid=R.sid AND R.bid=103
(sid) sname rating age (sid) bid day
A Note on Range Variables
• Really needed only if the same relation appears twice in the FROM clause. The
previous query can also be written as:
SELECT S.sname
FROM Sailors S, Reserves R
WHERE S.sid=R.sid AND bid=103
SELECT sname
FROM Sailors, Reserves
WHERE Sailors.sid=Reserves.sid
AND bid=103 OR
Find sailors who’ve reserved at
least one boat
• Would adding DISTINCT to this query make
a difference?
• What is the effect of replacing
S.sid
byS.sname
in the SELECT clause? Wouldadding DISTINCT to this variant of the query
make a difference? SELECT S.sid
Expressions and Strings
• Illustrates use of arithmetic expressions and string pattern matching:
• AS and = are two ways to name fields in
result.
• LIKE is used for string matching. `_’ stands for
any one character and `%’ stands for 0 or more arbitrary characters.
SELECT S.age, age1=S.age-5, 2*S.age AS age2 FROM Sailors S
WHERE S.sname LIKE ‘B_%B’
Find sid’s of sailors who’ve
reserved a red or a green boat
• UNION: Can be used to
compute the union of any two union-compatible sets of tuples (which are
themselves the result of SQL queries).
• If we replace OR by AND in
Find sid’s of sailors who’ve reserved a red and a green boat
• INTERSECT: Can be used to
compute the intersection of any two
union-compatible sets of tuples. • Included in the SQL/92
standard, but some
systems don’t support it.
SELECT S.sid
FROM Sailors S, Boats B1, Reserves R1, Boats B2, Reserves R2
WHERE S.sid=R1.sid AND R1.bid=B1.bid AND S.sid=R2.sid AND R2.bid=B2.bid
AND (B1.color=‘red’ AND B2.color=‘green’)
Nested Queries
• A very powerful feature of SQL: a WHERE clause can itself contain an SQL query! (Actually, so can FROM and HAVING clauses.)
• To find sailors who’ve not reserved #103, use NOT IN.
• To understand semantics of nested queries, think of a nested loops evaluation: For each Sailors tuple, check the
qualification by computing the subquery.
SELECT S.sname FROM Sailors S
WHERE S.sid IN (SELECT R.sid
FROM Reserves R
WHERE R.bid=103)
Nested Queries with Correlation
• EXISTS is another set comparison operator, like IN.
• If UNIQUE is used, and * is replaced by R.bid, finds sailors
with at most one reservation for boat #103. (UNIQUE checks for duplicate tuples; * denotes all attributes. Why do we
have to replace * by R.bid?)
• Illustrates why, in general, subquery must be re-computed for each Sailors tuple.
SELECT S.sname FROM Sailors S
WHERE EXISTS (SELECT *
FROM Reserves R
WHERE R.bid=103 AND S.sid=R.sid)
Find names of sailors who’ve reserved boat #103:
More on Set-Comparison Operators
• We’ve already seen IN, EXISTS and UNIQUE. Can also use NOT IN, NOT EXISTS and NOT UNIQUE.
• Also available: <value> op , op ANY, op ALL <expr>
op IN
• Find sailors whose rating is greater than that of some sailor called Horatio:
, , , , ,
SELECT *
FROM Sailors S
WHERE S.rating > ANY (SELECT S2.rating FROM Sailors S2
Rewriting
INTERSECTQueries Using
IN• Similarly, EXCEPT queries re-written using NOT IN.
Find sid’s of sailors who’ve reserved both a red and a green boat:
SELECT S.sid
FROM Sailors S, Boats B, Reserves R
WHERE S.sid=R.sid AND R.bid=B.bid AND B.color=‘red’
AND S.sid IN (SELECT S2.sid
FROM Sailors S2, Boats B2, Reserves R2
WHERE S2.sid=R2.sid AND R2.bid=B2.bid
AND B2.color=‘green’)
Division in SQL
a Reserves tuple showing S reserved B
Find sailors who’ve reserved all boats.
(1)
Bag Semantics
• Although the SELECT-FROM-WHERE
statement uses bag semantics, the default for union, intersection, and difference is set semantics.
– That is, duplicates are eliminated as the operation is applied.
Motivation: Efficiency
• When doing projection, it is easier to avoid eliminating duplicates.
– Just work tuple-at-a-time.
• For intersection or difference, it is most efficient to sort the relations first.
Controlling Duplicate Elimination
• Force the result to be a set by SELECT DISTINCT . . .
• Force the result to be a bag (i.e., don’t eliminate duplicates) by ALL, as in . . . UNION ALL . . .
Join Expressions
• SQL provides several versions of (bag) joins.
Products and Natural Joins
• Natural join:
R NATURAL JOIN S; • Product:
R CROSS JOIN S; • Example:
Sailors NATURAL JOIN Reserves;
• Relations can be parenthesized subqueries, as well.
Theta Join
• R JOIN S ON <condition> • Example:
Outerjoins
• R OUTER JOIN S is the core of an
outerjoin expression. It is modified by:
1. Optional NATURAL in front of OUTER. 2. Optional ON <condition> after JOIN. 3. Optional LEFT, RIGHT, or FULL before
OUTER.
LEFT = pad dangling tuples of R only.
RIGHT = pad dangling tuples of S only.
FULL = pad both; this choice is the default.
Aggregate Operators
• Significant extension of relational algebra
Find name and age of the oldest
sailor(s)
• The first query is illegal!
(We’ll look into the reason a bit later, when we discuss
GROUP BY.)
• The third query is equivalent to the second query, and is allowed in the SQL/92
standard, but is not
supported in some systems.
Motivation for Grouping
• So far, we’ve applied aggregate operators to all
(qualifying) tuples. Sometimes, we want to apply them to each of several groups of tuples.
• Consider: Find the age of the youngest sailor for each rating level.
– In general, we don’t know how many rating levels
exist, and what the rating values for these levels are!
– Suppose we know that rating values go from 1 to 10; we can write 10 queries that look like this (!):
SELECT MIN (S.age) FROM Sailors S
WHERE S.rating = i
Queries With
GROUP BYand
HAVING• The target-list contains (i) attribute names (ii) terms with aggregate operations (e.g., MIN (S.age)).
• The attribute list (i) must be a subset of grouping-list.
• Intuitively, each answer tuple corresponds to a group, and these attributes must have a single value per group.
SELECT [DISTINCT] target-list FROM relation-list
WHERE qualification
GROUP BY grouping-list
HAVING group-qualification
Restriction on SELECT Lists With
Aggregation
• If any aggregation is used, then each element of the SELECT list must be either:
1. Aggregated, or
Requirements on HAVING
Conditions
• These conditions may refer to any
relation or tuple-variable in the FROM clause.
• They may refer to attributes of those
relations, as long as the attribute makes sense within a group; i.e., it is either:
1. A grouping attribute, or 2. Aggregated.
Conceptual Evaluation
• The cross-product of relation-list is computed, tuples that fail
qualification are discarded, `unnecessary’ fields are deleted, and the remaining tuples are partitioned into groups by the value of attributes in grouping-list.
• The group-qualification is then applied to eliminate some
groups. Expressions in group-qualification must have a single value per group!
Find age of the youngest sailor with age >= 18, for each rating with at least 2 such
sailors
rating minage 18, for each rating with at least 2 such
rating minage changing EVERY to ANY?
Find age of the youngest sailor with age >= 18, for each rating with at least 2 such
sailors and with every sailor under 60.
For each red boat, find the number
of reservations for this boat
• What do we get if we remove B.color=‘red’ from the WHERE clause and add a HAVING clause with this condition?
SELECT B.bid, COUNT (*) AS scount FROM Boats B, Reserves R
Find age of the youngest sailor with age > 18, for each rating with at least 2 sailors (of any age)
• Shows HAVING clause can also contain a subquery. • Compare this with the query where we considered only
ratings with 2 sailors over 18!
• What if HAVING clause is replaced by:
Find those ratings for which the
average age is the minimum over all ratings
SELECT S.rating FROM Sailors S
WHERE S.age = (SELECT MIN (AVG (S2.age)) FROM Sailors S2)
SELECT Temp.rating, Temp.avgage
FROM (SELECT S.rating, AVG (S.age) AS avgage
FROM Sailors S
GROUP BY S.rating) AS Temp
WHERE Temp.avgage = (SELECT MIN (Temp.avgage)
FROM Temp)
Correct solution (in SQL/92):
Aggregate operations cannot be nested! WRONG
Null Values
• Field values in a tuple are sometimes unknown (e.g., a rating has not been assigned) or inapplicable (e.g., no spouse’s
name).
– SQL provides a special value null for such situations. • The presence of null complicates many issues. E.g.:
– Special operators needed to check if value is/is not null.
– Is rating>8 true or false when rating is equal to null? What about AND, OR and NOT connectives?
– We need a 3-valued logic (true, false and unknown).
– Meaning of constructs must be defined carefully. (e.g.,
WHERE clause eliminates rows that don’t evaluate to true.)
– New operators (in particular, outer joins) possible/needed.
Three-Valued Logic
• To understand how AND, OR, and NOT
work in 3-valued logic, think of TRUE = 1, FALSE = 0, and UNKNOWN = ½.
• AND = MIN; OR = MAX, NOT(
x
) = 1-x
. • Example:TRUE AND (FALSE OR NOT(UNKNOWN)) = MIN(1, MAX(0, (1 - ½ ))) =
Surprising Example
• From the following Sailors relation:
sid sname rating age
22 Dustin 7
SELECT sid 58 Lubber 8 FROM Sailors
WHERE age < 20 OR age >= 20;
40
UNKNOWN UNKNOWN
41
Constraints and Triggers
• A
constraint
is a relationship among data elements that the DBMS is required toenforce.
– Example: key, foreign key constraints.
•
Triggers
: event, condition, action– More flexible than simple constraints
42
Kinds of Constraints
• Keys.
• Foreign-key, or referential-integrity.
• Value-based constraints.
– Constrain values of a particular attribute.
• Tuple-based constraints.
– Relationship among components.
43
Foreign Keys (Review)
• Consider Relation Sells(bar, beer, price). • A constraint that requires a beer in Sells
to be a beer in Beers is called a
foreign
-key
constraint.• 2 possible violation events/conditions:
– insert/update to Beer
– delete/update on Sells • 4 possible actions
44
Attribute-Based Checks
• Constraints on the value of a particular attribute.
• Add: CHECK( <condition> ) to the declaration for the attribute.
• The condition may use the name of the attribute, but any other relation or
45
Example
CREATE TABLE Sells (
bar CHAR(20),
beer CHAR(20) CHECK ( beer IN
(SELECT name FROM Beers)),
price REAL CHECK ( price <= 5.00 )
46
Timing of Checks
• Attribute-based checks performed only
when a value for that attribute is inserted or updated.
– Example: CHECK (price <= 5.00) checks
every new price and rejects the modification (for that tuple) if the price is more than $5.
– Example: CHECK (beer IN (SELECT
name FROM Beers)) not checked if a beer
47
Tuple-Based Checks
• CHECK ( <condition> ) may be added as a relation-schema element.
• The condition may refer to any attribute of the relation.
– But any other attributes or relations require a subquery.
48
Example: Tuple-Based Check
• Only Joe’s Bar can sell beer for more than $5:
CREATE TABLE Sells (
bar CHAR(20),
beer CHAR(20),
price REAL,
CHECK (bar = ’Joe’’s Bar’ OR
price <= 5.00)
49
Assertions
• These are database-schema elements, like relations or views.
• Defined by:
CREATE ASSERTION <name> CHECK ( <condition> );
50
Example: Assertion
• In Sells(bar, beer, price), no bar may charge an average of more than $5.
CREATE ASSERTION NoRipoffBars CHECK ( NOT EXISTS (
SELECT bar FROM Sells
GROUP BY bar
HAVING 5.00 < AVG(price) ));
51
Example: Assertion
• In Drinkers(name, addr, phone) and
Bars(name, addr, license), there cannot be more bars than drinkers.
CREATE ASSERTION FewBar CHECK (
(SELECT COUNT(*) FROM Bars) <=
(SELECT COUNT(*) FROM Drinkers)
52
Timing of Assertion Checks
• In principle, we must check every
assertion after every modification to any relation of the database.
• A clever system can observe that only certain changes could cause a given assertion to be violated.
– Example: No change to Beers can affect
53
Triggers: Motivation
• Assertions are powerful, but the DBMS often can’t tell when they need to be checked.
• Attribute- and tuple-based checks are checked at known times, but are not powerful.
54
Event-Condition-Action Rules
• Another name for “trigger” is
ECA rule
, orevent-condition-action
rule.•
Event
: typically a type of database modification, e.g., “insert on Sells.”•
Condition
: Any SQL boolean-valued expression.55
Preliminary Example: A Trigger
• Instead of using a foreign-key
constraint and rejecting insertions into
56
Example: Trigger Definition
CREATE TRIGGER BeerTrig AFTER INSERT ON Sells
REFERENCING NEW ROW AS NewTuple FOR EACH ROW
WHEN (NewTuple.beer NOT IN (SELECT name FROM Beers)) INSERT INTO Beers(name)
VALUES(NewTuple.beer);
The event
The condition
57
Options: CREATE TRIGGER
• CREATE TRIGGER <name> • Option:
CREATE OR REPLACE TRIGGER <name>
58
Options: The Event
• AFTER can be BEFORE.
– Also, INSTEAD OF, if the relation is a view.
• A great way to execute view modifications: have triggers translate them to appropriate modifications on the base tables.
• INSERT can be DELETE or UPDATE.
59
Options: FOR EACH ROW
• Triggers are either “row-level” or “statement-level.”
• FOR EACH ROW indicates row-level; its absence indicates statement-level.
•
Row level triggers
: execute once for each modified tuple.•
Statement-level triggers
: execute once for an SQL statement, regardless of60
Options: REFERENCING
• INSERT statements imply a new tuple (for row-level) or new table (for
statement-level).
– The “table” is the set of inserted tuples.
• DELETE implies an old tuple or table. • UPDATE implies both.
• Refer to these by
61
Options: The Condition
• Any boolean-valued condition is appropriate.
• It is evaluated before or after the
triggering event, depending on whether BEFORE or AFTER is used in the event. • Access the new/old tuple or set of
62
Options: The Action
• There can be more than one SQL statement in the action.
– Surround by BEGIN . . . END if there is more than one.
• But queries make no sense in an action, so we are really limited to
63
Another Example
• Using Sells(bar, beer, price) and a
64
The Trigger
CREATE TRIGGER PriceTrig
AFTER UPDATE OF price ON Sells REFERENCING
OLD ROW AS ooo NEW ROW AS nnn FOR EACH ROW
WHEN(nnn.price > ooo.price + 1.00) INSERT INTO RipoffBars
VALUES(nnn.bar);
The event – only changes to prices
Updates let us talk about old and new tuples We need to consider each price change
Condition: a raise in price > $1
65
Triggers on Views
• Generally, it is impossible to modify a view, because it doesn’t exist.
• But an INSTEAD OF trigger lets us
interpret view modifications in a way that makes sense.
66
Example: The View
CREATE VIEW Synergy AS
SELECT Likes.drinker, Likes.beer, Sells.bar FROM Likes, Sells, Frequents
WHERE Likes.drinker = Frequents.drinker AND Likes.beer = Sells.beer
AND Sells.bar = Frequents.bar;
Natural join of Likes, Sells, and Frequents
67
Interpreting a View Insertion
• We cannot insert into Synergy --- it is a view.
• But we can use an INSTEAD OF trigger to turn a (drinker, beer, bar) triple into three insertions of projected pairs, one for each of Likes, Sells, and Frequents.
68
The Trigger
CREATE TRIGGER ViewTrig
INSTEAD OF INSERT ON Synergy REFERENCING NEW ROW AS n FOR EACH ROW
BEGIN
INSERT INTO LIKES VALUES(n.drinker, n.beer);
69
Trigger Differences in Oracle
• Compared with SQL standard triggers, Oracle has the following differences:
1. Action is a PL/SQL statement.
2. New/old tuples referenced automatically. 3. Strong constraints on trigger actions
Database Modifications
• A
modification
command does notreturn a result (as a query does), but changes the database in some way. • Three kinds of modifications:
1. Insert a tuple or tuples.
2. Delete a tuple or tuples.
3. Update the value(s) of an existing tuple
or tuples.
Insertion
• To insert a single tuple:
INSERT INTO <relation>
VALUES ( <list of values> );
• Example: add to Likes(drinker, beer)
the fact that Sally likes Bud. INSERT INTO Likes
Specifying Attributes in INSERT
• We may add to the relation name a list of attributes.
• Two reasons to do so:
1. We forget the standard order of attributes for the relation.
2. We don’t have values for all attributes, and we want the system to fill in missing
components with NULL or a default value.
Example: Specifying Attributes
• Another way to add the fact that Sally likes Bud to Likes(drinker, beer):
INSERT INTO Likes(beer, drinker)
Inserting Many Tuples
• We may insert the entire result of a query into a relation, using the form: INSERT INTO <relation>
( <subquery> );
Example: Insert a Subquery
• Using Frequents(drinker, bar), enter
into the new relation PotBuddies(name)
Solution
INSERT INTO PotBuddies (SELECT d2.drinker
FROM Frequents d1, Frequents d2 WHERE d1.drinker = ’Sally’ AND
d2.drinker <> ’Sally’ AND d1.bar = d2.bar
);
76
Deletion
• To delete tuples satisfying a condition from some relation:
Example: Deletion
• Delete from Likes(drinker, beer) the fact that Sally likes Bud:
DELETE FROM Likes
WHERE drinker = ’Sally’ AND
beer = ’Bud’;
Example: Delete all Tuples
• Make the relation Likes empty:
DELETE FROM Likes;
Example: Delete Many Tuples
• Delete from Beers(name, manf) all
beers for which there is another beer by the same manufacturer.
DELETE FROM Beers b WHERE EXISTS (
SELECT name FROM Beers WHERE manf = b.manf AND name <> b.name);
80
Semantics of Deletion --- (1)
• Suppose Anheuser-Busch makes only Bud and Bud Lite.
• Suppose we come to the tuple
b
for Bud first.• The subquery is nonempty, because of the Bud Lite tuple, so we delete Bud. • Now, when
b
is the tuple for Bud Lite,Semantics of Deletion --- (2)
• Answer: we
do
delete Bud Lite as well.• The reason is that deletion proceeds in two stages:
1. Mark all tuples for which the WHERE condition is satisfied.
2. Delete the marked tuples.
Updates
• To change certain attributes in certain tuples of a relation:
UPDATE <relation>
Example: Update
• Change drinker Fred’s phone number to 555-1212:
UPDATE Drinkers
SET phone = ’555-1212’
WHERE name = ’Fred’;
Example: Update Several Tuples
• Make $4 the maximum price for beer: UPDATE Sells
SET price = 4.00
Views (Review)
• A
view
is a “virtual table” = a relationdefined in terms of the contents of other tables and views.
• Declare by:
CREATE VIEW <name> AS <query>;
• Antonym: a relation whose value is
really stored in the database is called a
base table
.Example: View Definition
• CanDrink(drinker, beer) is a view “containing”
the drinker-beer pairs such that the drinker
frequents at least one bar that serves the beer:
CREATE VIEW CanDrink AS
SELECT drinker, beer
FROM Frequents, Sells
Example: Accessing a View
• Query a view as if it were a base table.
– Also: a limited ability to modify views if it makes sense as a modification of one
underlying base table.
• Example query:
SELECT beer FROM CanDrink
WHERE drinker = ’Sally’;
89
What Happens When a View Is
Used?
• The DBMS starts by interpreting the
query as if the view were a base table.
– Typical DBMS turns the query into something like relational algebra.
• The definitions of any views used by the query are also replaced by their
90
Example: View Expansion
PROJbeer
SELECTdrinker=‘Sally’
CanDrink PROJdrinker, beer
JOIN
91
DMBS Optimization
• It is interesting to observe that the typical DBMS will then “optimize” the query by transforming the algebraic expression to one that can be
executed faster. • Key optimizations:
1. Push selections down the tree.
92
Example: Optimization
PROJbeer
JOIN
SELECTdrinker=’Sally’ Sells
Frequents
Notice how most tuples are eliminated from Frequents before the
Summary
• SQL was an important factor in the early acceptance of the relational model; more natural than earlier, procedural query languages.
• Relationally complete; in fact, significantly more expressive power than relational algebra.
• Even queries that can be expressed in RA can often be expressed more naturally in SQL.
• Many alternative ways to write a query; optimizer should look for most efficient evaluation plan.
Summary (Contd.)
• NULL for unknown field values brings many complications
• SQL allows specification of rich integrity constraints
• Triggers respond to changes in the database
• More advanced DML
• View definition, execution and optimization
Exercise:
• Constraints can be named.
CREATE TABLE Reserves
( sname CHAR(10),
bid INTEGER,
day DATE,
PRIMARY KEY (bid,day),
CONSTRAINT noInterlakeRes
CHECK (`Interlake’ <>
( SELECT B.bname FROM Boats B
Exercise: Constraints Over Multiple
number of Boats tuples can be anything!
• ASSERTION is the
right solution; not associated with either table.
CREATE TABLE Sailors
CREATE ASSERTION smallClub
CHECK
( (SELECT COUNT (S.sid) FROM Sailors S)
+ (SELECT COUNT (B.bid) FROM Boats B) < 100 )
More Example: Trigger
CREATE TRIGGER youngSailorUpdate AFTER INSERT ON SAILORS
REFERENCING NEW TABLE NewSailors FOR EACH STATEMENT
INSERT
INTO YoungSailors(sid, name, age, rating) SELECT sid, name, age, rating
Group-by Aggregation Exercise
• From Sells(bar, beer, price) and
Beers(name, manf), find the average price of those beers that are either served in at least three bars or are manufactured by Pete’s.
Solution
SELECT beer, AVG(price) FROM Sells
GROUP BY beer
HAVING COUNT(bar) >= 3 OR beer IN (SELECT name
FROM Beers
WHERE manf = ’Pete’’s’);
Beers manu- factured by
Pete’s.
Beer groups with at least 3 non-NULL bars and also beer groups where the