• Integrity constraints
Example:
create table instructor (
ID char(5),
name varchar(20) not null,
dept_name varchar(20),
salary numeric(8,2),
primary key (ID),
foreign key (dept_name) references department);
Find courses that ran in Fall 2009 but not in Spring 2010
Section - Relation
Set Operations
• Find courses that ran in Fall 2009 or in Spring 2010
(select course_id from section where sem = ‘Fall’ and year = 2009)
union
(select course_id from section where sem = ‘Spring’ and year = 2010)
Find courses that ran in Fall 2009 but not in Spring 2010
(select course_id from section where sem = ‘Fall’ and year = 2009)
except
(select course_id from section where sem = ‘Spring’ and year = 2010)
Union
Intersect
Except
Set Operations (Cont.)
• Find the salaries of all instructors that are less than the
largest salary.
• Query 1
select distinct T.salary
from instructor as T, instructor as S
where T.salary < S.salary
• Query 2
select distinct salary
from instructor
• Select query 1 except Find the largest salary of all
instructors.
(select “Query 2” )
except
(select “Query 1”)
Set Operations (Cont.)
• Set operations union, intersect, and except
– Each of the above operations automatically
eliminates duplicates
To retain all duplicates use the
corresponding multiset versions union all,
intersect all and except all.
Suppose a tuple occurs m times in r and n
times in s, then, it occurs:
– m + n times in r union all s
– min(m,n) times in r intersect all s
– max(0, m – n) times in r except all s
Nested Queries
Nested Subqueries
• SQL provides a mechanism for the nesting of subqueries.
A subquery is a select-from-where expression that is
nested within another query.
• The nesting can be done in the following SQL query
select A1, A2, ..., An
from r1, r2, ..., rm
where P as follows:
– Ai can be replaced be a subquery that generates a single
value, ri can be replaced by any valid subquery
– P can be replaced with an expression of the form:
B <operation> (subquery)
Where B is an attribute and <operation> -- in, not in, exists,..
Nested Queries
• SQL provides a mechanism for nesting subqueries.
• A subquery is a select-from where expression that is
nested within another query.
• A common use of subqueries is to perform tests for
– set membership
– set comparisons
– set cardinality
• These tests are done by nesting subqueries in the where
clause.
Set Membership
• SQL allows testing tuples for membership in a
relation.
0
(5 > some 5 ) = true
(read: 5 > some tuple in the relation)
6
0
(5 > some 5 ) = false
0
(5 = some 5 ) = true
0
(5 some 5 ) = true (since 0 5)
(= some) in
However, ( some) not in
Set Comparison – “all” Clause
• Find the names of all instructors whose
salary is greater than the salary of all
instructors in the Biology department.
select name
from instructor
where salary > all (select salary
from instructor
where dept name = ’Biology’);
Definition of “all” Clause
• F <comp> all r t r (F <comp> t)
0
(5 < all 5 ) = false
6
6
(5 < all 10 ) = true
4
(5 = all 5 ) = false
4
(5 all 6 ) = true (since 5 4 and 5 6)
( all) not in
However, (= all) in
Test for Empty Relations
• or equivalently
insert into course (course_id, title, dept_name, credits)
values (’CS-437’, ’Database Systems’, ’Comp. Sci.’, 4);
• Add a new tuple to student with tot_creds set to null
insert into student
values (’3003’, ’Green’, ’Finance’, null);
Insertion (Cont.)
• Add all instructors to the student relation with
tot_creds set to 0
insert into student
select ID, name, dept_name, 0
from instructor
**The select from where statement is evaluated fully
before any of its results are inserted into the relation.
Otherwise queries like
insert into table1 select * from table1
would cause problem
Deletion
• Delete all instructors
delete from instructor
• Delete all instructors from the Finance department
delete from instructor
where dept_name= ’Finance’;
• Delete all tuples in the instructor relation for those
instructors associated with a department located in
the Watson building.
delete from instructor
where dept name in (select dept name
from department where building = ’Watson’);
Deletion (Cont.)
• Delete all instructors whose salary is less than
the average salary of instructors
avg_salary
Aggregation (Cont.)
• Attributes in select clause outside of aggregate
functions must appear in group by list
• Find the number of instructors in each department
who teach a course in the Spring 2010 semester
– select dept name, count (distinct ID) as instr count
from instructor natural join teaches
where semester = ’Spring’ and year = 2010
group by dept name;
Aggregate Functions – Having Clause
• Find the names and average salaries of all departments
whose average salary is greater than 42000
– Relation prereq
• Observe that:
– prereq information is missing for CS-315 and
– course information is missing for CS-437
Inner Join
• An inner join is a Cartesian Product that must
specify some conditions under which the two
relations are joined.
– The on condition allows a general predicate over
the relations being joined.
– The using clause specifies which attributes are
to be used to “join” the two relations.
• In SQL one can use either one of the following:
– join
– inner join
Inner Join -- On Condition
course relation
prereq relation
prereq relation
prereq relation
prereq relation
prereq relation
• Types of IC’s:
– Fundamental: Domain constraints, primary key
constraints, foreign key constraints
– General constraints : Check Constraints and Assertions.
Integrity Constraints on a Single Relation
• Four different constraints
– not null
– primary key
– unique
– check (P)
• where P is a predicate
Not Null and Unique Constraints
• not null
– Declare name and budget to be not null
name varchar(20) not null
budget numeric(12,2) not null
• unique ( A1, A2, …, Am)
– The unique specification states that the
attributes
A1, A2, … Am
form a candidate key.
– Candidate keys are permitted to be null
(in contrast to primary keys).
• check (P) The check clause
where P is a predicate
• Example: ensure that semester value is one of
the following {Fall, Winter, Spring, Summer}:
create table section (
course_id varchar (8),
sec_id varchar (8),
semester varchar (6),
year numeric (4,0),
building varchar (15),
room_number varchar (7),
time slot id varchar (4),
primary key (course_id, sec_id, semester, year),
check (semester in (’Fall’, ’Winter’, ’Spring’,
’Summer’)));
Referential Integrity
• Ensures that a value that appears in one relation for a
given set of attributes also appears for a certain set of
attributes in another relation.
– Example: If “Perryridge” is a branch name appearing in
one of the tuples in the account relation, then there exists
a tuple in the branch relation for branch “Perryridge”.
• Formal Definition
– Let r1(R1) and r2(R2) be relations with primary keys K1 and
K2 respectively.
– The subset of R2 is a foreign key referencing K1 in
relation r1, if for every t2 in r2 there must be a tuple t1 in r1
such that t1[K1] = t2[].
– Referential integrity constraint also called subset
dependency since its can be written as
(r2) K1 (r1)
Checking Referential Integrity on
Database Modification
• The following tests must be made in order to preserve
the following referential integrity constraint:
(r2) K (r1)
• Insert. If a tuple t2 is inserted into r2, the system must
ensure that there is a tuple t1 in r1 such that t1[K] =
t2[]. That is
t2 [] K (r1)
• Delete. If a tuple, t1 is deleted from r1, the system
must compute the set of tuples in r2 that reference t1:
= t1[K] (r2)
If this set is not empty
– either the delete command is rejected as an error, or
– the tuples that reference t1 must themselves be deleted
(cascading deletions are possible).
Database Modification (Cont.)
• Update. There are two cases:
– If a tuple t2 is updated in relation r2 and the update modifies values
for foreign key , then a test similar to the insert case is made:
• Let t2’ denote the new value of tuple t2. The system must ensure
that
t2’[] K(r1)
– If a tuple t1 is updated in r1, and the update modifies values for the
primary key (K), then a test similar to the delete case is made:
1. The system must compute
= t1[K] (r2)
using the old value of t1 (the value before the update is applied).
2. If this set is not empty
1. the update may be rejected as an error, or
2. the update may be cascaded to the tuples in the set, or
3. the tuples in the set may be deleted.
Referential Integrity in SQL
• Primary and candidate keys and foreign keys can be specified as part of
the SQL create table statement:
– The primary key clause lists attributes that comprise the primary
key.
– The unique key clause lists attributes that comprise a candidate
key.
– The foreign key clause lists the attributes that comprise the foreign
key and the name of the relation referenced by the foreign key.
• By default, a foreign key references the primary key attributes of the
referenced table
foreign key (account-number) references account
• Short form for specifying a single column as foreign key
account-number char (10) references account
• Reference columns in the referenced table can be explicitly specified
– but must be declared as primary/candidate keys
foreign key (account-number) references account(account-number)
Referential Integrity in SQL – Example
create table customer
(customer-name char(20),
customer-street char(30),
customer-city char(30),
primary key (customer-name))
create table branch
(branch-name char(15),
branch-city char(30),
assets integer,
primary key (branch-name))
Referential Integrity in SQL – Example
(Cont.)
create table account
(account-number char(10),
branch-name char(15),
balance integer,
primary key (account-number),
foreign key (branch-name) references branch)
create table depositor
(customer-name char(20),
account-number char(10),
primary key (customer-name, account-number),
foreign key (account-number) references account,
foreign key (customer-name) references customer)
Cascading Actions in SQL
create table account
...
foreign key(branch-name) references branch
on delete cascade
on update cascade
...)
• Due to the on delete cascade clauses, if a delete of a
tuple in branch results in referential-integrity
constraint violation, the delete “cascades” to the
account relation, deleting the tuple that refers to the
branch that was deleted.
• Cascading updates are similar.
•
Cascading Actions in SQL (Cont.)
If there is a chain of foreign-key dependencies across
multiple relations, with on delete cascade specified for each
dependency, a deletion or update at one end of the chain
can propagate across the entire chain.
• If a cascading update to delete causes a constraint violation
that cannot be handled by a further cascading operation,
the system aborts the transaction.
– As a result, all the changes caused by the transaction and its
cascading actions are undone.
• Referential integrity is only checked at the end of a
transaction
– Intermediate steps are allowed to violate referential integrity
provided later steps remove the violation
– Otherwise it would be impossible to create some database
states, e.g. insert two tuples whose foreign keys point to each
other
• E.g. spouse attribute of relation
marriedperson(name, address, spouse)
Referential Integrity in SQL (Cont.)
• Alternative to cascading:
– on delete set null
– on delete set default
• Null values in foreign key attributes
complicate SQL referential integrity semantics,
and are best prevented using not null
– if any attribute of a foreign key is null, the tuple is
defined to satisfy the foreign key constraint!
Referential Integrity
• Ensures that a value that appears in one relation for a given
set of attributes also appears for a certain set of attributes
in another relation.
– Example: If “Biology” is a department name appearing
in one of the tuples in the instructor relation, then there
exists a tuple in the department relation for “Biology”.
• Let A be a set of attributes. Let R and S be two relations
that contain attributes A and let A be the primary key of S.
A is said to be a foreign key of R if for any values of A
appearing in R these values also appear in S.
– In the above example, R is the instructor table, S is the
department table, and A is department_name
Referential Integrity Example
• create table course (
course_id char(5),
title varchar(20),
dept_name varchar(20),
primary key (course_id)
foreign key (dept_name) references
department)
Cascading Actions in Referential Integrity
• create table course (
…
dept_name varchar(20),
foreign key (dept_name) references department
on delete cascade
on update cascade,
. . . );
• delete cascade -- if a department (say Biology) is
deleted from the department relation, then all tuples in
the course relation that refer to Biology are deleted.
• update cascade -- if a department (say Biology) is
changed to (say Life-Science) in the department
relation, then all tuples in the course relation that refer
to Biology are updated to refer to life-Science.
• alternative actions to cascade: set null, set default
Domain Constraints
• Integrity constraints guard against accidental damage to
the database, by ensuring that authorized changes to the
database do not result in a loss of data consistency.
• Domain constraints are the most elementary form of
integrity constraint.
• They test values inserted in the database, and test queries
to ensure that the comparisons make sense.
• New domains can be created from existing data types
– E.g. create domain Dollars numeric(12, 2)
create domain Pounds numeric(12,2)
• We cannot assign or compare a value of type Dollars to a
value of type Pounds.
– However, we can convert type as below
(cast r.A as Pounds)
(Should also multiply by the dollar-to-pound conversion-rate)
Domain Constraints (Cont.)
• The check clause in SQL-92 permits domains to be restricted:
– Use check clause to ensure that an hourly-wage domain
allows only values greater than a specified value.
create domain hourly-wage numeric(5,2)
constraint value-test check(value > = 4.00)
– The domain has a constraint that ensures that the hourly-
wage is greater than 4.00
– The clause constraint value-test is optional; useful to
indicate which constraint an update violated.
• Can have complex conditions in domain check
– create domain AccountType char(10)
constraint account-type-test
check (value in (‘Checking’, ‘Saving’))
– check (branch-name in (select branch-name from branch))
Assertions
• An assertion is a predicate expressing a condition that
we wish the database always to satisfy.
• An assertion in SQL takes the form
create assertion <assertion-name> check <predicate>
• When an assertion is made, the system tests it for
validity, and tests it again on every update that may
violate the assertion
– This testing may introduce a significant amount of
overhead; hence assertions should be used with great care.
• Asserting
for all X, P(X)
is achieved in a round-about fashion using
not exists X such that not P(X)
Assertion Example
• The sum of all loan amounts for each branch must
be less than the sum of all account balances at the
branch.
create assertion sum-constraint check
(not exists (select * from branch
where (select sum(amount) from
loan
where loan.branch-name =
branch.branch-name)
>= (select sum(amount) from
account
where loan.branch-name =
branch.branch-name)))
Assertion Example
• Every loan has at least one borrower who maintains an
account with a minimum balance or $1000.00
create assertion balance-constraint check
(not exists (
select * from loan
where not exists (
select *
from borrower, depositor, account
where loan.loan-number = borrower.loan-
number
and borrower.customer-name =
depositor.customer-name
and depositor.account-number =
account.account-number
and account.balance >= 1000)))
Embedded SQL
Embedded SQL
• The SQL standard defines embedding's of SQL in a variety of
programming languages such as C, C++, Java, Fortran, and
PL/1,
• A language to which SQL queries are embedded is referred
to as a host language, and the SQL structures permitted in
the host language comprise embedded SQL.
• The basic form of these languages follows that of the
System R embedding of SQL into PL/1.
• EXEC SQL statement is used to identify embedded SQL
request to the preprocessor
EXEC SQL <embedded SQL statement >;
Note: this varies by language:
– In some languages, like COBOL, the semicolon is replaced with
END-EXEC
– In Java embedding uses # SQL { …. };
Embedded SQL (Cont.)
• Before executing any SQL statements, the program must
first connect to the database. This is done using:
EXEC-SQL connect to server user user-name using
password;
Here, server identifies the server to which a connection is
to be established.
• Variables of the host language can be used within
embedded SQL statements. They are preceded by a colon
(:) to distinguish from SQL variables (e.g., :credit_amount )
• Variables used as above must be declared within DECLARE
section, as illustrated below. The syntax for declaring the
variables, however, follows the usual host language syntax.
EXEC-SQL BEGIN DECLARE SECTION
int credit-amount ;
EXEC-SQL END DECLARE SECTION;
Embedded SQL (Cont.)
• To write an embedded SQL query, we use the
statement:
declare c cursor for <SQL query>
The variable c is used to identify the query
• Example:
– From within a host language, find the ID and name
of students who have completed more than the
number of credits stored in variable credit_amount
in the host langue
– Specify the query in SQL as follows:
EXEC SQL
declare c cursor for
select ID, name
from student
where tot_cred > :credit_amount
END_EXEC
Embedded SQL (Cont.)
• To execute embedded SQL statement we use the
open statement, that causes the database system to
execute the query and to save the results within a
temporary relation
• The open statement for our example is as follows:
EXEC SQL open c ;
The query uses the value of the host-language
variable credit-amount at the time the open
statement is executed.
• The fetch statement causes the values of one tuple
in the query result (i.e., ID and name of a student)
to be placed in host language variables -- :si, :sn
EXEC SQL fetch c into :si, :sn END_EXEC
Repeated calls to fetch get successive tuples in the
query result
Embedded SQL (Cont.)
• A variable called SQLSTATE in the SQL
communication area (SQLCA) is set to the value
‘02000’ when there is no more data available
to fetch.
• The close statement causes the database
system to delete the temporary relation that
holds the result of the query.
EXEC SQL close c ;
Note: above details vary with language. For
example, the Java embedding defines
Java integrators to step through result tuples.
Updates Through Embedded SQL
Embedded SQL provides mechanism to modify the database relations using --
update, insert, and delete
Can update tuples fetched by declaring that the cursor is for update
EXEC SQL
declare c cursor for
select *
from instructor
where dept_name = ‘Music’
for update;
Iterate through the tuples by performing fetch operations on the cursor, and
after fetching each tuple the following code can be executed:
update instructor
set salary = salary + 1000
where current of c;