Anda di halaman 1dari 132

Unit 3

Structured Query Language


Overview of SQL query language
• IBM Sequel language developed as part of System R project
at the IBM San Jose Research Laboratory
• Renamed Structured Query Language (SQL)
• ANSI and ISO standard SQL:
– SQL-86
– SQL-89
– SQL-92
– SQL:1999 (language name became Y2K compliant!)
– SQL:2003
• Commercial systems offer most, if not all, SQL-92 features,
plus varying feature sets from later standards and special
proprietary features.
– Not all examples here may work on your particular system.
SQL Language has several parts
• Data-definition language (DDL).
– The SQL DDL provides commands for defining relation schemas, deleting
relations, and modifying relation schemas.
• Data-manipulation language (DML).
– The SQL DML provides the ability to query information from the database and
to insert tuples into, delete tuples from, and modify tuples in the database.
• Integrity.
– The SQL DDL includes commands for specifying integrity constraints that the
data stored in the database must satisfy. Updates that violate integrity
constraints are disallowed.
• View definition.
– The SQL DDL includes commands for defining views.
• Transaction control.
– SQL includes commands for specifying the beginning and ending of
transactions.
• Embedded SQL and dynamic SQL.
– Embedded and dynamic SQL define how SQL statements can be embedded
within general-purpose programming languages, such as C, C++, and Java.
• Authorization.
– The SQL DDL includes commands for specifying access rights to relations and
views.
Data Definition Language
The SQL data-definition language (DDL) allows
the specification of information about relations,
including:
• The schema for each relation.

• The domain of values associated with each attribute.

• Integrity constraints

• And also other information such as


– The set of indices to be maintained for each relations.

– Security and authorization information for each relation.

– The physical storage structure of each relation on disk.


Domain Types standards in SQL
• char(n).
– Fixed length character string, with user-specified length n.
• varchar(n).
– Variable length character strings, with user-specified maximum length
n.
• int.
– Integer (a finite subset of the integers that is machine-dependent).
• smallint.
– Small integer (a machine-dependent subset of the integer domain type).
• numeric(p,d).
– Fixed point number, with user-specified precision of p digits, with d
digits to the right of decimal point. (ex., numeric(3,1), allows 44.5 to
be stores exactly, but not 444.5 or 0.32)
• real, double precision.
– Floating point and double-precision floating point numbers, with
machine-dependent precision.
• float(n).
– Floating point number, with user-specified precision of at least n digits.

SQL Plus supports char(n), varchar(n), number(n), numeric(n,d)


Basic Schema Definition
• An SQL relation is defined using the create table
command:
create table r (A1 D1, A2 D2, ..., An Dn,
(integrity-constraint1),
...,
(integrity-constraintk))
– r is the name of the relation
– each Ai is an attribute name in the schema of relation r
– Di is the data type of values in the domain of attribute Ai
• Example:
create table instructor (
ID char(5),
name varchar(20),
dept_name varchar(20),
salary numeric(8,2));
Integrity Constraints in Create Table
• not null
• primary key (A1, ..., An ) – not null and Unique by default
• foreign key (Am, ..., An ) references r

Example:
create table instructor (
ID char(5),
name varchar(20) not null,
dept_name varchar(20),
salary numeric(8,2),
primary key (ID),
foreign key (dept_name) references department);

primary key declaration on an attribute automatically ensures not null


Create the schema definition for University
• Department having attributes
– dept_name with domain type character of size 20,
– building with domain type character of length 15,
– budget as number with 12 digits in total, 2 of which are after the
decimal point.
– Set the dept_name attribute as primary key of the department
relation

• Student having attributes


– ID with domain type character of size 5,
– name with domain type character of length 20,
– dept_name with domain type character of size 20,
– total credits as number with 3 digits in total, 0 of which are after
the decimal point.
– Set the ID as primary key and dept_name attribute as foreign key
referencing department relation and name as not null
Relation Schema Definition for

University
create table department (
dept_name varchar (20),
building varchar (15),
budget numeric (12,2),
primary key (dept_name));
• create table student (
ID varchar(5),
name varchar(20) not null,
dept_name varchar(20),
tot_cred numeric(3,0),
primary key (ID),
foreign key (dept_name) references department);
Create the schema definition for University
• Course having attributes
– course_id with domain type character of size 8,
– title with domain type character of length 50,
– dept_name with domain type character of size 20,
– credits as number with 3 digits in total, 0 of which are
after the decimal point.
– Set the course_id as primary key and dept_name
attribute as foreign key referencing department relation
Create the schema definition for University
• Course having attributes
– course_id with domain type character of size 8,
– title with domain type character of length 50,
– dept_name with domain type character of size 20,
– credits as number with 3 digits in total, 0 of which are after
the decimal point.
– Set the course_id as primary key and dept_name attribute as
foreign key referencing department relation
• create table course (
course_id varchar(8),
title varchar(50),
dept_name varchar(20),
credits numeric(2,0),
primary key (course_id),
foreign key (dept_name) references department);
• Insert
Updates to tables
– insert into instructor values (‘10211’, ’Smith’, ’Biology’, 66000);
• Delete
– Remove all tuples from the student relation
• delete from student;
• Drop Table
– drop table r;
• Alter
– alter table r add A D
• where A is the name of the attribute to be added to relation r
and D is the domain of A.
• All exiting tuples in the relation are assigned null as the value
for the new attribute.
– alter table r drop A
• where A is the name of an attribute of relation r
• Dropping of attributes not supported by many databases.
The form of Basic Query Structure
– A typical SQL query has the form:
select A1, A2, ..., An
from r1, r2, ..., rm
where P
– Ai represents an attribute
– Ri represents a relation
– P is a predicate (over attributes, constants, strings, etc.)
• The result of an SQL query is a relation.
The select Clause
• The select clause lists the attributes desired in the
result of a query
– corresponds to the projection operation of the relational
algebra
• Example: find the names of all instructors:
select name
from instructor
• NOTE: SQL names are case insensitive (i.e., you
may use upper- or lower-case letters.)
– E.g., Name ≡ NAME ≡ name
– Some people use upper case wherever we use bold font.
The select Clause (Cont.)
• SQL allows duplicates in relations as well as in query results.

• To force the elimination of duplicates, insert the keyword distinct


after select.

• Find the department names of all instructors, and remove


duplicates
select distinct dept_name
from instructor
• The keyword all specifies that duplicates should not be removed.

select all dept_name


from instructor
The select Clause (Cont.)
• The select clause can contain arithmetic
expressions involving the operation, +, –, *, and /
operating on constants or attributes of tuples.
– The query:
select ID, name, salary/12
from instructor
returns a relation that is the same as the instructor
relation, except that the value of the attribute salary is
divided by 12.
– Can rename “salary/12” using the as clause:
select ID, name, salary/12 as monthly_salary
from instructor
The where Clause
• The where clause specifies conditions that the result must satisfy
– Corresponds to the selection predicate of the relational algebra.

• To find all instructors in Comp. Sci. dept


select name
from instructor
where dept_name = ‘Comp. Sci.'
• Comparison results can be combined using the logical connectives
and, or, and not
– To find all instructors in Comp. Sci. dept with salary > 80000
select name
from instructor
where dept_name = ‘Comp. Sci.' and salary > 80000
• Comparisons can be applied to results of arithmetic expressions.
The from Clause
• The from clause lists the relations involved in the query
– Corresponds to the Cartesian product operation of the relational
algebra.
• Find the Cartesian product instructor X teaches
select * from instructor, teaches
– generates every possible instructor – teaches pair, with all
attributes from both relations.
– For common attributes (e.g., ID), the attributes in the resulting
table are renamed using the relation name (e.g., instructor.ID)

• Cartesian product not very useful directly, but useful


combined with where-clause condition (selection operation
in relational algebra).
Cartesian
instructor
Product
teaches
Examples
• Find the names of all instructors who have taught
some course and the course_id

• Find the names of all instructors in the Art


department who have taught some course and the
course_id
Examples
• Find the names of all instructors who have taught
some course and the course_id
– select name, course_id
from instructor , teaches
where instructor.ID = teaches.ID

• Find the names of all instructors in the Art


department who have taught some course and the
course_id
– select name, course_id
from instructor , teaches
where instructor.ID = teaches.ID and
instructor. dept_name = ‘Art’
Overview of SQL query
• In general, the meaning of an SQL query can
be
1. Generate a Cartesian product of the relations listed
in the from clause
2. Apply the predicates specified in the where clause
on the result of Step 1.
3. For each tuple in the result of Step 2, output the
attributes (or results of expressions) specified in the
select clause.
Natural Join
• The natural join operation operates on two
relations and produces a relation as the result
– Cartesian product of two relations, which
concatenates each tuple of the first relation with
every tuple of the second
– natural join considers only those pairs of tuples
with the same value on those attributes that appear
in the schemas of both relations.
Natural Join
• The query “For all instructors in the university
who have taught some course, find their names
and the course ID of all courses they taught”
Natural Join
• The query “For all instructors in the university
who have taught some course, find their names
and the course ID of all courses they taught”
select name, course id
from instructor, teaches
where instructor.ID= teaches.ID;
Natural Join
• The query “For all instructors in the university
who have taught some course, find their names
and the course ID of all courses they taught”
select name, course id
from instructor, teaches
where instructor.ID= teaches.ID;
Similar to
select name, course id
from instructor natural join teaches;
Natural Join
Natural Join
• A from clause in an SQL query can have multiple relations
combined using natural join, as shown here:
select A1, A2, . . . , An
from r1 natural join r2 natural join . . . natural
join rm
where P;
• More generally, a from clause can be of the form from E1,
E2, . . . , En
where each Ei can be a single relation or an expression
involving natural joins.
• For example, The query “List the names of instructors along
with the titles of courses that they teach.”. The query can be
written in SQL as follows:
select name, title
from instructor natural join teaches, course
where teaches.course id= course.course id;
Natural Join
• For example, The query “List the names of
instructors along with the titles of courses that
they teach.”. The query can be written in SQL as
follows:
select name, title
from instructor natural join teaches
natural join course
• List all the combinations of instructor and the
course title
Natural Join
• For example, The query “List the names of instructors along
with the titles of courses that they teach.”. The query can be
written in SQL as follows:
select name, title
from instructor natural join teaches
natural join course
– List all the combinations of instructor and the course title
• SQL provides a form of the natural join construct that
allows you to specify exactly which columns should be
equated
select name, title
from (instructor natural join teaches) join course using
(course id);
Additional Basic Operations
The Rename Operation
• The SQL allows renaming relations and attributes
using the as clause:
old-name as new-name

• For example the attribute name “name” to be replaced


with the name instructor name
select name as instructor name, course id
from instructor, teaches
where instructor.ID= teaches.ID;
• Keyword as is optional and may be omitted
instructor as T ≡ instructor T
• An identifier, such as T is used to rename a relation is
referred to as a correlation name in the SQL standard,
but is also commonly referred to as a table alias, or a
correlation variable, or a tuple variable.
The Rename Operation
• The query “For all instructors in the university
who have taught some course, find their names
and the course ID of all courses they taught.”

• Find the names of all instructors who have a


higher salary than some instructor in ‘Comp.
Sci’
The Rename Operation
• The query “For all instructors in the university
who have taught some course, find their names
and the course ID of all courses they taught.”
select T.name, S.course id
from instructor as T, teaches as S
where T.ID= S.ID;
• Find the names of all instructors who have a
higher salary than some instructor in ‘Comp.
Sci’
select distinct T.name
from instructor as T, instructor as S
where T.salary > S.salary and S.dept_name =
‘Comp. Sci.’
String Operations
• SQL includes a string-matching operator for comparisons
on character strings. The operator like uses patterns that
are described using two special characters:
– percent ( % ). The % character matches any substring.
– underscore ( _ ). The _ character matches any character.
• Find the names of all instructors whose name includes the
substring “dar”.
select name
from instructor
where name like '%dar%'
• Match the string “100%”
like ‘100 \%' escape '\'
in that above we use backslash (\) as the escape character.
String Operations (Cont.)
• Patterns are case sensitive.
• Pattern matching examples:
– ‘Intro%’ matches any string beginning with “Intro”.
– ‘%Comp%’ matches any string containing “Comp” as a
substring.
– ‘_ _ _’ matches any string of exactly three characters.
– ‘_ _ _ %’ matches any string of at least three characters.

• SQL supports a variety of string operations such as


– concatenation (using “||”)
– converting from upper to lower case (and vice versa)
– finding string length, extracting substrings, etc.
String Operations (Cont.)
• The query “Find the names of all departments
whose building name includes the substring
‘Watson’.”
String Operations (Cont.)
• The query “Find the names of all departments
whose building name includes the substring
‘Watson’.”
select dept name
from department
where building like ’%Watson%’;
• SQL allows us to search for mismatches instead of
matches by using the not like comparison operator.
Attribute Specification in Select Clause
• The asterisk symbol “ * ” can be used in the select
clause to denote “all attributes.”
• Thus, the use of instructor.* in the select clause
of the query:
select instructor.*
from instructor, teaches
where instructor.ID= teaches.ID;
• indicates that all attributes of instructor are to be
selected.
• A select clause of the form select * indicates that
all attributes of the result relation of the from
clause are selected.
Ordering the Display of Tuples
• List in alphabetic order the names of all instructors
select distinct name
from instructor
order by name
• We may specify desc for descending order or asc
for ascending order, for each attribute; ascending
order is the default.
– Example: order by name desc
• Can sort on multiple attributes
– Example: order by dept_name, name
Ordering the Display of Tuples
• To list in alphabetic order all instructors in the
Physics department

• To list the entire instructor relation in


descending order of salary with alphabetical
order of instructor name
Ordering the Display of Tuples
• To list in alphabetic order all instructors in the
Physics department
– select name
from instructor
where dept name = ’Physics’
order by name;
• To list the entire instructor relation in
descending order of salary with alphabetical
order of instructor name
– select *
from instructor
order by salary desc, name asc;
Where Clause Predicates
• SQL includes a between comparison operator
• Example: Find the names of all instructors with
salary between $90,000 and $100,000 (that is,
≥$90,000 and ≤ $100,000)
– select name
from instructor
where salary between 90000 and 100000
• Similar to
– select name
from instructor
where salary <= 100000 and salary >= 90000;
Where Clause Predicates
• Tuple comparison
– select name, course_id
from instructor, teaches
where (instructor.ID, dept_name) = (teaches.ID, ’Biology’);
• Similar to
– select name, course id
from instructor, teaches
where instructor.ID= teaches.ID and dept name = ’Biology’;
Duplicates
• In relations with duplicates, SQL can define how
many copies of tuples appear in the result.
• Multiset versions of some of the relational algebra
operators – given multiset relations r1 and r2:
1.  (r1): If there are c1 copies of tuple t1 in r1, and t1
satisfies selections ,, then there are c1 copies of t1 in 
(r1).
2. A (r ): For each copy of tuple t1 in r1, there is a copy of
tuple A (t1) in A (r1) where A (t1) denotes the
projection of the single tuple t1.
3. r1 x r2: If there are c1 copies of tuple t1 in r1 and c2
copies of tuple t2 in r2, there are c1 x c2 copies of the tuple
t1. t2 in r1 x r2
Duplicates (Cont.)
• Example: Suppose multiset relations r1 (A, B) and r2
(C) are as follows:
r1 = {(1, a) (2,a)} r2 = {(2), (3), (3)}
• Then B(r1) would be {(a), (a)}, while B(r1) x r2
would be
{(a,2), (a,2), (a,3), (a,3), (a,3), (a,3)}
• SQL duplicate semantics:
select A1,, A2, ..., An
from r1, r2, ..., rm
where P
is equivalent to the multiset version of the expression:
 A ,A ,,A ( P (r1  r2    rm ))
1 2 n
Set Operations
Union, Intersect and Except
Set Operations
• Find courses that ran in Fall 2009 or in Spring 2010

 Find courses that ran in Fall 2009 and in Spring 2010

 Find courses that ran in Fall 2009 but not in Spring 2010
Section - Relation
Set Operations
• Find courses that ran in Fall 2009 or in Spring 2010
(select course_id from section where sem = ‘Fall’ and year = 2009)
union
(select course_id from section where sem = ‘Spring’ and year = 2010)

 Find courses that ran in Fall 2009 and in Spring 2010


(select course_id from section where sem = ‘Fall’ and year = 2009)
intersect
(select course_id from section where sem = ‘Spring’ and year = 2010)

 Find courses that ran in Fall 2009 but not in Spring 2010
(select course_id from section where sem = ‘Fall’ and year = 2009)
except
(select course_id from section where sem = ‘Spring’ and year = 2010)
Union
Intersect

Except
Set Operations (Cont.)
• Find the salaries of all instructors that are less than the
largest salary.
• Query 1
select distinct T.salary
from instructor as T, instructor as S
where T.salary < S.salary
• Query 2
select distinct salary
from instructor
• Select query 1 except Find the largest salary of all
instructors.
(select “Query 2” )
except
(select “Query 1”)
Set Operations (Cont.)
• Set operations union, intersect, and except
– Each of the above operations automatically
eliminates duplicates
To retain all duplicates use the
corresponding multiset versions union all,
intersect all and except all.
Suppose a tuple occurs m times in r and n
times in s, then, it occurs:
– m + n times in r union all s
– min(m,n) times in r intersect all s
– max(0, m – n) times in r except all s
Nested Queries
Nested Subqueries
• SQL provides a mechanism for the nesting of subqueries.
A subquery is a select-from-where expression that is
nested within another query.
• The nesting can be done in the following SQL query
select A1, A2, ..., An
from r1, r2, ..., rm
where P as follows:
– Ai can be replaced be a subquery that generates a single
value, ri can be replaced by any valid subquery
– P can be replaced with an expression of the form:
B <operation> (subquery)
Where B is an attribute and <operation> -- in, not in, exists,..
Nested Queries
• SQL provides a mechanism for nesting subqueries.
• A subquery is a select-from where expression that is
nested within another query.
• A common use of subqueries is to perform tests for
– set membership
– set comparisons
– set cardinality
• These tests are done by nesting subqueries in the where
clause.
Set Membership
• SQL allows testing tuples for membership in a
relation.

• The in connective tests for set membership, where


the set is a collection of values produced by a
select clause.

• The not in connective tests for the absence of set


membership.
Set Membership – using “in” connectives
• Find courses offered in Fall 2009 and in Spring 2010
• The subquery can be written as
(select course_id
from section
where semester = ’Spring’ and year= 2010)
The final query using “in” connectives
select distinct course_id
from section
where semester = ’Fall’ and year= 2009 and
course_id in (select course_id
from section
where semester = ’Spring’ and
year= 2010);
Set Membership – using “not in”
connectives
 Find courses offered in Fall 2009 but not in Spring 2010
select distinct course_id
from section
where semester = ’Fall’ and year= 2009 and
course_id not in (select course_id
from section
where semester = ’Spring’
and year= 2010);
Set Membership (Cont.)
• Find the total number of (distinct)
students who have taken course sections
taught by the instructor with ID 10101
select count (distinct ID)
from takes
where (course_id, sec_id, semester, year) in
(select course_id, sec_id,
semester, year
from teaches
where teaches.ID= 10101);
Set Comparison – “some” Clause
• Find names of instructors with salary greater than
that of some (at least one) instructor in the Biology
department.
select distinct T.name
from instructor as T, instructor as S
where T.salary > S.salary and S.dept name = ’Biology’;times
Alternative style for writing the preceding query. The phrase
“greater than at least one” is represented in SQL by >
some..Same query using > some clause
select name
from instructor
where salary > some (select salary
from instructor
where dept name = ’Biology’);
Definition of “some” Clause
• F <comp> some r t  r such that (F <comp> t )
Where <comp> can be:     

0
(5 > some 5 ) = true
(read: 5 > some tuple in the relation)
6
0
(5 > some 5 ) = false

0
(5 = some 5 ) = true

0
(5  some 5 ) = true (since 0  5)
(= some)  in
However, ( some)  not in
Set Comparison – “all” Clause
• Find the names of all instructors whose
salary is greater than the salary of all
instructors in the Biology department.
select name
from instructor
where salary > all (select salary
from instructor
where dept name = ’Biology’);
Definition of “all” Clause
• F <comp> all r t  r (F <comp> t)
0
(5 < all 5 ) = false

6
6
(5 < all 10 ) = true

4
(5 = all 5 ) = false

4
(5  all 6 ) = true (since 5  4 and 5  6)
( all)  not in
However, (= all)  in
Test for Empty Relations

• The exists construct returns the value true if


the argument subquery is nonempty.
• exists r  r  Ø
• not exists r  r = Ø
– Construct to stimulate the set containment
– Relation A contains relation B can be written as “not
exists(B except A)
Use of “exists” Clause
• Yet another way of specifying the query “Find all courses
taught in both the Fall 2009 semester and in the Spring 2010
semester”
select course_id
from section as S
where semester = ’Fall’ and year = 2009 and
exists (select *
from section as T
where semester = ’Spring’ and year= 2010
and S.course_id = T.course_id);
Use of “not exists” Clause
• Find all students who have taken all courses offered in
the Biology department.
select distinct S.ID, S.name
from student as S
where not exists ( (select course_id
from course
where dept_name = ’Biology’)
except
(select T.course_id
from takes as T
where S.ID = T.ID));
Test for Absence of Duplicate Tuples
• The unique construct tests whether a subquery has any
duplicate tuples in its result.
• The unique construct evaluates to “true” if a given
subquery contains no duplicates .
• Find all courses that were offered at most once in 2009
select T.course_id
from course as T
where unique (select R.course_id
from section as R
where T.course_id= R.course_id
and R.year = 2009);
Subqueries in the From Clause
• SQL allows a subquery expression to be used in the from
clause
• Find the average instructors’ salaries of those departments
where the average salary is greater than $42,000.”
select dept_name, avg_salary
from (select dept_name, avg (salary) as avg_salary
from instructor
group by dept_name)
where avg_salary > 42000;
• Note that we do not need to use the having clause
• Another way to write above query
select dept_name, avg_salary
from (select dept_name, avg (salary)
from instructor
group by dept_name) as dept_avg (dept_name, avg_salary)
where avg_salary > 42000;
With Clause
• The with clause provides a way of defining a
temporary relation whose definition is available
only to the query in which the with clause occurs.
• Find all departments with the maximum budget

with max_budget (value) as


(select max(budget)
from department)
select department.name
from department, max_budget
where department.budget = max_budget.value;
Complex Queries using With Clause
• Find all departments where the total salary is greater
than the average of the total salary at all departments
with dept _total (dept_name, value) as
(select dept_name, sum(salary)
from instructor
group by dept_name),
dept_total_avg(value) as
(select avg(value)
from dept_total)
select dept_name
from dept_total, dept_total_avg
where dept_total.value > dept_total_avg.value;
Scalar Subquery
• Scalar subquery is one which is used where a single value is
expected
• List all departments along with the number of instructors in
each department
select dept_name,
(select count(*)
from instructor
where department.dept_name = instructor.dept_name)
as num_instructors
from department;
• Runtime error if subquery returns more than one result tuple
Modification of the Database
• Insertion of new tuples into a given relation

• Deletion of tuples from a given relation.

• Updating of values in some tuples in a


given relation
Insertion
• Add a new tuple to course
insert into course
values (’CS-437’, ’Database Systems’, ’Comp. Sci.’, 4);

• or equivalently
insert into course (course_id, title, dept_name, credits)
values (’CS-437’, ’Database Systems’, ’Comp. Sci.’, 4);
• Add a new tuple to student with tot_creds set to null
insert into student
values (’3003’, ’Green’, ’Finance’, null);
Insertion (Cont.)
• Add all instructors to the student relation with
tot_creds set to 0
insert into student
select ID, name, dept_name, 0
from instructor
**The select from where statement is evaluated fully
before any of its results are inserted into the relation.
Otherwise queries like
insert into table1 select * from table1
would cause problem
Deletion
• Delete all instructors
delete from instructor
• Delete all instructors from the Finance department
delete from instructor
where dept_name= ’Finance’;
• Delete all tuples in the instructor relation for those
instructors associated with a department located in
the Watson building.
delete from instructor
where dept name in (select dept name
from department where building = ’Watson’);
Deletion (Cont.)
• Delete all instructors whose salary is less than
the average salary of instructors

delete from instructor


where salary < (select avg (salary)
from instructor);
Update
• Increase salaries of instructors whose salary is over
$100,000 by 3%, and all others by a 5%
– Write two update statements:
update instructor
set salary = salary * 0.03
where salary > 100000;
update instructor
set salary = salary * 0.05
where salary <= 100000;
– The order is important
– Can be done better using the case statement (next
slide)
Case Statement for Conditional Update

• Same query as before but with case statement


update instructor
set salary = case
when salary <= 100000 then salary * 0.05
else salary * 0.03
end
Updates with Scalar Subqueries
• Recompute and update tot_creds value for all students
update student S
set tot_cred = (select sum(credits)
from takes, course
where takes.course_id = course.course_id and
S.ID= takes.ID.and
takes.grade <> ’F’ and
takes.grade is not null);
• Sets tot_creds to null for students who have not successfully completed
any course
• And update null of tot_creds to 0 (Use another update cmd)
• Instead of replace the “select sum(credits)”, as:
select case
when sum(credits) is not null then sum(credits)
else 0
end
Aggregate Functions and Null values
Aggregate Functions
• These functions operate on the multiset of values of
a column of a relation, and return a value

avg: average value


min: minimum value
max: maximum value
sum: sum of values
count: number of values
Aggregate Functions (Cont.)
• Find the average salary of instructors in the Computer
Science department
– select avg (salary)
from instructor
where dept_name= ’Comp. Sci.’;
• Find the total number of instructors who teach a course in
the Spring 2010 semester
– select count (distinct ID)
from teaches
where semester = ’Spring’ and year = 2010;
• Find the number of tuples in the course relation
– select count (*)
from course;
Aggregate Functions – Group By
• Find the average salary of instructors in each department
– select dept_name, avg (salary) as avg_salary
from instructor
group by dept_name;

avg_salary
Aggregation (Cont.)
• Attributes in select clause outside of aggregate
functions must appear in group by list
• Find the number of instructors in each department
who teach a course in the Spring 2010 semester
– select dept name, count (distinct ID) as instr count
from instructor natural join teaches
where semester = ’Spring’ and year = 2010
group by dept name;
Aggregate Functions – Having Clause
• Find the names and average salaries of all departments
whose average salary is greater than 42000

select dept_name, avg (salary)


from instructor
group by dept_name
having avg (salary) > 42000;

Note: predicates in the having clause are applied after the


formation of groups whereas predicates in the where
clause are applied before forming groups
Null Values
• It is possible for tuples to have a null value, denoted by
null, for some of their attributes
• null signifies an unknown value or that a value does
not exist.
• The result of any arithmetic expression involving null
is null
– Example: 5 + null returns null
• The predicate is null can be used to check for null
values.
– Example: Find all instructors whose salary is null.
select name
from instructor
where salary is null
Null Values and Three Valued Logic
• Three values – true, false, unknown
• Any comparison with null returns unknown
– Example: 5 < null or null <> null or null = null
• Three-valued logic using the value unknown:
– OR: (unknown or true) = true,
(unknown or false) = unknown
(unknown or unknown) = unknown
– AND: (true and unknown) = unknown,
(false and unknown) = false,
(unknown and unknown) = unknown
– NOT: (not unknown) = unknown
• If the where clause predicate evaluates to either
false or unknown for a tuple, that tuple is not added
to the result
Null Values and Aggregates
• Total all salaries
select sum (salary )
from instructor
– Above statement includes null value in its input
amounts
• All aggregate operations except count(*) ignore
tuples with null values in the input collection
Joined Relations
• Join operations take two relations and return as a result
another relation.
• A join operation is a Cartesian product which requires
that tuples in the two relations match (under some
condition). It also specifies the attributes that are
present in the result of the join
• The join operations are typically used as subquery
expressions in the from clause
• Three types of joins:
– Natural join
– Inner join
– Outer join
Natural Join in SQL
• Natural join matches tuples with the same values
for all common attributes, and retains only one
copy of each common column.
• List the names of instructors along with the course
ID of the courses that they taught
– select name, course_id
from instructor, teaches
where instructor.ID = teaches.ID;
• Same query in SQL with “natural join” construct
– select name, course_id
from instructor natural join teaches;
Inner and Outer joins
• The inner and outer join operation works in
a manner similar to the join operations, but
preserves those tuples that would be lost in
a join, by creating tuples in the result
containing null values.
• SQL provides constructs for inner and
outer join
Inner and Outer joins Running Example
• Running examples used to illustrate inner/outer join
– Relation course

– Relation prereq

• Observe that:
– prereq information is missing for CS-315 and
– course information is missing for CS-437
Inner Join
• An inner join is a Cartesian Product that must
specify some conditions under which the two
relations are joined.
– The on condition allows a general predicate over
the relations being joined.
– The using clause specifies which attributes are
to be used to “join” the two relations.
• In SQL one can use either one of the following:
– join
– inner join
Inner Join -- On Condition
 course relation

 prereq relation

course inner join prereq on


course.course_id = prereq.course_id

What is the difference between the above and a natural join?-


- course inner join prereq on true
 same as Cartesian product
Inner Join -- Using Condition
• The using clause specifies which attributes are to be
used to “join” the two relations
• course inner join prereq using (course_id )
Using Condition (Cont.)
• List the names of instructors along with the titles of
courses that they teach
– Version 1 – without the use of “using”
select name, title
from instructor natural join teaches, course
where teaches.course_id = course.course_id;
– Version 2 – with the use of “using”
select name, title
from (instructor natural join teaches)
join course using (course_id);
Outer Join
• An extension of the join operation that avoids loss of
information.
• Computes the join and then adds tuples form one
relation that does not match tuples in the other relation
to the result of the join.
• Uses null values.
• Specifies:
– left or right or full
• Can be used with the clauses:
– natural
– on
– using
• If none of the clauses is used it is equivalent to Cartesian
Product
Left Outer Join
 course relation

 prereq relation

 course natural left outer join prereq


Right Outer Join
 course relation

prereq relation

 course natural right outer join prereq


Full Outer Join
 course relation

prereq relation

course natural full outer join prereq


Left Outer Join with on
 course relation

 prereq relation

course left outer join prereq on


course.course_id = prereq.course_id
Outer Join
 The course relation
-- Using Condition

 The prereq relation

course full outer join prereq using (course_id )

 There is no difference between the above and natural


full outer join
Integrity Constraints
• Integrity constraints guard against accidental damage to
the database, by ensuring that authorized changes to the
database do not result in a loss of data consistency.
– A checking account must have a balance greater than
$10,000.00
– A salary of a bank employee must be at least $4.00 an
hour
– A customer must have a (non-null) phone number
Integrity Constraints (Review)
• Constraint describes conditions that every legal
instance of a relation must satisfy.
– Inserts/deletes/updates that violate ICs are disallowed.
– Can be used to :
• ensure application semantics (e.g., sid is a key), or
• prevent inconsistencies (e.g., sname has to be a string, age must
be < 200)

• Types of IC’s:
– Fundamental: Domain constraints, primary key
constraints, foreign key constraints
– General constraints : Check Constraints and Assertions.
Integrity Constraints on a Single Relation
• Four different constraints
– not null
– primary key
– unique
– check (P)
• where P is a predicate
Not Null and Unique Constraints
• not null
– Declare name and budget to be not null
name varchar(20) not null
budget numeric(12,2) not null
• unique ( A1, A2, …, Am)
– The unique specification states that the
attributes
A1, A2, … Am
form a candidate key.
– Candidate keys are permitted to be null
(in contrast to primary keys).
• check (P) The check clause
where P is a predicate
• Example: ensure that semester value is one of
the following {Fall, Winter, Spring, Summer}:
create table section (
course_id varchar (8),
sec_id varchar (8),
semester varchar (6),
year numeric (4,0),
building varchar (15),
room_number varchar (7),
time slot id varchar (4),
primary key (course_id, sec_id, semester, year),
check (semester in (’Fall’, ’Winter’, ’Spring’,
’Summer’)));
Referential Integrity
• Ensures that a value that appears in one relation for a
given set of attributes also appears for a certain set of
attributes in another relation.
– Example: If “Perryridge” is a branch name appearing in
one of the tuples in the account relation, then there exists
a tuple in the branch relation for branch “Perryridge”.
• Formal Definition
– Let r1(R1) and r2(R2) be relations with primary keys K1 and
K2 respectively.
– The subset  of R2 is a foreign key referencing K1 in
relation r1, if for every t2 in r2 there must be a tuple t1 in r1
such that t1[K1] = t2[].
– Referential integrity constraint also called subset
dependency since its can be written as
 (r2)  K1 (r1)
Checking Referential Integrity on
Database Modification
• The following tests must be made in order to preserve
the following referential integrity constraint:
 (r2)  K (r1)
• Insert. If a tuple t2 is inserted into r2, the system must
ensure that there is a tuple t1 in r1 such that t1[K] =
t2[]. That is
t2 []  K (r1)
• Delete. If a tuple, t1 is deleted from r1, the system
must compute the set of tuples in r2 that reference t1:
= t1[K] (r2)
If this set is not empty
– either the delete command is rejected as an error, or
– the tuples that reference t1 must themselves be deleted
(cascading deletions are possible).
Database Modification (Cont.)
• Update. There are two cases:
– If a tuple t2 is updated in relation r2 and the update modifies values
for foreign key , then a test similar to the insert case is made:
• Let t2’ denote the new value of tuple t2. The system must ensure
that
t2’[]  K(r1)
– If a tuple t1 is updated in r1, and the update modifies values for the
primary key (K), then a test similar to the delete case is made:
1. The system must compute
 = t1[K] (r2)
using the old value of t1 (the value before the update is applied).
2. If this set is not empty
1. the update may be rejected as an error, or
2. the update may be cascaded to the tuples in the set, or
3. the tuples in the set may be deleted.
Referential Integrity in SQL
• Primary and candidate keys and foreign keys can be specified as part of
the SQL create table statement:
– The primary key clause lists attributes that comprise the primary
key.
– The unique key clause lists attributes that comprise a candidate
key.
– The foreign key clause lists the attributes that comprise the foreign
key and the name of the relation referenced by the foreign key.
• By default, a foreign key references the primary key attributes of the
referenced table
foreign key (account-number) references account
• Short form for specifying a single column as foreign key
account-number char (10) references account
• Reference columns in the referenced table can be explicitly specified
– but must be declared as primary/candidate keys
foreign key (account-number) references account(account-number)
Referential Integrity in SQL – Example
create table customer
(customer-name char(20),
customer-street char(30),
customer-city char(30),
primary key (customer-name))
create table branch
(branch-name char(15),
branch-city char(30),
assets integer,
primary key (branch-name))
Referential Integrity in SQL – Example
(Cont.)
create table account
(account-number char(10),
branch-name char(15),
balance integer,
primary key (account-number),
foreign key (branch-name) references branch)
create table depositor
(customer-name char(20),
account-number char(10),
primary key (customer-name, account-number),
foreign key (account-number) references account,
foreign key (customer-name) references customer)
Cascading Actions in SQL
create table account
...
foreign key(branch-name) references branch
on delete cascade
on update cascade
...)
• Due to the on delete cascade clauses, if a delete of a
tuple in branch results in referential-integrity
constraint violation, the delete “cascades” to the
account relation, deleting the tuple that refers to the
branch that was deleted.
• Cascading updates are similar.

Cascading Actions in SQL (Cont.)
If there is a chain of foreign-key dependencies across
multiple relations, with on delete cascade specified for each
dependency, a deletion or update at one end of the chain
can propagate across the entire chain.
• If a cascading update to delete causes a constraint violation
that cannot be handled by a further cascading operation,
the system aborts the transaction.
– As a result, all the changes caused by the transaction and its
cascading actions are undone.
• Referential integrity is only checked at the end of a
transaction
– Intermediate steps are allowed to violate referential integrity
provided later steps remove the violation
– Otherwise it would be impossible to create some database
states, e.g. insert two tuples whose foreign keys point to each
other
• E.g. spouse attribute of relation
marriedperson(name, address, spouse)
Referential Integrity in SQL (Cont.)
• Alternative to cascading:
– on delete set null
– on delete set default
• Null values in foreign key attributes
complicate SQL referential integrity semantics,
and are best prevented using not null
– if any attribute of a foreign key is null, the tuple is
defined to satisfy the foreign key constraint!
Referential Integrity
• Ensures that a value that appears in one relation for a given
set of attributes also appears for a certain set of attributes
in another relation.
– Example: If “Biology” is a department name appearing
in one of the tuples in the instructor relation, then there
exists a tuple in the department relation for “Biology”.
• Let A be a set of attributes. Let R and S be two relations
that contain attributes A and let A be the primary key of S.
A is said to be a foreign key of R if for any values of A
appearing in R these values also appear in S.
– In the above example, R is the instructor table, S is the
department table, and A is department_name
Referential Integrity Example
• create table course (
course_id char(5),
title varchar(20),
dept_name varchar(20),
primary key (course_id)
foreign key (dept_name) references
department)
Cascading Actions in Referential Integrity
• create table course (

dept_name varchar(20),
foreign key (dept_name) references department
on delete cascade
on update cascade,
. . . );
• delete cascade -- if a department (say Biology) is
deleted from the department relation, then all tuples in
the course relation that refer to Biology are deleted.
• update cascade -- if a department (say Biology) is
changed to (say Life-Science) in the department
relation, then all tuples in the course relation that refer
to Biology are updated to refer to life-Science.
• alternative actions to cascade: set null, set default
Domain Constraints
• Integrity constraints guard against accidental damage to
the database, by ensuring that authorized changes to the
database do not result in a loss of data consistency.
• Domain constraints are the most elementary form of
integrity constraint.
• They test values inserted in the database, and test queries
to ensure that the comparisons make sense.
• New domains can be created from existing data types
– E.g. create domain Dollars numeric(12, 2)
create domain Pounds numeric(12,2)
• We cannot assign or compare a value of type Dollars to a
value of type Pounds.
– However, we can convert type as below
(cast r.A as Pounds)
(Should also multiply by the dollar-to-pound conversion-rate)
Domain Constraints (Cont.)
• The check clause in SQL-92 permits domains to be restricted:
– Use check clause to ensure that an hourly-wage domain
allows only values greater than a specified value.
create domain hourly-wage numeric(5,2)
constraint value-test check(value > = 4.00)
– The domain has a constraint that ensures that the hourly-
wage is greater than 4.00
– The clause constraint value-test is optional; useful to
indicate which constraint an update violated.
• Can have complex conditions in domain check
– create domain AccountType char(10)
constraint account-type-test
check (value in (‘Checking’, ‘Saving’))
– check (branch-name in (select branch-name from branch))
Assertions
• An assertion is a predicate expressing a condition that
we wish the database always to satisfy.
• An assertion in SQL takes the form
create assertion <assertion-name> check <predicate>
• When an assertion is made, the system tests it for
validity, and tests it again on every update that may
violate the assertion
– This testing may introduce a significant amount of
overhead; hence assertions should be used with great care.
• Asserting
for all X, P(X)
is achieved in a round-about fashion using
not exists X such that not P(X)
Assertion Example
• The sum of all loan amounts for each branch must
be less than the sum of all account balances at the
branch.
create assertion sum-constraint check
(not exists (select * from branch
where (select sum(amount) from
loan
where loan.branch-name =
branch.branch-name)
>= (select sum(amount) from
account
where loan.branch-name =
branch.branch-name)))
Assertion Example
• Every loan has at least one borrower who maintains an
account with a minimum balance or $1000.00
create assertion balance-constraint check
(not exists (
select * from loan
where not exists (
select *
from borrower, depositor, account
where loan.loan-number = borrower.loan-
number
and borrower.customer-name =
depositor.customer-name
and depositor.account-number =
account.account-number
and account.balance >= 1000)))
Embedded SQL
Embedded SQL
• The SQL standard defines embedding's of SQL in a variety of
programming languages such as C, C++, Java, Fortran, and
PL/1,
• A language to which SQL queries are embedded is referred
to as a host language, and the SQL structures permitted in
the host language comprise embedded SQL.
• The basic form of these languages follows that of the
System R embedding of SQL into PL/1.
• EXEC SQL statement is used to identify embedded SQL
request to the preprocessor
EXEC SQL <embedded SQL statement >;
Note: this varies by language:
– In some languages, like COBOL, the semicolon is replaced with
END-EXEC
– In Java embedding uses # SQL { …. };
Embedded SQL (Cont.)
• Before executing any SQL statements, the program must
first connect to the database. This is done using:
EXEC-SQL connect to server user user-name using
password;
Here, server identifies the server to which a connection is
to be established.
• Variables of the host language can be used within
embedded SQL statements. They are preceded by a colon
(:) to distinguish from SQL variables (e.g., :credit_amount )
• Variables used as above must be declared within DECLARE
section, as illustrated below. The syntax for declaring the
variables, however, follows the usual host language syntax.
EXEC-SQL BEGIN DECLARE SECTION
int credit-amount ;
EXEC-SQL END DECLARE SECTION;
Embedded SQL (Cont.)
• To write an embedded SQL query, we use the
statement:
declare c cursor for <SQL query>
The variable c is used to identify the query
• Example:
– From within a host language, find the ID and name
of students who have completed more than the
number of credits stored in variable credit_amount
in the host langue
– Specify the query in SQL as follows:
EXEC SQL
declare c cursor for
select ID, name
from student
where tot_cred > :credit_amount
END_EXEC
Embedded SQL (Cont.)
• To execute embedded SQL statement we use the
open statement, that causes the database system to
execute the query and to save the results within a
temporary relation
• The open statement for our example is as follows:
EXEC SQL open c ;
The query uses the value of the host-language
variable credit-amount at the time the open
statement is executed.
• The fetch statement causes the values of one tuple
in the query result (i.e., ID and name of a student)
to be placed in host language variables -- :si, :sn
EXEC SQL fetch c into :si, :sn END_EXEC
Repeated calls to fetch get successive tuples in the
query result
Embedded SQL (Cont.)
• A variable called SQLSTATE in the SQL
communication area (SQLCA) is set to the value
‘02000’ when there is no more data available
to fetch.
• The close statement causes the database
system to delete the temporary relation that
holds the result of the query.
EXEC SQL close c ;
Note: above details vary with language. For
example, the Java embedding defines
Java integrators to step through result tuples.
Updates Through Embedded SQL
 Embedded SQL provides mechanism to modify the database relations using --
update, insert, and delete
 Can update tuples fetched by declaring that the cursor is for update
EXEC SQL
declare c cursor for
select *
from instructor
where dept_name = ‘Music’
for update;
 Iterate through the tuples by performing fetch operations on the cursor, and
after fetching each tuple the following code can be executed:
update instructor
set salary = salary + 1000
where current of c;

Anda mungkin juga menyukai