Anda di halaman 1dari 53

MORE SQL

Textbook: 4.3, 5.1, 5.3

Overview
More complex SQL:
Dates
Nested queries
Ordering results
Aggregation and grouping
Set operations
Insert, delete, and update
SQL views

How to solve simple SQL problems


1. Identify the columns that the question is
2.
3.
4.

5.

specifically asking for and place in the SELECT


Identify the table the columns in the SELECT
are from and include in the FROM
Identify any other columns in the question and
include the tables they are from in the FROM
If there is more than one table in the FROM
then join them in the WHERE by equating
foreign keys with primary keys and using AND
Include any other conditions in WHERE

Example
Find employee names working in department

'Research'
1. Identify columns to return and place in SELECT:
SELECT name

Example
Find employee names working in department

'Research'
2. Identify the table the columns in the SELECT
are from and include in the FROM:
SELECT name
FROM employee

Example
Find employee names working in department

'Research'
3. Identify any other columns in the question and
include the tables they are from in the FROM:
SELECT name
FROM employee, department

Example
Find employee names working in department

'Research'
4. If there is more than one table in the FROM then
join them in the WHERE by equating foreign keys
with primary keys and using AND:
SELECT name
FROM employee, department
WHERE dno = dnumber

Example
Find employee names working in department

'Research'
5. Include any other conditions in WHERE:
SELECT name
FROM employee, department
WHERE dno = dnumber
AND dname = 'Research'

Dates
Getting the current date and/or time in MySQL:
NOW(), SYSDATE() Returns current DATETIME
CURDATE() Returns current DATE

Converting DATETIME to DATE:


DATE() Returns DATE portion of DATETIME
e.g.:
DATE(SYSDATE()) is equivalent to CURDATE()

Date arithmetic
Addition and subtraction:
Adding/subtracting from a DATETIME changes seconds
Adding/subtracting from a DATE changes days

Example:
SELECT SYSDATE(), SYSDATE 1, CURDATE(), CURDATE() 1

Date arithmetic problem


Addition and subtraction don't understand dates:
SELECT DATE('2016-08-01') 1

Better to use DATE_ADD()/DATE_SUB() functions

DATE_ADD(), DATE_SUB()
DATE_ADD() and

DATE_SUB() properly
perform date/time
arithmetic
The interval must be
specified

MICROSECOND
SECOND
MINUTE
HOUR
DAY
WEEK
MONTH
QUARTER
YEAR
SECOND_MICROSECOND
MINUTE_MICROSECOND
MINUTE_SECOND
HOUR_MICROSECOND
HOUR_SECOND
HOUR_MINUTE
DAY_MICROSECOND
DAY_SECOND
DAY_MINUTE
DAY_HOUR
YEAR_MONTH

DATE_ADD(), DATE_SUB()
DATE_ADD() syntax:
DATE_ADD(date, INTERVAL expr type)

Examples:
DATE_ADD(SYSDATE(), INTERVAL 5 MINUTE)
DATE_SUB(CURDATE(), INTERVAL 3 DAY)
DATE_ADD(CURDATE(), INTERVAL 5 MONTH)
DATE_SUB(SYSDATE(), INTERVAL 5 HOUR)

Nested queries
Select the name and address of each employee who

works for a department called Research


SELECT E.name, E.address
FROM employee E,
department D
WHERE
D.dname = 'Research' AND
E.dno = D.dnumber;

SELECT E.name, E.address


FROM employee E
WHERE E.dno IN
(SELECT D.dnumber
FROM department D
WHERE
D.dname = 'Research');

Testing for empty set


List the name and ssn of employees who have a

dependent with the same name as the employee


SELECT E.name, E.ssn
FROM employee E
WHERE EXISTS
(SELECT * FROM dependent D
WHERE E.ssn = D.essn AND
E.name = D.dependent_name);
Note that for each employee E the nested query returns the
set of dependents with the desired property

Testing for empty set (2)


List the names of those employees who have no

dependent
SELECT E.name
FROM employee E
WHERE NOT EXISTS
(SELECT * FROM dependent D
WHERE E.ssn = D.essn)
Here, the nested query must be evaluated by the DBMS

as many times as there are employees.

Use of ANY/SOME and ALL


ANY and ALL may be used with subqueries that produce

a single column of numbers.


With ALL, condition will only be true if it is satisfied by all

values produced by subquery.


With ANY, condition will be true if it is satisfied by any

values produced by subquery.


If subquery is empty, ALL returns true, ANY returns false.

Example
SELECT *
FROM employee E
WHERE E.salary >= ALL (SELECT F.salary
FROM employee F
WHERE F.dno = 5);
SELECT *
FROM employee E
WHERE E.salary >= ANY (SELECT F.salary
FROM employee F
WHERE F.dno = 5);

How are nested queries evaluated?


The outer query is evaluated
For each tuple that the outer query produces the WHERE

clause is evaluated
If the where clause contains a test that involves a

nested query then the nested query is evaluated to


be able to perform the test
If the nested querys result is independent from the tuple

variables of the outer query, then the nested query needs to


be evaluated only once
If the nested querys result depends on at least one of the
tuple variables of the outer query, then the nested query
must be repeatedly evaluated for each tested tuple

Set operations
Union (also Intersect, Minus, but not supported in MySQL)
NB Tables must be union compatible!
And duplicates are eliminated in set operations
E.g.,

SELECT W.pno
FROM works_on W, employee E
WHERE W.essn = E.ssn and E.name = 'John Smith'
UNION
SELECT P.pnumber
FROM department D, employee E, project P
WHERE P.dnum = D.dnumber and
D.mgrssn = E.ssn and E.name = 'John Smith'

Ordering results
Retrieve a list of employees and the projects they are

working on

SELECT D.dname, E.name, P.pname


FROM department D, employee E, works_on W,
project P
WHERE D.dnumber=E.dno AND E.ssn=W.essn
AND W.pno=P.pnumber

Ordering results
Retrieve a list of employees and the projects they are

working on, ordered by department and, within each


department, ordered alphabetically by the employee
name, then project name.
SELECT D.dname, E.name, P.pname
FROM department D, employee E, works_on W,
project P
WHERE D.dnumber=E.dno AND E.ssn=W.essn
AND W.pno=P.pnumber
ORDER BY D.dname, E.name, P.pname;

Ordering results
Retrieve a list of employees and the projects they are

working on, ordered by department in a descending order


and, within each department, ordered alphabetically by
the employee name, then project name.
SELECT D.dname, E.name, P.pname
FROM department D, employee E, works_on W,
project P
WHERE D.dnumber=E.dno AND E.ssn=W.essn
AND W.pno=P.pnumber
ORDER BY D.dname DESC, E.name ASC,
P.pname ASC

SELECT expression
SELECT expression can involve column names,

constants, functions and operators.


E.g., SELECT name, salary / 1000
FROM employee
WHERE salary > 50000;
arithmetic operators for numbers: +, -, *, /

Aggregate functions
List the sum, average, minimum and maximum of salary

values of employees
SELECT SUM(E.salary), AVG(E.salary),
MIN(E.salary), MAX(E.salary)
FROM employee E;
Note that the result contains only one tuple!
E.g., SUM(sal) AVG(sal)

272,500

MIN(sal)
68,125 47,500 83,000

MAX(sal)

The following query is wrong:

SELECT E.ssn, SUM(E.salary), AVG(E.salary),


MIN(E.salary), MAX(E.salary)
FROM employee E;

Evaluate aggregate functions

Count
SELECT COUNT(E.salary) FROM employee E;
It does not count rows that have NULL value for salary
Count rows that have distinct values

SELECT COUNT(DISTINCT E.salary)


FROM employee E;
COUNT(*)

SELECT count(*)
FROM employee
WHERE salary>70000;
It counts rows that have NULL values

Example of aggregate functions


List the names and salaries of those employees whose

salaries are above average.


SELECT E.name, E.salary
FROM employee E
WHERE E.salary >
(SELECT AVG(F.salary)
FROM employee F);

Aggregate of groups
List the department number and the average salary for

each department
SELECT D.dnumber, AVG(E.salary)
FROM department D, employee E
WHERE E.dno = D.dnumber
GROUP BY D.dnumber;
Only GROUP BY and aggregate attributes can be present

in the SELECT clause.


The WHERE condition cannot refer to aggregate attributes,
since the aggregation has not been carried out yet

Example
For each project, retrieve the project number, the project

name, and the number of employees working on the


project.
SELECT P.pnumber, P.pname, COUNT(*)
FROM project P, works_on W
WHERE P.pnumber = W.pno
GROUP BY P.pnumber, P.pname;

Conditions on grouped aggregate


List the average salary of those departments where the

average salary is greater than 70,000


SELECT D.dnumber, AVG(E.salary)
FROM department D, employee E
WHERE E.dno = D.dnumber
GROUP BY D.dnumber
HAVING AVG(E.salary) > 32000;
The WHERE clause eliminates tuples from the cross product (of

tables in the FROM clause) before the GROUP BY clause is


applied
The HAVING clause eliminates groups produced after the GROUP
BY clause is applied

Example
List department names and the number of employees who

earn >30,000 in each department, provided that the


department has > 3 employees
SELECT D.dname, count(*)
FROM department D, employee E
WHERE D.dnumber = E.dno AND E.salary > 30000
GROUP BY D.dname HAVING count(*) >3;
Count(*) will count the number of employees in the

department who earn >30,000 and the HAVING clause will


eliminate those from the result where the number of
employees earning >30,000 is not >3. Thus this query is
WRONG

Example
List department names and the number of employees

who earn >30,000 in each department, provided that the


department has > 3 employees
SELECT D.dname, count(*)
FROM department D, employee E
WHERE D.dnumber = E.dno AND E.salary >30000
AND E.dno IN (SELECT EM.dno
FROM employee EM
GROUP BY EM.dno
HAVING count(*) > 3)
GROUP BY D.dname;

Example: equivalent?
SELECT W.pno, COUNT(*)
FROM works_on W
WHERE W.hours >= 10
GROUP BY W.pno HAVING W.pno=2;
SELECT W.pno, COUNT(*)
FROM works_on W
WHERE W.hours >= 10 AND W.pno=2
GROUP BY W.pno;
SELECT W.pno, COUNT(*)
FROM works_on W
WHERE W.hours >= 10 AND W.pno=2;

Example
List employees (by ssn and name) who work in at least

two distinct projects.


SELECT E.ssn, E.name
FROM employee E, works_on W
WHERE E.ssn = W.essn
GROUP BY E.ssn, E.name
HAVING COUNT(DISTINCT W.pno)>=2;

Example: using a nested query


List employees (by ssn and name) who work in two distinct

projects.
SELECT E.ssn, E.name
FROM employee E
WHERE E.ssn IN
(SELECT W.essn
FROM works_on W
GROUP BY W.essn
HAVING COUNT(DISTINCT W.pno)>=2);

Summary of SQL query syntax


SELECT [DISTINCT | ALL]
{* | exp1 [AS ] newName1,, expk [AS ] newNameK}
FROM table1 [alias1], ., tableN [aliasN]
[WHERE condition]
[GROUP BY columnList]
[HAVING condition]
[ORDER BY columnList]

Summary of SQL query evaluation


Create cross product of tables in the FROM clause
Evaluate the WHERE clause
On the remaining relation create groups according to the

GROUP BY clause
Evaluate aggregates
Test tuples in the result if they satisfy the HAVING clause
Eliminate duplicates if prescribed by DISTINCT, and print
the result according to the SELECT clause
Order result according to ORDER BY clause

Insert a tuple
Example: Insert a new employee into the Employee table:

INSERT INTO employee VALUES


(John Smith, 12345678, null, null, null, null);
the tuple must have the same order of attributes as

defined in the schema

INSERT INTO employee(name, ssn) VALUES


(John Smith, 12345678);
attributes not supplied are set to NULL

Deletion
Delete all employees of department number 5

DELETE FROM employee


WHERE dno = 5;
Delete employee with name J.S and ssn 1234567

DELETE FROM employee


WHERE name = J.S AND ssn = 1234567;

Update
Give a salary raise to all employees

UPDATE employee
SET salary = salary*1.1;
Give a salary raise to employees in department 5, and

transfers them to department 6


UPDATE employee
SET salary = salary*1.1 AND dno = 6
WHERE dno = 5;

Update vs. Delete + Insert


It is possible to simulate the effects of an update

statement with a deletion (delete the old tuples) followed


by and insertion (add tuples with the new values)
The two solutions are not entirely equivalent, because the

database state between the delete and insert may be


visible to others and/or may be inconsistent. As opposed
to this the update is always done in one transaction.

Insert tuples using queries


Suppose we have a schema:

Department_sal (dno, salary_average, date)


We can insert into this table the result of a query:

INSERT INTO Department_sal


SELECT E.dno, AVG(E.salary), 2-FEB-2005
FROM Employee E
GROUP BY E.dno;
Create a table using queries

CREATE TABLE Employee_dep5 AS


SELECT * FROM Employee E WHERE E.dno=5;

Alias
Useful in creating new tables from another table and

nested queries
CREATE TABLE employee_name AS
SELECT E.ssn AS emp_ssn, E.name AS emp_name
FROM employee E;
SELECT *
FROM (SELECT E.dno, AVG(E.salary) AS dep_sal
FROM employee E
GROUP BY E.dno) av -- Derived table needs alias
WHERE dep_sal >=30000;

SQL Views
An SQL view is a form of external schema
An SQL view does not describe a real table, it is a virtual

table, or derived relation


For the user of the database (at least for querying
purposes) a view is just like a table. (Views can be used
to define external schemas for applications)
The data in a view may or may not explicitly exist in the
database (e.g. aggregate values)

Define and delete a view


CREATE VIEW
dep_sal_view (dname, salary_average) AS
SELECT E.dno, AVG(E.salary)
FROM employee E
GROUP BY E.dno;
Note: A view definition may refer to other, already defined
views
Delete a view definition

DROP VIEW dep_sal_view;

Execute a query on a view


SELECT * FROM dep_sal_view;
The DBMS translates this query, based on the view
definition into an equivalent query that only mentions base
tables
SELECT E.dno, AVG(E.salary)
FROM employee E
GROUP BY E.dno
If the base tables change the result of the queries on
views automatically change

Update views
Normally difficult, only special cases are possible

(meaningful). For example, let us define a view:


CREATE VIEW works_on_view (emp, proj, hrs) AS
SELECT E.name, P.pname, W.hours
FROM employee E, project P, works_on W
WHERE E.ssn = W.essn AND P.pnumber = W.pno;
Suppose that at this moment John Smith works on

Project X.Try to execute:


UPDATE works_on_view W
SET W.pname = ProjectY
WHERE W.emp = John Smith;

The view update problem


What could be the intended meaning of this?
UPDATE works_on_view W
SET W.pname = ProjectY
WHERE W.emp = John Smith
Potential meaning #1: We want John Smith to work on

Project Y instead of Project X.


Potential meaning #2: We want the name of the project
Project X to be changed to Project Y
We have to update the underlying base relations,
#1: Works_on
#2: Project

Ambiguity: we can not decide


what was the intended meaning

The view update problem (2)


Similar ambiguity arises if the view contains aggregate

attributes (e.g. average salary).


For example: What does it mean that we increase the
average salary by 10,000? Give an across the board
raise or give a raise to some employees?
Answer: it is not decidable.

Solution 1 to the view updates


We dont allow it, unless
1. The view is defined on a single base table
2. The view does not contain aggregate attributes
Theoretically, the class of permissible (unambiguous)

updates on views is somewhat larger, everywhere where


it is possible to prove that the update can only effect at
most one base relation tuple in each base relation
involved.

Solution 2 to view updates


If we want to allow view updates where there is ambiguity,

then we must define what is the intended operation on the


base tables
It is possible to do this in SQL, using triggers
CREATE TRIGGER works_on_view_update_trigger
INSTEAD OF UPDATE ON works_on_view
FOR EACH ROW
BEGIN
<SQL update statement comes here to update the underlying
works_on table>
END
The treatment of triggers is not part of this introductory course