Anda di halaman 1dari 142

2

Introduction to
Database Management Systems
(DBMS)
Database Management
System (DBMS)
Definitions:

 Data: Known facts that can be


recorded and that have implicit
meaning
 Database: Collection of related
data
 Ex. the names, telephone numbers
and addresses of all the people you
know
 Database Management System: 4
DBMS (Contd.)
 Goals of a Database
Management System:
 To provide an efficient as well as
convenient environment for
accessing data in a database
 Enforce information security:
database security, concurrence
control, crash recovery
 It is a general purpose facility
5
Benefits of database
approach
 Redundancy can be reduced
 Inconsistency can be avoided
 Data can be shared
 Standards can be enforced
 Security restrictions can be applied
 Integrity can be maintained
 Data independence can be provided

6
DBMS Functions
 Data Definition
 Data Manipulation
 Data Security and Integrity
 Data Recovery and Concurrency
 Data Dictionary
 Performance

7
Database System
Users

DATABASE Application Programs/Queries


SYSTEM
DBMS
Software Software to process queries/programs

Software to access stored data

Stored Data Defn.


(META-DATA). Stored Database

8
Database System

Application user Database


program query
Q2 query Q1 scheme

Query processor DDL compiler

Compiled query
Q2 Database
Database manager description

File manager

Physical
database

9
Data Model
 A set of concepts used to describe the
structure of a database
 By structure, we mean the data types,
relationships, and constraints that
should holds for theofdata
Categories Data
Models

Conceptua Physical Representation


l al
10
Database Architecture
External level
(individual user
views)

Conceptual level
(community user view)

Internal level
(storage view)

Database

11
An example of the three
levels
SNo FName LName Age Salary BranchNo
Conceptual View
struct STAFF { Internal
int staffNo;
int branchNo;
View
SNo FName LName Age Salary char fName[15];
char lName[15];
struct date dateOfBirth;
External View1 float salary;
struct STAFF *next;
/* pointer to next Staff record
*/
SNo LName BranchNo };
index staffNo; index branchNo;
External View2 /* define indexes for staff */

12
Schema
 Schema: Description of data in terms
of a data model
 Three-level DB Architecture defines
following schemas:
 External Schema (or sub-schema)
 Written using external DDL
 Conceptual Schema (or schema)
 Written using conceptual DDL
 Internal Schema
 Written using internal DDL or storage
structure definition 13
Data Independence
 Change the schema at one level of a
database system without a need to
change the schema at the next
higher level
 Logical data independence: Refers to
the immunity of the external schemas
to changes in the conceptual schema
e.g., add new record or field
 Physical data independence: Refers to
the immunity of the conceptual schema 14
TYPES OF DATABASE
MODELS
HIERARCHICAL

NETWORK

COLUMN

TABLE
ROW VALUE

RELATIONAL

15
DATABASE DESIGN
PHASES
DATA ANALYSIS
Entities - Attributes - Relationships - Integrity Rules

LOGICAL DESIGN
Tables - Columns - Primary Keys - Foreign Keys

PHYSICAL DESIGN
DDL for Tablespaces, Tables, Indexes

16
Introduction to
Relational
Databases:
RDBMS
Definition : RDBMS
 It is a system in which, at a minimum
:
 The data is perceived by the user as
tables ( and nothing but tables ); and
 The operators at the user’s disposal -
e.g., for data retrieval - are operators
that generate new tables from old, and
those include at least SELECT, PROJECT,18
Features of an RDBMS

 The ability to create multiple


relations (tables) and enter data into
them
 An interactive query language
 Retrieval of information stored in
more than one table
 Provides a Catalog or Dictionary,
which itself consists of tables ( called
system tables ) 19
Some Important Terms
 Relation : a table
 Tuple : a row in a table
 Attribute : a Column in a table
 Degree : number of attributes
 Cardinality : number of tuples
 Primary Key : a unique identifier for the
table
 Domain : a pool of values from which 20
Properties of Relations
(Tables)
 There are no duplicate rows (tuples)
 Tuples are unordered, top to bottom
 Attributes are unordered, left to right
 All attribute values are atomic ( or
scalar )
 Relational databases do not allow
repeating groups
21
Keys
 Key
 Super Key
 Candidate Keys
 Primary Key
 Alternate Key
 Secondary Keys

22
Keys and Referential
Integrity
Enrolled Student

sid cid grade sid name login age gpa


53666 carnatic101 C
53666 Jones Jones@cs 18 3.4
53688 reggae203 B
53688 Smith Smith@eecs 18 3.2
53650 topology112 A
53650 Smith Smith@math 19 3.8
53666 history105 B

Primary key
Foreign key referring to
sid of STUDENT relation

23
24
Relational
Algebra
Relational Query Languages
 Query languages: Allow manipulation
and retrieval of data from a
database.
 Relational model supports simple,
powerful QLs:
 Strong formal foundation based on
logic.
 Allows for much optimization.

 Query Languages != programming


languages! 26
Example Instances
sid bid day

R1 22 101 10/10/99
58 103 11/12/99

sid sname rating age


S1 22 Deepa 7 45.0
31 Laxmi 8 55.5
58 Roopa 10 35.0

sid sname rating age


28 Yamuna 9 35.0
S2
31 Laxmi 8 55.5
44 Geeta 5 35.0
58 Roopa 10 35.0

27
Relational Algebra
 Basic operations:
 Selection (σ )
 Projection (π)
 Cross- product ( Χ)
 Set- difference ( –)
 Union (∪ )

28
Projection

sname rating
Yamuna 9 π sname, rating(S2)
Laxmi 8
Geeta 5
Roopa 10

age
35.0
55.5 π age(S2)

29
Selection

sid sname rating age


28 Yamuna 9 35.0 σ rating > 8(S2)
58 Roopa 10 35.0

sname rating
Yamuna 9
Roopa 10 π sname, rating(S2) (σ rating > 8(S2))

30
Union, Intersection, Set
Difference
sid sname rating age
22 Deepa 7 45.0 S1 ∪ S2
31 Laxmi 8 55.5
58 Roopa 10 35.0
44 Geeta 5 35.0
28 Yamuna 9 35.0

sid sname rating age


31 Laxmi 8 55.5 S1 ∩ S2
58 Roopa 10 35.0

sid sname rating age


22 Deepa 7 45.0 S1 − S2

31
Cross- Product
(sid) sname rating age (sid) bid day
22 Deepa 7 45.0 22 101 10/10/99
22 Deepa 7 45.0 58 103 11/12/99
31 Laxmi 8 55.5 22 101 10/10/99
31 Laxmi 8 55.5 58 103 11/12/99
58 Roopa 10 35.0 22 101 10/10/99
58 Roopa 10 35.0 58 103 11/12/99

32
Joins

Condition Join :

(sid) sname rating age (sid) bid day


22 Deepa 7 45.0 22 101 10/10/99
31 Laxmi 8 55.5 58 103 11/12/99

33
Equi-Join

(sid) sname rating age bid day


22 Deepa 7 45.0 101 10/10/99
58 Roopa 10 35.0 103 11/12/99

34
Division
•Not supported as a primitive operator, but useful for
expressing queries like:
•Find sailors who have reserved all boats .
sno pno
pno
s1 p1 pno pno
A s1 p2 p2 p1
p2
s1 p3 p4 p2
B1
s1 p4 B2 p4
s2 p1 sno
s2 p2 s1 B3
sno
s2
s3 p2 s1 sno
s3
s4 p2
s4 s4 s1
s4 p4
A/B1 A/B2 A/B3

35
36
Introduction to
Query
Optimization
Processing A High-
level Query
Query in a high level language

SCANING, PARSING AND VALIDATING

Intermediate form of query

QUERY OPTIMIZER
Typical steps
Execution plan when processing
a high level
QUERY CODE GENERATOR
query.
Code to execute the query

RUNTIME DATABASE PROCESSOR

Result of query

38
Two Main Techniques for
Query
Optimization
 Heuristic Rules: A heuristic is a rule that
works well in most of cases, but not
always. General Idea:
 Many different relational algebra
expressions (and thus query trees) are
equivalent.
 Transform the initial query tree of a query
into an equivalent final query tree that is
efficient to execute.
 Cost based query optimization
39
Motivating Example
select *
from R1, R2, R3
where R1.r2no=R2.r2no NLJ
and R2.r3no=R3.r3no
and R1.a=5000
NLJ SS(R1, “a=5000”)

SS(R2) SS(R3)

40
Alternative Plans 1(No
Indexes)
select *
from R1, R2, R3
where R1.r2no=R2.r2no NLJ
and R2.r3no=R3.r3no
and R1.a=5000
NLJ SS(R3)

SS(R1, “a=5000”) SS(R2)

41
Alternative Plans 2
(With Indexes)
select *
from R1, R2, R3
where R1.r2no=R2.r2no NLJ
and R2.r3no=R3.r3no
and R1.a=5000
NLJ SS(R3)

IS(R1, “a=5000”) SS(R2)

42
43
Conceptual Design
Using the
Entity-
Relationship
Model
Overview of Database
Design
 Conceptual design : (ER Model is
used at this stage.)

 Schema Refinement : (Normalization)

 Physical Database Design and Tuning

45
E R Modeling
 Conceptual Schema Design
 Relational Calculus
- Formal Language for Relational
D/B.
Relational Calculus

Predicate Calculus Domain Calculus


SQL / Tuple Based Query By Examples
46
Design Phases…
Requirements Collection
& Analysis
Data Requirements

Functional Requirements Conceptual Design

User Defined Operations Entity Types, Constraints , Relationships


Data Flow Diagrams No Implementation Details.
Sequence Diagrams, Scenarios
Ensures Requirements
Logical Design Meets the Design
Data Model Mapping – Type of Database is identified

Physical Design
Internal Storage Structures / Access Path / File Organizations
47
E-R Modeling
 Entity
 is anything that exists and is
distinguishable
 Entity Set
 a group of similar entities
 Attribute
 properties that describe an entity
 Relationship
 an association between entities 48
Notations
ENTITY TYPE ( REGULAR )

WEAK ENTITY TYPE

RELATIONSHIP TYPE

WEAK RELATIONSHIP TYPE

49
Entity

Attributes
SSN NAME LOT
ssn name lot
123- 22- 3666Attishoo 48
231- 31- 5368Smiley 22
Employee 131- 24- 3650Smethurst 35
Entity Set
CREATE TABLE Employees
(ssn CHAR (11),
name CHAR (20),
lot INTEGER,
PRIMARY KEY (ssn))

50
Types of Relationships
1 1
1:1 student Is issued ID card

1:M students M 1 course


enrols in

M:M students M M tests


take

51
ER Model

ssn name lot since did dname budget

Employee Works_in Department


supervisor Sub-
ordinate

Reports_To

52
ER Model (Contd.)
Works_ In

SSN DID SINCE


123-22-3666 51 1/1/91 CREATE TABLE Works_ In(
ssn CHAR (11),
123-22-3666 56 3/3/93
did INTEGER,
231-31-5368 51 2/2/92 since DATE,
PRIMARY KEY (ssn, did),
FOREIGN KEY (ssn)
REFERENCES Employees,
FOREIGN KEY (did)
REFERENCES Departments)

53
Key Constraints

ssn name lot since did dname budget

Employee Department
Manages

54
Key Constraints for Ternary Relationships

ssn name lot since did dname budget

Employee Works_in Department

Location
capacity
address

55
Participation Constraints

ssn name lot since did dname budget

Employee Manages Department

Works_in

since

56
Weak Entities

ssn name lot cost pname


age

Employee Dependent
policy

57
ISA (‘is a’) Hierarchies

ssn name lot

Employee

Hrly_wages
Hrs_worked IsA
contractid

Hourly_Emp Contract_Emp

58
Aggregation

ssn name lot

Employee

monitors until

pid pbudget Started on did dname budget

project sponsors department

59
Entity vs. Attribute

Works_ In does not allow an employee to work in a department


for two or more periods (why?)

ssn name lot from to did dname budget

Employee Department
Works_in

60
Entity vs. Attribute (Contd.)

ssn name lot did dname budget

Employee Department
Works_in

from Duration to

61
Entity vs. Relationship

ssn name lot since DB did dname budget

Employee Department
manages

DB - Dbudget

62
Entity vs. Relationship

ssn name lot did dname budget

Employee Department
manages

since

Appt num Mgr_appt


DBudget

63
Binary vs. Ternary Relationships

ssn name lot pname age

Employee Dependent
covers

Policy

policyid cost

64
Binary vs. Ternary Relationships
Better Design
ssn name lot pname
age

Employee Dependent

purchaser Beneficiary

Policy
policyid cost

65
Constraints Beyond the ER Model

• Some constraints cannot be captured in ER diagrams:

• Functional dependencies

• Inclusion dependencies

• General constraints

66
E-R Diagram

DEPARTMENT
1
SUPPLIER
DEPT_
M
EMP
PROJ_ M
M M
M WORK M SUPP_
EMPLOYEE PROJECT PART_
M PROJ
1 1 PROJ_ SUPP_
MGR M PART
EMP_
DEP PART M

M M M
DEPENDENT
PART_
STRUC
TURE

67
Example to Start with ….
 An Example Database Application
called COMPANY which serves to
illustrate the ER Model concepts and
their schema design.

The following are collection from the


Client.

68
Analysis…
 Company :
Organized into Departments, Each
Department has a name, no and
manager who manages the
department. The Company keeps
track of the date that employee
managing the department. A
Department may have a Several
locations.
69
Analysis…
 Department :
A Department controls a number of
Projects each of which has a unique name
, no and a single Location.
 Employee :
Name, Age, Gender, BirthDate, SSN,
Address, Salary. An Employee is assigned
to one department, may work on several
projects which are not controlled by the
department. Track of the number of hours
per week is also controlled.
70
Analysis….
 Keep track of the dependents of each
employee for insurance policies : We
keep each dependant first name,
gender, Date of birth and relationship
to the employee.

71
Now to our Company…
DEPARTMENT
( Name , Number , { Locations } , Manager, Start Date )

PROJECT
( Name, Number, Location , Controlling Department )

EMPLOYEE
(Name (Fname, Lname) , SSN , Gender, Address, Salary
Birthdate, Department , Supervisor , (Workson ( Project , Hrs))

DEPENDENT
( Employee, Name, Gender, Birthdate , Relationship )

72
Example …
 Manage:
 Department and Employee
 Partial Participation
 Relation Attribute : StartDate.
 Works For:
 Department and Employee
 Total Participation

73
Example…
 Control :
 Department , Project
 Partial Participation from Department

 Total Participation from Project

 Control Department is a RKA.

 Supervisor :
 Employee, Employee
 Partial and Recursive

74
Example …
 Works – On :
 Project , Employee
 Total Participation

 Hours Worked is a RKA.

 Dependants of:
 Employee , Dependant
 Dependant is a Weaker

 Dependant is Total , Employee is Partial.

75
One Possible mapping of the
Problem StatementName No Loc
Lname
Fname Sal Work
s For Department
Sex
SSN
Name
Sdate
Address Control
Employee s

Bdate manage
s
Hours
Project
Work
Supe sOn
rvise
s Name No Loc

Depend On

Dependent Relationship

Name Sex Bdate 76


77
78
79
80
Schema
Refinement and
Normalization
Normalization and
Normal Forms
 Normalization:
 Decomposing a larger, complex table
into several smaller, simpler ones.
 Move from a lower normal form to a
higher Normal form.
 Normal Forms:
 First Normal Form (1NF)
 Second Normal Form (2NF)
 Third Normal Form (3NF)
 *Higher Normal Forms (BCNF, 4NF, 5NF
....)
 In practice, 3NF is often good enough. 82
Why Normal Forms
 The first question to ask is whether
any refinement is needed!
 If a relation is in a certain normal
form (BCNF, 3NF etc.), it is known
that certain kinds of problems are
avoided/ minimized. This can be used
to help us decide whether
decomposing the relation will help.
83
The Evils of Redundancy
 Redundancy is at the root of several
problems associated with relational
schemas
 More seriously, data redundancy
causes several anomalies: insert,
update, delete
 Wastage of storage.
 Main refinement technique:
decomposition (replacing ABCD with,
say, AB and BCD, or ACD and ABD). 84
Refining an ER Diagram -
Before
ssn name lot since did dname budget

Employee Works_in Department

85
Refining an ER Diagram -
After

ssn name since did dname budget


lot
Employee Department
Works_in

86
First Normal Form
 A table is in 1NF, if every row contains
exactly one value for each attribute.
 Disallow multivalued attributes,
composite attributes and their
combinations.
 1NF states that :
 domains of attributes must include only
atomic (simple, indivisible) values and that
value of any attribute in a tuple must be a
single value from the domain of that
87
Functional Dependencies
(FDs)
 Provide a formal mechanism to
express constraints between
attributes
 Given a relation R, attribute Y of R is
functionally dependent on the
attribute X of R if & only if each X-
value in R has associated with it
precisely one Y-value in R.
88
Full Dependency
 Concept of full functional
dependency
 A FD x → y is a full functional
dependency if removal of any attribute
A from X means that the dependency
does not hold any more.

89
Partial Dependency
 An F.D. x → y is a partial dependency
if there is some attribute A ∈ X that
can be removed from X and the
dependency will still hold.

90
Example: Constraints on
Entity Set
S N L R W H
123- 22- 3666 Attishoo 48 8 10 40
231- 31- 5368 Smiley 22 8 10 30
131- 24- 3650 Smethurst 35 5 7 30
434- 26- 3751 Guldu 35 5 7 32
612- 67- 4134 Madayan 35 8 10 40

S N L H R
123- 22- 3666 Attishoo 48 40 8
8 R W
231- 31- 5368 Smiley 22 30
5 5 7
131- 24- 3650 Smethurst 35 30
8 10
434- 26- 3751 Guldu 35 32 5
612- 67- 4134 Madayan 35 40 8
91
Second Normal Form
(2NF)
 A relation schema R is in 2NF if:
 it is in 1NF and
 every non-prime attribute A in R is fully
functionally dependent on the primary
key of R.
 2NF prohibits partial dependencies.

92
2NF: An Example
 Emp{Eno, Dept, ProjCode, Hours}
 Primary key: {Eno, ProjCode}
 {Eno} -> {Dept}, {Eno, ProjCode} ->
{Hours}
 Test of 2NF
 {Eno} -> {Dept}: partial dependency.
 Emp is in 1NF, but not in 2NF.
 Decomposition:
 Emp {Eno, Dept} 93
Transitive Dependency
 An FD X → Y in a relation schema R is
a transitive dependency if
 there is a set of attributes Z that is not a
subset of any key of R, and
 both X → Z and Z → Y hold.

94
Third Normal Form
 A relation schema R is in 3NF if
 It is in 2NF and
 No nonprime attribute of R is
transitively dependent on the primary
key.
 3NF means that each non-key attribute value
in any tuple is truly dependent on the Primary
Key and not even partially on other attributes.
 3NF prohibits transitive 95
3NF: An Example
 Emp{Eno, Dept, Dept_Head}
 Primary key: {Eno}
 {Eno} -> {Dept}, {Dept} -> {Dept_Head}

 Test of 3NF
 {Eno} -> {Dept} -> {Dept_Head}:
Transitive dependency.
 Emp is in 2NF, but not in 3NF.

 Decomposition:
 Emp {Eno, Dept}
 Dept {Dept, Dept_Head} 96
Boyce –Codd Normal
Form
 The intention of BCNF is that- 3NF
does not satisfactorily handle the
case of a relation processing two or
more composite or overlapping
candidate keys

97
BCNF ( Boyce Codd
Normal Form)
 A Relation is said to be in Boyce
Codd Normal Form (BCNF) if and only
if every determinant is a candidate
key.

98
Decomposition of a
Relation Scheme
 Suppose that relation R contains
attributes A1 ... An. A decomposition
of R consists of replacing R by two or
more relations such that:
 Each new relation scheme contains a
subset of the attributes of R (and no
attributes that do not appear in R), and
 Every attribute of R appears as an
attribute of one of the new relations.

99
100
101
102
103
104
105
106
Transaction,
Concurrency
Control and
Recovery
Transaction
 A sequence of many actions which
are considered to be one atomic unit
of work.
 Read, write, commit, abort
 Governed by four ACID properties:
 Atomicity, Consistency, Isolation,
Durability
 Has a unique starting point, some
actions and one end point
108
The ACID Properties
 A tomicity: All actions in the
transaction happen, or none happen.
 C onsistency: If each transaction is
consistent, and the DB starts
consistent, it ends up consistent.
 I solation: Execution of one
transaction is isolated from that of
other transactions.
 D urability: If a transaction commits,
its effects persist.
109
Automicity
 All-or-nothing, no partial results. An event either
happens and is committed or fails and is rolled
back.
 e.g. in a money transfer, debit one account,
credit the other. Either both debiting and
crediting operations succeed, or neither of
them do.
 Transaction failure is called Abort
 Commit and abort are irrevocable actions. There
is no undo for these actions.
 An Abort undoes operations that have already
been executed
 For database operations, restore the data’s
previous value from before the transaction
(Rollback-it); a Rollback command will undo
all actions taken since the last commit for that110
Consistency
 Every transaction should maintain DB
consistency
 Referential integrity - e.g. each order
references an existing customer number
and existing part numbers
 The books balance (debits = credits,
assets = liabilities)
 Consistency preservation is a property of a
transaction, not of the database
mechanisms for controlling it (unlike the A,
I, and D of ACID)
 If each transaction maintains consistency,111
Isolation
Intuitively, the effect of a set of transactions
should
be the same as if they ran independently.
 Formally, an interleaved execution of
transactions is serializable if its effect is
equivalent to a serial one.
 Implies a user view where the system runs
each user’s transaction stand-alone.
 Of course, transactions in fact run with lots
of concurrency, to use device parallelism –
this will be covered later.
 Transactions can use common data (shared
data)
 They can use the same data processing
112
Durability
 When a transaction commits, its results
will survive failures (e.g. of the application,
OS, DB system … even of the disk).
 Makes it possible for a transaction to be a
legal contract.
 Implementation is usually via a log
 DB system writes all transaction updates
to a log file
 to commit, it adds a record “commit(Ti)”
to the log
 when the commit record is on disk, the
transaction is committed. 113
Transaction processing
Can be automatic (controlled by the
RDBMS) or programmatic
(programmed using SQL or other
supported programming languages,
like PL/SQL)

114
Why Have Concurrent
Processes?
 Better transaction throughput
 Improved response time
 Done via better utilization of
resources:
 While one processes is doing a disk
read, another can be using the CPU or
reading another disk.

115
Typical situations requiring
concurrency control
 Exclusive access to an external
device or shared service (e.g.,
managing printer queues)
 Coordination of applications which
process parallel data (e.g. parallel DB
servers)
 Disabling or enabling execution of
the client programs in a specific
moment (typically for database
administration - e.g. database
backups, enforcing resource
occupation, etc.) 116
Problems with Concurrency (in
absence of locking)
 Lost Update problem - losing values
due to intervention of write operation
from other overlapping transactions
 Temporary Update problem -
discarding previous changes made
by overlapping transaction after
rollback
 Incorrect Summary problem -
overwriting of certain
values used for calculation by write
operations from other transactions
117
Lost Update Problem

Time Transaction Value Transaction


A B
T0 Start A 6

T1 Read Value 6 Start B


(6)
T2 Add 2 (6+2=8) 6 Read Value
(6)
T3 Write Value 8 Add 3
(8) (6+3=9)
T4 End A 9 Write Value
(9)
T5 9 End B
 What should the final Order Value be?
 Which Update has been lost?
118
Temporary Update Problem
Time Transaction A Value Transaction
B
T0 Start A 6

T1 Read Value (6) 6

T2 Add 2 (8) 6

T3 Write Value 8 Start B


(8)
T4 Failure: 8 Read Value
Rollback! (8)
T5 Write Value (6) 6 Add 3
(8+3=11)
T6 End A 11 Write Value
(11)
T5 11 End B
 What should the final Order Value be?
 Where is the temporary update?
119
Incorrect Summary Problem
Time Transaction Values Transaction B
A 6
T0 Read 1 Value
st

(6) 3

T1 6
Add 2 (6+2=8) 3
T2 Write 1st Value 8
(8) 3

T3 Read 2nd Value 8 Read 1st Value


(3) 3 (8)
8 Read 2nd Value (3)
T4 Add 2 (3+2 = 5) 3
8 Total Sum = 11
T5 Write 2 Value
nd 5
 should(5)
What should the total Order Value be?
the total
 Which order was accumulated before update,
and which after?
120
t1 t2
3.1 Database State and Changes

State D1 State D2

T
D1, D2 - Logically consistent states of the database data
T- Transaction for changing the database
t1, t2 - Absolute time before and after the transaction

121
Progress
A transaction reaches its commit point when all
operations accessing the database are completed
and the result has been recorded in the log. It then
writes a [commit, <transaction-id>] and terminates.
BEGIN END COMMIT
active partially committed
committed

ROLLBACK
READ , WRITE ROLLBACK

aborted terminated

When a system failure occurs, search the log file for entries
[start, <transaction-id>]
and if there are no logged entries [commit, <transaction-id>]
then undo all operations that have logged entries
[write, <transaction-id>, X, old_value, new_value]

122
Schedules
• Schedule: Actions of transactions as seen by the DBMS

T1 T2
R(A)
W(A)
R(B)
W(B)
R(C)
W(C)

123
Serializable Schedule
 A schedule whose effect on the DB
“state” is the same as that of some
serial schedule
 All serial schedules are serializable
 But the reverse may not be true

124
Serializability Violations
Transfer Add 6%
Rs.10,000 interest to
from A to B A&B
T1 T2
R(A)
W(A)
R(A)
W(A)
Database is R(B)
inconsistent! W(B)
commit
R(B)
W(B)
commit

125
Cascading Aborts

T1 T2
R(A)
W(A)
R(A)
W(A)
abort

126
Recoverable Schedules
Unrecoverable Schedule Recoverable Schedule

T1 T2 T1 T2
R(A) R(A)
W(A) W(A)
R(A) R(A)
W(A) W(A)
commit commit
abort commit

127
Locking
 The concept of locking data items is one of the
main techniques for controlling the concurrent
execution of transactions.
 A lock is a variable associated with a data item in
the database.
 Generally there is a lock for each data item in
the database.
 A lock describes the status of the data item with
respect to possible operations that can be applied
to that item
 used for synchronising the access by
concurrent transactions to the database items.
 A transaction locks an object before using it
 When an object is locked by another transaction,128
Locking Granularity
 A database item which can be locked could be
 a database record
 a field value of a database record
 the whole database
 Trade-offs
 coarse granularity
 the larger the data item size, the lower the
degree of concurrency
 fine granularity
 the smaller the data item size, the more
locks to be managed and stored, and the
more lock/unlock operations needed.

129
Locking: A Technique for
Concurrency Control
•Locks are automatically obtained by DBMS.
•Guarantees serializability!

Compatibility matrix for lock types X and S


-- S X
-- √ √ √ S: Shared lock
S √ √ X: Exclusive lock
-- No lock
X √

130
Two- Phase Locking (2PL)
Strict 2PL:
– If T wants to read an object, first obtains an S lock.
– If T wants to modify an object, first obtains X lock.
– Hold all locks until end of transaction.
– Guarantees serializability, and recoverable schedule, too!
also avoids WW problems!
2PL:
– Slight variant of strict 2PL
– transactions can release locks before the end (commit or
abort)
But after releasing any lock it can acquire no new locks
– Guarantees serializability
131
Handling a Lock Request

Lock Request (XID, OID, Mode)

Mode==X Mode==S

Currently Locked? Empty Wait Queue?

Yes No Yes

Currently X-locked?
No Yes
Put on Queue

No
Grant Lock

132
133
Recovery
 Occurs in case of transaction failures.

 Database (DB) is restored to the most


recent consistent state just before the time
of failure.

 To do this, the DB system needs


information about changes applied by
various transactions. It is the system log.
134
Recovery: Motivation
crash
T1
T2
T3
T4
T5
•Atomicity: Undoing actions of transaction that do
not commit
•Durability: Making sure all actions of committed
transactions survive system crashes
•The Recovery Manager guarantees Atomicity &
Durability.
135
Recovery Outline
 Restore to most recent “consistent” state
just before time of failure
 Use data in the log file
 Catastrophic Failure
 Restore database from backup
 Replay transactions from log file

 Database becomes inconsistent (non-


catastrophic errors)
 Undo or Redo last transactions until
consistent state is restored 136
Logging
 Record REDO and UNDO information,
for every update, in a log.
– Sequential writes to log (put it on a
separate disk).
– Minimal info (diff) written to log, so
multiple updates fit in a single log page.

137
Handling the Buffer Pool
• When is buffer written back to disk?
• Steal/No-steal
Can it be written before commit? (steal)
Or does it have to wait till after commit? (no-steal)
• Force/No-force
Is it written “immediately” after commit? (force)
Or can it remain in memory? (no-force)

NoSteal Steal

Force Trivial

NoForce Desired

138
Write- Ahead Logging
(WAL)
 The Write- Ahead Logging Protocol:
 Must force the log record for an update
before the corresponding data page gets
to disk.
 Must write all log records for a
transaction before commit .
 What goes into log:
 BFIM needed for UNDO type algorithms
 AFIM needed for REDO type algorithms 139
Checkpoints in the System
Log
 Checkpoint record written in log when all
updated DB buffers written out to disk
 Any committed transaction occurring before
checkpoint in log can be considered
permanent (won’t have to be redone after
crash)
 Actions
 suspend execution of all transactions
 force-write all modified buffers to disk
 write checkpoint entry in log and force
write log
 resume transactions
 Fuzzy checkpointing 140
141
142

Anda mungkin juga menyukai