DBMS Module-2

GANDnI INS1I1U1L ICk LDUC1AICN & 1LCnNCLCG
DLAk1MLN1 CI CCMU1Lk SCILNCL

kLAkLD 8 Asst.rof.SAN1CSn kUMAk kA1n
DATABASE MANAGEMENT SYSTEM

(FOR BOTH CSE-4
TH
-SEM/MECH-3
RD
SEM
DEPARTMENT)
MODULE-II

Query Language in which user requests information from the database.
Categories of languages
1. Procedural language.
2. Nonprocedural language.

Procedural Language: in procedural language the user interface the system
to perform a sequence of operation on the database to compute the
desired result. Example: Relational Algebra.

Non-Procedural Language: in non-procedural language the user describe
information desired without giving a specific procedure for obtaining that
information. Example Tuple & Domain calculus.

Relational Algebra:
-It is a Procedural language.
- it consist of a set of operation that take one or two relation as input and
produce a new output.
-Six basic operators are used in procedural language.
1. select:
2. project:
3. union:
4. set difference:
5. Cartesian product: x
6. rename:
The operators take one or two relations as inputs and produce a new
relation as a result.
1. Selection Operation:
- it is unary operations.
- it is represented by the lower Greek letter sigma. ( ) )) )
- Syntax: <selection-condition>(<Relation>)
-Notation: p(r)
-p is called the selection predicate
- Defined as:p(r) = {t | t r and p(t)}
Where p is a formula in propositional calculus consisting of terms
Connected by: (and), (or), (not)
Each term is one of: <attribute> op <attribute> or <constant>
Where op is one of: =, , >, . <.
-Example of selection:
branch_name=Perryridge(account)
2. Project Operation
-Notation: where A1, A2 are attribute names and r is a relation name.n the
result is defined as the relation of k columns obtained by erasing the
columns that are not listed
-Duplicate rows removed from result, since relations are sets
-Example: To eliminate the branch_name attribute of account
account_number, balance (account)
3. Union Operation
Notation: r s
-Defined as:r s = {t | t r or t s}
-For r s to be valid.
1. r, s must have the same arity (same number of attributes)

2. The attribute domains must be compatible (example: 2nd column
of r deals with the same type of values as does the 2nd
column of s)
-Example: to find all customers with either an account or a loan
customer_name (depositor) customer_name (borrower)
Union Operation Example
Relations r, s:

4. Set Difference Operation
-Notation r s
- Defined as:
r s = {t | t r and t s}
- Set differences must be taken between compatible relations.
-r and s must have the same arity
- Attribute domains of r and s must be compatible

Set Difference Operation Example
Relations r, s:
r-s
5. Cartesian Product
Operation:
-Notation r x s
- Defined as: r x s = {t q | t r and q s}
- Assume that attributes of r(R) and s(S) are disjoint. (That is, R S = ).
- If attributes of r(R) and s(S) are not disjoint, then renaming must be used.

Cartesian product Operation Example
Relations r, s:
rXs

Example Queries:
Q1.Find all loans of over $1200
Ans: amount > 1200 (loan)

Q2.Find the loan number for each loan of an amount greater than $1200
amount > 1200 (loan)
ns: loan_number (amount > 1200 (loan))

Q3.Find the names of all customers who have a loan, an account, or both,
from the bank
ns: customer_name (borrower) customer_name (depositor)
Q4. Find the names of all customers who have a loan at the Perryridge
Branch
Ans: customer_name (branch_name=Perryridge
(borrower.loan_number = loan.loan_number(borrower x loan)))
Q5. Find the names of all customers who have a loan at the
Perryridge branch but do not have an account at any branch of
the bank.
Ans: customer_name (branch_name = Perryridge
(borrower.loan_number = loan.loan_number(borrower x loan)))
customer_name(depositor)
Q6. Find the names of all customers who have a loan at the Perryridge
branch.
Ans: customer_name (branch_name = Perryridge (
borrower.loan_number = loan.loan_number (borrower x loan)))
Formal Definition:
A basic expression in the relational algebra consists of either one of the
following:
-A relation in the database
-A constant relation
- Let E1 and E2 be relational algebra expressions; the following are all
relational algebra expressions:
-E1 E2
-E1 E2
-E1 x E2
-p (E1), P is a predicate on attributes in E1
-s (E1), S is a list consisting of some of the attributes in E1
- x (E1), x is the new name for the result of E1
Additional Operations
We define additional operations that do not add any power to the
Relational algebra, but that simplifies common queries.
-Set intersection
- Natural join
- Division
- Assignment
Set Intersection Operation:
Notation: r s
Defined as: r s = { t | t r and t s }
Assume: r, s have the same arty attributes of r and s are compatible
Note: r s = r (r s)

Set Intersection Operation Example

Relation r, s:
rs
Natural Join Operation:
-Let r and s be relations on schemas R and S respectively.
Then, r s is a relation on schema R S obtained as follows:
-Consider each pair of tuples tr from r and ts from s.
-If tr and ts have the same value on each of the attributes in R S, add a
tuple t to the result, where
t has the same value as tr on r
t has the same value as ts on s
- Example:
R = (A, B, C, D)
S = (E, B, D)
-Result schema = (A, B, C, D, E)
-r s is defined as:
Natural Join Operation Example
Relations r, s:

Lossless Design: Outer Join

An extension of the join operation that avoids loss of information.
-Computes the join and then adds tuples form one relation that does not
match tuples in the other relation to the result of the join.
- Uses null values:
-null signifies that the value is unknown or does not exist
-All comparisons involving null are (roughly speaking) false by

definition.
Outer Join Example
Relation loan

Relation borrower

Outer Join Example:

Left Outer Join: take all the tuples from left relation and match with right
relation, which didnt match with all the attributes.

Right Outer Join: take all the tuples from right relation and match with left
relation, which didnt match with all the attributes with left relation.

Relational Calculus Languages:
-Tuple Relational Calculus
-Domain Relational Calculus
-QuerybyExample (QBE)

Tuple Rellational Calculus:
-A nonprocedural query language, where each query is of the form
{t | P (t ) }
- It is the set of all tuples t such that predicate P is true for t
- t is a tuple variable, t [A ] denotes the value of tuple t on attribute A
- t r denotes that tuple t is in relation r
- P is a formula similar to that of the predicate calculus.
Banking Example
Branch (branch_name, branch_city, assets )
Customer (customer_name, customer_street, customer_city )
Account (account_number, branch_name, balance )
Loan (loan_number, branch_name, amount )

Depositor (customer_name, account_number )
Borrower (customer_name, loan_number )

Example Queries
-Q1.Find the loan_number, branch_name, and amount for loans of over
$1200
Ans: {t | t loan t [amount ] > 1200}

-Q2.Find the loan number for each loan of an amount greater than $1200
Ans: {t | s loan (t [loan_number ] = s [loan_number ] s [amount ]
> 1200)}.Notice that a relation on schema [loan_number ] is implicitly
defined by the query.
Q3.Find the names of all customers having a loan, an account, or both at
the bank
Ans: {t | s borrower ( t [customer_name ] = s [customer_name ])
u depositor ( t [customer_name ] = u [customer_name] )

Q4.Find the names of all customers who have a loan and an account
at the bank
Ans: {t | s borrower ( t [customer_name ] = s [customer_name ])
u depositor ( t [customer_name ] = u [customer_name ])

Q5.Find the names of all customers having a loan at the Perryridge branch
Ans: {t | s borrower (t [customer_name ] = s [customer_name ]
u loan (u [branch_name ] = Perryridge u [loan_number ] = s
[loan_number ]))

Q6. Find the names of all customers who have a loan at the
Perryridge branch, but no account at any branch of the bank
Ans: {t | s borrower (t [customer_name ] = s [customer_name ]
u loan (u [branch_name ] = Perryridge
u [loan_number ] = s [loan_number ])) not v depositor
(v [customer_name ] =t [customer_name ])} }

Domain Relational Calculus:
-A nonprocedural query language equivalent in power to the tuple
relational calculus
- Each query is an expression of the form:

{ < x1, x2, , xn > | P (x1, x2, , xn)}
-x1, x2, , xn represent domain variables,
-P represents a formula similar to that of the predicate calculus.

Example Queries
Q1.Find the loan_number, branch_name, and amount for loans of over
$1200
Ans: {< l, b, a > | < l, b, a > loan a > 1200}
Q2.Find the names of all customers who have a loan from the Perryridge
branch and the loan amount:
Ans: {< c > | l, b, a (< c, l > borrower < l, b, a > loan a > 1200)}

Q3. Find the names of all customers who have a loan from the Perryridge
branch and the loan amount:
Ans: {< c, a > | l (< c, l > borrower b (< l, b, a > loan b =
Perryridge))}
{< c, a > | l (< c, l > borrower < l, Perryridge, a > loan)}
{< c > | l, b, a (< c, l > borrower < l, b, a > loan a > 1200)}

Q.Find the names of all customers who have a loan of over $1200
{< l, b, a > | < l, b, a > loan a > 1200}

Database Design Life Cycle:
Data base design means to design the logical and physical
structure of data stored in a database to meet the required
information needed for different applications.
- In DDLC held the following phase for design the database.
Requiring Collection and analysis:
- It is the first step in the database design.
- During this step database design the detailed requirement by
interacting with potential users to identify their particular
needs based on the problems.
Conceptual database design:
- It is the second step in the database design.
- In this step to create a conceptual scheme for the database
that is independent of a specific database management
system(DBMS)
- The conceptual scheme includes detailed description of the

user and entity types, relationship and constraints.
- It provides a concept to the high level model such as entity
Relationship model.
Data Model mapping:
- It is also called logical database design.
- During this phase the transformation of the conceptual
scheme into the actual implementation of the database is
done.
- This may be carried out as DBMS package such as ORACLE,
MY SQL, and SQLSERVER.
Physical database Design:
- In this phase we design the specification for sorting database
in term of physical storage structure, access path and file
organizations.
- The design corresponds to designing the internal scheme of
the three level DBMS architecture.

Functional Dependency:
- Functional dependencies are constraints on the set of legal
relations.
- It plays important role in database design.
- The functional dependencies is denoted by , between
two set of attributes ,
- Let R is a relation scheme and C and then the
functional dependencies holds on R. if in any legal
relation r(R) for all pairs of tuples t1 in t2 such that.
- t1[]=t2[] and their must be t1[]=t2[]
- Fd represents an interrelationship among attributes of an
entity represented by a relation.

Armstrong Axioms

- Let F be a set of functional dependency the closure of F is the
set of all functional dependencies logically implied by F.
- We denoted the closure of F by F+.
- We can find the F+ given by F by using following rule.
- A) Reflexive rule
- B)Augmentation rule
- C) Transitive Rule
- These three rule are sound because do not generate any
incorrect functional dependency.
- The three rules are complete because for a given set F of FD
and they allowed generate all F+.
- These three rules is known as Armstrong Rules.
I. Reflexive Rule:
- if are the set of attributes and then is FD.
- Proof: let t1 and t2 be the two tuples of relations R such that
t1[]=t2[]
It means that all the attributes of t1 and t2 are same.
Then is another set of set attributes which is subset of .
so t1[]=t2[]
So is FD.
II. Augmentation Rule:
If is a FD and r is set of attributes the rr is FD.
Proof: let t1, t2 be the two tuples of relation R.
Let us assume that the FD r->r is not FD in a relation R.
Since -> is satisfied on R so we get t1 [] =t2 [] -------- (1)
t1[]=t2[] ------------(2)
since rr does not hold on Relation R
t1[r]=t2[r]---------------(3)
t1[r]=t2[r]----------------(4)
from equ(1) and (3)
we get t1[r]->t2[r]--------(5)
from equ(2) and (5)
t1[r]=t2[r]---------------(6)
from equ(3) and (6) we get r->r (proved)

III. Transitive Rule:

If -> and ->r are FD then ->r is FD.
Proof: let t1 and t2 are two tuples of relation R
Since -> is satisfied on R
so we get t1 [] =t2 [] -------- (1)
t1[]=t2[] ------------(2)
since -> satisfy on R
we get t1[]=t2[]------------------(3)
t1[r]=t2[r]-------------------(4)
from equ (1) and (4) we get
->r which holds the relation R (proved)

Closure of attributes sets:
Let R be v=the relation schema with set of functional dependency F.
let be the set of attributes, the closure of attributes set under a
set of FD and is denoted by +.

Compute the candidate Key:
1) Determine each set of attributes X for which left-hand side of
FD in F must be a subset of X.
2) Compute X+ under given set of FD in F.
3) If X+={R} then X is a candidate key R.
Example: let us consider r= {A, B, C, D, E, F, G, H, I}
And set of FD {AB->C, A->DE, B->F, F->GH,D->IJ}
Find out candidate key of R.
Ans: A+ under F= {A, D, E, I, J}
B+ under F= {B, F, G, H}
D+ under F= {D, I, J}
F+ under F= {F, G, H}
AB+ under f F={A,B,C,D,E,F,H,I,J}={R}
Hence AB is the candidate key of R.

Prime Attributes & Non-Prime Attributes:
An attributes A in a relation schema is a prime attributes. If A is
part of any candidate key of relation R is known prime attributes.
If A is not part of any candidate key of R A is called non-prime
attributes.

Database normalization

Normalization is the process of efficiently organizing data in a
database. There are two goals of the normalization process:
eliminating redundant data (for example, storing the same data in
more than one table) and ensuring data dependencies make sense
(only storing related data in a table). Both of these are worthy goals
as they reduce the amount of space a database consumes and
ensure that data is logically stored.

First Normal Form - 1NF:
First normal form (1NF) sets the very basic rules for an organized
database:
Eliminate duplicative columns from the same table.
Create separate tables for each group of related data and identify
each row with a unique column or set of columns (the primary
key).
The first normal form only says that the table should only include
atomic values, i.e. one value per box. For example, we cannot in
Table 1 below put in both Volvo and SAAB in the same box even if
we buy cars from both suppliers. We must use to different rows for
storing that. In most RDBMSs it is not allowed to assign more than
one value to each box that result in that all tables are in first
normal form.

Second Normal Form - 2NF

The second normal form says that a table, despite being in 1NF, is
not allowed to contain any full functional dependencies on
components of the primary key.
- A relation schema R is 2NF if every nonprime attributes A in R
is fully functional dependency on P.K of R.
- Fully Functional Dependency: functional dependency is
constraints on the set of legal relations.
- It plays important role in database design.
- The functional dependency is denoted by ->, between two
set of attributes , .
- Let R is a relation scheme and < and < then the
functional dependency -> hold on R. if in any legal relation
r(R) for all pairs of tuples t1 and t2 in r such that
t1[1]->t2[2] and their must be t1[1]->t2[2]
- FD represent on interrelationship among attributes of an
entity represented by a relation.
- A better definition of 2NF: To fulfill 2NF a table should fulfill
1NF and in addition every non-key attribute should be FFD of
every candidate key.

Third Normal Form - 3NF
A table is said to be third normal form, if all the non key field of the
table are independent of other non-key field of the table.
- 3
rd
NF is based on the concept of transitive dependency.
- A functional dependency X->Y in a relation schema R is
transitive dependency. If there is set of attributes Z that is
neither a candidate key nor a subset of key and both X->Z and
Z->Y is hold.
- When a non key attributes depends on other non key
attributes is called a transitive dependency.
Sl.NO ORIGIN DESTINATION DISTANCE

Here non key attributes Distance dependent on other non
key attributes Origin and Destination.
Origin, Destination-Distance is transitive dependency.
Boyce Codd Normal Form - BCNF

Every non-trivial functional dependency in the table is a
dependency on a super key.
- Trivial functional dependency:
A trivial functional dependency is a functional dependency of
an attribute on a superset of itself. {Employee ID, Employee
Address} {Employee Address} is trivial, as is {Employee
Address} {Employee Address}.
- BCNF is simple form of 3NF.but it is much strict then 3NF.it
means that every BCNF relation is also 3
Rd
NF, but a relation
in 3
RD
NF is not a BCNF.
- A relation schema R is in BCNF with respect to a set of FD, if
for all FD in F+ of the form -- where < and < at least
one of the following rule hold.
- A) - is trival functional dependency that <
- B) is super key of schema R.

PID C_NAME PLOT_NO AREA PRICE TAX_RATE
FD1
FD2
FD4 FD3

FD5
The plot schema is not 3
Rd
NF, since the FD3 and FD4 violates the
3NF. Hence it is decomposed into BCNF as follow.
Area---------Price
C_Name---------Tax_Rate
PID C_NAME PLOT_NO AREA

FD1
FD2
FD3
Area Price C_Name Tax_Rate
R1 R2
In the relation R is in 3
RD
Nf but it is not BCNF. since Area-
C_name violates the BCNF because Area is not super key.
So R3 is decomposed R31, R32.hence the PID is super key.
The BCNF relations are:
R1(Area,Price)
R2(C_Name,Tax_Rate)
R3(Area,C_Name)
R4(PId,Plolt_no,Area)
Fourth Normal Form - 4NF: A relation schema R is in 4NF with
respect to a set D of Functional dependency and multivalve
dependencies in D+ of the form ---- where R and R at
least one of the following rule hold.
1) ----- is multi-value dependency
2) is a super key of R(Schema)
if ---- is a multi-valuee dependency on schema R, so
---- is trival if or U=R

Table: loan_info
Loan_no Cust_Name Street City
110 Ram G.Nagar B.Patana
110 Ram I.Nagar B.Patana
120 Hari K.Nagar Rkl
Here we find the cust_name----Street,city is MVD and

cust_name is not super key of R. we replace the loan_info into two
schema. Borrower
Cust_name Loan_no
Ram 110
Hari 120
Customer
Cust_Name Strret City
Ram G.Nagar B.Patana
Ram I.Nagar B.Patana
Hari K.Nagar Rkl

Fifth Normal Form - 5NF: it is based on join dependency calllled
project join dependency.
- A relation schema R is in PJNF w.r.t D of functional
dependency,multi-value dependency. If for all join dependency
in D+ of the form *(R1, R2, R3..Rn) where RiR and
R= R1UR2UR3uR4Urn at least one of the following rule
hold.
1) *(R1, R2, R3..Rn) is a trival join dependency.
2) Every Ri is super key of R.
Join dependency: A table T is subject to a join dependency if T can
always be recreated by joining multiple tables each having a subset
of the attributes of T.
If r=R1UR2UR3.Rn, we say that relation r(R) satisfy
the join dependency *(R1, R2, R3.Rn).this dependency
require for all legal r( R)= R1(r) XR2 (r).Rn (r)

Important Points
Functional dependency
In a given table, an attribute Y is said to have a functional
dependency on a set of attributes X (written X Y) if and only if
each X value is associated with precisely one Y value. For example,
in an "Employee" table that includes the attributes "Employee ID"
and "Employee Date of Birth", the functional dependency {Employee
ID} {Employee Date of Birth} would hold. It follows from the
previous two sentences that each {Employee ID} is associated with
precisely one {Employee Date of Birth}.
Trivial functional dependency
- Let R is a relation and FD -- is trival dependency if
- A Fd is said to be trival functional dependency they are
satisfied all relations.
- AA,ABA
A trivial functional dependency is a functional dependency of an
attribute on a superset of itself. {Employee ID, Employee Address}
{Employee Address} is trivial, as is {Employee Address} {Employee
Address}.
Full functional dependency
An attribute is fully functionally dependent on a set of attributes X
if it is:
functionally dependent on X, and
not functionally dependent on any proper subset of X.
{Employee Address} has a functional dependency on
{Employee ID, Skill}, but not a full functional dependency,
because it is also dependent on {Employee ID}.

Transitive dependency
A transitive dependency is an indirect functional dependency, one
in which XZ only by virtue of XY and YZ.
Multivalued dependency
A multivalued dependency is a constraint according to which the
presence of certain rows in a table implies the presence of certain
other rows.
Join dependency
A table T is subject to a join dependency if T can always be
recreated by joining multiple tables each having a subset of the
attributes of T.
Superkey
A superkey is a combination of attributes that can be used to
uniquely identify a database record. A table might have many
superkeys.
Candidate key
A candidate key is a special subset of superkeys that do not have
any extraneous
Non-prime attribute
A non-prime attribute is an attribute that does not occur in any
candidate key. Employee Address would be a non-prime attribute in
the "Employees' Skills" table.
Prime attribute:A prime attribute, conversely, is an attribute that
does occur in some candidate key.

Primary key
Most DBMSs require a table to be defined as having a single unique
key, rather than a number of possible unique keys. A primary key is
a key which the database designer has designated for this purpose.
Query Processing
Query processing refer to no of activity involved in retrieve or
extracting data from database.
- Transfer query in High level language (SQL) into Low level
language (Relational algebra)
- Execute to retrieve of data.
Query Optimization: query optimizer is the process of selecting the
most efficient query among many strategies, i.e. usually possible for
processing.
- Query optimization reduces the execution time of query.
- Scanner, Parser, and validator
Internal representation of query
Plan
Generation
Cost
Estimation
Execute query
Query code generator
Code execute in query
Run-time database
processor.
Result query

System catalogue
manager
- A query expressed in high level language such as sql must first

scanned, parsed and validated.
- Scanner: identify the language taken such as sql ,keyboard,
attribute name and relational schema name.
- Parser: Parser check the query syntax to determine whether it
is formulated according to syntax rule or not.
- Validator: checking all the attributes and relation names are
valid and semantically meaningful name exist in database or
not.
- Internal representation of query: it is usually as tree data
structure called query tree. it can also represented query
graph
- Query Optimizer: is responsible for identifying an efficient
execution plan for evaluating query.
- I) optimizer generates alternate plan and choose plan with
least estimated cost.
- II) To estimate cost of plan, optimizer uses the system
catalog/data dictionary.
- Code generator: it generates the code to execute the plan,
chosen by query optimizer.
- Runtime Time Database: the processor has the task of
running the query code to produce the query result.
****************************End of Module II*********************
Possible Questions: (2 Marks)
Difference between natural join and inner join:
- It is a binary operation that allows us to combine certain
selection and a Cartesian product into one operation.
- It is denoted as
- It generates a Cartesian product of its two arguments and
performs a selection forcing equality on those attributes that
appear in both relations.
- It removes duplicate attributes.

Inner Join:
- it is binary operation
- it is represented by command inner join
- it generates new relation that contain tuples are common in
both relation with conditions.
- It cannot remove the duplicate attributes.
Update anomalies:
- The redundancy causes problems with storage, retrieval, and
updating of data.
- The redundancy can lead to update anomalies such as
inserting, modifying, and deleting data may cause
inconsistency.
Multivalued dependency
- A multivalued dependency is a constraint according to which
the presence of certain rows in a table implies the presence of
certain other rows.
- Let R be the relation schema and ( Subset) and
(subset). The multivalue dependency ---- in R.
Semi less join:
- It reduce the number of tuples in arelation before transferring
to another relation.
- It follows that the resulting relation will have two attributes
with non identical values in every tuples.
- If one of the these attributes is projected away and the other
renamed(if necessary)
- After applying semi less join , the resulting relation has exactly
the same set of tuples, but a difference name and a difference
schema.

DBMS Module-2

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

DBMS Module-2

Diunggah oleh

Hak Cipta:

Format Tersedia

GANDnI INS1I1U1L ICk LDUC1AICN & 1LCnNCLCG

DLAk1MLN1 CI CCMU1Lk SCILNCL

DATABASE MANAGEMENT SYSTEM

1. r, s must have the same arity (same number of attributes)

Set Intersection Operation Example

-All comparisons involving null are (roughly speaking) false by

Loan (loan_number, branch_name, amount )

- Each query is an expression of the form:

- The conceptual scheme includes detailed description of the

III. Transitive Rule:

Second Normal Form - 2NF

Boyce Codd Normal Form - BCNF

PID C_NAME PLOT_NO AREA

Here we find the cust_name----Street,city is MVD and

- A query expressed in high level language such as sql must first

- It removes duplicate attributes.

Anda mungkin juga menyukai