Department - Information Technology: Computer Networking Lab Project

DEPARTMENT - INFORMATION
TECHNOLOGY
COMPUTER NETWORKING LAB PROJECT
PROJECT NAME - PACKET TRACER: - ACCESS CONTROL LIST DEMONSTRATION
SUBHAMROY ( 91 )
SUDIP MAJUMDER ( 88 )
Paper Name: Database Management System Paper Code: IT-601
1. MAKAUT Syllabus
Paper name: Database Management System

Code: IT601
Contacts: 3L
Credits: 3
Pre-requisites: CS302 (Data Structure & Algorithm) , M101 & M201 (Mathematics),
IT401(Object Oriented Programming & UML)
Detailed Syllabus:
Introduction [4L]
Concept & Overview of DBMS, Data Models, Database Languages, Database Administrator,
Database Users, Three Schema architecture of DBMS.
Entity-Relationship Model [6L]
Basic concepts, Design Issues, Mapping Constraints, Keys, Entity-Relationship Diagram,
Weak Entity Sets, Extended E-R features.
Relational Model [5L]
Structure of relational Databases, Relational Algebra, Relational Calculus, Extended
Relational Algebra Operations, Views, Modifications of the Database.
SQL and Integrity Constraints [8L]
Concept of DDL, DML, DCL. Basic Structure, Set operations, Aggregate Functions, Null
Values, Domain Constraints, Referential Integrity Constraints, assertions, views, Nested
Subqueries, Database security application development using SQL, Stored procedures and
triggers.
Relational Database Design [9L]
Functional Dependency, Different anomalies in designing a Database., Normalization using
functional dependencies, Decomposition, Boyce-Codd Normal Form, 3NF, Normalization
using multi-valued dependencies, 4NF, 5NF
Internals of RDBMS [7L]
Physical data structures, Query optimization: join algorithm, statistics and cost based
optimization. Transaction processing, Concurrency control and Recovery Management:
transaction model properties, state serializability, lock base protocols, two phase locking.
File Organization & Index Structures [6L]
File & Record Concept, Placing file records on Disk, Fixed and Variable sized Records,
Types of Single-Level Index (primary, secondary, and clustering), Multilevel Indexes,
Dynamic Multilevel Indexes using B tree and B+ tree .
2. Recommended Books:
a. Henry F. Korth and Silberschatz Abraham, “Database System Concepts”,
Mc.Graw Hill. .
b. Date C. J., “Introduction to Database Management”, Vol. I, II, III,
Addison Wesley.
c. Ullman JD., “Principles of Database Systems”, Galgottia Publication.
3. Course Outcomes:
IT601.1 Develop a good description of the data, its relationships and constraints
IT601.2 Use Functional Dependencies to express relational schema in a well
normalized form.
IT601.3 It helps you to maintain the quality of data in the database
IT601.4 It helps you to identify bad designs
4. Day wise Lesson Plan with book reference: Times New Roman 12(Format given
below). Note that video link in the lesson plan is optional.
Sl. Day Module Topic Video Recommended books

No Links for the topic
(Optional
)
1. 1 5 Introduction to DBMS Henry F. Korth and
Silberschatz
Abraham, “Database
System Concepts”,
Mc.Graw Hill
2. 2 5 Functional Dependency Henry F. Korth and
Silberschatz
System Concepts”,
Mc.Graw Hill
3. 3 5 Different anomalies in Henry F. Korth and
designing a Database Silberschatz
System Concepts”,
Mc.Graw Hill
4. 4 5 Normalization using Henry F. Korth and
functional dependencies, Silberschatz
System Concepts”,
Mc.Graw Hill
5. 5 5 Decomposition Henry F. Korth and
Silberschatz
System Concepts”,
Mc.Graw Hill
6. 6 5 Boyce-Codd Normal Henry F. Korth and
Form, Silberschatz
System Concepts”,
Mc.Graw Hill
7. 7 5 3NF Henry F. Korth and
Silberschatz
System Concepts”,
Mc.Graw Hill
8. 8 5 Normalization using Henry F. Korth and
multi-valued Silberschatz
dependencies, Abraham, “Database
System Concepts”,
Mc.Graw Hill
9. 9 5 4NF, 5NF Henry F. Korth and
Silberschatz
System Concepts”,
Mc.Graw Hill
10. 10 6 Physical data structures Henry F. Korth and
Silberschatz
System Concepts”,
Mc.Graw Hill
11. 11 Query optimization: join Henry F. Korth and
algorithm, statistics and Silberschatz
System Concepts”,
Mc.Graw Hill
12. 12 6 cost bas optimization Henry F. Korth and
Silberschatz
System Concepts”,
Mc.Graw Hill
13. 13 6 Relational Calculus Henry F. Korth and
Silberschatz
System Concepts”,
Mc.Graw Hill
14. 14 3 Practice of RC Henry F. Korth and
Silberschatz
System Concepts”,
Mc.Graw Hill
15. 15 3 File & Record Concept, Henry F. Korth and
Placing file records on Silberschatz
Disk, Fixed and Abraham, “Database
Variable sized Records System Concepts”,
Mc.Graw Hill
16. 16 7 Types of Single-Level Henry F. Korth and
Index (primary, Silberschatz
secondary, and Abraham, “Database
clustering)
System Concepts”,
Mc.Graw Hill
17. 17 7 Multilevel Indexes, Henry F. Korth and
Silberschatz
System Concepts”,
Mc.Graw Hill
18. 18 7 Dynamic Multilevel Henry F. Korth and
Indexes using B tree and Silberschatz
B+ tree Abraham, “Database
System Concepts”,
Mc.Graw Hill
5. Course Information
PROGRAMME: IT DEGREE: BTech
COURSE: Database Management System SEMESTER: 6 CREDITS: 3
COURSECODE: IT601 COURSE TYPE: Theory
CORRESPONDING LAB COURSE

CONTACT HOURS:
CODE (IF ANY): IT691
DAY 1
Course: Database Management System IT601
Relevant MAKAUT syllabus portion: Functional Dependency, Different anomalies in

designing a Database. Normalization using
functional dependencies, Decomposition, Boyce-
Codd Normal Form, 3NF, Normalization using
multi-valued dependencies, 4NF, 5NF
Course Outcomes: Basic Knowledge about DBMS
Lecture 1 (1 hr)
Topics Covered: Introduction to Database and DBMS
Prerequisites: Have you Read
● Set theory
● Entity Relationship
● Data Structure
Objectives: Impart basic knowledge about basics of Database and DBMS
Notes:
Database: A collection of logically-related information stored in a consistent fashion.
E.g. : Phone book, Bank records (checking statements, etc)
The storage format typically appears to users as some kind of tabular list (table, spreadsheet)
Jobs of Database:
✔ Stores information in a highly organized manner

✔ Manipulates information in various ways, some of which are not available in other
applications or are easier to accomplish with a database
✔ Models some real world process or activity through electronic means
o Often called modeling a business process
o Often replicates the process only in appearance or end result
DBMS: A database-management system (DBMS) is a collection of interrelated data and a set of

programs to access those data.
Relational Database:
Relational database was proposed by Edgar Codd (of IBM Research) around 1969.
✔ A relational database organizes data in tables (or relations).

✔ A table is made up of rows and columns.
✔ A row is also called a record (or tuple).
✔ A column is also called a field (or attribute).
✔ A database table is similar to a spreadsheet.
✔ However, the relationships that can be created among the tables enable a relational
database to efficiently store huge amount of data, and effectively retrieve selected data.
Primary Key
✔ In the relational model, a table cannot contain duplicate rows, because that would create
ambiguities in retrieval.
✔ To ensure uniqueness, each table should have a column (or a set of columns), called primary
key that uniquely identifies every records of the table.
✔ A primary key is called a simple key if it is a single column; it is called a composite key if it
is made up of several columns.
The most important logical criteria in data base design are reduction/elimination of
redundancy and maintenance of database consistency.
Normal Relations: The relations that store each fact (tuple) only once in the database and that
remain consistent following database operations (updates, insertions and deletions).
Normalization: Process of decomposing unsatisfactory "bad" relations by breaking up their

attributes into smaller relations.
First Normal Form (1NF)

✔ A table is in first normal form if there are no repeating groups.
✔ Repeating Groups : a set of logically related fields or values that occur multiple times in
one record
1. non-atomic value, or multiple values, stored in a field
2. multiple fields in the same table that hold logically similar values
Let’s learn by doing:
1. What is the difference between a database and a table?

2. State significant difference between file system and DBMS.
3. What are the basic components of DBMS?
4. Why is the following table NOT in first normal form (1NF)?
5. How can the following table be changed to first normal form?

DAY 2

Course Outcomes: Functional Dependency
Lecture 1 (1 hr)
Topics Covered: Functional Dependency
● Set theory
● Data Structure
Objectives: Impart basic knowledge about Functional Dependency
Notes:
Functional Dependency
A set of attributes Y is functionally dependent on a set of attributes X if a given set of values for
each attribute in X determines unique values for the set of attributes in Y.
We use the notation: X →Y to denote that Y is functionally dependent on X. The set of attributes
X is known as the determinant of the FD, X →Y.
Rules of Functional Dependency:
The Splitting/Combining rule of FDs
Attributes on right independent of each other
d,e,f
–Consider a,b,c →
– “Attributes a, b, and c functionally determine d, e, and f”
⇨ No mention of d relating to e or f directly
Splitting rule (Useful to split up right side of FD)
– abc → d ef becomes abc →

d , abc →
e and abc → f
No safe way to split left side
– abc → d ef is NOT the same as ab→

def a nd
c→def ! Combining rule (Useful to combine right
sides):
– d , abc→e , abc→
if abc→ f holds, then abc→d ef holds
Trivial FDs
Not all functional dependencies are useful
– A→
always holds
A
– abc → a also always holds (right side is subset of left side)
FD with an attribute on both sides is “trivial”

– Simplify by removing L ∩ R from R abc → a d b ecomes a bc →d
–Or, in singleton form, delete trivial FDs abc →a and abc → d b ecomes just abc →d
Transitive rule
• The transitive rule holds for FDs
– b a nd b → c; then a→
Consider the FDs: a → c holds
– b a nd b →cd; then ad→
Consider the FDs: ad → cd holds or just ad→c ( because of the
trivial dependency rule)
Cyclic functional dependencies:

• Attributes on right side of one FD may appear on left side of another!
– Simple example: assume relation (A, B) & FDs: A -> B, B -> A
– What does this say about A and B?
• Example
– studentID -> email email -> studentID
Geometric view of FDs

▪ Let D be the domain of tuples in R
♣ Every possible tuple is a point in D
▪ FD X on R restricts tuples in R to a subset of D
♣ Points in D which violate X cannot be in R
▪ Example: D(x,y,z)
♣ xy -> z
▪ z= abs(x) + abs(y)
♣ z -> x,y
♣ x=y=abs(z)/2
1. List all functional dependencies satisfied by the relation of following table
2. Use the definition of functional dependency to argue that each of Armstrong’s axioms
(reflexivity, augmentation, and transitivity) is sound.
3. Consider the following proposed rule for functional dependencies: If α → β and γ → β, then
α → γ. Prove that this rule is not sound by showing a relation r that satisfies α → β and γ →
β, but does not satisfy α → γ.
DAY 3

Course Outcomes: Functional Dependency
Lecture 1 (1 hr)
Topics Covered: Different anomalies in designing a Database

● Set theory
● Data Structure
Objectives: Impart basic knowledge about Tables and database.
Notes:
Anomalies:
♣ Insertion anomaly: occurs when a row cannot be added to a relation, because not all data
are available (or one has to invent “dummy” data)
♣ Deletion anomaly: occurs when data is deleted from a relation, and other critical data are
unintentionally lost
♣ Update anomaly: occurs when one must make many changes to reflect the modification of a
single datum
Anomalies are primarily caused by:

▪ data redundancy: replication of the same field in multiple tables, other than foreign keys
▪ Functional dependencies whose determinants are not candidate keys, including
● partial dependency
● transitive dependency
Find the three types of anomaly in the following table:
StudentNum CourseNum Student Address Course

Name
S21 9201 Malti Etawa Accounts
S21 9267 Malti Etawa Accounts
S24 9267 Smitha Golsi physics
S30 9201 Rabi Malancha Computing
S30 9322 Rabi Malancha Maths
DAY 4

Course Outcomes: Normalization
Lecture 1 (1 hr)
Topics Covered: . Normalization using functional dependencies

● Set theory
● Data Structure
Notes:
Removing FDs

❖ Suppose we have a relation R with scheme S and the FD A A ∩B wBh=er{e}
❖ Let C = S – (A U B)
❖ In other words:
♣ A – attributes on the left hand side of the FD
♣ B – attributes on the right hand side of the FD
♣ C – all other attributes
❖ It turns out that we can split R into two parts:
♣ R1, with scheme C U A
♣ R2, with scheme A U B
❖ The original relation can be recovered as the natural join of R1 and R2:
❖ R = R1 NATURAL JOIN R2
Problems Resolved in 2NF
♣ Problems in 1NF
♣ INSERT – Can't add a module with no texts
♣ UPDATE – To change lecturer for M1, we have to change two rows
♣ DELETE – If we remove M3, we remove L2 as well
♣ In 2NF the first two are resolved
Closure of a Set of FDs

♣ The set of all FDs logically implied by F is called the closure of F.
♣ The closure of F is denoted by F + .
♣ Given a set F , we can find all FDs in F + by applying Armstrong’s Axioms
1. Compute the closure of the following set F of functional dependencies for relation
schema R = (A, B, C, D, E).
A →BC, CD →E, B→ D, E→ A. List the candidate keys for R.
2. Give an example with explanation where a database is in 1NF and not in 2NF.
DAY 5

Course Outcomes: Learn Normalization in Relational database
Lecture 1 (1 hr)
Topics Covered: Decomposition
● Set theory
● Data Structure
Notes:
Decomposition
Redundancy causes problems: Solution => decompose schema so that each information content
is represented only once
Definition: Let R be a relation scheme {R1, ..., Rn} is a decomposition of R if R = R1∪ ... ∪
Rn (i.e., all of R’s attributes are represented)
We will deal mostly with binary decomposition:
R into {R1, R2} where R = R1 ∪ R2
Eg: student(s_id, name, dept_id, dept_head, dept_phone, grade)
⇨ student(s_id, name, dept, grade) dept(dept_id, dept_head, grade)
✔ Lossless: Data should not be lost or created when splitting relations up
✔ Dependency preservation: It is desirable that FDs are preserved when splitting relations up
1. Suppose that we decompose the schema R = (A, B, C, D, E) into (A, B, C) & (A, D, E).
Show that this decomposition is a lossless-join decomposition if the following set F of
functional dependencies holds:
A → BC, CD → E, B → D, E → A
2. Consider the relation R ( A, B, C, D, E ) with the set of F = { A → C, B → C, C → D, DC → C,

CE → A }. Suppose the relation has been decomposed by the relations R1 ( A, D ) R2 ( A, B )
R3 ( B, E ) R4 ( C, D, E ), R5 ( A, E ). Is this decomposition lossy or lossless?
DAY 6

Lecture 1 (1 hr)
Topics Covered: Boyce-Codd Normal Form

● Set theory
● Data Structure
Notes:
Boyce-Codd Normal Form
A relation schema R is in BCNF with respect to a set F of functional dependencies if for all
functional dependencies in F+ of the form α → β, where α ⊆ R and β ⊆ R, at least one of
the following holds:
♣ α →β is trivial (i.e., β ⊆ α)

♣ α is a superkey for R
Example
R = (A, B, C), F = {A → B ; B → C}, Key = {A}
♣ R is not in BCNF
♣ Decompose into R1 = (A, B), R2 = (B, C)
❖ R1 and R2 in BCNF
❖ Lossless-join decomposition
❖ Dependency preserving

1. Give a lossless-join decomposition into BCNF of schema R= (A, B, C, D, E) with FD A →
BC, CD → E, B → D, E → A.
2. Consider the following collection of relations and dependencies. For each relation, please
(a) determine the candidate keys, and (b) if a relation is not in BCNF then
decompose it into a collection of BCNF relations.
a. R1(A,C,B,D,E), A → B, C→ D
b. R2(A,B,F), AB→ F, B→ F
c. R(A,B,C,D,E) with functional dependencies D → B, CE → A

DAY 7

Lecture 1 (1 hr)
Topics Covered: 3NF
● Set theory
● Data Structure
Notes:
Third Normal Form
❖ Allows some redundancy (with resultant problems)
❖ But FDs can be checked on individual relations without computing a join
❖ There is always a lossless-join, dependency-preserving decomposition into 3NF
A relation schema R is in third normal form (3NF) if for all α → β in F+ at least one of the
following holds:
❖ α→β is trivial (i.e., β ∈ α)
❖ α is a superkey for R
❖ Each attribute A in β – α is contained in a candidate key for R.
(NOTE: each attribute may be in a different candidate key)
❖ If a relation is in BCNF it is in 3NF (since in BCNF one of the first two conditions above
must hold).
1. R(ABCD), ABC → D, D → A, is R in 3NF?

2. Compare BCNF with 3NF
3. The relation schema Student_Performance (name, courseNo, rollNo, grade) has the
following FDs:
name,courseNo->grade, rollNo,courseNo->grade ,name->rollNo, rollNo->name
Find the highest normal form of this relation scheme
DAY 8

Lecture 1 (1 hr)
Topics Covered: Normalization using multi-valued dependencies

● Set theory
● Data Structure
Objectives: Impart basic knowledge about Multi-valued Dependency
Notes:
Multi-valued Dependency
❖ R=XYZ: relation scheme. An MVD X →→ Y holds iff each X-value in R is associated
with a set of Y-values in a way that does not depend on Z-values.
❖ Formally, for any pair of tuples t1, t2 of r(R) such that t1[X]=t2[X]
❖ There exists t3, t4 in r such that

o t1[X]=t2[X]= t3[X]=t4[X]
o t3[Y]= t1[Y]
o t3[Z]=t2[Z]
o t4[Y]=t2[Y]
o t4[Z]=t1[Z]
1. List all nontrivial MVDs in
2. Add tuples to the following table so that it will satisfy X →→ Y.
3. Prove that α →→ β implies α →→ β .

DAY 9

Lecture 1 (1 hr)
● Set theory
● Data Structure
Objectives: Impart basic knowledge about 4NF and 5NF
Notes:
Fourth Normal Form (4NF)
Relation schema R is in 4NF
❖ w.r.t. a set of dependencies D (FDs & MVDs)
❖ if for all MVD α →→ β in D+
α →→ β is a trivial MVD, OR
α is a superkey for R
❖ Effect: if there is any nontrivial MVD => it must be FD
Lemma: If R is in 4NF then it is in BCNF

Example
R = (A, B, C, G, H, I), F = {A →→ B, B →→ HI, CG →→ H }
R is not in 4NF since A →→ B and A is not a superkey for R
Join Dependencies
Let R be a relation schema and R1, R2, . . . , Rn be a decomposition of R. The join dependency
*(R1, R2, . . . , Rn) is used to restrict the set of legal relations to those for which R1, R2, . . . , Rn
is a lossless-join decomposition of R. Formally, if R = R1 ∪ R2 ∪ . . . ∪ Rn, we say that a
relation r(R) satisfies the join dependency *(R1, R2, . . . , Rn) if
Project-join normal form (PJNF) is defined in the same way as BCNF and 4NF, except that join
dependencies are used.
A relation schema R is in PJNF with respect to a set D of functional, multi-valued, and join
dependencies if, for all join dependencies in D+ of the form *(R1, R2, . . . , Rn), where each Ri⊆R and
R = R1 ∪ R2 ∪ . . . ∪ Rn, at least one of the following holds:
✔ *(R1, R2, . . . , Rn) is a trivial join dependency.
✔ Every Ri is a superkey for R.
1. Consider the following schema.

R = (A, B, C, D, E), S = (G, H, I, J), F = {A B, B C, B E, B D, G H, G I, I J}
Normalize the above schema with given constraints, to 4NF.
DAY 10
Relevant MAKAUT syllabus portion: Physical data structures, Query optimization: join
algorithm, statistics and cost based optimization.
Course Outcomes: Knowledge on Query Optimization
Lecture 1 (1 hr)
● Set theory
● RDBMS
Objectives: Understanding on Query Optimization
Notes:
Queries
A query is a language expression that describes data to be retrieved from a database.
Steps in query processing:
Query Optimization:
Query optimization is the process of selecting an efficient execution plan for evaluating the query.
After parsing of query, parsed query is passed to query optimizer, which generates different execution
plans to evaluate parsed query and select the plan with least estimated cost.
We can have different scenario and based on those scenario we will choose different execution
plan for query.
Scenerio1:-when both data files are small.

Load both F1 and F2 in main memory and then do cross product.
Pick up one record at a time from F1 and do cross product with every element of F2 and repeat this
process for every element of F1. As both data files are small so no require of lot memory space.
Scenerio2:when F1 is very large and F2 is small.

In this case we cannot load F1 in main memory as it will requires large space and it is not an
efficient plan.
Solution:-In this case load only F2 in main memory and read records from F1 direct from
secondary memory and do cross product since we don't require writing records into F1 hence
we have least amount of I/O in this process and also no large space overhead.
Amount of I/O = size of F1 (no. of pages in F1) + size of F2 (no. of pages in F2) = m+n
Above one is the best execution plan for given scenario as in this we have to read at least once
from both the files.
Scenerio3: When we have a limited constant space available of main memory.

Case1:-Let us have two pages p1 and p2 main memory available to perform cross product of
F1 and F2.
Then we read a page from F1 and load it into p1 and a page from F2 into p2 and then we
don't change p1 and change contents of p2 by loading elements of F2 till n and perform cross
product every time. After this change p1 and thus repeat this process till m times.
Amount of I/O = m x n
Case2: find the best way to perform cross product when pages of main memory available
equals to 4(p1, p2, p3, p4).
We have two solutions for this:
Solution1:- read 2 pages from F1 at a time and also 2 pages from F2 then perform cross
product between them and then remove last two pages of F2 and read two new
pages and repeat this process till n. thus we repeat same process n/2 times then we
read 2 next pages from F1 and repeat process m/2 times.
Hence Amount of I/O = (m/2)*(n/2)
Solution2:- read 3 pages from F1 at a time and also 1 page from F2 then perform cross
product between them and then remove last pages of F2 and read new page and
repeat this process till n. thus we repeat same process n times then we read 3 next
pages from F1 and repeat process m/3 times.
Hence Amount of I/O = (m/3)*n
Scenerio4: When we have amount of memory not constant

Some fraction of data files first half pages (p1) from F1 and read half pages (p2) from F2
and do cross product then read another half part of F2 (p3) and do cross product with same
part of F1 (p1) and then read another half part of F1 (p4) and do cross product with latest
read part of F2 (p3) then do cross product with first read part of F2 (p2).
Amount of I/O = (m/2) + (n/2) + (n/2) + (m/2) + (n/2)
Natural join (|X|):

Scenerio1: if we sort the file in common attribute
First take record from F1 then from F2 if matches then store in temporary file and repeat this
process for all records of F1. Amount of I/O = size of F1 + size of F2
Like cross product we also require different sorting method on the basis of different scenario
present like size of data files. Best sorting method does not require main memory.
Scenario_X: if we have both F1 and F2 very-2 large we should use merge sort for this case.
Scenario_Y: if we have very-2 large data files and some memory in fraction of data files:
Let we have 5% pages main memory available of data file having 100 pages then we
read 5 pages from F1 and sort them and assign one page and again next 5 pages and
sort them and assign one page thus we have different sorted group assigned into one
page. Then keep on merging between these all sorted groups and get a final sorted
file.
Amount of I/O = whole scan of file = to create 20 sorted file + cost of merging
Comparison between above types of memory available (constant and fraction of data
file):
1. In first one, we have sufficient memory available but in second one we have
some fraction of data file
2. In scenario_X we sort records in decreasing order i.e. first we have more sorted no.
of files those reduces to one file.
3. In scenario_Y in starting we have fix no. of files those after sorting reduces to
one file.

Selection operation( σ ):
If given data file is large then instead of load it into main memory we take a temporary page
in main memory and pick records from data files(secondary memory)one by one, if any
particular record qualify required condition then store it.
Projection( П ):
In given data files we search table by table as above and if we found required attribute in
particular table then store that attribute and values.
How will Oracle (with the rule-based optimizer) evaluate this query?
SELECT E.ENAME
FROM EMP E
WHERE DEPTNO = 20
AND SAL >= 2000
AND ENAME LIKE ’F%’
DAY 11
Relevant MAKAUT syllabus portion: Physical data structures, Query optimization: join
algorithm, statistics and cost based optimization.
Course Outcomes: Knowledge on Query Optimization
Lecture 1 (1 hr)
● Set theory
● RDBMS
Objectives: Understanding on statistics and cost based optimization
Notes:
Distributed Cost Model
Two different types of cost functions can be used
Reduce total time
✔ Reduce each cost component (in terms of time) individually, i.e., do as little for
each cost component as possible
✔ Optimize the utilization of the resources (i.e., increase system throughput)
Reduce response time
✔ Do as many things in parallel as possible
✔ May increase total time because of increased total activity
Total time: Sum of the time of all individual components

❖ Local processing time: CPU time + I/O time
❖ Communication time: fixed time to initiate a message + time to transmit the data
The individual components of the total cost have different weights:

● Wide area network
❖ Message initiation and transmission costs are high
❖ Local processing cost is low (fast mainframes or minicomputers)
❖ Ratio of communication to I/O costs is 20:1

● Local area networks
❖ Communication and local processing costs are more or less equal
❖ Ratio of communication to I/O costs is 1:1.6 (10MB/s network)
Response time: Elapsed time between the initiation and the completion of a query
– where #seq x (x in instructions, I/O, messages, bytes) is the maximum number of x which must
be done sequentially.
• Any processing and communication done in parallel is ignored
Database Statistics
✔ The primary cost factor is the size of intermediate relations
that are produced during the execution and
must be transmitted over the network, if a subsequent operation is located on a different site
✔ It is costly to compute the size of the intermediate relations precisely.
✔ Instead global statistics of relations and fragments are computed and used to provide
approximations
♣ Let R(A1,A2, . . . ,Ak) be a relation fragmented into R1,R2, . . . ,Rr.

♣ Relation statistics
❖ min and max values of each attribute: min{Ai}, max{Ai}.
❖ length of each attribute: length(Ai)
❖ number of distinct values in each fragment (cardinality): card(Ai), (card(dom(Ai)))
♣ Fragment statistics
❖ cardinality of the fragment: card(Ri)
❖ cardinality of each attribute of each fragment: card(ΠAi(Rj))
1. Consider the SQL query
Select *
From employee, department
Where employee.dept_id = department.dept_id
What evaluation plan would a query optimizer likely choose to get the least estimated cost?
DAY 12
Relevant MAKAUT syllabus portion: File & Record Concept, Placing file records on Disk,
Fixed and Variable sized Records, Types of Single-
Level Index (primary, secondary, and clustering),
Multilevel Indexes, Dynamic Multilevel Indexes using
B tree and B+ tree
Course Outcomes: File & Record Concept, Placing file records on Disk
Lecture 1 (1 hr)
Topics Covered: File system and Memory
● Memory
● File System
Objectives: Understanding File & Record Concept in Disk.
Notes:
Physical Storage
Speed with which data can be accessed
Cost per unit of data
Reliability
o data loss on power failure or system crash
o physical failure of the storage device
Can differentiate storage into:
o volatile storage: loses contents when power is switched off
o non-volatile storage:
Contents persist even when power is switched off.
Includes secondary and tertiary storage, as well as battery-backed up
main-memory.
1. Do a comparative study on various levels of RAID.

DAY 13
B tree and B+ tree
Course Outcomes: Knowledge of Fixed and Variable sized Records
Topics Covered: Fixed and Variable sized Records

● Memory
● File System
Objectives: Learn about Fixed and Variable sized Records and use them in real problem.
Notes:
File organization
⇨ The database is stored as a collection of files. Each file is a sequence of records. A
record is a sequence of fields.
⇨ One approach:
o assume record size is fixed
o each file has records of one particular type only
o different files are used for different relations
⇨ This case is easiest to implement; will consider variable length records later.
Fixed Length Record
Simple approach:
Store record i starting from byte n * (i – 1), where n is the size of each record.
Record access is simple but records may cross blocks
o Modification: do not allow records to cross block
boundaries Deletion of record i:
alternatives:
move records i + 1, . . ., n to i, . . . , n – 1
move record n to i
do not move records, but link all free records on a free list
Variable Length Record
Variable-length records arise in database systems in several ways:

Storage of multiple record types in a file.
Record types that allow variable lengths for one or more fields.
Record types that allow repeating fields (used in some older data models).
Slotted page structure

1. Given a block can hold either 3 records or 10 key pointers. A database contains n
records, then how many blocks do we need to hold the data file?
DAY 14
B tree and B+ tree
Course Outcomes: Knowledge of Index
Topics Covered: Types of Single-Level Index (primary, secondary, and clustering),
● Memory
● File System
Objectives: Impart knowledge of Indexing
Notes:
Basic Concepts
Indexing mechanisms used to speed up access to desired data.
o E.g., author catalog in library
Search Key - attribute to set of attributes used to look up records in a
file. An index file consists of records (called index entries) of the form
Index files are typically much smaller than the original file
Two basic kinds of indices:
Ordered indices: search keys are stored in sorted order
Hash indices: search keys are distributed uniformly across “buckets” using a “hash
function”.
Indexing techniques evaluated on basis of:
In an ordered index, index entries are stored sorted on the search key value. E.g., author catalog
in library.
Primary index: in a sequentially ordered file, the index whose search key specifies the
sequential order of the file.
o Also called clustering index
o The search key of a primary index is usually but not necessarily the primary key.
Secondary index: an index whose search key specifies an order different from the sequential
order of the file. Also called non-clustering index.
Index-sequential file: ordered sequential file with a primary index.
Dense index — Index record appears for every search-key value in the file.
Sparse Index: contains index records for only some search-key values.
o Applicable when records are sequentially ordered on search-key
To locate a record with search-key value K we:
⇨ Find index record with largest search-key value < K
⇨ Search file sequentially starting at the record to which the index record points
Clustering Index
A clustered index can be defined as an ordered data file. Sometimes the index is created on non-
primary key columns which may not be unique for each record.
In this case, to identify the record faster, we will group two or more columns to get the unique
value and create index out of them. This method is called a clustering index.
The records which have similar characteristics are grouped, and indexes are created for these
group.

Suppose we have a relation R(a,b,c,d,e) and there are at least 1000 distinct values for each of the
attributes. Consider each of the following query workloads, independently of each other. If it is
possible to speed it up significantly by adding up to two additional indexes to relation R, specify for
each index (1) which attribute or set of attributes form the search key of the index, (2) if the index
should be clustered or un-clustered?
DAY 15
B tree and B+ tree
Course Outcomes: Knowledge of Multilevel Indexes
Topics Covered: Multilevel Indexes
● Memory
● File System
Objectives: Working with Multilevel Indexes
Notes:
Multilevel Index
★ If primary index does not fit in memory, access becomes expensive.
★ To reduce number of disk accesses to index records, treat primary index kept on disk as a
sequential file and construct a sparse index on it.
outer index – a sparse index of primary index
inner index – the primary index file
★ If even outer index is too large to fit in main memory, yet another level of index can
be created, and so on.
★ Indices at all levels must be updated on insertion or deletion from the file.

1. How does the multilevel indexing structure improve the efficiency of searching an index file?
2. Consider a disk with block size B = 512 bytes. A block pointer is P = 8 bytes long, and a
record pointer is Pr = 9 bytes long. A file has r = 50,000 STUDENT records of fixed-size R =
147 bytes. The key field ID# has a length V = 12 bytes. Answer the following questions:
Suppose the key field ID# is the ordering field, and a primary index has been constructed.
Now if we want to make it into a multilevel index, what is the number of levels needed and
what is the total number of blocks required by the multilevel index?
Suppose the key field ID# is NOT the ordering field, and a secondary index has been built.
Now if we want to make it into a multilevel index, what is the number of levels needed and
what is the total number of blocks required by the multilevel index?
DAY 16
B tree and B+ tree
Course Outcomes: Learn of Dynamic Multilevel Indexes using B tree and B+ tree
Topics Covered: Dynamic Multilevel Indexes using B tree and B+ tree
● Data Structure
● File System
Objectives: Handling Dynamic Multilevel Indexes using B tree and B+ tree

Notes:
B+-Tree Index Files
B+-tree indices are an alternative to indexed-sequential files.
★ Disadvantage of indexed-sequential files: performance degrades as file grows, since many
overflow blocks get created. Periodic reorganization of entire file is required.
★ Advantage of B+-tree index files: automatically reorganizes itself with small, local, changes,
in the face of insertions and deletions. Reorganization of entire file is not required to maintain
performance.
★ Disadvantage of B+-trees: extra insertion and deletion overhead, space overhead.
★ Advantages of B+-trees outweigh disadvantages, and they are used extensively.
A B+-tree is a rooted tree satisfying the following properties:
✔ All paths from root to leaf are of the same length
✔ Each node that is not a root or a leaf has between [n/2] and n children.
✔ A leaf node has between [(n–1)/2] and n–1 values
✔ Special cases:
✔ If the root is not a leaf, it has at least 2 children.
✔ If the root is a leaf (that is, there are no other nodes in the tree), it can have between 0 and (n–1)
values.
Structure of B+ Tree
✔ Every leaf node is at equal distance from the root node. A B+ tree is of the order n where n is
fixed for every B+ tree.
Internal nodes −
● Internal (non-leaf) nodes contain at least ⌈n/2⌉ pointers, except the root node.
● At most, an internal node can contain n pointers.
Leaf nodes −
● Leaf nodes contain at least ⌈n/2⌉ record pointers and ⌈n/2⌉ key values.
● At most, a leaf node can contain n record pointers and n key values.
● Every leaf node contains one block pointer P to point to next leaf node and forms a
linked list.
B+ Tree Insertion
● B+ trees are filled from bottom and each entry is done at the leaf node.
● If a leaf node overflows −
o Split node into two parts.
o Partition at i = ⌊(m+1)/2⌋.
o First i entries are stored in one node.
o Rest of the entries (i+1 onwards) are moved to a new node.
ey is duplicated at the parent of the leaf.
o ith k
● If a non-leaf node overflows −
o Split node into two parts.
o Partition the node at i = ⌈(m+1)/2⌉.
o Entries up to i are kept in one node.
o Rest of the entries are moved to a new node.
B+ Tree Deletion
● B+ tree entries are deleted at the leaf nodes.
● The target entry is searched and deleted.
o If it is an internal node, delete and replace with the entry from the left position.
● After deletion, underflow is tested,
o If underflow occurs, distribute the entries from the nodes left to it.
● If distribution is not possible from left, then
o Distribute from the nodes right to it.
● If distribution is not possible from left or from right, then
o Merge the node with left and right to it.

Consider the disk with block size B = 512 bytes. A block pointer is P = 8 bytes long, and a record
pointer is Pr = 9 bytes long. A file has r = 50,000 STUDENT records of fixed-size R = 147 bytes.
The key field is ID# whose length is V = 12 bytes. (This is the same disk file as in previous
exercises. Some of the early results should be utilised.) Suppose that the file is NOT sorted by the
key field ID# and we want to construct a B-tree access structure (index) on ID#. Answer the
following questions:
1. What is an appropriate order p of this B-tree?
2. How many levels are there in the B-tree if nodes are approximately 69% full?
3. What is the total number of blocks needed by the B-tree if they are approximately 69% full?
4. How many block accesses are required to search for and retrieve a record from the data file,
given an ID#, using the B-tree?
DAY 17
Relevant MAKAUT syllabus portion: Relational Calculus

Course Outcomes: To learn Relational Calculus
Topics Covered: Relational Calculus
● Memory
● File System
Objectives: Learn to use Relational Calculus in DBMS
Notes:
RELATIONAL CALCULUS
⇨ Relational Algebra is a PROCEDURAL LANGUAGE

⇨ we must explicitly provide a sequence of operations to generate a desired output result
⇨ Relational Calculus is a DECLARATIVE LANGUAGE
⇨ we specify what to retrieve, not how to retrieve it
⇨ Declarative ~ Non-Procedural
If retrieval can be specified in the relational calculus, it can be specified in the relational algebra, and
vise-versa
→expressive power of the languages is identical
A query language L is relationally complete if L can express any query that can be expressed in
the relational calculus
1. Let R = (A, B) and S = (A, C), and let r(R) and s(S) be relations. The relational
algebra expression ∠A( ⌠B=10 (r)) is equivalent to the following domain relational
calculus expression:
{<a> | ∃ b ( <a, b> ∈ r ∧ b = 10)}
Give an expression in the domain relational calculus that is equivalent to each of the
following:
a) r ⋈ s
b) ∠r .A ((r ⋈ s) ⋈c =r2.A ∧ r.B>r2.B (ρr 2(r)))
2. Consider the following relational schema.
Students(rollno: integer, sname: string)
Courses(courseno: integer, cname: string)
Registration(rollno: integer, courseno: integer, percent: real)
Express in TRC "Find the distinct names of all students who score more than 90% in the
course numbered 107"

Department - Information Technology: Computer Networking Lab Project

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Department - Information Technology: Computer Networking Lab Project

Diunggah oleh

Hak Cipta:

Format Tersedia

DEPARTMENT - INFORMATION

Paper name: Database Management System

Sl. Day Module Topic Video Recommended books

COURSE: ​Database Management System SEMESTER: ​6 CREDITS: ​3

COURSECODE: IT601 COURSE TYPE: ​Theory

CORRESPONDING LAB COURSE

Relevant MAKAUT syllabus portion: Functional Dependency, Different anomalies in

Topics Covered: Introduction to Database and DBMS

Prerequisites​: Have you Read

Objectives:​ Impart basic knowledge about basics of Database and DBMS

E.g. : Phone book, Bank records (checking statements, etc)

✔ Stores information in a highly organized manner

DBMS: A database-management system (DBMS) is a collection of interrelated data and a set of

✔ A relational database organizes data in tables (or relations).

Normalization: Process of decomposing unsatisfactory "bad" relations by breaking up their

First Normal Form (1NF)

Let’s learn by doing:

1. What is the difference between a database and a table?

2. State significant difference between file system and DBMS.

3. What are the basic components of DBMS?

4. Why is the following table NOT in first normal form (1NF)?

5. How can the following table be changed to first normal form?

Relevant MAKAUT syllabus portion: Functional Dependency, Different anomalies in

Topics Covered: ​Functional Dependency

Prerequisites​: Have you Read

Objectives:​ Impart basic knowledge about Functional Dependency

Rules of Functional Dependency:

The Splitting/Combining rule of FDs

Attributes on right independent of each other

– abc ​→ d​ ef ​becomes ​abc →

No safe way to split left side

– abc ​→ d​ ef ​is ​NOT ​the same as ​ab→

c​→​def ​! Combining rule (Useful to combine right

FD with an attribute on both sides is “trivial”

Cyclic functional dependencies:

Geometric view of FDs

Let’s learn by doing:

1. List all functional dependencies satisfied by the relation of following table

Course: Database Management System IT601

Relevant MAKAUT syllabus portion: Functional Dependency, Different anomalies in

Topics Covered: ​Different anomalies in designing a Database

Objectives:​ Impart basic knowledge about Tables and database.

Anomalies are primarily caused by:

Let’s learn by doing:

Find the three types of anomaly in the following table:

StudentNum CourseNum Student Address Course

Course: Database Management System IT601

Relevant MAKAUT syllabus portion: Functional Dependency, Different anomalies in

Topics Covered: ​. Normalization using functional dependencies

Objectives:​ Impart basic knowledge about Functional Dependency

Closure of a Set of FDs

Let’s learn by doing:

Course: Database Management System IT601

Relevant MAKAUT syllabus portion: Functional Dependency, Different anomalies in

Topics Covered: ​Decomposition

Prerequisites​: Have you Read

Objectives:​ Impart basic knowledge about Functional Dependency

Let’s learn by doing:

2. Consider the relation R ( A, B, C, D, E ) with the set of F = { A → C, B → C, C → D, DC → C,

Course: Database Management System IT601

Relevant MAKAUT syllabus portion: Functional Dependency, Different anomalies in

Topics Covered: ​Boyce-Codd Normal Form

Objectives:​ Impart basic knowledge about Functional Dependency

COURSE: Database Management System SEMESTER: 6 CREDITS: 3

COURSECODE: IT601 COURSE TYPE: Theory

Prerequisites: Have you Read

Objectives: Impart basic knowledge about basics of Database and DBMS

Topics Covered: Functional Dependency

Prerequisites: Have you Read

Objectives: Impart basic knowledge about Functional Dependency

– abc → d ef becomes abc →

– abc → d ef is NOT the same as ab→

c→def ! Combining rule (Useful to combine right

Topics Covered: Different anomalies in designing a Database

Objectives: Impart basic knowledge about Tables and database.

Topics Covered: . Normalization using functional dependencies

Objectives: Impart basic knowledge about Functional Dependency

Topics Covered: Decomposition

Prerequisites: Have you Read

Objectives: Impart basic knowledge about Functional Dependency

Topics Covered: Boyce-Codd Normal Form

Objectives: Impart basic knowledge about Functional Dependency

♣ α →β is trivial (i.e., β ⊆ α)

Topics Covered: 3NF

Prerequisites: Have you Read

Objectives: Impart basic knowledge about Functional Dependency

Topics Covered: Normalization using multi-valued dependencies

Objectives: Impart basic knowledge about Multi-valued Dependency

Topics Covered: Functional Dependency

Prerequisites: Have you Read

Objectives: Impart basic knowledge about 4NF and 5NF

Lemma: If R is in 4NF then it is in BCNF

Topics Covered: Functional Dependency

Prerequisites: Have you Read

Objectives: Understanding on Query Optimization

Topics Covered: Functional Dependency

Prerequisites: Have you Read

Objectives: Understanding on statistics and cost based optimization

♣ Let R(A1,A2, . . . ,Ak) be a relation fragmented into R1,R2, . . . ,Rr.

Topics Covered: File system and Memory

Prerequisites: Have you Read

Objectives: Understanding File & Record Concept in Disk.

Topics Covered: Fixed and Variable sized Records

Topics Covered: Types of Single-Level Index (primary, secondary, and clustering),

Prerequisites: Have you Read

Topics Covered: Multilevel Indexes

Prerequisites: Have you Read

Objectives: Working with Multilevel Indexes

Objectives: Handling Dynamic Multilevel Indexes using B tree and B+ tree

Topics Covered: Relational Calculus

Prerequisites: Have you Read

Objectives: Learn to use Relational Calculus in DBMS

⇨ Relational Algebra is a PROCEDURAL LANGUAGE