TECHNOLOGY
COMPUTER NETWORKING LAB PROJECT
PROJECT NAME - PACKET TRACER: - ACCESS CONTROL LIST DEMONSTRATION
SUBHAMROY ( 91 )
SUDIP MAJUMDER ( 88 )
Paper Name: Database Management System Paper Code: IT-601
1. MAKAUT Syllabus
Pre-requisites: CS302 (Data Structure & Algorithm) , M101 & M201 (Mathematics),
IT401(Object Oriented Programming & UML)
Detailed Syllabus:
Introduction [4L]
Concept & Overview of DBMS, Data Models, Database Languages, Database Administrator,
Database Users, Three Schema architecture of DBMS.
Entity-Relationship Model [6L]
Basic concepts, Design Issues, Mapping Constraints, Keys, Entity-Relationship Diagram,
Weak Entity Sets, Extended E-R features.
Relational Model [5L]
Structure of relational Databases, Relational Algebra, Relational Calculus, Extended
Relational Algebra Operations, Views, Modifications of the Database.
SQL and Integrity Constraints [8L]
Concept of DDL, DML, DCL. Basic Structure, Set operations, Aggregate Functions, Null
Values, Domain Constraints, Referential Integrity Constraints, assertions, views, Nested
Subqueries, Database security application development using SQL, Stored procedures and
triggers.
Relational Database Design [9L]
Functional Dependency, Different anomalies in designing a Database., Normalization using
functional dependencies, Decomposition, Boyce-Codd Normal Form, 3NF, Normalization
using multi-valued dependencies, 4NF, 5NF
Internals of RDBMS [7L]
Physical data structures, Query optimization: join algorithm, statistics and cost based
optimization. Transaction processing, Concurrency control and Recovery Management:
transaction model properties, state serializability, lock base protocols, two phase locking.
File Organization & Index Structures [6L]
File & Record Concept, Placing file records on Disk, Fixed and Variable sized Records,
Types of Single-Level Index (primary, secondary, and clustering), Multilevel Indexes,
Dynamic Multilevel Indexes using B tree and B+ tree .
2. Recommended Books:
a. Henry F. Korth and Silberschatz Abraham, “Database System Concepts”,
Mc.Graw Hill. .
b. Date C. J., “Introduction to Database Management”, Vol. I, II, III,
Addison Wesley.
c. Ullman JD., “Principles of Database Systems”, Galgottia Publication.
3. Course Outcomes:
IT601.1 Develop a good description of the data, its relationships and constraints
IT601.2 Use Functional Dependencies to express relational schema in a well
normalized form.
IT601.3 It helps you to maintain the quality of data in the database
IT601.4 It helps you to identify bad designs
4. Day wise Lesson Plan with book reference: Times New Roman 12(Format given
below). Note that video link in the lesson plan is optional.
5. Course Information
PROGRAMME: IT DEGREE: BTech
Lecture 1 (1 hr)
● Set theory
● Entity Relationship
● Data Structure
Notes:
Database: A collection of logically-related information stored in a consistent fashion.
The storage format typically appears to users as some kind of tabular list (table, spreadsheet)
Jobs of Database:
Relational Database:
Relational database was proposed by Edgar Codd (of IBM Research) around 1969.
Primary Key
✔ In the relational model, a table cannot contain duplicate rows, because that would create
ambiguities in retrieval.
✔ To ensure uniqueness, each table should have a column (or a set of columns), called primary
key that uniquely identifies every records of the table.
✔ A primary key is called a simple key if it is a single column; it is called a composite key if it
is made up of several columns.
The most important logical criteria in data base design are reduction/elimination of
redundancy and maintenance of database consistency.
Normal Relations: The relations that store each fact (tuple) only once in the database and that
remain consistent following database operations (updates, insertions and deletions).
DAY 2
Course: Database Management System IT601
Lecture 1 (1 hr)
● Set theory
● Entity Relationship
● Data Structure
Notes:
Functional Dependency
A set of attributes Y is functionally dependent on a set of attributes X if a given set of values for
each attribute in X determines unique values for the set of attributes in Y.
We use the notation: X →Y to denote that Y is functionally dependent on X. The set of attributes
X is known as the determinant of the FD, X →Y.
d,e,f
–Consider a,b,c →
– “Attributes a, b, and c functionally determine d, e, and f”
⇨ No mention of d relating to e or f directly
Splitting rule (Useful to split up right side of FD)
Paper Name: Database Management System Paper Code: IT-601
sides):
– d , abc→e , abc→
if abc→ f holds, then abc→d ef holds
Trivial FDs
Not all functional dependencies are useful
– A→
always holds
A
– abc → a also always holds (right side is subset of left side)
Transitive rule
• The transitive rule holds for FDs
– b a nd b → c; then a→
Consider the FDs: a → c holds
– b a nd b →cd; then ad→
Consider the FDs: ad → cd holds or just ad→c ( because of the
trivial dependency rule)
2. Use the definition of functional dependency to argue that each of Armstrong’s axioms
(reflexivity, augmentation, and transitivity) is sound.
3. Consider the following proposed rule for functional dependencies: If α → β and γ → β, then
α → γ. Prove that this rule is not sound by showing a relation r that satisfies α → β and γ →
β, but does not satisfy α → γ.
Paper Name: Database Management System Paper Code: IT-601
DAY 3
Lecture 1 (1 hr)
● Set theory
● Entity Relationship
● Data Structure
Notes:
Anomalies:
♣ Insertion anomaly: occurs when a row cannot be added to a relation, because not all data
are available (or one has to invent “dummy” data)
♣ Deletion anomaly: occurs when data is deleted from a relation, and other critical data are
unintentionally lost
♣ Update anomaly: occurs when one must make many changes to reflect the modification of a
single datum
Lecture 1 (1 hr)
● Set theory
● Entity Relationship
● Data Structure
Notes:
Removing FDs
❖ Suppose we have a relation R with scheme S and the FD A A ∩B wBh=er{e}
❖ Let C = S – (A U B)
❖ In other words:
♣ A – attributes on the left hand side of the FD
♣ B – attributes on the right hand side of the FD
♣ C – all other attributes
❖ It turns out that we can split R into two parts:
♣ R1, with scheme C U A
♣ R2, with scheme A U B
❖ The original relation can be recovered as the natural join of R1 and R2:
❖ R = R1 NATURAL JOIN R2
Problems Resolved in 2NF
♣ Problems in 1NF
♣ INSERT – Can't add a module with no texts
♣ UPDATE – To change lecturer for M1, we have to change two rows
♣ DELETE – If we remove M3, we remove L2 as well
♣ In 2NF the first two are resolved
1. Compute the closure of the following set F of functional dependencies for relation
schema R = (A, B, C, D, E).
A →BC, CD →E, B→ D, E→ A. List the candidate keys for R.
2. Give an example with explanation where a database is in 1NF and not in 2NF.
DAY 5
Lecture 1 (1 hr)
● Set theory
● Entity Relationship
● Data Structure
Notes:
Decomposition
Redundancy causes problems: Solution => decompose schema so that each information content
is represented only once
Definition: Let R be a relation scheme {R1, ..., Rn} is a decomposition of R if R = R1∪ ... ∪
Rn (i.e., all of R’s attributes are represented)
We will deal mostly with binary decomposition:
R into {R1, R2} where R = R1 ∪ R2
Eg: student(s_id, name, dept_id, dept_head, dept_phone, grade)
⇨ student(s_id, name, dept, grade) dept(dept_id, dept_head, grade)
✔ Lossless: Data should not be lost or created when splitting relations up
✔ Dependency preservation: It is desirable that FDs are preserved when splitting relations up
Paper Name: Database Management System Paper Code: IT-601
1. Suppose that we decompose the schema R = (A, B, C, D, E) into (A, B, C) & (A, D, E).
Show that this decomposition is a lossless-join decomposition if the following set F of
functional dependencies holds:
A → BC, CD → E, B → D, E → A
DAY 6
Lecture 1 (1 hr)
● Set theory
● Entity Relationship
● Data Structure
Notes:
Boyce-Codd Normal Form
A relation schema R is in BCNF with respect to a set F of functional dependencies if for all
functional dependencies in F+ of the form α → β, where α ⊆ R and β ⊆ R, at least one of
the following holds:
Example
♣ R is not in BCNF
♣ Decompose into R1 = (A, B), R2 = (B, C)
❖ R1 and R2 in BCNF
❖ Lossless-join decomposition
❖ Dependency preserving
Paper Name: Database Management System Paper Code: IT-601
2. Consider the following collection of relations and dependencies. For each relation, please
(a) determine the candidate keys, and (b) if a relation is not in BCNF then
decompose it into a collection of BCNF relations.
a. R1(A,C,B,D,E), A → B, C→ D
b. R2(A,B,F), AB→ F, B→ F
DAY 7
Lecture 1 (1 hr)
● Set theory
● Entity Relationship
● Data Structure
Notes:
Third Normal Form
❖ Allows some redundancy (with resultant problems)
❖ But FDs can be checked on individual relations without computing a join
❖ There is always a lossless-join, dependency-preserving decomposition into 3NF
A relation schema R is in third normal form (3NF) if for all α → β in F+ at least one of the
following holds:
❖ α→β is trivial (i.e., β ∈ α)
❖ α is a superkey for R
❖ Each attribute A in β – α is contained in a candidate key for R.
(NOTE: each attribute may be in a different candidate key)
❖ If a relation is in BCNF it is in 3NF (since in BCNF one of the first two conditions above
must hold).
3. The relation schema Student_Performance (name, courseNo, rollNo, grade) has the
following FDs:
name,courseNo->grade, rollNo,courseNo->grade ,name->rollNo, rollNo->name
Find the highest normal form of this relation scheme
Paper Name: Database Management System Paper Code: IT-601
DAY 8
Lecture 1 (1 hr)
● Set theory
● Entity Relationship
● Data Structure
Notes:
Multi-valued Dependency
❖ R=XYZ: relation scheme. An MVD X →→ Y holds iff each X-value in R is associated
with a set of Y-values in a way that does not depend on Z-values.
❖ Formally, for any pair of tuples t1, t2 of r(R) such that t1[X]=t2[X]
DAY 9
Lecture 1 (1 hr)
● Set theory
● Entity Relationship
● Data Structure
Notes:
Fourth Normal Form (4NF)
Relation schema R is in 4NF
❖ w.r.t. a set of dependencies D (FDs & MVDs)
❖ if for all MVD α →→ β in D+
α →→ β is a trivial MVD, OR
α is a superkey for R
❖ Effect: if there is any nontrivial MVD => it must be FD
Join Dependencies
Let R be a relation schema and R1, R2, . . . , Rn be a decomposition of R. The join dependency
*(R1, R2, . . . , Rn) is used to restrict the set of legal relations to those for which R1, R2, . . . , Rn
is a lossless-join decomposition of R. Formally, if R = R1 ∪ R2 ∪ . . . ∪ Rn, we say that a
relation r(R) satisfies the join dependency *(R1, R2, . . . , Rn) if
Paper Name: Database Management System Paper Code: IT-601
Project-join normal form (PJNF) is defined in the same way as BCNF and 4NF, except that join
dependencies are used.
A relation schema R is in PJNF with respect to a set D of functional, multi-valued, and join
dependencies if, for all join dependencies in D+ of the form *(R1, R2, . . . , Rn), where each Ri⊆R and
R = R1 ∪ R2 ∪ . . . ∪ Rn, at least one of the following holds:
✔ *(R1, R2, . . . , Rn) is a trivial join dependency.
✔ Every Ri is a superkey for R.
Relevant MAKAUT syllabus portion: Physical data structures, Query optimization: join
algorithm, statistics and cost based optimization.
Course Outcomes: Knowledge on Query Optimization
Lecture 1 (1 hr)
● Set theory
● Entity Relationship
● RDBMS
Notes:
Queries
A query is a language expression that describes data to be retrieved from a database.
Query Optimization:
Query optimization is the process of selecting an efficient execution plan for evaluating the query.
After parsing of query, parsed query is passed to query optimizer, which generates different execution
plans to evaluate parsed query and select the plan with least estimated cost.
We can have different scenario and based on those scenario we will choose different execution
plan for query.
First take record from F1 then from F2 if matches then store in temporary file and repeat this
process for all records of F1. Amount of I/O = size of F1 + size of F2
Like cross product we also require different sorting method on the basis of different scenario
present like size of data files. Best sorting method does not require main memory.
Scenario_X: if we have both F1 and F2 very-2 large we should use merge sort for this case.
Scenario_Y: if we have very-2 large data files and some memory in fraction of data files:
Let we have 5% pages main memory available of data file having 100 pages then we
read 5 pages from F1 and sort them and assign one page and again next 5 pages and
sort them and assign one page thus we have different sorted group assigned into one
page. Then keep on merging between these all sorted groups and get a final sorted
file.
Amount of I/O = whole scan of file = to create 20 sorted file + cost of merging
Comparison between above types of memory available (constant and fraction of data
file):
1. In first one, we have sufficient memory available but in second one we have
some fraction of data file
2. In scenario_X we sort records in decreasing order i.e. first we have more sorted no.
of files those reduces to one file.
3. In scenario_Y in starting we have fix no. of files those after sorting reduces to
one file.
Paper Name: Database Management System Paper Code: IT-601
Selection operation( σ ):
If given data file is large then instead of load it into main memory we take a temporary page
in main memory and pick records from data files(secondary memory)one by one, if any
particular record qualify required condition then store it.
Projection( П ):
In given data files we search table by table as above and if we found required attribute in
particular table then store that attribute and values.
How will Oracle (with the rule-based optimizer) evaluate this query?
SELECT E.ENAME
FROM EMP E
WHERE DEPTNO = 20
AND SAL >= 2000
AND ENAME LIKE ’F%’
Paper Name: Database Management System Paper Code: IT-601
DAY 11
Course: Database Management System IT601
Relevant MAKAUT syllabus portion: Physical data structures, Query optimization: join
algorithm, statistics and cost based optimization.
Course Outcomes: Knowledge on Query Optimization
Lecture 1 (1 hr)
● Set theory
● Entity Relationship
● RDBMS
Notes:
Distributed Cost Model
Two different types of cost functions can be used
Reduce total time
✔ Reduce each cost component (in terms of time) individually, i.e., do as little for
each cost component as possible
✔ Optimize the utilization of the resources (i.e., increase system throughput)
Reduce response time
✔ Do as many things in parallel as possible
✔ May increase total time because of increased total activity
Response time: Elapsed time between the initiation and the completion of a query
– where #seq x (x in instructions, I/O, messages, bytes) is the maximum number of x which must
be done sequentially.
• Any processing and communication done in parallel is ignored
Database Statistics
✔ The primary cost factor is the size of intermediate relations
that are produced during the execution and
must be transmitted over the network, if a subsequent operation is located on a different site
✔ It is costly to compute the size of the intermediate relations precisely.
✔ Instead global statistics of relations and fragments are computed and used to provide
approximations
Relevant MAKAUT syllabus portion: File & Record Concept, Placing file records on Disk,
Fixed and Variable sized Records, Types of Single-
Level Index (primary, secondary, and clustering),
Multilevel Indexes, Dynamic Multilevel Indexes using
B tree and B+ tree
Course Outcomes: File & Record Concept, Placing file records on Disk
Lecture 1 (1 hr)
● Memory
● Entity Relationship
● File System
Notes:
Physical Storage
Speed with which data can be accessed
Cost per unit of data
Reliability
o data loss on power failure or system crash
o physical failure of the storage device
Can differentiate storage into:
o volatile storage: loses contents when power is switched off
o non-volatile storage:
Contents persist even when power is switched off.
Includes secondary and tertiary storage, as well as battery-backed up
main-memory.
Let’s learn by doing:
Relevant MAKAUT syllabus portion: File & Record Concept, Placing file records on Disk,
Fixed and Variable sized Records, Types of Single-
Level Index (primary, secondary, and clustering),
Multilevel Indexes, Dynamic Multilevel Indexes using
B tree and B+ tree
Course Outcomes: Knowledge of Fixed and Variable sized Records
● Memory
● Entity Relationship
● File System
Objectives: Learn about Fixed and Variable sized Records and use them in real problem.
Notes:
File organization
⇨ The database is stored as a collection of files. Each file is a sequence of records. A
record is a sequence of fields.
⇨ One approach:
o assume record size is fixed
o each file has records of one particular type only
o different files are used for different relations
⇨ This case is easiest to implement; will consider variable length records later.
Fixed Length Record
Simple approach:
Store record i starting from byte n * (i – 1), where n is the size of each record.
Record access is simple but records may cross blocks
o Modification: do not allow records to cross block
boundaries Deletion of record i:
alternatives:
move records i + 1, . . ., n to i, . . . , n – 1
move record n to i
do not move records, but link all free records on a free list
Variable Length Record
DAY 14
Course: Database Management System IT601
Relevant MAKAUT syllabus portion: File & Record Concept, Placing file records on Disk,
Fixed and Variable sized Records, Types of Single-
Level Index (primary, secondary, and clustering),
Multilevel Indexes, Dynamic Multilevel Indexes using
B tree and B+ tree
Course Outcomes: Knowledge of Index
● Memory
● Entity Relationship
● File System
Objectives: Impart knowledge of Indexing
Notes:
Basic Concepts
Indexing mechanisms used to speed up access to desired data.
o E.g., author catalog in library
Search Key - attribute to set of attributes used to look up records in a
file. An index file consists of records (called index entries) of the form
Index files are typically much smaller than the original file
Two basic kinds of indices:
Ordered indices: search keys are stored in sorted order
Hash indices: search keys are distributed uniformly across “buckets” using a “hash
function”.
Indexing techniques evaluated on basis of:
In an ordered index, index entries are stored sorted on the search key value. E.g., author catalog
in library.
Primary index: in a sequentially ordered file, the index whose search key specifies the
sequential order of the file.
o Also called clustering index
o The search key of a primary index is usually but not necessarily the primary key.
Secondary index: an index whose search key specifies an order different from the sequential
order of the file. Also called non-clustering index.
Index-sequential file: ordered sequential file with a primary index.
Paper Name: Database Management System Paper Code: IT-601
Dense index — Index record appears for every search-key value in the file.
Sparse Index: contains index records for only some search-key values.
o Applicable when records are sequentially ordered on search-key
To locate a record with search-key value K we:
⇨ Find index record with largest search-key value < K
⇨ Search file sequentially starting at the record to which the index record points
Clustering Index
A clustered index can be defined as an ordered data file. Sometimes the index is created on non-
primary key columns which may not be unique for each record.
In this case, to identify the record faster, we will group two or more columns to get the unique
value and create index out of them. This method is called a clustering index.
The records which have similar characteristics are grouped, and indexes are created for these
group.
DAY 15
Course: Database Management System IT601
Relevant MAKAUT syllabus portion: File & Record Concept, Placing file records on Disk,
Fixed and Variable sized Records, Types of Single-
Level Index (primary, secondary, and clustering),
Multilevel Indexes, Dynamic Multilevel Indexes using
B tree and B+ tree
Course Outcomes: Knowledge of Multilevel Indexes
● Memory
● Entity Relationship
● File System
Notes:
Multilevel Index
★ If primary index does not fit in memory, access becomes expensive.
★ To reduce number of disk accesses to index records, treat primary index kept on disk as a
sequential file and construct a sparse index on it.
outer index – a sparse index of primary index
inner index – the primary index file
★ If even outer index is too large to fit in main memory, yet another level of index can
be created, and so on.
★ Indices at all levels must be updated on insertion or deletion from the file.
Paper Name: Database Management System Paper Code: IT-601
2. Consider a disk with block size B = 512 bytes. A block pointer is P = 8 bytes long, and a
record pointer is Pr = 9 bytes long. A file has r = 50,000 STUDENT records of fixed-size R =
147 bytes. The key field ID# has a length V = 12 bytes. Answer the following questions:
Suppose the key field ID# is the ordering field, and a primary index has been constructed.
Now if we want to make it into a multilevel index, what is the number of levels needed and
what is the total number of blocks required by the multilevel index?
Suppose the key field ID# is NOT the ordering field, and a secondary index has been built.
Now if we want to make it into a multilevel index, what is the number of levels needed and
what is the total number of blocks required by the multilevel index?
Paper Name: Database Management System Paper Code: IT-601
DAY 16
Course: Database Management System IT601
Relevant MAKAUT syllabus portion: File & Record Concept, Placing file records on Disk,
Fixed and Variable sized Records, Types of Single-
Level Index (primary, secondary, and clustering),
Multilevel Indexes, Dynamic Multilevel Indexes using
B tree and B+ tree
Course Outcomes: Learn of Dynamic Multilevel Indexes using B tree and B+ tree
Topics Covered: Dynamic Multilevel Indexes using B tree and B+ tree
Prerequisites: Have you Read
● Data Structure
● Entity Relationship
● File System
● B+ trees are filled from bottom and each entry is done at the leaf node.
● If a leaf node overflows −
o Split node into two parts.
o Partition at i = ⌊(m+1)/2⌋.
o First i entries are stored in one node.
o Rest of the entries (i+1 onwards) are moved to a new node.
ey is duplicated at the parent of the leaf.
o ith k
● If a non-leaf node overflows −
o Split node into two parts.
o Partition the node at i = ⌈(m+1)/2⌉.
o Entries up to i are kept in one node.
o Rest of the entries are moved to a new node.
B+ Tree Deletion
● B+ tree entries are deleted at the leaf nodes.
● The target entry is searched and deleted.
o If it is an internal node, delete and replace with the entry from the left position.
● After deletion, underflow is tested,
o If underflow occurs, distribute the entries from the nodes left to it.
● If distribution is not possible from left, then
o Distribute from the nodes right to it.
● If distribution is not possible from left or from right, then
o Merge the node with left and right to it.
● Memory
● Entity Relationship
● File System
Notes:
RELATIONAL CALCULUS
If retrieval can be specified in the relational calculus, it can be specified in the relational algebra, and
vise-versa
→expressive power of the languages is identical
A query language L is relationally complete if L can express any query that can be expressed in
the relational calculus
Let’s learn by doing:
1. Let R = (A, B) and S = (A, C), and let r(R) and s(S) be relations. The relational
algebra expression ∠A( ⌠B=10 (r)) is equivalent to the following domain relational
calculus expression:
{<a> | ∃ b ( <a, b> ∈ r ∧ b = 10)}
Give an expression in the domain relational calculus that is equivalent to each of the
following:
a) r ⋈ s
b) ∠r .A ((r ⋈ s) ⋈c =r2.A ∧ r.B>r2.B (ρr 2(r)))
2. Consider the following relational schema.
Students(rollno: integer, sname: string)
Courses(courseno: integer, cname: string)
Registration(rollno: integer, courseno: integer, percent: real)
Express in TRC "Find the distinct names of all students who score more than 90% in the
course numbered 107"