Unit 1 - Introduction
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 1
Distributed Databases
• Distributed database is a collection of data
which belong logically to the same system but
which belong logically to the same system but
are spread over the sites of computer network.
1 Distribution : The fact that the data are not
1. Distribution :‐ The fact that the data are not
resident at the same site.
2 L i l
2. Logical correlation :‐
l ti Th f t th t th d t h
The fact that the data have
some properties which tie them together.
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 2
Branch 1 Branch 2
DB 1 DB 2
DB 2
T T
T T
T Terminals T
Comp 1 Comp 2
T T
T T
Comm.
N/W
A distributed database
on a geographically
dispersed network
p T
T
T
DB 3 Comp 3
T
T
Branch 3
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 3
Branch 1 Computer Center Branch 2
DB1 DB2 T
T T
T T
T
Comp 1
p Comp 2
p
Local
N/W
Comp 3 DB3
A distributed database
A distributed database
on a local network T T T
Branch 3
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 4
Features of Distributed VS Centralized Databases
• Centralized Control
• D t i d
Data independence
d
• Reduction of redundancy
• Complex physical structures and efficient access
Example
• Integrity, Recovery and Concurrency Control
• Privacy & Security
Privacy & Security
Next
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 5
SUPPLIER
SUPPLIER‐PART
PART
Find SUPPLIER record with SUP#=S1;
Repeat until “No
Repeat until No more members in the set
more members in the set”
Find next PART record in SUPPLIER‐PART set;
Output PART record;
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 6
1) At site 1
Send sites 2 and 3 the supplier number SN
2) At sites 2 and 3
2) At sites 2 and 3
execute in parallel, upon receipt of the supplier
number the following program:
number, the following program:
Find all PARTS records having SUP#=SN;
Send result to site 1.
3) At site 1
Merge results from sites 2 and 3;
Output the result
Output the result. BACK
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 7
Why Distributed Databases
Why Distributed Databases
• Organizational and economic reasons.
Organizational and economic reasons
• Interconnection of existing databases.
• Incremental Growth.
lG h
• Reduced communication overhead.
• Performance considerations.
• Reliability and Availability
Reliability and Availability.
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 8
Reference Architecture for D.D.
Distribution Transparency :‐
The reference architecture which we are going to see is not
explicitly implemented in all Distributed databases but it will
helpful to understand the organization of any DD
helpful to understand the organization of any DD.
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 9
Global
Schema
Fragmentation Site
Schema Independent
Schema
Allocation Reference Architecture for
Schema
Distributed Databases
Distributed Databases
Local Local
M
Mapping
i Mapping
Mapping
Schema1 Schema2
DBMS of Site 1 DBMS of Site 2
Local DB at Local DB at
site 1 site 2
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 10
(R11)
R1 R1
(Site 1)
(R12)
R2
R (R21) R2
((Site 2))
R3 (R22)
Physical Images
Physical Images
R4 (R32)
R3
(Site 3)
Global Relation Fragments (R33)
(R34)
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 11
Types of Fragmentation
• The decomposition of global relation into
fragments can be performed by applying two
different types of fragmentation
– Horizontal Fragmentation
– Vertical Fragmentation
• In all types of fragmentation, a fragment can be
yp g g
defined by an expression in relational language
g g
(Relational Algebra) which takes global relations
as operands and produces the fragment as
result.
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 12
Following are the rules which must be followed
• Completeness Condition :‐
C l C di i All the data of the
ll h d f h
global relation must be mapped into fragments.
• Reconstruction Condition :‐ It must always
possible to reconstruct each global relation
from its fragments.
• Disjointness Condition :‐
j It is convenient that
fragments be disjoint, so that the replication of
data can be controlled at each allocation level.
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 13
Horizontal Fragmentation
• Horizontal Fragmentation :‐ Horizontal
Fragmentation consists of partitioning the
tuples of a global relation into subsets, where
each subset can contain data which have
common geographical properties.
• This can be defined byy usingg selection
operation on the global relation.
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 14
• Eg :‐ Consider the global realtion
SUPPLIER (SNUM NAME CITY)
SUPPLIER (SNUM, NAME, CITY)
Then Horizontal Fragmentation can be defined as
SUPPLIER1 = SL
SLCITY=“NSP” SUPPLIER
SUPPLIER2 = SLCITY=“HYD” SUPPLIER
• The reconstruction can be done using Union as
SUPPLIER = SUPPLIER1 UN SUPPLIER2
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 15
Derived Horizontal Fragmentation
• SSometimes Horizontal Fragmentation may be
ti H i t lF t ti b
derived from Horizontal Fragmentation of
another relation
th l ti
Eg:‐ Consider the global relation
SUPPLY(SNUM, PNUM, DEPTNUM, QUAN)
We can fragment the tuples for suppliers which are
We can fragment the tuples for suppliers which are
in given city. The derived fragmentation can be
de es as
defines as
SUPPLY1 = SUPPLY SJ SNUM = SNUM SUPPLIER1
SUPPLY2 = SUPPLY SJ
SUPPLY SJ SNUM = SNUM SUPPLIER2
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 16
Vertical Fragmentation
• The Vertical Fragmentation of a global relation is
the subdivision of its attributes into groups.
• The fragmentation is correct if each attributes is
pp
mapped into at least one attribute of the
fragments.
• It must be possible to reconstruct the original
relation by joining fragments.
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 17
• Eg:‐ Consider the global schema.
EMP(EMPNUM NAME SAL TAX MGRNUM DEPTNUM)
EMP(EMPNUM, NAME, SAL, TAX, MGRNUM, DEPTNUM)
The Vertical Fragmentation of this relation can be
d fi d
defined as
EMP1 = PJ EMPNUM, NAME, MGRNUM, DEPTNUM EMP
EMP2 = PJ EMPNUM, SAL, TAX EMP
The reconstruction of relation EMP can be obtained
EMP EMP1 JN
EMP = EMP JN EMPNUM = EMPNUM EMP2
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 18
Mixed Fragmentation
Eg:‐ Consider the global schema.
EMP(EMPNUM, NAME, SAL, TAX, MGRNUM, DEPTNUM)
• The following is a mixed fragmentation obtained
y g y
by vertical fragmentation followed by horizontal
fragmentation
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 19
• EMP1 = SL DEPTNUM<=10 PJ EMPNUM, NAME, MGRNUM,
DEPTNUM EMP
• EMP2 = SL
SL 10<DEPTNUM<=20
10 DEPTNUM 20 PJ
PJ EMPNUM, NAME, MGRNUM,
EMPNUM NAME MGRNUM
DEPTNUM EMP
• EMP3 = SL DEPTNUM>20 PJ EMPNUM, NAME, MGRNUM,
DEPTNUM EMP
DEPTNUM
• EMP4 = PJ EMPNUM, NAME, SAL, TAX
EMPNUM NAME SAL TAX EMP
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 20
• The reconstruction of the relation EMP is
defined as
defined as
EMP = UN(EMP1 , EMP2, EMP3, ) JN EMPNUM = EMPNUM
PJ EMPNUM, NAME, SAL, TAX EMP4
PJ
• Mixed
Mixed fragmentation can be represented by a
fragmentation can be represented by a
Fragmentation Tree as follows : EMP
V
h
EMP4
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 22
• Fragmentation Schema
¾ EMP1 = SL
= SL DEPTNUM<=10 PJ PJ EMPNUM, NAME, MGRNUM, DEPTNUM (EMP)
¾ EMP2 = SL 10<DEPTNUM<=20 PJ EMPNUM, NAME, MGRNUM, DEPTNUM (EMP)
¾ EMP3 = SL
= SL DEPTNUM>20 PJ PJ EMPNUM, NAME, MGRNUM, DEPTNUM (EMP)
¾ EMP4 = PJ EMPNUM, NAME, SAL, TAX (EMP)
¾ DEPT1 = SL
SL DEPTNUM<=10
DEPTNUM 10 (DEPT)
¾ DEPT2 = SL 10<DEPTNUM<=20 (DEPT)
¾ DEPT3 = SL
SL DEPTNUM>20
DEPTNUM>20 (DEPT)
¾ SUPPLIER1 = SLCITY=“NSP” (SUPPLIER)
¾ SUPPLIER2 = SLCITY= HYD ((SUPPLIER))
CITY=“HYD”
¾ SUPPLY1 = SUPPLY SJ SNUM = SNUM SUPPLIER1
¾ SUPPLY2 = SUPPLY SJ SNUM = SNUM
SNUM = SNUM SUPPLIER2
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 23
Integrity Constraints in D.D.
• Integrity Constraints may indicate which data
values are allowed or which transactions are
allowed.
• They may involve one or more relations.
y y
• When an update performed by the a database
application violates an Integrity Constraints the
application violates an Integrity Constraints , the
application is rejected and thus correctness of
data is preserved
data is preserved.
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 24
• Integrity Constraints can be enforced
automatically by adding to application programs
automatically by adding to application programs
some code for testing whether the constraint is
violated.
violated
• If violated the program execution is suspended &
allll the
h actions
i already
l d performed
f d by
b it
i are
cancelled.
• Eg:‐ Consider deletion of tuple from SUPPLIER
DELETE * FROM SUPPLIER WHERE SNUM = $SNUM
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 25
• In order to verify that’s. the constraint is not
violated we may write the program as follows:
violated, we may write the program as follows:
SELECT $SNUM FROM SUPPLY
WHERE SNUM = $SNUM;
IF NOT #FOUND THEN
DELETE * FROM SUPPLIER
WHERE SNUM = $SNUM
WHERE SNUM = $SNUM
• The main disadvantage of integrity constraints is
the loss of performance due to integrity test.
Vijaykumar Mantri, Assoc. Prof. in IT, BVRIT 26