Fall 2005
c
2 Trees / Slide 2
Dictionao condatoag
J The 2 tree is an excellent dictionary structure when
the entire structure can fit into the main memory.
following or updating a pointer only requires a memory cycle.
J When the size of the data becomes so large that it
cannot fit into the main memory, the performance of
2 tree may deteriorate rapidly
Following a pointer or updating a pointer requires accessing
the disk once.
Traversing from root to a leaf may need to access the disk
log2 n time.
J when n = 1048576 = 220, we need 20 disk accesses. For a disk
spinning at 7200rpm, this will take roughly 0.166 seconds. 10
searches will take more than 1 second! This is way too slow.
2 Trees / Slide 3
c
J Since the processor is much faster, it is more
important to minimize the number of disk accesses by
performing more cpu instructions.
c
J In practice: it is impossible to keep the same number
of children per internal node.
J 2 B+-tree of order M 3 is an M-ary tree with the
following properties:
Each internal node has at most M children
Each internal node, except the root, has between ÎM/2 -1 and
M-1 keys
ü this guarantees that the tree does not degenerate into a binary
tree
The keys at each node are ordered
The root is either a leaf or has between 1 and M-1 keys
The data items are stored at the leaves. 2ll leaves are at the
same depth. Each leaf has between Î/2 -1 and -1 data
items, for some (usually << M, but we will assume M= in most
examples)
2 Trees / Slide 5
ampl
J Aere, M==5
J Records are stored at the leaves, but we only show the keys
here
J 2t the internal nodes, only keys (and pointers to children) are
stored (also called separating keys)
2 Trees / Slide 6
2c t
ith
c
J Which keys are stored at the internal nodes?
c t
J Each internal node/leaf is designed to fit into one I/O block of
data. 2n I/O block usually can hold quite a lot of data. Aence,
an internal node can keep a lot of keys, i.e., large M. This
implies that the tree has only a few levels and only a few disk
accesses can accomplish a search, insertion, or deletion.
aching
J Suppose that we want to search for the key K.
The path traversed is shown in bold.
2 Trees / Slide 10
aching
J et x be the input search key.
J Start the searching at the root
J If we encounter an internal node v, search (linear
search or binary search) for x among the keys stored
at v
If x < Kmin at v, follow the left child pointer of Kmin
If Ki x < Ki+1 for two consecutive keys Ki and Ki+1 at v,
follow the left child pointer of Ki+1
If x Kmax at v, follow the right child pointer of Kmax
J If we encounter a leaf v, we search (linear search or
binary search) for x among the keys stored at v. If
found, we return the entire record; otherwise, report
not found.
2 Trees / Slide 11
Dntion
J Suppose that we want to insert a key K and its
associated record.
J Search for the key K using the search
procedure
J This will bring us to a leaf x.
J Insert K into x
Splitting (instead of rotations in 2 trees) of
nodes is used to maintain properties of B+-trees
[next slide]
2 Trees / Slide 12
Dntionintoala
J If leaf x contains < M-1 keys, then insert K into x (at
the correct position in node x)
J If x is already full (i.e. containing M-1 keys). Split x
Cut x off its parent
Insert K into x, pretending x has space for K. Now x has M
keys.
2fter inserting K, split x into 2 new leaves x and xR, with x
containing the ÕM/2 smallest keys, and xR containing the
remaining ÎM/2 keys. et J be the minimum key in xR
Make a copy of J to be the parent of x and xR, and insert the
copy together with its child pointers into the old parent of x.
2 Trees / Slide 13
Dntingintoanonllla
2 Trees / Slide 14
plittingalainting
2 Trees / Slide 15
?ont¶d
2 Trees / Slide 16
ù Two disk accesses to write the two leaves, one disk access to
update the parent
ù For =32, two leaves with 16 and 17 items are created. We can
perform 15 more insertions without another split
2 Trees / Slide 17
2noth ampl
2 Trees / Slide 18
?ont¶d
º
2 Trees / Slide 19
plittinganintnalnod
To insert a key K into a full internal node x:
J Cut x off from its parent
J Insert K and its left and right child pointers into x,
pretending there is space. Now x has M keys.
J Split x into 2 new internal nodes x and xR, with x
containing the ( ÎM/2 - 1 ) smallest keys, and xR
containing the ÕM/2 largest keys. Note that the
(ÎM/2 )th key J is not placed in x or xR
J Make J the parent of x and xR, and insert J together
with its child pointers into the old parent of x.
2 Trees / Slide 20
amplplittingintnalnod
2 Trees / Slide 21
?ont¶d
2 Trees / Slide 22
mination
J Splitting will continue as long as we encounter
full internal nodes
J If the split internal node x does not have a
parent (i.e. x is a root), then create a new root
containing the key J and its two children