Anda di halaman 1dari 52

Review: Binary Search Tree (BST)

Structure property (binary


tree)
Each node has 2 children
Result: keeps operations
simple

8
5

11

Order property
All keys in left subtree smaller 2
than nodes key
All keys in right subtree larger
4
than nodes key
Result: easy to find any given
key
Data Structures & Algorithms

10
7

12

14
13
1

BST: Efficiency of
Operations?
Problem: operations may be inefficient if
BST is unbalanced.
Find, insert, delete
O(n) in the worst case

BuildTree
O(n2) in the worst case

Data Structures & Algorithms

How can we make a BST efficient?


Observation
BST: the shallower the better!
Solution: Require and maintain a Balance Condition that
1. Ensures depth is always O(log n)
strong enough!
2. Is efficient to maintain
not too strong!
When we build the tree, make sure its balanced.
BUTBalancing a tree only at build time is insufficient because
sequences of operations can eventually transform our carefully
balanced tree into the dreaded list
So, we also need to also keep the tree balanced as we perform
operations.
Data Structures & Algorithms

Potential Balance Conditions


1.

Left and right subtrees of the root


have equal number of nodes
Too weak!
Height mismatch example:

2.

Left and right subtrees of the root


have equal height
Too weak!
Double chain example:

Data Structures & Algorithms

Potential Balance Conditions


3.

Left and right subtrees of every node


have equal number of nodes
Too strong!
Only perfect trees (2n 1 nodes)

4.

Left and right subtrees of every node


have equal height
Too strong!
Only perfect trees (2n 1 nodes)

Data Structures & Algorithms

The AVL Balance Condition


Left and right subtrees of every node have heights differing by at most 1
Definition: balance(node) = height(node.left) height(node.right)
AVL property: for every node x, 1 balance(x) 1
Ensures small depth
This is because an AVL tree of height h
must have a number of nodes exponential in h
Thus height must be log(number of nodes).
Efficient to maintain
Using single and double rotations
Data Structures & Algorithms

The AVL Tree Data Structure


An AVL tree is a self-balancing binary search tree.
Structural properties
1. Binary tree property (same as BST)
2. Order property (same as for BST)
3. Balance property:
balance of every node is between -1 and 1
Result: Worst-case depth is O(log n)
Named after inventors Adelson-Velskii and Landis (AVL)
. First invented in 1962
Data Structures & Algorithms

Is this an AVL tree?

6
4
1

8
0

2
11

7
0 10

1
12 0

Yes! Because the left and right subtrees of


every node have heights differing by at most 1
Data Structures & Algorithms

Is this an AVL tree?


4
3
2

5
3

1
11 0

Nope! The left and right subtrees of some nodes


(e.g. 1, 4, 6) have heights that differ by more than 1
Data Structures & Algorithms

What do AVL trees give us?


If we have an AVL tree, then the number of nodes is an
exponential function of the height.
Thus the height is a log function of the number
of nodes!
And thus find is O(log n)
But as we insert and delete elements, we need to:
1. Track balance
2. Detect imbalance
3. Restore balance

Data Structures & Algorithms

10

An AVL Tree

Node object

3
10
2
0

9
0
7

1
15

key

value

height
children

20
1

10

0
30
0
17
Track height at all times!

Data Structures & Algorithms

11

AVL tree operations


AVL find:
Same as BST find
AVL insert:
First BST insert, then check balance and potentially fix the
AVL tree
Four different imbalance cases
AVL delete:
The easy way is lazy deletion
Otherwise, do the deletion and then check for several imbalance
cases (we will skip this)

Data Structures & Algorithms

12

Insert: detect potential imbalance


1. Insert the new node as in a BST (a new leaf)
2. For each node on the path from the root to the new leaf, the
insertion may (or may not) have changed the nodes height
3. So after insertion in a subtree, detect height imbalance and
perform a rotation to restore balance at that node
All the action is in defining the correct rotations to restore balance
Fact that an implementation can ignore:
There must be a deepest element that is imbalanced after the
insert (all descendants still balanced)
After rebalancing this deepest node, every node is balanced
So at most one node needs to be rebalanced
Data Structures & Algorithms

13

Case #1: Example


Insert(6)
Insert(3)
Insert(1)

6
3

Third insertion violates


balance property
happens to be at
the root

3
1

What is the only way to


fix this?

3
1

Data Structures & Algorithms

14

Fix: Apply Single Rotation


Single rotation: The basic operation well use to rebalance
Move child of unbalanced node into parent position
Parent becomes the other child (always okay in a BST!)
Other subtrees move in only way BST allows (next slide)
AVL Property violated at node 6
1

6
3
1

0
1

0
Childs new-height = old-height-before-insert
Data Structures & Algorithms

15

The example generalized: Left of Left


Insertion into left-left grandchild causes an imbalance
1 of 4 possible imbalance causes (other 3 coming up!)
Creates an imbalance in the AVL tree (specifically a is imbalanced)

h+1

h+2

h
Z
X

h+2

h+3
h

h+1

h
Z

Data Structures & Algorithms

16

Data Structures & Algorithms

17

Data Structures & Algorithms

18

The general left-left case


So we rotate at a
Move left child of unbalanced node into parent position
Parent becomes the right child
Other sub-trees move in the only way BST allows:
using BST facts: X < b < Y < a < Z
a

h+2

h+3

h+1

Z
X

h+1

h
Y

h+2
a

h+1
h

X
Y

A single rotation restores balance at the node


To same height as before insertion, so ancestors now balanced
Data Structures & Algorithms

19

Another example: insert(16)


15
8
4
3

22
24

19

10
6

17

20

16

15
8
4
3

19
10

6
Data Structures & Algorithms

22

17
16

20

24
20

The general right-right case


Mirror image to left-left case, so you rotate the other way
Exact same concept, but needs different code

h+3
h+2

h+1

b
X

h+2
h+1

h+1

h
Z

Data Structures & Algorithms

21

Two cases to go
Unfortunately, single rotations are not enough for insertions in the
left-right subtree or the right-left subtree
Simple example: insert(1), insert(6), insert(3)
First wrong idea: single rotation like we did for left-left

2
1

6
3

Violates order
property!

1
Data Structures & Algorithms

3
22

Two cases to go
Unfortunately, single rotations are not enough for insertions in the
left-right subtree or the right-left subtree
Simple example: insert(1), insert(6), insert(3)
Second wrong idea: single rotation on the child of the
unbalanced node

1
6

Still unbalanced!

1
0

6
Data Structures & Algorithms

23

Sometimes two wrongs make a right


First idea violated the order property
Second idea didnt fix balance
But if we do both single rotations, starting with the second, it
works! (And not just for this example.)
Double rotation:
1. Rotate problematic child and grandchild
2. Then rotate between self and new child

1
6
3

6
Data Structures & Algorithms

6
24

The general right-left case


h+3

h+2
b

h+1
X

h-1

c
a

h
h
U

h+1

h+2

c
X

h+3

h-1

h+1
h

a
U

h+1

h
X

h+2

h-1
V

Z
Data Structures & Algorithms

25

Comments
Like in the left-left and right-right cases, the height of the
subtree after rebalancing is the same as before the insert
So no ancestor in the tree will need rebalancing

Does not have to be implemented as two rotations; can just do:


h+3
h+2
a
c
h+2
h
b
h+1
h+1
h+
b
h
a
c
1
h
X
h
h
h-1
h
h-1
U

Easier to remember than you may think:


Move c to grandparents position
Put a, b, X, U, V, and Z in the only legal positions for a BST
Data Structures & Algorithms

26

The last case: left-right


Mirror image of right-left
Again, no new concepts, only new code to write
h+3
h+2

h
X

b
c

h+1
Z

h
U

h-1
V

h+1

Data Structures & Algorithms

h
X

h+2

h+
1
h

h-1
V

27

Insert, summarized
Insert as in a BST
Check back up path for imbalance, which will be 1 of 4 cases:
Nodes left-left grandchild is too tall
Nodes left-right grandchild is too tall
Nodes right-left grandchild is too tall
Nodes right-right grandchild is too tall
Only one case occurs because tree was balanced before insert
After the appropriate single or double rotation, the smallestunbalanced subtree has the same height as before the insertion
So all ancestors are now balanced

Data Structures & Algorithms

28

Example
3
10
2
0

20
1

9
0
7

1
15

0
30
0
17

Data Structures & Algorithms

29

Insert a 6

Whats the deepest node


that is unbalanced?
4

Whats the case?

10
3
5

left-left

20
2

What do we do?

9
7

1
15

0
30
0
17

30

Insert a 6
3
10
2

5
0

20
1

1
15

7
0 6

0
30
0
17

Data Structures & Algorithms

31

Insert a 13
3
10
2

5
0

20
1

7
0 6

1
15

0
30

0
0 13 17
9

Data Structures & Algorithms

32

Insert a 14
4
10
2
5
0

20
1

7
0 6

What is the deepest


unbalanced node?

3
2
15
1
0 13
9

0
30
0
17
14 0

Data Structures & Algorithms

33

Insert a 14
4
10
2
5
0

20
1

7
0 6

What is the deepest


unbalanced node?

3
2
15
1
0 13
9

0
30
0
17
14 0

Data Structures & Algorithms

Which of the four


cases is this?
Still left-left!
Single rotation

34

Insert a 14
3
10
2

5
0

15
1

7
0 6

1 13
9

20 1

0 14 17 0 30 0

Data Structures & Algorithms

35

Now efficiency
Worst-case complexity of find: O(log n)
Tree is balanced
Worst-case complexity of insert: O(log n)
Tree starts balanced
A rotation is O(1) and theres an O(log n) path to root
Tree ends balanced
Worst-case complexity of buildTree: O(n log n)
Takes some more rotation action to handle delete

Data Structures & Algorithms

36

Pros and Cons of AVL Trees


Arguments for AVL trees:
1. All operations logarithmic worst-case because trees are always
balanced
2. Height balancing adds no more than a constant factor to the speed
of insert and delete
Arguments against AVL trees:
3.
4.
5.
6.

More difficult to program & debug [but done once in a library!]


More space for height field
Asymptotically faster but rebalancing takes a little time
If amortized (later) logarithmic time is enough, use splay trees (also
in the text)

Data Structures & Algorithms

37

A new ADT: Priority Queue


A priority queue holds compare-able data
Like dictionaries, we need to compare items
Given x and y, is x less than, equal to, or greater than y
Meaning of the ordering can depend on your data
Integers are comparable, so will use them in examples
But the priority queue ADT is much more general
Typically two fields, the priority and the data

Data Structures & Algorithms

38

A new ADT: Priority Queue


Each item has a priority
In our examples, the lesser item is the one with the greater priority
So priority 1 is more important than priority 4
(Just a convention, think first is best)
Operations:
insert
deleteMin
is_empty
Key property: deleteMin returns and deletes the item with greatest priority
(lowest priority value)
Can resolve ties arbitrarily

Data Structures & Algorithms

39

A new ADT: Priority Queue


insert x1 with priority 5
insert x2 with priority 3
insert x3 with priority 4
a = deleteMin // x2
b = deleteMin // x3
insert x4 with priority 2
insert x5 with priority 6
c = deleteMin // x4
d = deleteMin // x1
Analogy: insert is like enqueue, deleteMin is like dequeue
But the whole point is to use priorities instead of FIFO

Data Structures & Algorithms

40

A new ADT: Priority Queue


Like all good ADTs, the priority queue arises often
Sometimes blatant, sometimes less obvious
Run multiple programs in the operating system
critical before interactive before compute-intensive
Maybe let users set priority level
Treat hospital patients in order of severity (or triage)
Select print jobs in order of decreasing length?
Forward network packets in order of urgency
Select most frequent symbols for data compression
Sort (first insert all, then repeatedly deleteMin)

Data Structures & Algorithms

41

A new ADT: Priority Queue


Will show an efficient, non-obvious data structure for this ADT
But first lets analyze some obvious ideas for n data items
All times worst-case; assume arrays have room
data

insert algorithm / time

unsorted array

add at end

unsorted linked list

deleteMin algorithm / time

O(1)

search

O(n)

add at front

O(1)

search

O(n)

sorted circular array

search / shift

O(n)

move front

O(1)

sorted linked list

put in right place O(n)

binary search tree

put in right place O(n)

AVL tree

remove at front O(1)


leftmost

put in right place O(log n) leftmost

Data Structures & Algorithms

42

O(n)
O(log n)

Our data structure: the Binary Heap


A binary min-heap (or just binary heap or just heap) has:
Structure property: A complete binary tree
Heap property: The priority of every (non-root) node is less than
the priority of its parent
Not a binary search tree
not a heap

10

20

10

a heap
20

80
40

30

15

50

80
60

85

99

700

So:
Where is the most important item?
What is the height of a heap with n items?
43

Operations: basic idea


findMin: return root.data
deleteMin:
1. answer = root.data
2. Move right-most node in last
row to root to restore
structure property
3. Percolate down to restore
heap property
insert:
1. Put new node in next position
on bottom row to restore
structure property
2. Percolate up to restore
heap property

10
20
40
50

80

60

85

99

700

Overall strategy:
Preserve structure property
Break and restore heap
property

44

DeleteMin
Delete (and later return) value at root node
1
4

11 9

6 10

45

DeleteMin: Keep the Structure Property


We now have a hole at the root
Need to fill the hole with another value
Keep structure property: When we are done,
the tree will have one less node and must
still be complete

11 9

6 10

Pick the last node on the bottom row of the


tree and move it to the hole

7
11 9

5
6

10
46

DeleteMin: Restore the Heap Property


Percolate down:
Keep comparing priority of item with both children
If priority is less important (>) than either, swap with the most
important (smaller) child and go down one level
Done if both children are less important (>) than the item or weve
reached a leaf node
10

3
4

11 9

10

3
4

4
9

7
11 9

5
6

7
11 9

10

Why is this correct?


What is the run time?
47

DeleteMin: Run Time Analysis


Run time is O(height of heap)
A heap is a complete binary tree
Height of a complete binary tree of n nodes?
height = log2(n)
Run time of deleteMin is O(log n)

48

Insert
Add a value to the tree

Afterwards, structure and heap


properties must still be correct

1
4

7
11 9

10

49

Insert: Maintain the Structure Property

There is only one valid tree shape after


we add one more node

So put our new data there and then


focus on restoring the heap property

7
11 9

10

50

Insert: Restore the heap property


Percolate up:
Put new data in new location
If parent is less important (>), swap with parent, and continue
Done if parent is more important (<) than item or reached root

10

10

11 9

?
2

?
11 9

7
11 9

4
6

10

What is the running time?


Like deleteMin, worst-case time proportional to tree height: O(log n)
51

Summary
Priority Queue ADT:
insert comparable object,
deleteMin
Binary heap data structure:
Complete binary tree
Each node has less important
priority value than its parent

6
insert

15

23
12 18
45 3 7

deleteMin

10
20
40
700

80
60

85

99

50

insert and deleteMin operations = O(height-of-tree)=O(log n)


insert:
put at new last position in tree and percolate-up
deleteMin: remove root, put last element at root and
percolate-down
52

Anda mungkin juga menyukai