Anda di halaman 1dari 47

Advanced Analysis of

Algorithms
Chapter 1
Dr. M. Sikander Hayat Khiyal
Chairperson
Department of Computer Science/Software
Engineering,
Fatima Jinnah Women University,
Rawalpindi, PAKISTAN
m.sikandarhayat@yahoo.com
EVERY CASE TIME
COMPLEXITY
For a given function, T(n) is defined as the
number of times the algorithm does the
basic operations for an instance of size n.
T(n) is called the every case time
complexity of the algorithm, and the
determination of T(n) is called an every
case time complexity analysis.
WORST CASE TIME
COMPLEXITY
For a given algorithm, W(n) is defined as the
maximum number of times the algorithm will ever
do its basic operations for an input of size n.
W(n) is called the worst case time complexity of
the algorithm, and the determination of W(n) is
called the worst case time complexity analysis. If
T(n) exists then clearly:
W(n) = T(n)




AVERAGE CASE TIME
COMPLEXITY
For a given algorithm, A(n) is defined as the
average (or expected value) number of times the
algorithm does its basic operations for an input of
size n.
A(n) is called the average case time complexity of
the algorithm, and the determination of A(n) is
called the average case time complexity analysis.
If T(n) exists then clearly:
A(n) = T(n)




BEST CASE TIME
COMPLEXITY
For a given algorithm, B(n) is defined as the
minimum number of times the algorithm will ever
do its basic operations for an input of size n.
B(n) is called the best case time complexity of the
algorithm, and the determination of B(n) is called
the Best case time complexity analysis. If T(n)
exists then clearly:
B(n) = T(n)




ANALYSIS SUMMARY OF SORTING ALGORITHMS
Algorithm Comparisons of
keys
Assignment of
records
Extra space
usage
Exchange Sort T(n) = n
2
/2
W(n) = 3n
2
/2
A(n) = 3n
2
/4
In place
Insertion Sort
W(n) = n
2
/2
A(n) = n
2
/4
W(n) = n
2
/2
A(n) = n
2
/4
In place
Selection Sort T(n) = n
2
/2 T(n) = 3n In place
Merge Sort
W(n) = nlgn
A(n) = nlgn
T(n) = 2nlgn u(n) records
Merge Sort
(D.P)
W(n) = nlgn
A(n) = nlgn
T(n) = 0 u(n) links
Quick Sort
W(n) = n
2
/2
A(n) = 1.38 nlgn
A(n) = 0.69 nlgn
u(lg n) indices
Heap Sort
W(n) = 2nlgn
A(n) = 2nlgn
W(n) = nlgn
A(n) = nlgn
In place
COMPUTATIONAL COMPLEXITY
Computational complexity is the study of
all possible algorithms that can solve a
given problem.
A computational complexity analysis tries
to determine a lower bound on the
efficiency of all algorithms for a given
problem. Its is a field that runs hand in-
hand with algorithm design and analysis.
LOWER BOUND FOR SORTING ALGORITHMS
Insertion sort:
In general, we are concerned with sorting n distinct
keys that come from any ordered set. However, without
loss of generality, we can assume that the keys to be
sorted are simply the +ve integers 1,2,,n because we
can substitute 1 for the smallest key, 2 for 2nd smallest
key and so on. Suppose for the alpha input [Ralph,
Clyde, Dave], we have [3,1,2], if 1 associates with
Clyde, 2 with Dave and 3 with Ralph. Any algorithm
that sorts these integers only by comparison of keys
would have to do the same number of comparisons to
sort the three names.
A permutation of the first n positive integers can
be thought of as an ordering of these integers
because there are n! permutations of the first n +ve
integer, there are n! different orderings.
For example, for the first three +ve integers there
are six ordering permutation as:
[1,2,3] [1,3,2] [2,1,3] [2,3,1] [3,1,2] [3,2,1]
This means that there are n! different inputs
containing n distinct keys. These six permutations
are the different inputs of size 3. Permutation is
denoted by [k
1
,k
2
,,k
n
].
An inversion in a permutation is a pair (k
i
,k
j
)
such that i<j and k
i
> k
j


For example, the permutation [3,2,4,1,6,5] contains the
inversion (3,2), (3,1), (2,1), (4,1) and (6,5). Clearly a
permutation contains no inversion if and only if it is the
sorted ordering [1,2,,n]. This means that the task of
sorting n distinct keys is the removal of all inversions
in a permutation.




Proof:
For the worst case, there are [n,n-1,,2,1] permutations,
therefore there are at least n(n-1)/2 comparisons of keys.
(solved by arithmetic sum)
For the average case, pair the permutation [k
n
, , k
2
, k
1
]
with the permutation [k
1
, k
2
, ..., k
n
]. This is called the
Transpose of the original permutation.
Let r and s be integers between 1 and n such that s>r.

Theorem1:
Any algorithm that sorts n distinct keys only by
comparison of keys and remove at most one inversion after
each comparison must in the worst case do at least
n(n-1)/2 comparisons of keys
and on the average do at least
n(n-1)/4 comparisons of keys.
Given a permutation, the pair (s,r) is an inversion
in either permutation or its transpose and not in
both. As there are [n,n-1,,2,1] permutations so
there are n(n-1)/2 pairs between 1 and n. This
means that a permutation and its transpose have
n(n-1)/2 inversions. Thus the average number of
inversions in a permutation and its transpose is
()*(n(n-1)/2) = n(n-1)/4.
Therefore, if we consider all permutations equally
probable for the input, the average number of
inversions in the input is also n(n-1)/4. Because
we assumed that the algorithm removes at most
one inversion after each comparison, on the
average it must do at least this many comparisons
to remove all inversions and there by sort the
input.

Lower bounds for sorting only by comparison of keys
Decision tree for sorting algorithm:
We can associate a binary tree with procedure sort tree
by placing the comparison of a and b at the root, left
child in the comparison if a<b and right child if a>b.

a<b
b<c
b<c
c,a,b a,c,b
a<c a,b,c a<c
c,b,a
b,c,a b,a,c
Yes
No
Yes
Yes
Yes
Yes
No
No
No
No
Fig : 1
This tree is called Decision tree, because at each
node a decision must be made as to which node to
visit next.
A decision tree is called Valid for sorting n keys
if, for each permutation of the n keys, there is a path
from the root to a leaf that sorts that permutation,
that is, it can sort every input of size n. The above
tree is valid, but no longer be valid if we removed
any branch from the trees.
Now draw a decision tree for exchange sort
when sorting three keys.
b<a
c,b,a
b<a
c<b
c<a
b,a,c b,c,a
c<a
a<b c<b
c,a,b
a,c,b a,b,c
Yes
Yes
Yes
Yes Yes
Yes
Yes
No
No
No
No
No
Fig:2

A decision tree is Pruned if every leaf can be reached
from the root by making a consistent sequence. Note
that for exchange sort the comparison c<b does mean
that the current value of c is compared with current
value of b not s[3] and s[2].

LEMMA 1:
To every deterministic algorithm for sorting n distinct
keys there corresponds a pruned, valid, binary decision
tree containing exactly n! leaves.
PROOF:As there is a pruned, valid decision tree
corresponding to any algorithm for sorting n keys. When
all the keys are distinct, the result of a comparison is
always < or > . Therefore, each node in that tree has at
most 2 children, which means that it is a binary tree.
Because there are n! different inputs that contains n
distinct keys and because a decision tree is valid for
sorting n distinct keys only if it has a leaf for every input,
the tree has at least n! leaves. Because there is a unique
path in the tree for each of the n! different inputs and
because every leaf is pruned, decision tree must be
reachable, the tree can have no more than n! leaves.
Therefore, the tree has exactly n! leaves.

LOWER BOUND FOR WORST CASE BEHAVIOUR
LEMMA 2: The worst case number of comparisons done
by a decision tree is equal to its depth.
PROOF: Given some input, the number of comparisons
done by a decision tree is the number of internal nodes
on the path followed for that input. The number of
internal nodes is the same as the length of the path.
Therefore, the worst case number of comparisons done
by a decision tree is the length of the longest path to a
leaf, which is the depth of the decision tree.
LEMMA 3: If m is the number of leaves in a binary tree
and d is the depth, then d > lg m(
PROOF: By induction, show that 2
d
>m
Induction Base: A binary tree with the depth 0 has one
node that is both the root and the only leaf. Therefore,
for such a tree, the number of leaves m equals 1, and
2
0
> 1
Induction Hypothesis: Assume for every binary tree with
depth d , 2
d
> m,
where m is the number of leaves.
Induction Step: We need to show that, for a binary tree
with depth d+1 ,
2
d+1
> m

where m is the number of leaves.
If we remove all the leaves from such a tree, we have a
tree with depth d whose leaves are the parent of the
leaves in original tree. If m is the number of these
parents, then by the induction hypothesis.
2
d
> m
because each parent can have at most two children,
2m>=m
Thus,
2
d+1
> 2m > m Proof completed.
Now taking lg of 2
d
> m gives d > lg m because d is the
integer so d > lg m(.

THEOREM 2: Any deterministic algorithm that sorts n
distinct keys only by comparisons of keys must in the
worst case do at least lg (n!)( comparisons of keys.
PROOF: By lemma 1, any pruned, valid BDT has n!
leaves and by lemma 3 depth d>lg m(. Thus theorem
follow by lemma 2 that any DT worst case number of
comparison is given by its depth.
LEMMA 4: For any +ve integer n,
lg(n!) > nlgn - 1.45n
PROOF:
lg(n!) = lg[n(n-1)(n-2)(2)(1)] = E lg i
i=2
n
since lg 1=0

lg(n!) > } lg x dx = (1/ln2)[n ln n n+1]
> nlogn 1.45n Proof Completed.

THEOREM 3: Any deterministic algorithm that sorts n
distinct keys only by comparison of keys must in the
worst case do at least
nlgn 1.45n( comparison of keys.

PROOF: By theorem 2 and lemma 4.
1
n
LOWER BOUNDS FOR AVERAGE CASE BEHAVIOUR
Definition: A binary tree in which every nonleaf
contains exactly two children is called 2-tree.
LEMMA 5: To every pruned, valid, BDT for sorting n
keys (distinct), there corresponds a pruned, valid
decision 2-tree that is at least as efficient as the original
tree.
PROOF: If the pruned, valid BDT corresponding to a
deterministic sorting algorithms for sorting n distinct
keys contain any comparison nodes with only one child,
we can replace each such node by its child and prune
the child to obtain a decision tree that sorts using no
more comparisons than did the original (fig.2). Every
nonleaf in the new tree will contain exactly two
children.
Definition: The external path length of a tree is the total
length of all paths from the root to the leaves. EPL for
Figure 1 is
EPL=2+3+3+3+3+2=16
As the EPL is the total number of comparisons done by
DT and n! different inputs of size n. Therefore, the
average number of comparison is EPL/n!
LEMMA 6: Any deterministic algorithm that sorts n
distinct keys only by comparison of keys must on the
average do at least
(min EPL(n!)) / n!
comparison of keys.
PROOF: By lemma 1, every DA for sorting n distinct
keys there corresponds a pruned, valid, BDT containing
n! leaves.
By lemma 5, we can convert BT to 2-tree. Because the
original tree has n! leaves so must the 2-tree we obtain
from it. Hence proved.
LEMMA 7: Any 2-tree that has m leaves and whose
EPL equals min EPL(m) must have all of its leaves on
at most the bottom two levels.
PROOF: Suppose that some 2-tree does not have all its
leaves on the bottom two levels. Let d be the depth of
the tree, let A be a leaf in the tree that is not on one of
the bottom two levels and let k be the depth of A.
Because nodes at the bottom level have depth d, ksd-2.
Now show that tree cannot minimize the EPL among
the trees with the same number of leaves by developing
a 2-tree with the same no. of leaves and a lower EPL.
Now choose a nonleaf B at level d-1 in original tree,
removing its two children and giving two children to A.
Clearly the new tree has the same number of leaves as
the original tree.
B
A
Level k
Level d-1
Level d
Original 2-tree
with m leaves
A
Level k
Level k+1
B
Level d-1
New 2-tree with m leaves
and EPL decreased
In new tree neither A nor the children of B are leaves,
but they are leaves in old tree. Therefore, we have
decreased the EPL by the length of the path to A and by
the length of the two paths to Bs children. That is
k+d+d = k+2d
In new tree, B and the two new children of A are leaves,
but they are not leaves in old tree. Therefore, we have
increased the EPL by the length of the path to B and the
length of the two paths to As new children. That is, we
have increased the EPL by
d-1+k+1+k+1 = d+2k+1
the net change in EPL is ksd-2
(d+2k+1) (k+2d) = k-d+1 s d-2-d+1 = -1
As the net change in the EPL is negative, the new tree
has a smaller EPL. Thus the old tree cannot minimize the
EPL among trees with the same number of leaves.
LEMMA 8: Any 2-tree that has m leaves and whose EPL
equals min EPL(m) must have
2
d
-m leaves at level d-1 and
2m-2
d
leaves at level d
and have no other leaves where d is the depth of the tree.
PROOF: By lemma 7, all leaves are at the bottom two
levels and non leaves in a 2-tree must have two children,
there must be 2
d-1
nodes at level d-1. Therefore, if r is the
number of leaves at level d-1, the number of non leaves
at that level is 2
d-1
-r. Because non leaves in a 2-tree have
exactly two children, for every non leaf at least d-1 there
are two leaves at level d. Because there are only leaves at
level d, the number of leaves at level d is equal to 2(2
d-1
-
r). Because lemma 7 says that all leaves are at level d or
d-1,
r+2(2
d-1
-r) = m
gives,
r = 2
d
m at level d-1
Therefore, the number of leave at level d is
m - r = m - 2
d
+ m = 2m 2
d



LEMMA 9: For any 2-tree that has m leaves and
whose EPL equals min EPL(m), the depth d is given
by d=lgm(.
PROOF: Consider the case that m is a power of 2,
then for some integer k, m=2
k
. Let d be depth of a
minimizing tree by Lemma 8, let r be the number of
leaves at level d-1,
r = 2
d
m = 2
d
- 2
k

because r > 0, we must have d > k. Assuming that d>k
leads to a contradiction.
If d>k, then
r = 2
d
- 2
k
> 2
k+1
2
k
= 2
k
(2-1) = 2
k
= m
because rsm, this means r=m, and all leaves are at
level d-1. But there must be some leaves at level d,
this contradiction implies that d=k which means r=0,
thus,
2
d
m = 0 means 2
d
= m or d = lgm.
Because lgm=lgm( where m is a power of 2, Thus
d = lgm( . Proof completed.

LEMMA 10:
For all integers m>1 min EPL(m) > m lgm( .
PROOF: By lemma 8, any 2-tree that minimizes EPL
must have 2
d
m leaves at level d-1, have 2m- 2
d
leaves
at level d, and have no other leaves. Therefore we have,
min EPL(m) = (2
d
-m)(d-1)+(2m-2
d
)d
= md + m 2
d


By lemma 9,
min EPL(m) = m lgm( + m 2
lgm(

If m is a power of 2, mlgm = m lgm(
If m is not a power of 2,
lgm( = lgm+ 1, so
min EPL(m)= m(lgm +1) + m 2
lgm(

= m lgm + 2m 2
lgm(
> m lgm
> m lgm


because 2m > 2
lgm(

therefore min EPL(m) > m lgm
THEOREM 4: Any deterministic algorithm that sorts n
distinct keys only by comparisons of keys must on the
average do at least
nlogn 1.45n comparisons of keys.
PROOF: By lemma 6, any such algorithm must on the
average do at least
min EPL(n!) / n! comparisons of keys.
By lemma 10, this comparison is greater than or equal
to,
min EPL(n!) / n! = n! lg(n!) /n! = lg(n!)
By lemma 4,
lg(n!) > nlogn 1.45n
Proved.

Lower bounds for searching only by comparison of
keys
The problem of searching for a key can be
described as follows. Given an array S containing
n keys and a key x, find an index i such that x=S[i]
if x equals one of the keys; if x does not equal one
of the keys, report failure.
For lower bound, we can associate a decision tree
with every deterministic algorithm that searches
for a key x in an array of n keys. Figure 1 shows a
decision tree corresponding to binary search when
searching seven keys and figure2 shows a decision


tree corresponding to sequential search algorithm. In
these trees, each large node represents a comparison of
an array item with the search key x and each small node
(leaf) contains a result that is reported. When x is in the
array, we report an index of the item that is equals, and
when x is not in the array, we report an F for the
failure. In figure we use S[1]=S
1
, S[2]=S
2
,.,
S[7]=S
7
.
Each leaf in a decision tree for searching n keys for a
key x represents a point at which the algorithm stops
and report, an index i such that x=S
i
or reports failure.
Every internal node represents a comparisons.


x=S
4

x=S
5
x=S
1
x=S
3

x=S
2
x=S
6

2
3 F 1 F F F
4
5 F F F 7 F
6 x=S
7

<
>
<
<
< <
<
<
>
>
> > >
=
=
= = = =
=
>
Figure1: The decision tree corresponding to binary search
when searching seven keys
Figure 2: The decision tree
corresponding to sequential
search when searching seven
keys
x=S
7

x=S
1

x=S
2

x=S
3

x=S
4

x=S
5

x=S
6

1
3
2
5
4
7
6
F
=
=
=
=
=
=
=
=
=
=
=
=
=
=
A decision tree is called Valid for searching n keys for a
key x if for each possible outcome there is a path from
the root to a leaf that reports that outcome. That is, there
must be a path for x=S
i
for 1sisn and a path that leads
to failure.
The decision tree is called pruned if every leaf is
reachable. Every algorithm that searches for a key x is
an array of n keys has a corresponding pruned, valid
decision tree.


Lower bounds for worst case behavior

LEMMA 11: If n is the number of nodes in a binary
tree and d is the depth, then
d > lg(n)
PROOF: We have
n s 1+2+2
2
+2
3
++2
d

because there can be only one root, at most two nodes
with depth 1, 2
2
nodes with depth 2, , and 2
d
nodes
with depth d. Apply geometric series,
n s 2
d+1
1
which means that n < 2
d+1
or lgn<d+1or lg(n) < d.



LEMMA 12: To be a pruned,valid decision tree
for searching n distinct keys for a key x, the
binary tree consisting of the corresponding
nodes must contain at least n nodes.
PROOF: Let S
i
for i=1,2,3,.,n be the values of the n
keys. First we show that every S
i
must be in at least one
comparison nodes. Suppose that for some i this is not
the case. Take two inputs that are identical for all keys
except the ith key, and are different for the ith key. Let
x have the value of S
i
in one of the inputs. Because S
i
is
not involved in any comparison and all the other keys
are the same in both inputs, the decision tree must
behave the same for both inputs. However, it must
report i for one of the input and it must not report i for
the other. This contradiction shows that every S
i
must
be in at least one comparison node.
Because every S
i
must be in at least one comparison
node, the only way we would have less than n
comparisons node would be to have at least one key S
i

involved only in comparison with other keys- that is,
one S
i
that is never compared with x, suppose we do
have such a key.
Take two inputs that are equal everywhere except for S
i
,
with S
i
being the smallest key in both inputs. Let x be
the ith key in one of the inputs. A path from a
comparison node containing S
i
must go in the same
direction for both inputs, and all other keys are the same
in both inputs. Therefore the decision tree must behave
the same for the two inputs. However, it must report i
for one of them and must not report i for the other. This
contradiction proves the lemma.
THEOREM 5: Any deterministic algorithm that
searches for a key x in an array of n distinct keys only
by comparison of keys must in the worst case do at least
lg(n) + 1 comparison of keys.
PROOF: Corresponding to the algorithm, there is a
pruned, valid decision tree for searching n distinct keys
for a key x. The worst case number of comparisons is
the number of nodes in the longest path from the root to
a leaf in the binary tree consisting of the comparison
nodes in that decision tree. This number is the depth of
the binary tree plus 1. Lemma 12 says that this binary
tree has at least n nodes. Therefore, by lemma 11, its
depth is greater than or equal to lg(n). This proves the
theorem.
Lower bounds for average case behavior
A binary tree is called a nearly complete binary tree if it
is complete down to a depth of d-1. Every essentially
complete binary tree is nearly complete, but not every
nearly complete binary tree is essentially complete.
Fig(a): An Essentially Complete
Binary Tree
Fig(b): A Nearly Complete Binary Tree
but not Essentially Complete BT.
LEMMA 13: The tree consisting of the comparison of
nodes in the pruned,valid decision tree corresponding to
the binary search is a nearly complete binary tree.
LEMMA 14: The total node distance (TND) of a binary
tree containing n nodes is equal to min TND(n) if and
only if the tree is nearly complete.
PROOF: First show that if a trees TND=min TND(n),
the tree is nearly complete. Suppose that some binary
tree is not nearly complete. Than there must be some
node, not at one of the bottom two levels, that has at
most one child. We can remove any node from the
bottom level and make it a child of that node. The
resulting tree will be a binary tree containing n nodes.
The number of nodes in the path to A in that tree will be
at least 1 less than the number of nodes in the path A in
the original tree. The number of nodes in the path to
all other nodes will be the same. Therefore, we have
created a binary tree containing n nodes with a TND
smaller than that of our original tree, which means that
our original tree did not have a minimum TND.
As TND is the same for all nearly complete binary tree
containing n nodes. Therefore, every such tree must
have the minimum TND.
LEMMA 15: Suppose that we are searching n keys, the
search key x is in the array, and all array slots are
equally probable. Then the average case time
complexity for binary search is given by
min TND(n)/n
PROOF: The proof follows from lemma 13 and 14.
LEMMA 16: If we assume that x is in the array and that
all array slots are equally probable, the average case
time complexity of any deterministic algorithm that
searches for a key x in an array of n distinct keys is
bounded below by
min TND(n)/n
PROOF:By lemma 12, every array item S
i
must be
compared with x at least once in the decision tree
corresponding to the algorithm. Let C
i
be the number of
nodes in the shortest path to a node containing a
comparison of S
i
with x. Because each key has the same
probability 1/n of being the search key x, a lower bound
on the average case time complexity is given by
C
1
(1/n)+ C
2
(1/n)+.+ C
n
(1/n)=(1/n) E C
i

thus, E C
i
> min TND(n)


i=1
n
i=1
n
THEOREM 6:
Among deterministic algorithm that search for a key x
in an array of n distinct keys only by comparison of
keys, binary search is optimal in its average case
performance if we assume that x is in the array and that
all array slots are equally probable. Therefore, under
these assumptions algorithm must on the average do at
least approximately lg(n) 1 comparisons of keys.
PROOF:
Proof follows from lemma 15 and 16.

Anda mungkin juga menyukai