Anda di halaman 1dari 4

MATH 482 SESSION NOTES

Session 12
Quicksort
Quicksort tends to beat other sorting routines in practice, especially when we soup
it up with various modifications. Randomizing the pivot element gives an expected time
of O(nlgn), the same as for Merge Sort. Another modification is to break away from
recursion when the array passed is small enough. Apparently it is faster to use Insertion
Sort to solve these smaller subproblems.
The key subroutine of Merge Sort was the Merge routine, and the key subroutine
for Heapsort was Max-Heapify. For Quicksort, the key routine is called Partition.
There are two common versions of Partition, namely, the original one due to Hoare
(1967) and another one due to Lomuto. The Lomuto version makes the mathematical
analysis easier, at least the analysis that we will do when we need to use a partition
routine. However for completeness we supply both.
Hoare-Partition(A,p,r)
1. x A[p]
2. i p 1
3. j r + 1
4. while TRUE
5.
do repeat j j 1
6.
until A[j]x

7.
repeat i i + 1
8.
until A[i]x
if i<j
9.
10.
then exchange A[i] A[ j]

11.
else return j
Lomuto-Partition(A,p,r)
1. x A[r ]
2. i p 1
3. for j p to r 1
4.
do if A[j]x
5.
then i i + 1
6.
exchange A[i] A[ j]

7. exchange A[i+1] A[r ]


8. return
i+1

page 1

2/22/10

In either version of Partition, we refer to x as the partition. In Lomuto, if the numbers


are distinct, then the final result will be a rearrangement of the numbers in the array so
that the numbers smaller than x come before x, the next number (in order) is x, and
finally the numbers greater than x come after x. For example, if our array has elements in
the order
(2, 8, 7, 1, 3, 5, 6, 4), then the output (of Lomuto) will be (2, 1, 3, 4, 7, 5, 6, 8). We often
use the visual device of writing the output on three lines, as follows:
7

4
2

Because of this, the portion of the array to the right of the pivot (in this case, 4) is often
called the low side, and the portion to the right of the pivot is called the high side.
Careful reading of the pseudocode shows that at the beginning of the for loop
(equivalently, at the end of the for loop), at anytime during execution of the routine,
elements in the array with index less than or equal to i are x and elements with index
greater than i but less than j are >x. The text refers to this constant relationship as a loop
invariant. (It is vacuously true when i = p-1.)
Quicksort uses successive (recursively called) partitions to sort an array of numbers.
Viz.:
Quicksort(A, p, r)
1. if p<r
2.
then q Partition( A,p,r )
3.
Quicksort(A, p, q-1)
4.
Quicksort(A, q+1, r)

this: If Hoare is used in Quicksort, and we pivot on the last element instead of
Now try
the first, it is possible for the algorithm to fail. Give an example of this.
Now try this: If Quicksort (as just described) is applied to an already sorted (in
increasing order) list of 1,000 distinct numbers, exactly how many comparisons are
made? Exactly how many comparisons would Insertion Sort make on the same list of
1,000 numbers?
The answers are

999 1000
and 999, respectively.
2

In worst-case partitioning, that is, when either the low side, or the high side is
empty, then the operation of the algorithm can be described by the recurrence
T(n) =
T(n 1) + T(0) + (n) .
This recurrence has solution (n 2 ) .

page 2

2/22/10

It is an interesting fact that can be verified through appeal to a recursion tree, that
if the ratio of the number of elements in the low side to the number of elements in the
high side is bounded away from zero, and bounded above, then the recurrence has
solution (nlgn) . In fact, it can be shown using probabilistic reasoning, that if the pivot
at every step is chosen randomly (from all the elements in the array) then the expected
time for Quicksort is also optimal, namely, (nlgn) . Randomization has an added
bonus. If this technique is used, then no particular input will elicit, repeatedly, the worst
case. In this light one might reconsider the problem immediately above.
One may well ask if there is an efficient way for determining the pivot, so that
neither the low side nor the high side will be too small. For instance, if the pivot is
chosen to be the median of the array, then we will be guaranteed to have practically equal
sides. This nice split will give us worst-case time of O(nlgn). But most (elementary)
books instruct students to find the median by first sorting all the numbers in the list, then
selecting the middle number, if there is one, or returning the average of the two middle
numbers, if the list has an even number of numbers in it. Since we are tying to sort the
numbers in the first place, this type of thinking looks like a dead end. But it is not. Next
session we will discuss a deviously clever algorithm for determining the median in linear
time.
We conclude this session with a sketch of the sensational proof given in the text
of the O( n lg n ) -expected time of Quicksort. The proof is amazing because it does not
follow the execution of the algorithm, recursive call by recursive call, and figure out on
average, how much work is spent between calls. This staging method was used in the
balls and bins proof we considered earlier. But it is possible to take the time, or
evolutionary, aspect of the algorithm completely out of the analysis, as we shall see
below. I mention if passing that this approach presages the use of potential functions to
calculate efficiency in more complicated algorithms, like the Relabel-To-Front algorithm
for computing maximum flow in a flow network.
As the text points out, the work of Quicksort is dominated by what happens in the
Partition procedure. For each call to Partition, a pivot is returned, and this pivot never
appears again in any call to Quicksort or Partition (Lomutos version makes this analysis
much easier than Hoares version). So there are at most n calls to Partition. Also, the
work done by the Partition procedure is proportional to the total number of comparisons
performed throughout the Quicksort algorithm.
If the elements in the array were in sorted order, smallest to largest, then they
could be written as ( z1, z2 , z3 , ... , zn ) . Define Z ij = {zi , zi+1, ... , z j } , the elements in
the sorted list from zi to z j .
Now comes the crux. When does the Quicksort algorithm compare zi to z j ?
Well, if any
as a pivot before zi or z j , then one of these
other elements in Z ij are chosen

two landsin the


low side produced by the Partition procedure, and the other lands in the
high side. These two disjoint sides are handled separately by Quicksort,
and these two

elements are therefore never compared. Consider then the indicator random variable

page 3
2/22/10

1 if zi is compared to z j
. What is Pr{ X ij = 1} ? The answer is not hard to find.
X ij =
0 otherwise
We are simply asking what the probability is of choosing zi or z j before any other
elements in the Z ij block. Pivots are assumed to be chosen randomly, so the answer is

2
. We have used the fact that there are a total of j i + 1 elements in the block.
j i +1

Now
we
can
turn
our
attention
to

2
1
This
E [ X ] = E [ X ij ] =
2
H n = O( n lg n ) .

1i< jn
1i< jn j i + 1
1i n i jn j i + 1
1i n
completes the argument.

page 4

2/22/10

Anda mungkin juga menyukai