Anda di halaman 1dari 37

Linear Sorting

Sorting in O(n)
Previously
Previous sorts had the same property: they are
based only on comparisons
Any comparison-based sorting algorithm must
run in (n log n)
Thus, MERGESORT and HEAPSORT are
asymptotically optimal
Obviously, this lower bound does not apply to
linear sorting
Lower Bound for Sorting
Assume (for simplicity) that all elements in an input sequence (a
1
,
a
2
, ..., a
n
) are distinct
We can view comparison-based sorting using a decision tree
Each internal node in the decision tree corresponds to one of the
comparisons in the algorithm.
start at the root and do first comparison:
- if s take left branch, if > take right branch, etc.
each leaf represents one possible ordering of the input
one leaf for each of n! possible orderings
Each permutation must appear as a leaf in the tree
One decision tree exists for each algorithm and input size
The Decision Tree
a
1
:a
2
a
2
:a
3
a
1
:a
3
a
2
:a
3
a
1
:a
3 (2,1,3)
(2,3,1) (3,2,1) (3,1,2) (1,3,2)
(1,2,3)





>
>
>
>
>
(a
1
= 6, a
2
= 8, a
3
= 5)
Note: The length of the longest root to leaf path in this tree
= worst-case number of comparisons
s worst-case number of operations of algorithm
The O(nlgn) Lower Bound
Theorem: Any decision tree T for sorting n elements has height
O(nlgn) (therefore, any comparison sort algorithm requires O(nlgn)
compares in the worst case).
Proof: Let h be the height of T. Then we know
T has at least n! leaves
T is binary, so it has at most 2
h
leaves
n! s #leaves s 2
h


2
h
> n!
lg(2
h
) > lg(n!)
h > O(nlgn) (Eq. 3.18)
Proof
Consider a decision tree with height h with l
reachable leaves
n! s l
Since a binary tree of height h has no more than
2
h
leaves:
n! s l s 2
h
Taking the log of both sides:
h > lg (n!) = O(n lg n)
Counting Sort
Each element is an integer in the range from 1
to k
For each element, determine the number of
elements less than it
Works well on small ranges
Does not sort in place
The Code
COUNTING-SORT (A, B, k)
for i 1 to k // Init C
do C[i] 0 // O(k)
for j 1 to length[A] // Build C
do C[A[j]] C[A[j]] + 1 // O(n)
for i 2 to k // Make C'
do C[i] C[i] + C[i-1] // O(k)
for j length[A] downto 1 // Copy info
do B[C[A[j]]] A[j] // O(n)
C[A[j]] C[A[j]] -1
Requirement: input elements are integers in known range [0..k] for
some constant k.
Idea: for each input element x, find the number of elements s x (say
this number = m) and put x in the (m+1)st spot in the output array.
3 6 4 1 3 4 1 4
2 0 2 3 0 1
1 2 3 4 5 6 7 8
1 2 3 4 5 6
A
C
Original
Number of 1's, 2's..
How do you construct C's size?
How do you fill C's data?
Modify C
Such that C now tells how many elements are less than i
2 0 2 3 0 1
1 2 3 4 5 6
C
Number of 1's, 2's..
How do you construct C'?
2 2 4 7 7 8
1 2 3 4 5 6
C'
Number of slots
Move into B from A
2 2 4 7 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
Move into B from A
4
2 2 4 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
Move into B from A
4
2 2 4 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
Move into B from A
4
1 2 4 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
1
Move into B from A
1 4
1 2 4 6 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
Move into B from A
1 4
1 2 4 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4
Move into B from A
1 4
1 2 4 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4
Move into B from A
1 4
1 2 3 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4 3
Move into B from A
1 4
1 2 3 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4 3
Move into B from A
1 1 4
0 2 3 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4 3
Move into B from A
1 1 4
0 2 3 5 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4 3
Move into B from A
1 1 4
0 2 3 4 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4 3 4
Move into B from A
1 1 4
0 2 3 4 7 8
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4 3 4
Move into B from A
1 1 4
0 2 3 4 7 7
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4 3 4 6
Move into B from A
1 1 4
0 2 3 4 7 7
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4 3 4 6
Move into B from A
1 1 4
0 2 2 4 7 7
1 2 3 4 5 6 7 8
1 2 3 4 5 6
B
C
Sorted
Number of slots.
3 6 4 1 3 4 1 4
1 2 3 4 5 6 7 8
A Original
4 3 4 6 3
Running Time of Counting Sort
for loop in lines 1-2 takes u(n) time.
for loop in lines 3-4 takes u(k) time.
for loop in lines 5-7 takes u(n) time.

In practice, use counting sort when we have k = u(n),
so running time is u(n).

Counting sort has the important property of stability:

A sorting algorithm is stable when numbers with the same values
appear in the output array in the same order as they do in the input array.

Important when satellite data is stored with elements being sorted and
when counting sort is used as a subroutine for radix sort.
Overall time is u(k + n).
Radix Sort
Sorting by "column"
A d-digit number would create d columns
Start with least-significant row
Usually requires counting sort to be used on the
columns

Example
(by hand)
331
429
190
127
982
784
318
190
331
982
784
127
318
429
318
127
429
331
982
784
190
127
190
318
331
429
784
982
The Code
RADIX-SORT (A, d)
for i 1 to d
do use a stable sort to sort A on digit I
Note:
radix sort sorts the least significant digit first.
correctness can be shown by induction on the digit being sorted.
often counting sort is used as the internal sort in step 2.
Given n d-digit numbers in which each digit can take on up
to k possible values, RADIXSORT correctly sorts these
numbers in (d(n + k)) time.
Proof :
Each digit is in the range 0 (k-1)
Takes k time to construct C'
Each pass over n d-digit numbers takes
O(n+k)
Thus, the total running time is O(d(n+k))


Given n b-bit numbers and any positive integer r b,
RADIX-SORT correctly sorts these numbers in
((b/r)(n + 2
r
)) time.
Proof : For a value r b, we view each key as having d =
b/r( digits of r bits each. Each digit is an integer in the range 0
to 2
r
- 1, so that we can use counting sort with k = 2
r
- 1. (For
example, we can view a 32-bit word as having 4 8-bit digits,
so that b = 32, r = 8, k = 2
r
- 1 = 255, and d = b/r = 4.) Each
pass of counting sort takes time (n + k) = (n + 2
r
) and there
are d passes, for a total running time of (d(n + 2
r
)) =
((b/r)(n + 2
r
)).
Bucket Sort
Bucket sort runs in linear time when the input is
drawn from a uniform distribution.
Like counting sort, bucket sort is fast because it
assumes something about the input.
Whereas counting sort assumes that the input consists
of integers in a small range, bucket sort assumes that
the input is generated by a random process that
distributes elements uniformly over the interval [0, 1).
The idea of bucket sort is to divide the interval [0, 1)
into n equal-sized subintervals, or buckets, and then
distribute the n input numbers into the buckets.
To produce the output, we simply sort the numbers in
each bucket and then go through the buckets in order,
listing the elements in each.
Example
.06
.43
.48
.37
.91
.44
.98
0
1
2
3
4
5
6
7
8
9
.06




.37

.43 .44 .48
.98
/
/
/
/
/
/
The Code
The bucket sort assumes that the input is an n-element
array A and that each element A[i] in the array satisfies
0 A[i] < 1. The code requires an auxiliary array
B[0n - 1] of linked lists (buckets) and there is a
mechanism for maintaining such lists.
BUCKET-SORT (A)
n length[A]
for i 1 to n
do insert A[i] into list B[A[i]]
for i 0 to n - 1
do sort list B[i] with insertion sort
concatenate the lists together
Proof of Bucket Sort
Consider two elements A[i] and A[j]
Assume A[i] s A[j]
Then, A[i] is placed into either the same bucket
as A[j], or a bucket with a lower index.
The sort of each bucket guarantees A[i] and A[j]
are ordered correctly
Of Interest
(only to me)
As long as the input has the property that the
sum of the squares of the bucket sizes is linear
in the total number of elements... bucket sort
will run in linear time
In other words, each bucket should get the
square root of the number of bucket elements.

Anda mungkin juga menyukai