K Smallest Number
th
Shadab Salam
13/IT/06
Problem Statement:-
Example : --
Input: A = {7, 10, 4, 3, 20, 15}
i=3
Output: 7
CAN WE DO BETTER?
Obviously we can do better. The problem with our previous solution is that we
are over calculating i.e. though we are required to return only the i th
smallest number we in fact are considering the all values.
Modifying Quick Sort
Randomized Selection
Randomized-Select(A , p, r, i ) Random-Partition(A, p, r)
1. if p == r 1. n = r-l+1;
2. return A[p] 2. pivot = rand() % n;
3. q=Randomized-Partition(A , p, r ) 3. swap(&A[l + pivot], &A[r]);
4. k=q-p+1 4. return partition(A, p, r);
5. if i == k
6. return A[q]
7. elseif i < k
8. return Randomized-Select(A,p,q-1,i)
9. else return Randomized-Select(A , q+1,r,i-k )
Randomized Select
In this case the Time Complexity would be O(n).So the average case
complexity of the O(n).
Analysis of Randomized Selection
But at the same it may also happen that the pivot always chosen in such a
way that it is the first element and then we have to recur through the rest
elements i.e. n-1 elements
So in the worst case the recurrence would look something like this.
T(n)=T( n-1 )+O(n)
Now after solving this recurrence wed get time complexity of O(n2 )
So in the worst case the complexity of this algorithm is even worse that of our
first ( sorting the whole array) approach.
Though it is very unlikely situation and mostly this algorithm works fine.
So Again Can we Do Better ????
Though the pervious algorithm works fine in most of cases still there is some
amount of uncertainty in this algorithm.
The real cause of that uncertainty is the randomly choosing the pivot.
So if somehow we are able to deterministically partition the list so that every
time a good partition is ensured then we can come up with an algorithm that
can get desired result in O(n) even in the worst case.
Medians of Medians Method
1. Divide elements in n/5 groups of 5 elements, plus at most one group with
(n mod 5) elements.
2. Find median of each group:
Insertion sort: O(1) time (at most 5 elements).
Take middle element (largest if two medians).
3. Use Select recursively to find median x of medians.
4. Partition input array around median-of-medians x. Let k be the number of
elements on low side, n-k on high side.
a1,a2,,ak | ak+1,ak+2,,an
ai < aj, for 1 i k, k+1 j n.
5. Use Select recursively to:
Find i-th smallest element on low side, if i k
Find (i-k)-th smallest on high side, if i > k.
Analysis :-
Find lower bound on number of elements greater than x.
At least half of medians in step 2 greater than x. Then,
At least half of the groups contribute 3 elements that are greater
than x, except:
Last group (if less than 5 elements);
x own group.
Discard those two groups:
Number of elements greater than x is 3((n/5)/2-2)=3n/10-6.
Similarly, number of elements smaller than x is 3n/10-6.
Then, in worst case, Select is called recursively in Step 5
on at most 7n/10+6 elements (upper bound).
Analysis: -
Steps 1,2 and 4: O(n) time.
Step 3: T(n/5)
Step 5: at most T(7n/10+6)
7n/10+6 < n for n > 20.
Acknowledgements
Thanks to Dr. Sajal Mukhopadhaya Sir and Rupkabon Rongpipi maam.