Anda di halaman 1dari 17

Medians and Order Statistics

K Smallest Number
th

Shadab Salam
13/IT/06
Problem Statement:-

INPUT: A set A of n (distinct) number and integer i , with 1 i n.


OUTPUT: The element x A that is larger than exactly i 1 other elements of A.

Example : --
Input: A = {7, 10, 4, 3, 20, 15}
i=3
Output: 7

Input: A = {7, 10, 4, 3, 20, 15}


i=4
Output: 10
SOLUTIONS:-
NAVE Method
This problem can be simply solved just by sorting the given set of numbers
and returning the (i-1)th position.
There are many sorting algorithms that can be applied here such as Insertion
Sort, Merge Sort, Quick Sort, Heap Sort.
Among these Sorting Algorithms Merge and Heap Sort can sort the given array
or set in O(n logn) worst case time complexity.
CAN WE DO BETTER ???

So now we ask ourselves the most fundamental question in Algorithms ..

CAN WE DO BETTER?
Obviously we can do better. The problem with our previous solution is that we
are over calculating i.e. though we are required to return only the i th
smallest number we in fact are considering the all values.
Modifying Quick Sort
Randomized Selection

The algorithm RANDOMIZED-SELECT is modeled after the quicksort algorithm


As in quicksort, we partition the input array recursively
But unlike quicksort, which recursively processes both sides of the partition,
RANDOMIZED-SELECT works on only one side of the partition
Algorithm of Random-Select

Randomized-Select(A , p, r, i ) Random-Partition(A, p, r)
1. if p == r 1. n = r-l+1;
2. return A[p] 2. pivot = rand() % n;
3. q=Randomized-Partition(A , p, r ) 3. swap(&A[l + pivot], &A[r]);
4. k=q-p+1 4. return partition(A, p, r);
5. if i == k
6. return A[q]
7. elseif i < k
8. return Randomized-Select(A,p,q-1,i)
9. else return Randomized-Select(A , q+1,r,i-k )
Randomized Select

Line 1 checks for the base case of the recursion


Line 2 calls function which randomly chooses a pivot and partitions the array.
Now all the elements right to the pivot are larger than the pivot where as all
the number on the left the pivot are smaller than the pivot.
So if the pivot is at correct position it is returned at line 6
Else if the Randomized-Select is recursively called.
Analysis of Randomized Selection
It is probabilistic algorithm as we use a random function to generate the pivot
Depending upon the position of the pivot element the time complexity would
vary
If we get lucky and the partition occurs at 90% and 10% then the recurrence
would look something like this.
T(n)=T(9n/10)+O(n)
Now using master theorem as below we can calculate the time
complexity

In this case the Time Complexity would be O(n).So the average case
complexity of the O(n).
Analysis of Randomized Selection

But at the same it may also happen that the pivot always chosen in such a
way that it is the first element and then we have to recur through the rest
elements i.e. n-1 elements
So in the worst case the recurrence would look something like this.
T(n)=T( n-1 )+O(n)
Now after solving this recurrence wed get time complexity of O(n2 )
So in the worst case the complexity of this algorithm is even worse that of our
first ( sorting the whole array) approach.
Though it is very unlikely situation and mostly this algorithm works fine.
So Again Can we Do Better ????

Though the pervious algorithm works fine in most of cases still there is some
amount of uncertainty in this algorithm.
The real cause of that uncertainty is the randomly choosing the pivot.
So if somehow we are able to deterministically partition the list so that every
time a good partition is ensured then we can come up with an algorithm that
can get desired result in O(n) even in the worst case.
Medians of Medians Method

This method is recent development in the algorithm field.


This algorithm ensure the desired result in linear time complexity even in the
worst case scenario.
Algorithm :--

1. Divide elements in n/5 groups of 5 elements, plus at most one group with
(n mod 5) elements.
2. Find median of each group:
Insertion sort: O(1) time (at most 5 elements).
Take middle element (largest if two medians).
3. Use Select recursively to find median x of medians.
4. Partition input array around median-of-medians x. Let k be the number of
elements on low side, n-k on high side.
a1,a2,,ak | ak+1,ak+2,,an
ai < aj, for 1 i k, k+1 j n.
5. Use Select recursively to:
Find i-th smallest element on low side, if i k
Find (i-k)-th smallest on high side, if i > k.
Analysis :-
Find lower bound on number of elements greater than x.
At least half of medians in step 2 greater than x. Then,
At least half of the groups contribute 3 elements that are greater
than x, except:
Last group (if less than 5 elements);
x own group.
Discard those two groups:
Number of elements greater than x is 3((n/5)/2-2)=3n/10-6.
Similarly, number of elements smaller than x is 3n/10-6.
Then, in worst case, Select is called recursively in Step 5
on at most 7n/10+6 elements (upper bound).
Analysis: -
Steps 1,2 and 4: O(n) time.
Step 3: T(n/5)
Step 5: at most T(7n/10+6)
7n/10+6 < n for n > 20.

T(n) T(|n/5|)+T(7n/10+6)+O(n), n > n1.


Use substitution to solve:
Assume T(n) cn, for n > n1; find n1 and c.

T(n) c|n/5| + c(7n/10+6) + O(n)


cn/5 + c + 7cn/10 + 6c +O(n)
= 9cn/10 + 7c + O(n)
Want T(n) cn:
Pick c such that c(n/10-7) c1n, where c1 is constant from O(n) above (n1 = 80).

So the Time complexity of this method is O(n)


Code :

n=int(input("Enter no. of elements"))


arr=[7,8,5,6,2,1,4]
for i in range(n):
x=int(input())
arr.append(x)
k=int(input("enter k:-"))
arr.sort() #sorting the list
print arr[k-1]
Medians of Medians
Reference :
Introduction to Algorithms
Book by Charles E. Leiserson , Clifford Stein, Ronald Rivest , and Thomas H.
Cormen
MIT Videos from YouTube

Acknowledgements
Thanks to Dr. Sajal Mukhopadhaya Sir and Rupkabon Rongpipi maam.

Anda mungkin juga menyukai