Find Kth minimum element in a unsorted array

Some days ago while interviewing a candidate, I asked the candidate a very easy by the book interview question.

Design an efficient algorithm to find the 2nd smallest integer in a unsorted integer array of length N, which should run in constant space.

This question is quite easy to solve, and google search returns tons of results for the question.
In brief one way in which the above problem can be solved is, to store both the smallest element and the 2nd smallest element while traversing the array.This approach will require array to be traversed only once .

After the interview was finished I thought of the generalized variant of the above problem, which is  –

Design an efficient algorithm to find the Kth smallest integer in a unsorted integer array of length N which should run in constant space.

As one can see using the same approach we used for k=2, while will be a valid approach(Space complexity – O(k)), It will not be efficient for cases in which K ~ N/2 (Think!!!)

I thought whether using a appropriate data structure might work here –

Heap : O(nlg(n)), Building the Heap for n elements will take O( nlg(n) ) time.
Binary search tree: Building the tree will take O(n lg(n)) time.

So, as you can see I could as well sort the array in O(n lg(n) ) time and return the element at Kth index.
Lets see if can do better.

Hint: We can make use of partition step of QuickSort, to find the kth minimum element.
Think about it, or read further for the details/implementation code.

Step1: Partition the Array.
So if you can remember partitioning step in quicksort algorithm (if not, read below for a quick overview)

From the input array select a random element, and call it the pivot element.
Partition the array in such a way that
All the array elements which are less than or equal to pivot are on left side of the pivot.
All the array elements which are more than pivot are on right side of the pivot.

And Now the meat of the algorithm:

lets say that pivot_idx is the final position of the pivot after partitioning.
That means that all elements values less than pivot are placed before it
Which means that pivot is Kth smallest element for the array for k = pivot_idx.

Step 2: Repeat Step 1 based on some conditions(Divide array into smaller sub-array)
It comprises of 3 simple steps.

If k = pivot_idx .return pivot value as the answer.
if k < pivot_idx. the answer lies in left partition and so partition existing left partition.
if k > pivot_idx. the answer lies in right partition and so partition existing right partition.

Algorithm complexity analysis
Though this approach have a worst case complexity as Quick Sort, because it reuses its partition (Divide) step.
On the positive side, it also gives good average case time complexity, similar to quicksort.
I can also replace existing naïve partitioning algorithm with a more efficient version, which most of the times can result in partition divide by half.
In that case, the number of comparisons I have to do will be:
n + n /2 + n/4 + n/8 … (A geometric progression with ratio 0.5) = 2n .
So the algorithm is not so bad after all 🙂

And as always, Algorithm’s implementation in python (Download Code)

#!/usr/bin/env python
from random import randrange

def findKMin(arr,k,start=0,end=None):
    '''
    Find kth minimum element in a array (in-place randomized algorithm, similar to quicksort)
    assumption: Input will only contain unique elements'''
    if k > len(arr):
        raise Exception("k should be less than length of the input array")
    if not end: end = len(arr) -1 #Get last index value
    pivot_ridx = randrange(start,end)     #Get a random array element as pivot value
    pivot = arr[pivot_ridx]
    pivot_idx = partition(arr,start,end,pivot_ridx) #partition to partition array around the pivot value in place
    if pivot_idx+1 == k:
        return pivot #Well, there is your answer
    elif pivot_idx+1 > k:
        return findKMin(arr,k,start,pivot_idx) #lies somewhere in the first partition
    else:
        return findKMin(arr,k,pivot_idx,end) #lies somewhere in the second Partiton

def partition(arr,start,end,pivot_idx):
    '''
    Partitions array in-place around the given pivot value
    '''
    pivot = arr[pivot_idx]
    arr[end],arr[pivot_idx] = arr[pivot_idx],arr[end]
    inc_idx = start
    for i in xrange(start,end):
        if arr[i] <= pivot:
            arr[inc_idx],arr[i] = arr[i],arr[inc_idx]
            inc_idx+=1
    arr[end],arr[inc_idx] = arr[inc_idx],arr[end]
    return inc_idx

if __name__ == '__main__':
    from random import shuffle
    test_input = range(100000)
    shuffle(test_input)
    assert findKMin(test_input,50000) == 49999
Advertisements

About Ashish Yadav

Hey there!!! I am Ashish Yadav, a Geek | Music Lover | Open Web enthusiast | Wannabe entrepreneur. I currently live in Chennai, India and share my thoughts, nifty hacks rather infrequently here with rest of the world. If you like my posts, may I suggest subscribing to the blog to read them as they are written :) . As a part of my day job I enjoy writing ubercool python code, and have loads of fun while at it. Opinions presented here are mine and not to be associated with my employer or anyone else. Opinions presented here are mine and not to be associated with my employer or anyone else.
This entry was posted in Algorithm and tagged , , , , , , . Bookmark the permalink.

4 Responses to Find Kth minimum element in a unsorted array

  1. Alan P. Bartokomous says:

    What is the purpose of the random element selection, if the input array is already essentially random? Wouldn’t pivot = arr[0] effectively be the same, without the need for the PRNG functionality?

    • Ashish Yadav says:

      Yeah, valid point. Though at the time of writing the code I didn’t think about this, I used PRNG to get the pivot element just out of habit.
      As the answer to your question, take a look at the Wikipedia article on QuickSelect(It’s a well-known algorithm, as I found out some time back)

  2. Nitin Garg says:

    Building a heap takes O(n) time
    void buildheap(int * A, int n)
    {for(int i=n-1;i>0;i++)
    {
    if(A[i]<A[(i-1)/2]) {
    int temp = A[(i-1)/2];
    A[(i-1)/2] = A[i];
    A[i] = temp;
    }

    }
    }

    • Ashish Yadav says:

      Yeah, that’s true. However, a solution using any Heap based data structure will still have O(nlgn) complexity, where one needs to invoke CreateHeap, delete-min * k times, extractMin to find the kth minimum array. There are several Heap implementations available (http://en.wikipedia.org/wiki/Heap_(data_structure)), they just trade-off complexity of one type of operation over the other.
      Although, I understand your comment is regarding the following line, and it’s possible to build a Heap in linear time.

      Building the Heap for n elements will take O( nlg(n) ) time.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s