Some days ago while interviewing a candidate, I asked the candidate a very easy by the book interview question.

Design an efficient algorithm to find the 2nd smallest integer in a unsorted integer array of length N, which should run in constant space.

This question is quite easy to solve, and google search returns tons of results for the question.

In brief one way in which the above problem can be solved is, to store both the smallest element and the 2nd smallest element while traversing the array.This approach will require array to be traversed only once .

After the interview was finished I thought of the generalized variant of the above problem, which is –

Design an efficient algorithm to find the Kth smallest integer in a unsorted integer array of length N which should run in constant space.

As one can see using the same approach we used for k=2, while will be a valid approach(Space complexity – O(k)), It will not be efficient for cases in which K ~ N/2 (Think!!!)

I thought whether using a appropriate data structure might work here –

Heap : O(nlg(n)), Building the Heap for n elements will take O( nlg(n) ) time.

Binary search tree: Building the tree will take O(n lg(n)) time.

So, as you can see I could as well sort the array in O(n lg(n) ) time and return the element at Kth index.

Lets see if can do better.

**Hint:*** We can make use of partition step of QuickSort, to find the kth minimum element.
Think about it, or read further for the details/implementation code.*

**Step1: Partition the Array.**

So if you can remember partitioning step in quicksort algorithm (if not, read below for a quick overview)

From the input array select a random element, and call it the pivot element.

Partition the array in such a way that

All the array elements which are less than or equal to pivot are on left side of the pivot.

All the array elements which are more than pivot are on right side of the pivot.

And Now the meat of the algorithm:

lets say that pivot_idx is the final position of the pivot after partitioning.

That means that all elements values less than pivot are placed before it

Which means that pivot is Kth smallest element for the array for k = pivot_idx.

**Step 2: Repeat Step 1 based on some conditions(Divide array into smaller sub-array)**

It comprises of 3 simple steps.

If k = pivot_idx .return pivot value as the answer.

if k < pivot_idx. the answer lies in left partition and so partition existing left partition.

if k > pivot_idx. the answer lies in right partition and so partition existing right partition.

**Algorithm complexity analysis**

Though this approach have a worst case complexity as Quick Sort, because it reuses its partition (Divide) step.

On the positive side, it also gives good average case time complexity, similar to quicksort.

I can also replace existing naïve partitioning algorithm with a more efficient version, which most of the times can result in partition divide by half.

In that case, the number of comparisons I have to do will be:

**n + n /2 + n/4 + n/8 … (A geometric progression with ratio 0.5) = 2n .**

So the algorithm is not so bad after all 🙂

And as always, Algorithm’s implementation in python (Download Code)

#!/usr/bin/env python from random import randrange def findKMin(arr,k,start=0,end=None): ''' Find kth minimum element in a array (in-place randomized algorithm, similar to quicksort) assumption: Input will only contain unique elements''' if k > len(arr): raise Exception("k should be less than length of the input array") if not end: end = len(arr) -1 #Get last index value pivot_ridx = randrange(start,end) #Get a random array element as pivot value pivot = arr[pivot_ridx] pivot_idx = partition(arr,start,end,pivot_ridx) #partition to partition array around the pivot value in place if pivot_idx+1 == k: return pivot #Well, there is your answer elif pivot_idx+1 > k: return findKMin(arr,k,start,pivot_idx) #lies somewhere in the first partition else: return findKMin(arr,k,pivot_idx,end) #lies somewhere in the second Partiton def partition(arr,start,end,pivot_idx): ''' Partitions array in-place around the given pivot value ''' pivot = arr[pivot_idx] arr[end],arr[pivot_idx] = arr[pivot_idx],arr[end] inc_idx = start for i in xrange(start,end): if arr[i] <= pivot: arr[inc_idx],arr[i] = arr[i],arr[inc_idx] inc_idx+=1 arr[end],arr[inc_idx] = arr[inc_idx],arr[end] return inc_idx if __name__ == '__main__': from random import shuffle test_input = range(100000) shuffle(test_input) assert findKMin(test_input,50000) == 49999

What is the purpose of the random element selection, if the input array is already essentially random? Wouldn’t pivot = arr[0] effectively be the same, without the need for the PRNG functionality?

Yeah, valid point. Though at the time of writing the code I didn’t think about this, I used PRNG to get the pivot element just out of habit.

As the answer to your question, take a look at the Wikipedia article on QuickSelect(It’s a well-known algorithm, as I found out some time back)

Building a heap takes O(n) time

void buildheap(int * A, int n)

{for(int i=n-1;i>0;i++)

{

if(A[i]<A[(i-1)/2]) {

int temp = A[(i-1)/2];

A[(i-1)/2] = A[i];

A[i] = temp;

}

}

}

Yeah, that’s true. However, a solution using any Heap based data structure will still have O(nlgn) complexity, where one needs to invoke CreateHeap, delete-min * k times, extractMin to find the kth minimum array. There are several Heap implementations available (http://en.wikipedia.org/wiki/Heap_(data_structure)), they just trade-off complexity of one type of operation over the other.

Although, I understand your comment is regarding the following line, and it’s possible to build a Heap in linear time.