Sorting algorithms: Technical interviews must-know

Sorting algorithms are a must-know for any technical interview, be it a startup or a FANG company like Facebook, Amazon. If you fail to explain the sorting algorithm you are using and why are you using it, you can kiss goodbye to your interview. However, the most important aspect of sorting algorithms is their complexity. We will be going through 5 most known algorithms and group them by their complexity. Another aspect that is good know is the stability of the sorting algorithm.

Stability does the sorting algorithm maintain the relative order of elements that are equal?

The first group is of polynomial complexity like O(n2) and second group is linear logarithmic complexity like (n log n). Algorithms like Selection Sort, Bubble Sort, and Insertion sort fall in the first group. Quicksort, Merge sort and heap sort they fall in the second group.

Worst-case complexity of quicksort is O(n2), however, that is limited to very specific cases like already sorted input, on all other cases, if the pivot is selected properly, it will give O(nlogn)

Selection sort

The fundamental of this sort is: Find the minimum element in remaining unsorted array and put it at the current position. We start with index 0 and go all the way to the last index, find the minimum in the remaining array, and put it to the current index.

  public static void selectionSort(int[] arr){  
      for (int i = 0; i < arr.length - 1; i++){  
            int index = i;
            //Select minimum in the remaining array  
            for (int j = i + 1; j < arr.length; j++){  
                if (arr[j] < arr[index]){  
                    index = j;
                }  
            }  
            //Swap the smallest number and current index number
            int smallestNumber = arr[index];   
            arr[index] = arr[i];  
            arr[i] = smallerNumber;  
        }  
    }  

If you are wondering why complexity of selection sort is O(n2) even though inner loop goes from i+1 to end, I would suggest reading this post on calculation of time complexity in for loop. The space complexity is O(1).

Bubble sort

The fundamental of Bubble sort is: Compare two adjacent indices, if they are out of order, put them in order. We start with index 0, compare it with every new neighbor till the end of the array. After each pass, we have sorted one element.

   void bubbleSort(int arr[]) 
    { 
        int n = arr.length; 
        for (int i=0; i < n; i++){
            for (int j = 1; j < n-i; j++){
                if (arr[j-1] > arr[j]){ 
                    // swap arr[j-1] and arr[j] 
                    int temp = arr[j]; 
                    arr[j] = arr[j-1]; 
                    arr[j-1] = temp; 
                } 
             }
        }
    }

The time complexity is obviously O(n2) and the space complexity is O(1). Bubble sort is a stable sorting algorithm.

Insertion sort

The third algorithm in the polynomial complexity class is insertion sort, fundamental of it is: Taken element and put it in its correct position in the array by shifting all the indices on right by 1. The idea to keep the part of the array sorted, examine all the indices, and put them at the correct position in the sorted part of the array.

    void insertionSort(int arr[]){ 
        int n = arr.length; 
        for (int i = 1; i < n; ++i) { 
            int key = arr[i]; 
            int j = i - 1; 
  
            /* Move elements of arr[0..i-1], that are 
               greater than key, to one position ahead 
               of their current position */
            while (j >= 0 && arr[j] > key) { 
                arr[j + 1] = arr[j]; 
                j = j - 1; 
            } 
            arr[j + 1] = key; 
        } 
    } 

Time complexity is O(n2) and space complexity is O(1). Insertion sort is stable sort as well.

Quicksort

The entire idea of quicksort revolves around a pivot. Pivot is an element in input around which input is arranged in such a way that all elements on the left side are smaller and all elements on the right side are greater than the pivot. The question is how to find or select pivot and put it into the correct position.
To put this pivot at the correct position in input, start with the next element of pivot in input space, and find an element that is greater than the pivot. Let that be ith position.

At the same time, start from the end of the array and find an element that is smaller than pivot. Let it be jth position.

If i and j have not crossed each other i.e i < j, then swap element at ith and jth positions, and continue moving right on input to find element greater than pivot and moving left to find element smaller than pivot. Once i and j cross each other, swap pivot with element at jth position.
After this step, the pivot will be at its correct position and the array will be divided into two parts. All elements on the left side will be less than the pivot and all elements on the right side will be greater than the pivot.

Recursively apply quicksort on left and right subarray, until there is only one element to sort.

    private void swap(int [] a, int i, int j){
        int temp = a[i];
        a[i] = a[j];
        a[j] = temp;
    }

    private int partition(int [] a, int start, int end){
        int pivot = a[start];
        int i  = start;
        int j  = end;

        while(i < j){
            while(i <= end && a[i] <= pivot) i++;
            while(j >= start && a[j] > pivot) j--;
            
            if(i < j) {
                swap(a, i, j);
            }
        }

        swap(a, start, j);
        return j;
    }

    public void quickSort(int [] a, int start, int end){
        if(start < end){
            int p = partition(a, start, end);
            quickSort(a,start, p-1);
            quickSort(a, p+1, end);
        }
    }

Time complexity of the quicksort implementation depends on the selection of pivot. If pivot splits the original array into two equal parts (which is the intention), the complexity of quicksort is O(n log n) (Master Theorem).

The worst-case complexity of quick sort happens when array is sorted or reverse sorted. Array is partitioned into two subarrays, one with size 1 and other with size n-1. Subarray with n-1 elements, it again is divided into two subarrays of size 1 and n-2. In order to completely sort array, it will split for n-1 times and each time it requires to traverse n element to find the correct position of pivot. Hence overall complexity of quick sort comes out as O(n2).

What is space complexity of this algorithm? There is no apparent memory is used. However, recursive implementation internally puts stack frames on the stack for partitioned indices and function call return addresses and so on. In the worst case, there can be n stack frames, so, the worst-case complexity of quicksort will be O(n).

Merge sort

Merge sort algorithm falls in divide and conquer class of algorithms where input space is divided into subproblems and these subproblems are then solved individually and combined together to solve the original problems.
A question arises is what is the smallest subproblem? The smallest subproblem is where input cannot be further divided, a subproblem with one item to sorted. Once the split is done to the smallest size, we start merging. Going from the bottom, start with two one-item subproblems, combine that to create a two-item subproblem, then combine two two-items subproblems to create a four-item subproblem. Go on till you get a problem with the original size.
In merge operation, scan both sorted arrays one by one element and based on the output of comparison, put one of the two items into output array, till both arrays are scanned.

int merge(int[] a, int start, int mid, int end){
 
    int i,j,k;
    int temp[end-start+1];
 
    i = start; j = mid+1; k=0;
    /* Compare all elements of two sorted arrays and store
      them in sorted array. */
    while(i <= mid && j <= end){
        if(a[i]< a[j]){
            temp[k++]= a[i++];
        }
        else {
            temp[k++]  = a[j++];
        }
    }
    while(i <= mid){
        temp[k++] = a[i++]; 
    }
    while(j <= end){
        temp[k++] = a[j++]; 
    }
    //Copy the temporary array into original array
    for(i=0; i<k; i++){
        a[i+start] = temp[i];
    }
}
int mergeSort(int a[], int start, int end ){
 
    int mid  = (start + end)/2;
    if(start<end){
        //sort the left part
        mergeSort(a, start, mid);
        //sort right part
        mergeSort(a, mid+1, end);
 
        //merge them together
        merge(a, start, mid, end);
    }
}

For n elements in an array, there will be O(logn) such steps. Hence, the complexity of merge sort algorithm is O(nlogn) Refer Master theorem.

Every time we merge, two sub-arrays an auxiliary array is needed to temporarily store array elements and that incurs O(n) space complexity on merge sort algorithm.

This in addition to overhead caused by recursion.

Heap Sort

This one depends on the concept of heaps, which is covered on this website here: Heaps Fundamental is that we put all the elements in a priority queue or heap, min-heap to be precise. Then take element one by one and add back into the array. Using Java this is one of the simplest sorting algorithms, given that we get an inbuilt implementation of the priority queue in java.

    public void heapSort (int[] a){

        PriorityQueue<Integer> pq = Arrays.stream(a)
                .boxed()
                .collect(Collectors.toCollection(PriorityQueue::new));

        int i = 0;
        while(!pq.isEmpty()){
            a[i] = pq.poll();
            i++;
        }

        Arrays.stream(a)
                .forEach(e -> System.out.print(e + " "));
   }

The time complexity of heap sort is O(long) and space complexity is O(n).

There is another class of sorting algorithms that sort input in linear time complexity, but those algorithms are dependent on some conditions to be present in the input, for example for Radix sort, the input must be an integer.

Prepare your sorting algorithm basics and understand time and space complexity and stability. It is very good to know the fundamentals of them, even if you are using the inbuild library or function to do the sorting.

Always account for complexity for sorting algorithm in your solution, most the time that dominates the complexity.

If you are preparing for an interview and need some help with preparation, please reach out to us or book a session with us to understand more.

Median of two sorted arrays

Before we find the median of two sorted arrays, let’s understand what is the median?

Median is the middle value in a list of numbers.

For example,

Input:
A = [2,4,5,6,7,8,9].
Output:
6

To find the median, the input should be sorted. If it is not sorted, then first sort it and return the middle of that list. The question arises is what if the number of elements in the list is even? In that case, the median is the average of two middle elements.

Median of two sorted arrays

There are two sorted arrays nums1 and nums2 of size m and n respectively.
Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

median of two sorted arrays

Before going into the post, find a pen and paper and try to work out an example. And as I tell in our posts, come up with a method to solve this considering, you have all the time and resources to solve this problem. I mean think of most brute force solutions.

Let’s simplify the question first and then work it upwards. If question was to find the median of one sorted array, how would you solve it?
If array has odd number of elements in it, return A[mid], where mid = (start + end)/2; if array has even number of elements, return average of A[mid] + A[mid+1].
For example for array A = [1,5,9,12,15], median is 9.
Complexity of this operation is O(1).

Focus back on 2 sorted arrays. To find a median of 2 sorted arrays in no more simple and definitely not O(1) operation. For example,

A = [ 1,5,9,12,15] and B = [ 3,5,7,10,17], median is 8.

How about merging these two sorted arrays into one, the problem is reduced to find the median of one array.

Although to find median in a sorted array is O(1), merge step takes O(n) operations. Hence, overall complexity would be O(n).
Reuse the merge part of Merge sort algorithm to merge two sorted arrays.

Start from the beginning of two arrays and advance the pointer of the array whose current element is smaller than the current element of the other. This smaller element is put on to output array which is sorted merged array. Merge will use an additional space to store N elements (Note that N is here sum of the size of both sorted arrays). The best part of this method is that it does not consider if the size of the two arrays is the same or different.

This can be optimized, by counting number of elements n, in two arrays in advance. Then we need to merge only n/2 + 1 elements if n is even and n/2 if n is odd. This saves us O(n/2) space.

There is another optimization: do not store all n/2 or n/2 + 1 elements while merging, keep track of last two elements in sorted array, and count how many elements are sorted. When n/2 + 1 elements are sorted return average of last two elements if n is even, else return n/2 element as the median. With these optimizations, time complexity remains O(n), however, space complexity reduces to O(1).

Impementation with merge function

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public static double findMedian(int[] A, int[] B){
        int[] temp = new int[A.length + B.length];

        int i = 0;
        int j = 0;
        int k = 0;
        int lenA = A.length;
        int lenB = B.length;

        while(i<lenA && j<lenB){
            if(A[i] <= B[j]){
                temp[k++] = A[i++];
            }else{
                temp[k++] = B[j++];
            }
        }
        while(i<lenA){
            temp[k++] = A[i++];
        }
        while(j<lenB){
            temp[k++] = B[j++];
        }

        int lenTemp = temp.length;

        if((lenTemp)%2 == 0){
            return ( temp[lenTemp/2-1] + temp[lenTemp/2] )/2.0;
        }
        return temp[lenTemp/2];
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedian(a,b);
        System.out.println("Median is " + median);
    }
}

Optimized version to median of 2 sorted arrays

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public  static int findMedianOptimized(int[] A, int[] B){
        int i = 0;
        int j = 0;
        int k = 0;
        int lenA = A.length;
        int lenB = B.length;

        int mid = (lenA + lenB)/2;
        int midElement = -1;
        int midMinusOneElement = -1;

        while(i<lenA && j<lenB){
            if(A[i] <= B[j]){
                if(k == mid-1){
                    midMinusOneElement = A[i];
                }
                if(k == mid){
                    midElement = A[i];
                    break;
                }
                k++;
                i++;
            }else{
                if(k == mid-1){
                    midMinusOneElement = B[j];
                }
                if(k == mid){
                    midElement = B[j];
                    break;
                }
                k++;
                j++;
            }
        }
        while(i<lenA){
            if(k == mid-1){
                midMinusOneElement = A[i];
            }
            if(k == mid){
                midElement = A[i];
                break;
            }
            k++;
            i++;
        }
        while(j<lenB){
            if(k == mid-1){
                midMinusOneElement = B[j];
            }
            if(k == mid){
                midElement = B[j];
                break;
            }
            k++;
            j++;
        }

        if((lenA+lenB)%2 == 0){
            return (midElement + midMinusOneElement)/2;
        }
        return midElement;
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedianOptimized(a,b);
        System.out.println("Median is " + median);
    }
}

Binary search approach

One of the properties which lead us to think about binary search is that two arrays are sorted. Before going deep into how binary search algorithm can solve this problem, first find out mathematical conditions which should hold true for a median of two sorted arrays.

As explained above, median divides input into two equal parts, so first condition median index m satisfy is a[start..m] and a[m+1..end] are equal size. We have two arrays A and B, let’s split them into two. The first array A is of size m, and it can be split into m+1 ways at 0 to m.

If we split at i, len(Aleft) – iand len(Aright) = m-i.
When i=0, len(Aleft) = 0 and when i=m, len(Aright) = 0.

median of two sorted arrays

Similarly, for array B, we can split it into n+1 way, j being from 0 to n.

find median of sorted arrays

After splitting at specific indices i and j, how can we derive the condition for the median: left part of the array should be equal to the right part of the array?

If len(Aleft) + len(Bleft) == len(Aright) + len(Bleft) , it satisfies our condition. As we already know these values for split at i and j, equation becomes

i+j = m-i + n-j

median of two sorted array

But is this the only condition to satisfy for the median? As we know, the median is middle of the sorted list, we have to guarantee that all elements on the left array should be less than elements in the right array.

It is must that max of left part is less than min of right part. What is max of left part? It can be either A[i-1] or B[j-1]. What can be min of right part? It can be either A[i] or B[j].

We already know that, A[i-1] < A[i] and B[j-1] < B[j] as arrays A and B are sorted. All we need to check if A[i-1] <= B[j] and B[j-1] <= A[i], if index i and j satisfy this conditions, then median will be average of max of left part and min of right part if n+m is even and max(A[i-1], B[j-1]) if n+m is odd.

Let’s make an assumption that n>=m, then j = (n+m+1)/2 -i, it will always lead to j as a positive integer for possible values of i (o ~m) and avoid array out of bound errors and automatically makes the first condition true.

Now, problem reduces to find index i such that A[i-1] <= B[j] and B[j-1]<=A[i] is true.

This is where binary search comes into the picture. We can start as mid of array A, j = (n+m+1)/2-i, and see if this i satisfies the condition. There can be three possible outcomes for the condition.
1. A[i-1] <= B[j] and B[j-1]<=A[i] is true, we return the index i.
2. If B[j-1] > A[i], in this case, A[i] is too small. How can we increase it? by moving towards right. If i is increased, value A[i] is bound to increase, and also it will decrease j. In this case, B[j-1] will decrease and A[i] will increase which will make B[j-1]<=A[i] true. So, limit search space for i to mid+1 to mand go to step 1.
3. A[i-1] > B[j], means A[i-1] is too big. And we must decrease i to get A[i-1]<=B[j]. Limit search space for i to 0 mid-1 and go to step 1

Let’s take an example and see how this works. Out initial two arrays as follows.

median of two sorted arrays leetcode

The index i is mid of array A and corresponding j will as shown

median of two sorted arrays leetcode solution

Since condition B[j-1] <= A[i] is not met, we discard left of A and right of B and find new i and j based on remaining array elements.

get median of two sorted arrays

Finally, our condition that A[i-1]<= B[j] and B[j-1] <=A[i] is satisfied, find the max of left and min of right and based on even or odd length of two arrays, return average of the max of left and min of right or return a max of left.

This algorithm has dangerous implementation caveat, what if i or j is 0, in that case, i-1 and j-1 will be invalid indices. When can j be zero, when i==m. Till i<m, no need to worry about j being zero. So be sure to check i<m and i>0, when we are checking j-1 and i-1 respectively.

Implementation

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public static double findMedianWithBinarySearch(int[] A, int[] B){

        int[] temp;

        int lenA = A.length;
        int lenB = B.length;

        /*We want array A to be always smaller than B
          so that j is always greater than zero
         */
        if(lenA > lenB){
            temp = A;
            A = B;
            B = temp;
        }

        int iMin = 0;
        int iMax = A.length;
        int midLength =  ( A.length + B.length + 1 )/2;

        int i = 0;
        int j = 0;

        while (iMin <= iMax) {
            i = (iMin + iMax) / 2;
            j = midLength - i;
            if (i < A.length && B[j - 1] > A[i]) {
                // i is too small, must increase it
                iMin = i + 1;
            } else if (i > 0 && A[i - 1] > B[j]) {
                // i is too big, must decrease it
                iMax = i - 1;
            } else {
                // i is perfect
                int maxLeft = 0;
                //If there we are at the first element on array A
                if (i == 0) maxLeft = B[j - 1];
                //If we are at te first element of array B
                else if (j == 0) maxLeft = A[i - 1];
                //We are in middle somewhere, we have to find max
                else maxLeft = Integer.max(A[i - 1], B[j - 1]);

                //If length of two arrays is odd, return max of left
                if ((A.length + B.length) % 2 == 1)
                    return maxLeft;

                int minRight = 0;
                if (i == A.length) minRight = B[j];
                else if (j == B.length) minRight = A[i];
                else minRight = Integer.min(A[i], B[j]);

                return (maxLeft + minRight) / 2.0;
            }
        }
        return -1;
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedian(a,b);
        System.out.println("Median is " + median);
    }
}

The complexity of this algorithm to find the median of two sorted arrays is log(max(m,n)) where m and n are the size of two arrays.

Please share your views and suggestions.

Inversions in array

Inversions in array

Let A[0…n – 1] be an array of n distinct positive integers. If i < j and A[i] > A[j] then the pair (i, j) is called an inversion of A. Given n and an array A, find the number of inversions in array A. For example: First array has two inversions (2,1) and (5,1) where as second array has 3 inversions, (2,1), (4,1) and (4,3)

inversions in array

How many inversion can be in a sorted array? There is no inversion in sorted array and nC2 inversions in completely inverted array.

Inversions in array : Thought process

What first thing which comes to mind? For each index i, check all j where j > i and see if A[j] < A[i]?
If A[j] is greater than current element A[i], increase inversion count. Implementation is given below.

package com.company;

/**
 * Created by sangar on 6.4.18.
 */
public class Inversions {
    public static int findInversions(int[] a){
        int count = 0;
        for(int i=0; i<a.length; i++){
            for(int j=i+1;  j<a.length; j++){
                if(a[i] > a[j]) count++;
            }
        }
        return count;
    }

    public static void main(String[] args) {
        int a[] = new int[]{90, 20, 30, 40, 50};
        int inversions = findInversions(a);
        System.out.print("Inversions : " + inversions);
    }
}

Worst case complexity of this method to find inversions in array is O(n2).

Can we use the information that a sorted array does not have any inversion? Let’s ask this question again with a tweak, how many inversions will there be if only one element is out of place for a completely sorted array? There will one. What if there are two elements out of place? Of course, there will be 2 inversion. Are we getting somewhere? Effectively, we have to count, how many swaps we need to completely sort array from it’s original state. If you noticed, the brute force algorithm, is nothing but selection sort, instead of swapping elements, count inversions.

What if we use any other sort algorithm like Merge sort algorithm? Complexity of merge sort is much better than selection sort, then it is possible that merge sort identifies inversions much efficiently than selection sort.
Let’s use merge step of merge sort algorithm with a bit modification. Divide part will remains the same. While merging two sorted subarrays, count number of times element on right part is put on result array before element on left side.
Every time A[i] is appended to the output of merge sort, no new inversions are encountered, since A[i] is smaller than everything left in array B.  If B[j] is appended to the output, then it is smaller than all the remaining items in A, we increase the number of count of inversions by the number of elements remaining in A.
Overall inversions will be inversion in left part + inversions of right part and inversion which are in merge step.

Let’s see how inversions in merge steps are counted and then we will see the overall algorithm.

inversion in array

 

inversions in array

Total number of inversions is 6.

Overall, algorithm looks like below.

inversions in array using merge sort

Algorithm to count inversions in array

First, let’s write an algorithm to count inversions in merge step. When we are at merge steps, there are two sorted arrays with us:  A and B

  1. Initialize i as start position of A and j as start position of B. These pointers will reference to currently compared indices in both arrays.
  2. While there are elements in both arrays, i.e. i < length(A) && j < length(B)
    1.  If B[j] < A[i], all elements from i to length(A) are greater than B[j],
      count += number of elements remaining in A. Increment j
    2. Else increment i
  3. Return count

Replace merge part of merge sort with this piece of algorithm and return sum of inversions in left + inversions in right + inversions in merge from function.

MergeSortAndCount(L):

  1. If L has one element return 0
  2. Divide L into A, B
    1. inversionsLeft = MergeSortAndCount(A)
    2. inversionRight = MergeSortAndCount(B)
    3. inversionMerge = MergeAndCount(A,B)
  3. return inversionLeft + inversionRight + inversionMerge

Inversions in array implementation

package com.company;

/**
 * Created by sangar on 6.4.18.
 */
public class Inversions {
    public  static int mergeAndCount(int[] a, int low, int mid, int high){
        int count  =  0;
        int[] temp = new int[high-low+1];

        int i = low;
        int j = mid+1;
        int k = 0;
        /*
            There are elements on both side of array
        */
        while(i<=mid && j<=high){

            if(a[i] > a[j]){
                //Number of elements remaining on left side.
                count+= (mid - i + 1);
                temp[k++] = a[j++];
            }
            else{
                temp[k++] = a[i++];
            }
        }
        while(i<=mid){
            temp[k++] = a[i++];
        }
        while(j<=high) {
            temp[k++] = a[j++];
        }

        for(i=low; i<k+low; i++){
            a[i] = temp[i-low];
        }

        return count;
    }

    public static int countInversions(int[] a, int low, int high){
        if(low >= high) return 0;

        int mid = low + (high - low) / 2;

        int inversionsLeft = countInversions(a, low, mid);
        int inversionsRight = countInversions(a, mid+1, high);
        int inversionsMerge = mergeAndCount(a, low, mid, high);

        return inversionsLeft + inversionsRight + inversionsMerge;
    }


    public static void main(String[] args) {
        int a[] = new int[]{90, 20, 30, 40, 50};
        int inversions = countInversions(a, 0, a.length-1);
        System.out.print("Inversions : " + inversions);
    }
}

Complexity of finding inversions in arrays using merge sort method is  O(n log n).

Please share if there is something wrong or missing. If you are interested in contributing to website and share your knowledge with learners, please contact us at [email protected]

References: Lecture 8

Merge k sorted arrays

Given k sorted arrays of varying lengths, merge these k sorted arrays into one sorted array.

For example, given 3 arrays:
merge k sorted arrays

The resulting array should be like

merge k sorted array

Companies this problem is asked in
Microsoft, Amazon, Facebook, Salesforce, Indeed

Merge k sorted arrays divide and conquer

Since all the input arrays are sorted, the first element in the output sorted array will be one of these first elements of input arrays. How can we find the minimum among all the elements plucked from the first index of each array? Easy, take those k elements (there are k arrays, so k first elements) and build a min-heap. The root of the min-heap will be the least element among each of the first elements of the given k sorted arrays, i.e.

result[0] = min(arr1[0], arr2[0], arr3[0]…arrK[0])

merging k sorted arrays

The initial root above will be the first element in the result array. Now the second element for the result array can be found from the set of first elements of all input arrays except the array from which the first element of result array was taken. For example, if arr1 had the least of all first elements while finding the initial root, then:

result[1] = min(arr1[1], arr2[0], arr3[0] … arrK[0])

merge k sorted arrays java

Next iteration, 5 will be picked and put into the result array and index of arr 3 will be increased.
merging k sorted arrays

After putting 5 in the result array, we will move the next element in array 3 to the min-heap.
merge n sorted arrays

In order to know which array gave the minimum element at a particular time, we will store additional information of about array and index at which minimum element was.

If i represents the array number, and j represents the index of the minimum number currently in the heap from the ith array, then we add (j+1)th element to the min-heap next and re-heapify.
If we have put all the element from the ith array in the heap then we need to reduce the size of min-heap to k-1.

Follow the procedure for (n-1)*k times. When all array elements are processed the result array will be the sorted array for all nk element.

Algorithm

  • Build min heap with the first element of all k arrays.
  • Pick the root of min element and put it in the result array.
  • If there are remaining elements in the array,  put next element at the root of min heap and heapify again
  • If all elements are already of an array are processed, reduce the size of min heap by 1.
  • Repeat step 2, 3 and 4 till min heap is empty.

Show me the implementation

package com.company;
import java.util.PriorityQueue;

/**
 * Created by sangar on 2.12.18.
 */
public class MergeKSortedArrays {
    private class HeapNode {
        public int arrayNum;
        public int index;
        public int value;

        public HeapNode(int arrayNum, int index, int value) {
            this.arrayNum = arrayNum;
            this.index = index;
            this.value = value;
        }
    }

    public int[] mergeKSortedArrays(int[][] arrays) {

        if (arrays == null) return null;

        PriorityQueue<HeapNode> minHeap =
                new PriorityQueue<>(arrays.length,
                        (HeapNode a, HeapNode b) -> a.value - b.value);

        int size = 0;
        for (int i = 0; i < arrays.length; i++) {
            size += arrays[i].length;
        }
        int[] result = new int[size]; // k * n

        //add first elements in the array to this heap
        for (int i = 0; i < arrays.length; i++) {
            minHeap.add(new HeapNode(i, 0, arrays[i][0]));
        }

        //Complexity O(n * k * log k)
        for (int i = 0; i < size; i++) {
            //Take the minimum value and put into result
            HeapNode node = minHeap.poll();

            if (node != null) {
                result[i] = node.value;
                if (node.index + 1 < arrays[node.arrayNum].length) {
                    //Complexity of O(log k)
                    minHeap.add(new HeapNode(node.arrayNum,
                            node.index + 1,
                            arrays[node.arrayNum][node.index + 1]));
                }
            }
        }
        return result;
    }
}

The complexity of the code to merge k sorted arrays is O(nklogk) along with space complexity of O(k).

Please share if there is something wrong or missing. If you are preparing for an interview, please sign up to receive interview preparation kit for free.