Median of two sorted arrays

Median of two sorted array

Before going any further, let’s understand what is a median? “Median” is “middle” value in list of numbers. To find median, input should be sorted from smallest to largest. If input is not sorted, then we have to first sort and them return middle of that list. Question arises is what if number of elements in list are even? In that case, median is average of two middle elements. Ask of this problem is to find median of two sorted arrays.
For example :

median of two sorted array

Before going into the post, find a pen and paper and try to work out example. And as I tell in our posts, come up with a method to solve this considering, you have all the time and resources to solve this problem. I mean think of most brute force solution.
Let’s simplify the question first and then work it upwards. If question was to find median of one sorted array, how would you solved it?
If array has odd number of elements in it, return A[mid], where mid = (start + end)/2; else if array has even number of elements, return average of A[mid] + A[mid+1]. For example for array A = [1,5,9,12,15], median is 9. Complexity of this operation is O(1).

Focus back on two sorted arrays. To find median of two sorted arrays in no more simple and O(1) operation. For example, A = [ 1,5,9,12,15] and B = [ 3,5,7,10,17], median is 8. How about merging these two sorted array into one, problem is reduced to find median of one array. In above example, it will be C = [1,3,5,5,7,9,10,12,15,17]. Although to find median in a sorted array is O(1), merge step takes O(N) operations. Hence, overall complexity would be O(N). Reuse the merge part of Merge sort algorithm to merge two sorted arrays.
Start from beginning of two arrays and advance the pointer of array whose current element is smaller than current element of other. This smaller element is put on to output array which is sorted merge array. Merge will use an additional space to store N elements (Note that N is here sum of size of both sorted arrays). Best part of this method is that it does not consider if size of two arrays is same or different. It works for all size of arrays.

This can be optimized, by counting number of elements, N, in two arrays in advance. Then we need to merge only N/2+1 elements if N is even and N/2 if N is odd. This saves us O(N/2) space.

There is another optimization:do not store all N/2 or N/2+1 elements while merging, keep track of last two elements in sorted array, and count how many elements are sorted. When N/2+1 elements are sorted return average of last two elements if N is even, else return N/2 element as median. With this optimizations, time complexity remains O(N), however, space complexity reduces to O(1).

Median of two sorted arrays implementation

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public static double findMedian(int[] A, int[] B){
        int[] temp = new int[A.length + B.length];

        int i = 0;
        int j = 0;
        int k = 0;
        int lenA = A.length;
        int lenB = B.length;

        while(i<lenA && j<lenB){
            if(A[i] <= B[j]){
                temp[k++] = A[i++];
            }else{
                temp[k++] = B[j++];
            }
        }
        while(i<lenA){
            temp[k++] = A[i++];
        }
        while(j<lenB){
            temp[k++] = B[j++];
        }

        int lenTemp = temp.length;

        if((lenTemp)%2 == 0){
            return ( temp[lenTemp/2-1] + temp[lenTemp/2] )/2.0;
        }
        return temp[lenTemp/2];
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedian(a,b);
        System.out.println("Median is " + median);
    }
}

Complexity to find median of two sorted arrays using merge operation is O(N).
Optimized version to find median of two sorted arrays

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public  static int findMedianOptimized(int[] A, int[] B){
        int i = 0;
        int j = 0;
        int k = 0;
        int lenA = A.length;
        int lenB = B.length;

        int mid = (lenA + lenB)/2;
        int midElement = -1;
        int midMinusOneElement = -1;

        while(i<lenA && j<lenB){
            if(A[i] <= B[j]){
                if(k == mid-1){
                    midMinusOneElement = A[i];
                }
                if(k == mid){
                    midElement = A[i];
                    break;
                }
                k++;
                i++;
            }else{
                if(k == mid-1){
                    midMinusOneElement = B[j];
                }
                if(k == mid){
                    midElement = B[j];
                    break;
                }
                k++;
                j++;
            }
        }
        while(i<lenA){
            if(k == mid-1){
                midMinusOneElement = A[i];
            }
            if(k == mid){
                midElement = A[i];
                break;
            }
            k++;
            i++;
        }
        while(j<lenB){
            if(k == mid-1){
                midMinusOneElement = B[j];
            }
            if(k == mid){
                midElement = B[j];
                break;
            }
            k++;
            j++;
        }

        if((lenA+lenB)%2 == 0){
            return (midElement + midMinusOneElement)/2;
        }
        return midElement;
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedianOptimized(a,b);
        System.out.println("Median is " + median);
    }
}

Median of two sorted array using binary search

One of the property which leads us to think about binary search is that two arrays are sorted. Before going deep into how Binary search algorithm can solve this problem, first find out mathematical condition which should hold true for a median of two sorted arrays.
As explained above, median divides input into two equal parts, so first condition median index m satisfy is a[start..m] and a[m+1..end] are equal size. We have two arrays A and B, let’s split them into two. First array is of size m, and it can be split into m+1 ways at 0 to at m. If we split at i, length(A_left) – i and length(A_right) = m-i.

When i=0, len(A_left) =0 and when i=m, len(A_right) = 0.

Similarly for array B, we can split it into n+1 way, j being from 0 to n.

After split at specific indices i and j, how can we derive condition for median, which is left part of array should be equal to right part of array?

If len(A_left) + len(B_left) == len(A_right) + len(B_right) , it satisfies our condition. As we already know these values for split at i and j, equation becomes

i+j = m-i + n-j

median of two sorted array

But is this the only condition to satisfy for median? As we know, median is middle of sorted list, we have to guarantee that all elements on left array should be less than elements in right array.
It is must that max of left part is less than min of right part. What is max of left part? It can be either A[i-1] or B[j-1]. What can be min of right part, it can be either A[i] or B[j]. We already know that, A[i-1] < A[i] and B[j-1]<B[j] as arrays A and B are sorted. All we need to check if A[i-1] <= B[j] and B[j-1]<=A[i], if index i and j satisfy this conditions, then median will be average of max of left part and min of right part if n+m is even and max(A[i-1], B[j-1]) if n+m is odd.

Let’s make an assumption that n>=m, then j = (n+m+1)/2 -i, it will always lead to j as positive integer for possible values of i (o ~m) and avoid array out of bound errors and automatically makes the first condition true.

Now, problem reduces to find index i such that A[i-1] <= B[j] and B[j-1]<=A[i] is true.

This is where binary search comes into picture. We can start i as mid of array A, j = (n+m+1)/2-i and see if this i satisfies the condition. There can be three possible outcomes for condition.
1. A[i-1] <= B[j] and B[j-1]<=A[i] is true, we return the index i.
2. If B[j-1] > A[i], in this case, A[i] is too small. How can we increase it? by moving towards right. If i is increased, value A[i] is bound to increase, and also it will decrease j. In this case, B[j-1] will decrease and A[i] will increase which will make B[j-1]<=A[i] is true. So, limit search space for i to mid+1 to m and go to step 1.
3. A[i-1] > B[j], means A[i-1] is too big. And we must decrease i to get A[i-1]<=B[j]. Limit search space for i to 0 mid-1 and go to step 1

Let’s take an example and see how this works. Out initial two array as follows.

Index i is mid of array A and corresponding j will as shown

Since condition B[j-1] <= A[i] is not met, we discard left of A and right of B and find new i and j based on remaining array elements.

Finally our condition that A[i-1]<= B[j] and B[j-1] <=A[i] is satisfied, find max of left and min of right and based on even or odd length of two arrays, return average of max of left and min of right or return max of left.

This algorithm has very dangerous implementation caveat, which what if i or j is 0, in that case i-1 and j-1 will  be invalid indices. When can j be zero, when i == m. Till i<m, no need to worry about j being zero. So be sure to check i<m and i>0, when we are checking j-1 and i-1 respectively.

Implementation

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public static double findMedianWithBinarySearch(int[] A, int[] B){

        int[] temp;

        int lenA = A.length;
        int lenB = B.length;

        /*We want array A to be always smaller than B
          so that j is always greater than zero
         */
        if(lenA > lenB){
            temp = A;
            A = B;
            B = temp;
        }

        int iMin = 0;
        int iMax = A.length;
        int midLength =  ( A.length + B.length + 1 )/2;

        int i = 0;
        int j = 0;

        while (iMin <= iMax) {
            i = (iMin + iMax) / 2;
            j = midLength - i;
            if (i < A.length && B[j - 1] > A[i]) {
                // i is too small, must increase it
                iMin = i + 1;
            } else if (i > 0 && A[i - 1] > B[j]) {
                // i is too big, must decrease it
                iMax = i - 1;
            } else {
                // i is perfect
                int maxLeft = 0;
                //If there we are at the first element on array A
                if (i == 0) maxLeft = B[j - 1];
                //If we are at te first element of array B
                else if (j == 0) maxLeft = A[i - 1];
                //We are in middle somewhere, we have to find max
                else maxLeft = Integer.max(A[i - 1], B[j - 1]);

                //If length of two arrays is odd, return max of left
                if ((A.length + B.length) % 2 == 1)
                    return maxLeft;

                int minRight = 0;
                if (i == A.length) minRight = B[j];
                else if (j == B.length) minRight = A[i];
                else minRight = Integer.min(A[i], B[j]);

                return (maxLeft + minRight) / 2.0;
            }
        }
        return -1;
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedian(a,b);
        System.out.println("Median is " + median);
    }
}

Complexity of this algorithm to find median of two sorted arrays is log(max(m,n)) where m and n are size of two arrays.
Please share your views and suggestions. If you liked content, please share it. If you are interested in contributing to site, please contact us.