Interval partitioning problem

Interval partitioning problem

In continuation of greedy algorithm problem, (earlier we discussed : even scheduling and coin change problems) we will discuss another problem today. Problem is known as interval partitioning problem and it goes like : There are n lectures to be schedules and there are certain number of classrooms. Each lecture has a start time si and finish time fi. Task is to schedule all lectures in minimum number of classes and there cannot be more than one lecture in a classroom at a given point of time. For example, minimum number of classrooms required to schedule these nine lectures is 4 as shown below.

interval partition

However,  we can do some tweaks and manage to schedule same nine lectures in three classrooms as shown below.

So, second solution optimizes the output.

Another variant of this problem is :  You want to schedule jobs on a computer. Requests take the form (si , fi) meaning a job that runs from time si to time fi. You get many such requests, and you want to process as many as possible, but the computer can only work on one job at a time.

Interval partitioning : Line of thought

First thing to note about interval partitioning problem is that we have to minimize something, in this case, number of classrooms. What template this problem fits into? Greedy may be? Yes it fits into greedy algorithm template. In greedy algorithm we take decision on local optimum.

Before discussing the solution, be clear that what is resource and what needs to be minimized? In this problem, resource is classroom and total number of classroom needs to be minimized by arranging lectures in certain order.

There are few natural orders in which we can arrange all lectures or for sake of generality, tasks. First is to arrange them in order of finish time,  second is to arrange in order of start time, third is to order them by smallest duration of task, fourth is by minimum number of conflicting jobs. Which one to chose?
You can come up with counter example when if lectures are arranged in classrooms by order of their end time, or smallest duration or minimum number of conflicting jobs, it does not end to optimal solution  So, let’s pick lectures based on earliest start time. At any given pint of time, pick lecture with least start time and yet not scheduled and then assign it to first available class. Will it work? Sure it does.  When you have assigned all lectures, total number of classrooms will be minimum number of classrooms required.

Interval partitioning algorithm

1. Sort all lectures based on start time in ascending order.
2. Number of initial classrooms = 0
3. While lecture to be scheduled:
   3.1 Take first lecture yet not scheduled,
   3.2 If there a already a class available for lecture's start time
       Assign lecture to the class.
   3.3 If not, then allocate a new classroom
       number of classroom = number of classroom + 1
4. Return number of classrooms.

Before jumping into the code, let’s discuss some data structures which we can use to implement this algorithm.

Understand that we have to find a compatible classroom for a lecture. There are many classrooms, we need to check if the finish time of lecture in that classroom is less than start time of new lecture. If yes , then classroom is compatible, if there is no such class, allocate a new class. If we store our allocated classrooms in such a way that it always gives classroom with least finish time of last lecture scheduled there, we can safely say that if this classroom is not compatible, none of the others will be.(Why?) Every time we assign a lecture to a classroom, sort the list of classroom, so that first classroom is with least finish time.  Sort has complexity of O(n log n) and if we do it for all n intervals, overall complexity of algorithm will be O(n2 log n).

We are sorting just to find minimum end time across all classrooms. This can easily be achieved by min heap or priority queue keyed on finish time of last lecture of class. Every time finish time of last lecture changes for a classroom, heap is readjusted and root gives us classroom with min finish time.

  • To determine whether lecture j is compatible with some classroom, compare sj to key of min classroom k in priority queue.
  • When a lecture is added to a classroom,  increase key of classroom k to fj.

Well know we have algorithm and data structure to implement in, so let’s code it.

PrioritityQueue implementation is given below:

import heapq
# This is our priority queue implementation
class PriorityQueue:
    def __init__(self):
        self._queue = []
        self._index = 0
 
    def push(self, item, priority):
        heapq.heappush(self._queue, (priority, self._index, item))
        self._index += 1
 
    def pop(self):
        if(self._index == 0):
                return None
        return heapq.heappop(self._queue)[-1];

Classroom class implementation

class Classroom:
	def __init__(self, number, finish_time):
		self.class_num = number
		self.finish_time = finish_time
	def __repr__(self):
		return 'Classroom({!r})'.format(self.class_num)

Interval partitioning problem : Implementation

from PriorityQueue import PriorityQueue
from Classroom import Classroom

jobs = [(1, 930, 1100),
        (2, 930, 1300),
        (3, 930, 1100),
        (5, 1100, 1400),
        (4, 1130, 1300),
        (6, 1330, 1500),
        (7, 1330, 1500),
        (8,1430,1700),
        (9, 1530, 1700),
        (10, 1530, 1700)
]

def find_num_classrooms():
	num_classrooms = 0;
	priority_queue = PriorityQueue()

	for job in jobs:
		# we have job here, now pop the classroom with least finishing time
		classroom = priority_queue.pop();
		if(classroom == None) :
			#allocate a new class
			num_classrooms+= 1;
			priority_queue.push(Classroom(num_classrooms,job[2]),job[2]);
		else:
			#check if finish time of current classroom is
			#less than start time of this lecture
			if(classroom.finish_time  <= job[1]):
				classroom.finish_time = job[2]
				priority_queue.push(classroom,job[2])
			else:
				num_classrooms+= 1;
				#Since last classroom needs to be compared again, push it back
				priority_queue.push(classroom,job[2])
				#Push the new classroom in list
				priority_queue.push(Classroom(num_classrooms,job[2]),job[2])

    return  num_classrooms
	
print "Number of classrooms required: " +  find_num_classrooms();

Java Implementation

package com.company;

import java.util.*;

/**
 * Created by sangar on 24.4.18.
 */
public class IntervalPartition {

    public static int findIntervalPartitions(ArrayList<Interval> intervals){
        PriorityQueue<Interval> queue =
                new PriorityQueue<Interval>(intervals.size(), Comparator.comparing(p -> p.getEndTime()));

        for(Interval currentInterval : intervals) {
            if (queue.isEmpty()) queue.add(currentInterval);
            else {
                if (queue.peek().getEndTime() > currentInterval.getStartTime()) {
                    queue.add(currentInterval);
                } else {
                    queue.remove();
                    queue.add(currentInterval);
                }
            }
        }
        return queue.size();
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));

        int minimumClassRooms = findIntervalPartitions(intervals);
        System.out.println(minimumClassRooms);
    }
}

This algorithm takes overall time of O(n log n) dominated by the sorting of jobs on start time. Total number of priority queue operations is O(n) as we have only n lectures to schedule and for each lecture we have push and pop operation.

Reference :

There is another method using binary search algorithm which can be used to solve this problem. As per problem statement, we have to find minimum number of classrooms to schedule n lectures. What are the maximum number of classrooms required? It will be number of lectures when all lectures conflict with each other.
Minimum number of classrooms will be 0 when there is no lecture to be scheduled. Now, we know the range of values of classrooms. How can we find minimum?

Basic idea is that if we can schedule all n lectures in m rooms, then we can definitely schedule them in m+1 and more rooms. So minimum number of rooms required will be either m or less than it. In this case, we can safely discard all candidate solution from m to n (remember n is the maximum number of classrooms).
Again what if we can not schedule lectures in m rooms, then there is no way we can schedule them in less than m rooms. Hence we can discard all candidate solutions less than m.

How can we select m? We can select is as mid of range which is (0,n). And try to fit all lectures on those m rooms based on condition that none of lecture conflicts. Keep track of end time of last lecture of each classroom. If none of the classroom has end time less than start time of new lecture, allocate new class. If total number of classrooms is less than or equal to m, discard m+1 to n. If it is more than m, then discard 0 to m and search for m+1 to n.

package com.company;

import java.util.*;

/**
 * Created by sangar on 24.4.18.
 */
public class IntervalPartition {

    public static boolean predicate(ArrayList<Interval> intervals, long candidateClassRooms){

        int i = 0;

        PriorityQueue<Interval> queue =
                new PriorityQueue<Interval>(intervals.size(), Comparator.comparing(p -> p.getEndTime()));

        for(Interval currentInterval : intervals){
            if(queue.isEmpty()) queue.add(currentInterval);
            else{
                if(queue.peek().getEndTime() > currentInterval.getStartTime()){
                    queue.add(currentInterval);
                }
                else{
                    queue.remove();
                    queue.add(currentInterval);
                }
            }
        }

        return queue.size() <= candidateClassRooms;
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        long low = 0;
        long high = intervals.size();

        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));

        while(low < high){
            long mid  = low + ( (high - low) >> 1);

            if(predicate(intervals, mid)){
                high = mid;
            }else{
                low = mid+1;
            }
        }
        System.out.println(low);
    }
}

Complexity of algorithm is dependent on number of lectures to be scheduled which is O(n log n ) with additional space complexity of O(c) where c is number of classrooms required.

Please share your views and suggestions in comments and feel free to share and spread the word. If you are interested to share your knowledge to learners across the world, please write to us on [email protected]

Median of two sorted arrays

Before we find the median of two sorted arrays, let’s understand what is the median?

Median is the middle value in a list of numbers.

For example,

Input:
A = [2,4,5,6,7,8,9].
Output:
6

To find the median, the input should be sorted. If it is not sorted, then first sort it and return the middle of that list. The question arises is what if the number of elements in the list is even? In that case, the median is the average of two middle elements.

Median of two sorted arrays

There are two sorted arrays nums1 and nums2 of size m and n respectively.
Find the median of the two sorted arrays. The overall run time complexity should be O(log (m+n)).

median of two sorted arrays

Before going into the post, find a pen and paper and try to work out an example. And as I tell in our posts, come up with a method to solve this considering, you have all the time and resources to solve this problem. I mean think of most brute force solutions.

Let’s simplify the question first and then work it upwards. If question was to find the median of one sorted array, how would you solve it?
If array has odd number of elements in it, return A[mid], where mid = (start + end)/2; if array has even number of elements, return average of A[mid] + A[mid+1].
For example for array A = [1,5,9,12,15], median is 9.
Complexity of this operation is O(1).

Focus back on 2 sorted arrays. To find a median of 2 sorted arrays in no more simple and definitely not O(1) operation. For example,

A = [ 1,5,9,12,15] and B = [ 3,5,7,10,17], median is 8.

How about merging these two sorted arrays into one, the problem is reduced to find the median of one array.

Although to find median in a sorted array is O(1), merge step takes O(n) operations. Hence, overall complexity would be O(n).
Reuse the merge part of Merge sort algorithm to merge two sorted arrays.

Start from the beginning of two arrays and advance the pointer of the array whose current element is smaller than the current element of the other. This smaller element is put on to output array which is sorted merged array. Merge will use an additional space to store N elements (Note that N is here sum of the size of both sorted arrays). The best part of this method is that it does not consider if the size of the two arrays is the same or different.

This can be optimized, by counting number of elements n, in two arrays in advance. Then we need to merge only n/2 + 1 elements if n is even and n/2 if n is odd. This saves us O(n/2) space.

There is another optimization: do not store all n/2 or n/2 + 1 elements while merging, keep track of last two elements in sorted array, and count how many elements are sorted. When n/2 + 1 elements are sorted return average of last two elements if n is even, else return n/2 element as the median. With these optimizations, time complexity remains O(n), however, space complexity reduces to O(1).

Impementation with merge function

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public static double findMedian(int[] A, int[] B){
        int[] temp = new int[A.length + B.length];

        int i = 0;
        int j = 0;
        int k = 0;
        int lenA = A.length;
        int lenB = B.length;

        while(i<lenA && j<lenB){
            if(A[i] <= B[j]){
                temp[k++] = A[i++];
            }else{
                temp[k++] = B[j++];
            }
        }
        while(i<lenA){
            temp[k++] = A[i++];
        }
        while(j<lenB){
            temp[k++] = B[j++];
        }

        int lenTemp = temp.length;

        if((lenTemp)%2 == 0){
            return ( temp[lenTemp/2-1] + temp[lenTemp/2] )/2.0;
        }
        return temp[lenTemp/2];
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedian(a,b);
        System.out.println("Median is " + median);
    }
}

Optimized version to median of 2 sorted arrays

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public  static int findMedianOptimized(int[] A, int[] B){
        int i = 0;
        int j = 0;
        int k = 0;
        int lenA = A.length;
        int lenB = B.length;

        int mid = (lenA + lenB)/2;
        int midElement = -1;
        int midMinusOneElement = -1;

        while(i<lenA && j<lenB){
            if(A[i] <= B[j]){
                if(k == mid-1){
                    midMinusOneElement = A[i];
                }
                if(k == mid){
                    midElement = A[i];
                    break;
                }
                k++;
                i++;
            }else{
                if(k == mid-1){
                    midMinusOneElement = B[j];
                }
                if(k == mid){
                    midElement = B[j];
                    break;
                }
                k++;
                j++;
            }
        }
        while(i<lenA){
            if(k == mid-1){
                midMinusOneElement = A[i];
            }
            if(k == mid){
                midElement = A[i];
                break;
            }
            k++;
            i++;
        }
        while(j<lenB){
            if(k == mid-1){
                midMinusOneElement = B[j];
            }
            if(k == mid){
                midElement = B[j];
                break;
            }
            k++;
            j++;
        }

        if((lenA+lenB)%2 == 0){
            return (midElement + midMinusOneElement)/2;
        }
        return midElement;
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedianOptimized(a,b);
        System.out.println("Median is " + median);
    }
}

Binary search approach

One of the properties which lead us to think about binary search is that two arrays are sorted. Before going deep into how binary search algorithm can solve this problem, first find out mathematical conditions which should hold true for a median of two sorted arrays.

As explained above, median divides input into two equal parts, so first condition median index m satisfy is a[start..m] and a[m+1..end] are equal size. We have two arrays A and B, let’s split them into two. The first array A is of size m, and it can be split into m+1 ways at 0 to m.

If we split at i, len(Aleft) – iand len(Aright) = m-i.
When i=0, len(Aleft) = 0 and when i=m, len(Aright) = 0.

median of two sorted arrays

Similarly, for array B, we can split it into n+1 way, j being from 0 to n.

find median of sorted arrays

After splitting at specific indices i and j, how can we derive the condition for the median: left part of the array should be equal to the right part of the array?

If len(Aleft) + len(Bleft) == len(Aright) + len(Bleft) , it satisfies our condition. As we already know these values for split at i and j, equation becomes

i+j = m-i + n-j

median of two sorted array

But is this the only condition to satisfy for the median? As we know, the median is middle of the sorted list, we have to guarantee that all elements on the left array should be less than elements in the right array.

It is must that max of left part is less than min of right part. What is max of left part? It can be either A[i-1] or B[j-1]. What can be min of right part? It can be either A[i] or B[j].

We already know that, A[i-1] < A[i] and B[j-1] < B[j] as arrays A and B are sorted. All we need to check if A[i-1] <= B[j] and B[j-1] <= A[i], if index i and j satisfy this conditions, then median will be average of max of left part and min of right part if n+m is even and max(A[i-1], B[j-1]) if n+m is odd.

Let’s make an assumption that n>=m, then j = (n+m+1)/2 -i, it will always lead to j as a positive integer for possible values of i (o ~m) and avoid array out of bound errors and automatically makes the first condition true.

Now, problem reduces to find index i such that A[i-1] <= B[j] and B[j-1]<=A[i] is true.

This is where binary search comes into the picture. We can start as mid of array A, j = (n+m+1)/2-i, and see if this i satisfies the condition. There can be three possible outcomes for the condition.
1. A[i-1] <= B[j] and B[j-1]<=A[i] is true, we return the index i.
2. If B[j-1] > A[i], in this case, A[i] is too small. How can we increase it? by moving towards right. If i is increased, value A[i] is bound to increase, and also it will decrease j. In this case, B[j-1] will decrease and A[i] will increase which will make B[j-1]<=A[i] true. So, limit search space for i to mid+1 to mand go to step 1.
3. A[i-1] > B[j], means A[i-1] is too big. And we must decrease i to get A[i-1]<=B[j]. Limit search space for i to 0 mid-1 and go to step 1

Let’s take an example and see how this works. Out initial two arrays as follows.

median of two sorted arrays leetcode

The index i is mid of array A and corresponding j will as shown

median of two sorted arrays leetcode solution

Since condition B[j-1] <= A[i] is not met, we discard left of A and right of B and find new i and j based on remaining array elements.

get median of two sorted arrays

Finally, our condition that A[i-1]<= B[j] and B[j-1] <=A[i] is satisfied, find the max of left and min of right and based on even or odd length of two arrays, return average of the max of left and min of right or return a max of left.

This algorithm has dangerous implementation caveat, what if i or j is 0, in that case, i-1 and j-1 will be invalid indices. When can j be zero, when i==m. Till i<m, no need to worry about j being zero. So be sure to check i<m and i>0, when we are checking j-1 and i-1 respectively.

Implementation

package com.company;

/**
 * Created by sangar on 18.4.18.
 */
public class Median {

    public static double findMedianWithBinarySearch(int[] A, int[] B){

        int[] temp;

        int lenA = A.length;
        int lenB = B.length;

        /*We want array A to be always smaller than B
          so that j is always greater than zero
         */
        if(lenA > lenB){
            temp = A;
            A = B;
            B = temp;
        }

        int iMin = 0;
        int iMax = A.length;
        int midLength =  ( A.length + B.length + 1 )/2;

        int i = 0;
        int j = 0;

        while (iMin <= iMax) {
            i = (iMin + iMax) / 2;
            j = midLength - i;
            if (i < A.length && B[j - 1] > A[i]) {
                // i is too small, must increase it
                iMin = i + 1;
            } else if (i > 0 && A[i - 1] > B[j]) {
                // i is too big, must decrease it
                iMax = i - 1;
            } else {
                // i is perfect
                int maxLeft = 0;
                //If there we are at the first element on array A
                if (i == 0) maxLeft = B[j - 1];
                //If we are at te first element of array B
                else if (j == 0) maxLeft = A[i - 1];
                //We are in middle somewhere, we have to find max
                else maxLeft = Integer.max(A[i - 1], B[j - 1]);

                //If length of two arrays is odd, return max of left
                if ((A.length + B.length) % 2 == 1)
                    return maxLeft;

                int minRight = 0;
                if (i == A.length) minRight = B[j];
                else if (j == B.length) minRight = A[i];
                else minRight = Integer.min(A[i], B[j]);

                return (maxLeft + minRight) / 2.0;
            }
        }
        return -1;
    }

    public static void main(String[] args){
        int[] a = {1,3,5,6,7,8,9,11};
        int[] b = {1,4,6,8,12,14,15,17};

        double median = findMedian(a,b);
        System.out.println("Median is " + median);
    }
}

The complexity of this algorithm to find the median of two sorted arrays is log(max(m,n)) where m and n are the size of two arrays.

Please share your views and suggestions.

2 Sum problem

2 sum problem goes like: given an array a[] and a number X, find two elements or pair with given sum X in the array. For example:

Given array : [3,4,5,1,2,6,8] X = 10
The answer could be (4,6) or (2,8).

Before looking at the post below, we strongly recommend to have a pen and paper and git it a try to solve it.

Thought process

Ask some basic questions about the problem, it’s a good way to dig more into the problem and gain more confidence. Remember interviewers are not trained interrogators, they slip hint or two around solution when you ask relevant questions.

  • Is it a sorted array? If not, think additional complexity you would be adding to sort it
  • If duplicates present in array?
  • Whether returning first pair is enough or should we return all such pairs with a sum equal to X?
  • If there can be negative numbers in array?

Watch out the video here:

This problem is used regularly in interviews because it tests so many things about your programming knowledge.
It validates that if you can traverse array properly, with both lower and higher bounds. It also checks your optimizing ability once you got a working solution. Can you work with additional constraints? Are you able to work with more than one data structure like an array and hash together to solve a problem?

2 sum problem using sorting

Let’s go with an assumption that input is sorted array and if not, we will sort it? If you want to know how to sort an array efficiently,refer Quick sort or Merge sort
With sorted array, we can apply below algorithm to find a pair with given sum.

1. Initialize two variable left = 0 and right = array.length-1
2. While left and right do not cross each other,
3. Get sum of elements at index left and right, i.e A[left] + A[right]
4. If sum > K, move towards left from end i.e decrease right by 1
5. Else if sum < K,then move towards right from start, i.e increment left
6. At last, if sum is equal to K, then return (left, right) as pair.

Let’s see how this works with an example and then we will implement it. Given an array as shown and sum = 17, find all pairs which sum as 17.

2 sum problem

Initialization step, left = 0 and right = array.length – 1

find two number with sum k

A[left] + A[right] = 20 which is greater than sum (17), move right towards left by 1.

pair with given sum

Again, A[left] + A[right] = 18 which is greater than sum (17), move right towards left by 1.

two numbers with given sum

At this point, A[left] + A[right] is less than sum(17), hence move left by 1

2 sum problem

Now, A[left] + A[right]  is equal to the sum and so add this pair in the result array. Also, decrease right by 1, why?

At this point, A[left] + A[right] is less than sum(17), hence move left by 1

sum of two numbers equal to a number

Again, A[left] + A[right] is less than sum(17), hence move left by 1

two elements add to a given number

A[left] + A[right]  is equal to the sum and so add this pair in the result array. Also, decrease right by 1.

Since left and right point to the same element now, there cannot be a pair anymore, hence return.

Show me the implementation

package com.company;

import javafx.util.Pair;

import java.util.ArrayList;

/**
 * Created by sangar on 5.4.18.
 */
public class PairWithGivenSum {
    public static ArrayList<Pair<Integer, Integer>> pairWithGivenSum(int[] a, int sum){
        int left = 0;
        int right = a.length - 1;

        ArrayList<Pair<Integer, Integer>> resultList = new ArrayList<>();

        while(left < right){
            /*If sum of two elements is greater than
              sum required, move towards left */
            if(a[left] + a[right] > sum) right--;
            /*
              If sum of two elements is less than
              sum required, move towards right
            */
            if(a[left] + a[right] < sum) left++;
            if(a[left] + a[right] == sum){
                resultList.add(new Pair(left, right));
                right--;
            }
        }
        return resultList;
    }
    public static void main(String[] args) {
        int a[] = new int[] {10, 20, 30, 40, 50};

        ArrayList<Pair<Integer, Integer>> result = pairWithGivenSum(a,50);
        for (Pair<Integer, Integer> pair : result ) {
            System.out.println("("+ pair.getKey() + "," + pair.getValue()  + ")");
        }
    }
}

Complexity of this algorithm to find two numbers in array with sum K is dependent on sorting algorithm used. If it is merge sort, complexity is O(n log n) with added space complexity of O(n). If quick sort is used, worst case complexity is O(n2) and no added space complexity.

Solution with hashmap

In first method,  array is modified, when it is not already sorted. Also, Preprocessing step (sorting) dominates the complexity of algorithm. Can we do better than O(nlogn) or in other words, can we avoid sorting?

An additional constraint put on the problem is that you cannot modify original input.  Use basic mathematics, if A + B = C, then A = C-B.  Consider B is each element for which we are looking for A. Idea is to scan the entire array and find all A’s required for each element. Scan array again and check there was B which required the current element as A.
To keep track of required A values, we will create a hash, this will make second step O(1).
We can optimize further by scanning array only once for both steps.

1. Create an hash
2. Check element at each index of array
    2.a If element at current index  is already in hash. return pair of current index and value in hash
    2.b If not, then subtract element from sum and store (sum-A[index], index) key value pair in hash.

This algorithm scans the array only once and does not change input. Worst-case time complexity is O(n), hash brings additional space complexity. How big should be the hash? Since all values between the sum-max value of the array and sum-min value of array will be candidate A’s hence hash will be of difference between these two values.

This solution does not work in C if there are negative numbers in an array. It will work in languages that have HashMaps in-built. For C, we have to do some preprocessing like adding absolute of smallest negative number to all elements. That’s where our fourth question above helps us to decide.

2 sum problem hash based implementation

package com.company;

import javafx.util.Pair;

import java.util.ArrayList;
import java.util.HashMap;

/**
 * Created by sangar on 5.4.18.
 */
public class PairWithGivenSum {
    public static ArrayList<Pair<Integer, Integer>> pairsWithGivenSum2(int[] a, int sum){
        int index = 0;
        ArrayList<Pair<Integer, Integer>> resultList = new ArrayList<>();

        HashMap<Integer, Integer> pairMap = new HashMap<>();
        for(int i=0; i< a.length; i++){
            if(pairMap.containsKey(a[i])){
                resultList.add(new Pair(pairMap.get(a[i]), i));
            }
            pairMap.put(sum-a[i], i);
        }
        return resultList;
    }
    public static void main(String[] args) {
        int a[] = new int[] {10, 20, 30, 40, 50};

        ArrayList<Pair<Integer, Integer>> result = pairsWithGivenSum2(a,50);
        for (Pair<Integer, Integer> pair : result ) {
            System.out.println("("+ pair.getKey() + "," + pair.getValue()  + ")");
        }
    }
}

Please share if there is some error or suggestion to improve. We would love to hear what you have to say. If you want to contribute to the learning process of others by sharing your knowledge, please write to us at [email protected]

Minimum number of pages to read

Minimum number of pages to read

In previous post Ceiling in sorted array using binary search , we understood a very important concept about application of binary search in problems where minimum or maximum of something is asked. In the post mentioned above, we were asked to find minimum element which is greater than target value. We will use the same concept to solve another interesting problem : Find minimum number of pages to read for each student. Problem statement:
Given N different books and M students. Each book has certain pages. Every student is assigned to read some consecutive books.  Find a minimum number of pages each student has to read, so that all books are read. It should be noted that a student cannot read partial book, he/she needs to read entire book. For example, if number of pages of 8 books are as given below and there are 3 students to finish those books, a student has to read at least 84 pages. Books have to be read in sequence and either complete book is read or not read at all by student.

minimum number of pages to read

Books read by each student is shown below

If we change the order of books as shown below, minimum number of pages each student has to read are 82

Minimum number of pages to read : Thought process

Before we solve it, let’s revisit the basic premise to use binary search algorithm.

Binary search can be used if and only if for all x in candidate Set S, predicate(x) implies predicate(y) for all y > x.

In this problem, if students can finish N books with each student reading K pages, then it is definitely possible to finish N books by reading K+1 and more pages. This statement implies, that problem satisfy to apply binary search.

For binary search algorithm, three things are required : search space or candidate solution set, lower bound and upper bound of search space.
Assume that there is only one student, what will be the minimum number of pages he or she has to read to finish all books? Obviously, student has to read at least all pages in all books. This gives us upper bound of our solution set. Answer of this problem cannot be more than this upper bound.

Now, assume that we have N students but there is no book to read. Then minimum number of pages to be read by each student is zero. Well, student cannot read less than zero pages, hence lower bound of solution is zero.
At this point, we know lower and upper bound of solution. How can we find the required minimum number of page with N books and M students?

Idea is to start with middle of lower and upper bounds of pages to be read. Let’s call it K. With each student reading K pages, will all books be completed? If yes, it is always possible to finish all books with each student reading more than K pages, hence, there is no need to check from K to upper bound. All we need to verify that if there is a solution with each student reading K or less than K pages each.

Designing predicate function

What will be predicate? Predicate will be implemented by going through each book’s pages and see when sum of pages goes more than current candidate minimum. As soon it current sum goes more than candidate minimum, we add one more student. When all books are finished, we check if we required less than equal to M students. If yes, this candidate solution is valid and predicate should return true. If more than M students are required to finish all books, then current candidate is not valid and hence function return false.

Based on what is returned from predicate function, either right or left subset of candidate solution is discarded. In this example, if predicate function returns true, upper bound to be searched will be set to K. Else lower bound will be set to K+1.

Minimum number of pages to read  implementation

package com.company;

import java.util.Arrays;
import java.util.Scanner;

/**
 * Created by sangar on 28.3.18.
 */
public class Books {
    public static boolean predicate(long[] books, long candidate, int days){

        long currentPages = 0;
        int studentRequired = 1;
        int i = 0;

        while(i<books.length){
            if(books[i] > candidate){
                return false;
            }
            if(currentPages + books[i] <= candidate){
                currentPages+=books[i];
                i++;
            }else{
                currentPages = 0;
                studentRequired++;
            }
        }
        return days >= studentRequired;
    }

    public static void main(String args[] ) throws Exception {
        Scanner scanner = new Scanner(System.in);

        int books = scanner.nextInt();
        int students = scanner.nextInt();

        long [] pages = new long[books];

        for(int i=0; i<books; i++){
            pages[i] = scanner.nextLong();
        }

        long low = 0;
        long high = Arrays.stream(pages).sum();

        while(low < high){
            long mid  = low + ( (high - low) >> 1);

            if(predicate(pages, mid, students)){
                high = mid;
            }else{
                low = mid+1;
            }
        }
        System.out.println(low);
    }
}

Complexity of algorithm to find minimum number of pages will be O(sum of pages of all books).

More problems on similar lines

It’s very interesting to see how many problems can be solved using same approach. I solved one on Hacker Rank : BooBoo and upsolving

  public static boolean predicate(long[] time, long candidateTime, int days){

        long currentTime = 0;
        int daysRequired = 1;
        int i = 0;

        while(i<time.length){
            if(time[i] > candidateTime){
                return false;
            }
            if(currentTime + time[i] <= candidateTime){
                currentTime+=time[i];
                i++;
            }else{
                currentTime = 0;
                daysRequired++;
            }
        }
        return days >= daysRequired;
    }

    public static void main(String args[] ) throws Exception {
        Scanner scanner = new Scanner(System.in);

        int tasks = scanner.nextInt();
        int days = scanner.nextInt();

        long [] time = new long[tasks];

        for(int i=0; i<tasks; i++){
            time[i] = scanner.nextLong();
        }

        /* What will be the maximum time he has to practice?
        It will be when he has only one day and all problems needs to be solved.
        that will give us the upper bound of time.

        What will be minimum time required? When he has no problems to be solved.
        That will give us lower bound of time.

        Idea is to start with middle of lower and upper bounds.And see if all problems can be solved
        by practicing that amount of time each day. If yes, there is a possibility that it can be done
        in less than that, hence, we try to find reduce our search space from lower bound to mid. Should mid be included?

        If all problems can not be solved by practicing mid amount of time, then there is no way it can be done
        by practicing less. Hence we increase the time and start looking in mid+1 to higher bound
        */

        //first let's set lower and higher bound.
        long low = 0;
        long high = Arrays.stream(time).sum();

        while(low < high){
            long mid  = low + ( (high - low) >> 1);

            if(predicate(time, mid, days)){
                high = mid;
            }else{
                low = mid+1;
            }
        }

        System.out.println(low);
    }

Similar method can be applied to topcoder problem Fair Work, try it yourself, if are able to solve it, please drop code in comment.

Please share if there is something is wrong or missing. If you want to contribute to website and share your knowledge with learners, please write to [email protected]

 

Longest alternating Subsequence

In this post, we will discuss another dynamic programming problem called the longest zigzag subsequence which can be solved using dynamic programming.

A sequence of numbers is called a alternating sequence if differences between successive numbers strictly alternate between positive and negative value. In other words, alternate subsequence is where elements of subsequence are alternate increasing and decreasing order, means, they satisfy below conditions:

x1 < x2 > x3 < x4 > x5 < ….  x2 < x3 > x4 < x5 > …. xn

A sequence with fewer than two elements is trivially a zigzag subsequence.

For example, 1,9,3,9,1,6 is a zigzag sequence because the differences (8,-6,6,-8,5) are alternately positive and negative. In contrast, 1,6,7,4,5 and 1,9,4,4,5 are not zigzag sequences, first sequence is not because its first two differences are positive and second because its last difference is zero.
Coming to the problem of the day: Given an array of integers, find longest alternating subsequence.

We have already seen a similar problem longest increasing subsequence in an array. That problem is solved using a dynamic programming approach. To apply dynamic programming, we need to properties: first, Optimal subproblem structure, that is the solution of the original problem depends on the optimal solution of subproblem; and second, overlapping subproblems, so that we can save computation by memoization.

Do these two properties exist in this problem? Does the longest zigzag subsequence till length i has anything to do with the longest zigzag subsequence till j where j is less than i? Also, it is already clear that alternating subsequence can start with decreasing first and then increasing or increasing first and then decreasing.

To add ith as next element in subsequence, consider two cases. First, ith element can be greater than previous element in longest zigzag subsequence till j where j < i. In this case, we are looking for all such j where A[j] < A[i]. Another criterion for j should be that A[j] less than the previous element in the sequence, that means, at j, we are looking exactly opposite condition than that i.

Second, ith element can be less than previous element in longest zigzag subsequence till j where j < i. In this case, we are looking for all such j where A[j] > A[i]. Another criterion for j should be that A[j] is greater than the previous element in the sequence, that means, at j again, we are looking exactly opposite condition than that at i.
For each i we will store these two.

Let’s say increase[i] describes LZS, for the first case and decrease[i] describes it for the second case.

  increase[i] = max(decrease[j] + 1) for all j< i && A[j] < A[i]
  decrease[i] = max(increase[j] + 1) for all j< i && A[j] > A[i]

Longest alternating subsequence dynamic programming approach

Before going through the implementation, it will be great if you can go through Longest increasing subsequence using dynamic programming
Implementation wise, both increase and decrease array can be one two dimensional array Table[][]. Table[i][0] represents length of longest zigzag subsequence ending at i with A[i] being greater than A[j] for all j in earlier subsequences.

Similarly, Table[i][1] represents length of subsequence ending at i with A[i] being less than A[j] for all j in earlier subsequences.

Table(i,0) = max(Table(j,1) + 1); 
             for all j < i and A[j] < A[i] 
Table(i,1) = max(Table(j,0) + 1); 
             for all j < i and A[j] > A[i]

What will be length of longest zigzag subsequence for index i?

Result =  max (Table(i,0), Table(i,1))

Click here to see longest alternating subsequence implementation

#include <stdio.h>
#include <stdlib.h>
 
int max(int a, int b) {  return (a > b) ? a : b; }
 
int longestZigzagSubsequence(int A[], int n)
{
    int Table[n][2];
 
    for (int i=0; i<n; i++){
    	Table[i][0] = 1; 
    	Table[i][1] = 1;
    }
 
    int result = 1;
 
    for (int i=1; i<n; i++) {
        for (int j=0; j<i; j++){
        	// If A[i] is greater than last element in subsequence, 
        	//then check with Table[j][1]
        	if (A[j] < A[i] && Table[i][0] < Table[j][1] + 1)
                    Table[i][0] = Table[j][1] + 1;
                /* If A[i] is smaller than last element in subsequence,
                then check with Table[j][0] */
                if( A[j] > A[i] && Table[i][1] < Table[j][0] + 1)
                   Table[i][1] = Table[j][0] + 1;
        }
 
        /* Pick maximum of both values at index i  */
        if (result < max(Table[i][0], Table[i][1]))
            result = max(Table[i][0], Table[i][1]);
        printf("\n %d", result);
    }
 
    return result;
}
Complexity of dynamic programming approach to find longest alternate subsequence is O(n2) using O(n) extra space.

Please share if there is something wrong or missing. If you want to contribute to website, please contact us.

Boolean Parenthesization Problem

Boolean Parenthesization problem

Given a boolean expression, a string with True or False as operands and between each pair of operand,  there is boolean operator (and &, or | and xor ^). Find number of ways in which this Boolean expression can be parenthesized so that expression evaluates to True. This is known as Boolean Parenthesization problem. To understand problem better, let’s take some examples
Expression :

T ^ F & T

Two ways :

((T ^ F) & T) and (T ^ (F & T))

boolean parenthesization problem

T | T & F ^ T

Four ways :

((T|T)&(F^T)), (T|(T&(F^T))), (((T|T)&F)^T)
and (T|((T&F)^T))

boolean-parenthesization

Boolean Parenthesization problem : Line of thoughts

What will be the most trivial Boolean expression? Of course, an expression with only one Boolean value T or Boolean value F.

How many ways can this expression be parenthesized so that expression evaluates to True ? Apparently, there is only one way.

For Boolean value T, there is one way, (T); whereas for F, there no way we can parenthesize to evaluates True. An expression can evaluate to either True or False value.

Let’s say, T(i,j) is number of ways expression from i to j can be parenthesized so that it evaluates to True. Similarly, F(i,j) is number of ways expression evaluates to False. With base case, only one value either T or F is there, hence i=j, hence following equations hold true

T(i,i) = 1 if operand is T
         0 if operand is F

F(i,i) = 0 if operand is T
         1 if operand is F

How to calculate T(i, j) for expression with more than one values and operators between them?  This is something familiar to matrix chain multiplication problem. We will put parenthesis at all possible position and count how many ways these two resultant expressions hold True. Once we have count for each expression, we can combine count based on operator between split expression.

For expression from index i to index j, find k such that i<k<j, and find number of ways expressions from i to k and k+1 to j evaluates to True. Interesting, once these numbers are determined, number of ways for expression i to j can be calculated based on operator between expression i to k and k+1 to j.

When Boolean operator is & (AND)

When can expression (i,j) be True if expression is of form Expression(i, k) & Expression(k+1, j)?  Only if Expression(i,k) and Expression(k+1,j) are  both True. Hence, for any k, expression can be True in T(i,k) * T(k+1, j) where T(i,k) is number of ways Expression(i,k) is True and T(k+1, j) is number of ways Expression(j+1, j) is True. For all possible values of k, expression becomes

T(i,j)  = Summation ( T(i,k) * T(k+1,j)) for all k such that i < k < j

How about expression (i,j) being evaluates to False? Simple enough, one of the two expression should evaluate to False.

If Total(i,j) represents total number of ways an expression can be parenthesized irrespective of out being True or False, then

Total(i,j) =  Total(i,k) * Total(k+1, j)
or
Total(i,j) = T(i,j) + F(i,j)

If we take out number of ways an expression can parenthesized as True from Total, it gives number of ways it can be evaluates False. Hence, below equation

F(i,j) = Sum ( Total (i,j) - T(i,k)* T(k+1)) for all k for i< k< j
or
F(i,j) = Sum (Total(i,k) * Total(k+1, j) - T(i,k)* T(k+1) )

When Boolean operator | (OR)

In case, operator is OR, then, whole expression is True is any one of the expressions is True. How many ways both Exp(i,k) and Exp(k+1, j) be False.

Following the same logic from AND operator True, it can be derived that

F(i,j) = Summation (F(i,k)* F(k+1,j)) for all  i<k<j

Overall expression is True when both sub-expressions are not False. Hence.

T(i,j) = sum ( Total(i,j) - F(i,k)* F(k+1,j)) for k such i<k

In the same vein, T(i,j) and F(i,j) when operand is xor will be

T(i,j) = sum(T(i,k)*F(k+1,j) + F(i,k)* T(k+1,j)) for k such i<k

To find solution to Boolean parenthesis problem, find is T(1,N).

Implementation : Boolean parenthesization problem

package com.company;

/**
 * Created by sangar on 31.12.17.
 */
public class BooleanParenthesis {

    public static int calculateNumberOfWays(String operators, String operands){
        int numOperands = operands.length();


        int[][] F = new int[numOperands][numOperands];
        int[][] T = new int [numOperands][numOperands];

        for (int i=0; i<numOperands; i++){
            System.out.println(operands.charAt(i));
            F[i][i] = (operands.charAt(i) == 'F')? 1: 0;
            T[i][i] = (operands.charAt(i) == 'T')? 1: 0;
            System.out.println(T[i][i]);
        }

        for (int L=1; L<numOperands; L++) {
            for (int i=0; i<numOperands-L; ++i){
                int j = i+L;
                T[i][j] = F[i][j] = 0;
                for (int k=i; k<j; k++){
                    int totalIK = T[i][k] + F[i][k];
                    int totalKJ = T[k+1][j] + F[k+1][j];
                    if (operators.charAt(k) == '&') {
                        T[i][j] += T[i][k]*T[k+1][j];
                        F[i][j] += (totalIK *totalKJ - T[i][k]*T[k+1][j]);
                    }
                    if (operators.charAt(k) == '|'){
                        F[i][j] += F[i][k]*F[k+1][j];
                        T[i][j] += (totalIK*totalKJ - F[i][k]*F[k+1][j]);
                    }
                    if (operators.charAt(k) == '^'){
                        T[i][j] += F[i][k]*T[k+1][j] + T[i][k]*F[k+1][j];
                        F[i][j] += T[i][k]*T[k+1][j] + F[i][k]*F[k+1][j];
                    }
                }
            }
        }
        for(int i=0; i<numOperands; i++){
            for(int j=0; j<numOperands; j++){
                System.out.println("(" + i + "," + j + ") :"  + T[i][j]);
            }
        }
        return T[0][numOperands-1];
    }

    public static void main(String[] args) {

        String operands = "TTFT";
        String operators = "|&^";

        System.out.println("Number of ways to parenthisize expression : " +
                calculateNumberOfWays(operators, operands));

    }
}

Complexity of  dynamic programming approach to find ways to parenthesize a Boolean expression to evaluate it to True is O(n3). and space complexity is O(n2) .

Please share if there is something missing or wrong. If you want to contribute to algorithms and me and share your knowledge with thousands of learners across world, please contact us..

Longest common substring

Longest Common Substring

Given two string A and B, find longest common substring in them. For example, A = “DataStructureandAlgorithms” and B=“Algorithmsandme”, then longest common substring in A and B is “Algorithms”. Below figure shows longest common substring.

longest common substring

Brute force solution is to find all substrings of one string and check any of these substring are substring of second string, while comparing, keep track of the longest one we found. There can be n2substring for a string with length n and to find if a string is substring of another, it takes another m operations, where m is length of second string. Hence, overall complexity of this method is O(n2m).

Can we do better than that?

Longest common substring : Line of thoughts

We have to find longest common substring in strings of length M and length N. Can we find longest common substring till length M-1 and N-1 and then derive longest common substring for M and N?  Yes, we can find. The length either grows by one if last characters are equal or reset to zero if last characters are not equal. Why so?

First see why we need to reset to zero when characters are different. This because we are looking for common substring which means characters should be consecutive, any different character restart the the entire search because with those two  different characters, there can’t be any common substring.

What if characters are same? In that case we increment by one, because, longest common substring in N-1 and M-1 would be either 0 or some number based on how any consecutive common characters were till N-1 and M-1.

What will be longest common substring when one of the strings is empty? It will be zero.

So, do you see recursion here? So, let’s write recursion relation and then implement it.

LCS(i,j) = 1+LCS(i-1, j-1) if S[i] = T[j] 
         =  0 otherwise

This recursion relation has optimal subproblem property that solution to the problem actually depends on solutions to subproblems. Also, there are subproblems which will be calculated again and again, which is called overlapping subproblems. These two properties are required for dynamic programming. To not to calculate subproblems, we will use memoization, for that  create a two dimensional array called LCS with dimensions as n and m. LCS[i][j] represents the length of longest common substring in A[0..i] and B[0..j]. And since solution for i-1 and and j-1 is required before solution of i and j, this matrix will be filled bottom up.

Longest common substring using dynamic programming

How to fill LCS[i][j]?

1. Check if A[i] is equal to B[j] 
   1.1 If yes, LCS[i][j] = 1 + LCS[i-1][j-1]
( Because new character is added to already common substring, 
     if any, till A[0...i-1] and B[0,,j-1])
   1.2 if both characters are not same, LCS[i][j] = 0,
       ( Because if characters are not same, there cannot be any
         common substring including A[i] and B[j].

Implementation

#include <stdio.h>
#include <string.h>

int max(int a, int b){
	return a>b ? a:b;
}
 int longestCommonSubstring(char * A, char * B){
     int lenA = strlen(A);
     int lenB = strlen(B);
     int LCS[lenA+1][lenB+1];

     for (int i=0; i<= lenA; i++){
         LCS[i][0] = 0;
     }

     for (int j=0; j <= lenB; j++){
         LCS[0][j] = 0;
     }
	
     int maxLength = 0;
     for (int i=1; i<= lenA; i++){
        for (int j=1; j <= lenB; j++){
            if (A[i] == B[j]){
                LCS[i][j] = 1 + LCS[i-1][j-1];		
                maxLength = max( maxLength, LCS[i][j] );
            } 
            else {
               LCS[i][j] = 0;
            }
         }
     }
     return maxLength;
}

int main(void) {
    char *a = "ABCDEFGSE";
    char *b = "EBCDEFGV";
	
    printf("\n Longest common substring : %d",
			longestCommonSubstring(a,b));
    return 0;
}
package com.company;

/**
 * Created by sangar on 5.1.18.
 */
public class LCS {

    public  static int longestCommonSubstring(String A, String B){
        int lenA = A.length();
        int lenB = B.length();

        int [][] LCS = new int[lenA][lenB];

        for (int i=0; i<lenA; i++){
            LCS[i][0] = 0;
        }

        for (int j=0; j<lenB; j++){
            LCS[0][j] = 0;
        }

        int maxLength = 0;
        for (int i=1; i<lenA; i++){
            for (int j=1; j<lenB; j++){
                if (A.charAt(i) == B.charAt(j)){
                    LCS[i][j] = 1 + LCS[i-1][j-1];
                    maxLength = Integer.max(maxLength, LCS[i][j]);
                }
                else {
                    LCS[i][j] = 0;
                }
            }
        }

        for (int i=0; i<lenA; i++){
            System.out.println();
            for (int j=0; j<lenB; j++){
                System.out.print(" " + LCS[i][j]);
            }
        }
        return maxLength;
    }

    public static void main(String[] args) {
	    String a = "ABCDEFGS";
	    String b = "EBCDEFG";

        System.out.println("Longest common substring :" +
                longestCommonSubstring(a,b));
    }
}

Time complexity of dynamic programming approach to find length of longest common substring in two string is O(n*m) and space complexity is O(n*m) where n and m are lengths of two given strings.

longest common substring dynamic programming

In next post, we will discuss suffix tree method to find LCS which is more optimized than DP solution and can be easily be generalized for multiple strings.

This solution is very similar to Longest common subsequence. Difference between two problems is that a subsequence is collection of characters, which may or may not be contiguous in string, where for a substring, characters must be contiguous. Based on this difference, out solution will vary a bit.

Please share if you find something wrong or missing. If you want to contribute to site, please refer contact us. We would be happy to publish your work and in turn will pay you too.

Find bridges in graph

Given a direct graph, detect bridges in the graph.

An edge is called as bridge edge if and only if on removal of that edge will increases number of components increase by one.

For example, in the below graphs, bridges are shown in green
bridges in graph

The concept of detecting bridges in a graph will be useful in solving the Euler path or tour problem.

Depth First Search of graph can be used to see if graph is connected or not. We can use the same concept, one by one remove each edge and see if the graph is still connected using DFS. If yes, then the edge is not bridge edge, if not, then edge is bridge edge.

However, this method entails quite a complexity of O(E * (V+E)) where E is number of edges and V is number of vertices.

Let’s think something better. Consider that we are looking at the edge (u,v) in a graph. In what condition, we can say that it is a bridge edge?
If we can somehow reach node u or any ancestor of u from any node which is a decedent of v, that means the graph is still connected and (u,v) is not a bridge edge. If the above condition is not possible, then (u,v) is the bridge.

How can we determine that there is no edge from decedent of v to u or its ancestor? For that we need to maintain time when a node was discovered during the depth-first search, call it tin[].

tin[u] is time when node u was discovered using DFS. If d[u] < d[v], means u was discovered before v.

Below is a graph with tin[u] filled for each node.
find bridges in a graph

Now, figure out the lowest tin[x] which can be reached from each node. Reason to find that is to see if there is a node x which is reachable from children of v and has tin[x] less than tin[u], i.e. x is ancestor of u reachable from children of v.

Store lowest DFS ancestor reachable from a node i in an array low[u].
low[u] = min(low[u], low[v])  for edge (u,v)

Idea here is that if (u,v) is an edge, then either there is no back edge from subtree of v to u and ancestor of u.
If there is a back edge to x from subtree of v, then minimum tin[x] reached by node in subtree will be assigned to the low[u].

The diagram shows the calculation of low[] in a graph.
bridge edges in a graph
Finally, if low[v] > tin[u] that means if discovery time of u is less than least ancestor that can be reached from subtree of v, we have a bridge, because there is no way we can reach to an ancestor of u once we disconnect edge (u,v).

Lots of theory, let’s code it. We will be modifying Depth First Search implementation to keep track of tin[] and low[].

Bridges in a graph implementation

package AlgorithmsAndMe;

import java.util.*;

public class Bridges {

    Set<Integer> visited = new HashSet<>();
    /* This map stores the time when the
    current node is visited
     */
    Map<Integer, Integer> tin = new HashMap<>();

    /*
      low will store minimum on 
       tin[v]
       tin[p] for all p for which (v,p) is a back edge
       low[to] for all to for which (v,to) is a tree edge
     */
    Map<Integer, Integer> low = new HashMap<>();
    
    //To maintain monotonic increasing order.
    int timer;

    void dfs(Graph g, int u, int parent) {
        visited.add(u);

        //Put the current timer.
        tin.put(u, timer);
        //Low is the time of entry to start with
        low.put(u,timer);
        
        timer++;
        
        /*
            Go through all the neighbors
         */
        for (int to : g.getNeighbors(u)) {
            //If it is parent, nothing to be done
            if (to == parent) continue;
            
            /* If the neighbor was already visited
                get the minimum of the neighbor entry time
                or the current low of the node.
             */
            if (visited.contains(to)) {
                low.put(u, Math.min(low.getOrDefault(u, Integer.MAX_VALUE),
                        tin.getOrDefault(to, Integer.MAX_VALUE)));
            } else {
                //Else do the DFS
                dfs(g, to, u);
                /*
                 Normal edge scenario,
                 take the minimum of low of the parent and the child. 
                 */
                low.put(u, Math.min(low.getOrDefault(u, Integer.MAX_VALUE),
                        low.getOrDefault(to, Integer.MAX_VALUE)));
                
                /* If low of the child node is less than
                   time of entry of current node, then
                   there is a bridge.
                 */
                if (low.get(to) > tin.get(u))
                    System.out.println(u + "->" + to);
            }
        }
    }

    public void findBridges(Graph g) {
        timer = 0;
        Iterator it = g.getNodes().iterator();
        while(it.hasNext()){
            int i = (int) it.next();
            if (!visited.contains(i))
                dfs(g, i, -1);
        }
    }
}

The complexity of finding bridges in a graph is O(V+E) where V is number of vertices and E is number of edges in graph.

Problems you can solve using this concept:
796 – Critical Links

Median of integers stream

Median of integers stream

We solve two problems which involved streams, first was to find first non repeated character in stream and second was LRU cache. Let’s discuss another problem which is to find median of integers stream. Problem statement is like this: Given continuous stream of integers, find median of integers stream received till given point of time. Median can be asked at multiple times.

To understand problem better, ask yourself, what is a median?

The median is the value separating the higher half from the lower half of a data sample. For a data set, it may be thought of as the “middle” value.

Wikipedia

For example, in the data set {1, 3, 3, 6, 7, 8, 9}, the median is 6, the fourth largest, and also the fourth smallest, number in the sample.

Median of sorted array of integers is element at middle index of array if size of array is odd and average of elements at mid and mid +1 elements if size of array is even.

Now, that we understood the definition of median. let’s go back to our problem and take an example to understand it further. Problem is that we get integers from a stream, one by one and at any given point of time, we have to return median of set of integers received till now. 
First, stream throws 12, then 7 and then 8. What will be the median now? It will be 8, because if we arrange 12,7,8 in sorted order, 8 is element at middle. What if we get 11 next? Well, now sorted order looks like 7,8,11,12. As size of set is even, we take average of mid and mid+1 element which is 9.5.

Median of integers stream : thoughts

What will be the brute force solution? As integers are processed from stream, store them in an array. Can we store element randomly? If yes, to find median, we have to sort array every time. Complexity of this method to find median in stream of integers will be O(n log n) dominated by the sorting algorithm.
How about we insert element in array in sorted order. This will make complexity of processing integer from stream O(n2), as we have to move n elements to right in worst case.
Another underlying problem in using array here is that we do not know how many integers will come out of stream, so it will be very difficult to pre-allocate memory for it. Linked list can solve that problem, however, it does not reduce complexity of processing, at the same increases the complexity of finding median to O(n) from O(1).

Think of this, do we need completely sorted set of  integers before we can calculate the median? Actually, we need kth smallest element of array if size of set is odd and average of kth and k+1th element if size of set is even, k will be n/2. 

However, we do not have pre-processed array with us. What is the property of the median? Median is greater than all elements on left of it and less than all elements on the right side of it, where the number of elements on both groups is equal or differs by 1.

Median of integers stream : Heaps

How about we split the incoming integers into two halves. Whenever median is asked, we can get the maximum of one half and return it as median, if the size of two halves differ by 1 or return of average of the max of one half and minimum of other halves if the size of two halves is equal.

What data structure is best to find min and max in constant time? Heap it is. In this case, we will need two heaps, one max and another min heap. Max heap will store all the elements on the left side of median and min heap will store all the elements on the right side of the median.

How to balance the size difference between the two heaps? Insert new processed integer into the max heap,  if the size of the max heap is 2 more than min heap, extract the maximum element from the max heap and put it in min heap.

Also, maintain the property that all the elements on the max heap should be less than elements on the min heap. So, whenever the root of the max heap greater than the root of min heap, it should be removed from the max heap and added to the min heap.

Let’s take an example and understand the method first and the make concrete algorithm out of it. We have the first number from the stream as 12, what should we do? We decided to put every number on the max heap to start with.

median of integer stream
Add the integer to max heap as both heaps are empty at this point of time

Now, comes the integer 7. First of all, we add a new integer to the max heap. This will create a difference in size of the min and max heap more than one. In that case, we will take out the max from the max heap and put it into the min heap.

median of integers stream
7 is added to the max heap, which makes size difference of more than 1.
So, the root of the max heap (12) is moved to min heap

Next integer is 18, what happens now. We add into the max heap. Difference between sizes is not more than 1, However, the root of the max heap (18) is greater than the root of min heap (12). In this case, too, we take the root of the max heap and move it to the min heap. At this point, if the median of integers stream is asked, return the root of min heap which is 12.

18 is added to max heap, however now the root of max heap is more than the root of the min heap, so it should be removed from the max heap
median of stream
18 is removed from the max heap and added to the min heap.

Come the integer 10, it goes into the max heap, does not create any size difference and the root of the max heap is less than the root of the min heap. At this point, the median of the stream of integers till now is 11 ((10+12)/2).

median of stream of integers
10 is added to the max heap.

.New integer from the stream is 11. As usual, add the new integer to the max heap, size difference remains less than 2 and 11 is less than the root of the min heap (12).
What should be the median now? At this point, the size of the max heap is more than the min heap, hence we will return the root of the max heap (11)

median of integer stream
11 is added to max heap

Median of a stream of integers: Algorithm

  1. Process integer from the stream and add it to the max heap.
  2. If the root of max heap greater than the root of the min heap:
    1. Delete the root from the max heap
    2. Add removed integer from the max heap to the min heap
  3. If the size difference between the two heaps is more than 2:
    1. Remove the root of the heap which has more elements.
    2. Add removed node to another heap.
  4. To calculate the median:
    1. If the size of both heaps equal, return average of their roots.
    2. Else, return the root of the heap with more elements.

Median of integers stream : Implementation

Implementation involves priority queue in Java, refer to Stack Overflow question on how to use priority queue as a max heap.

package com.company;

import java.util.Collections;
import java.util.PriorityQueue;

/**
 * Created by sangar on 18.10.18.
 */
public class MedianOfIntegerStream {
    private PriorityQueue maxHeap;
    private PriorityQueue minHeap;

    public MedianOfIntegerStream(){
        maxHeap = new PriorityQueue(Collections.reverseOrder());
        minHeap = new PriorityQueue();
    }

    public double getMedian(){
        if(maxHeap.size() == minHeap.size())
            return (double)((int)maxHeap.peek() + (int)minHeap.peek())/2;

        if(maxHeap.size() > minHeap.size())
            return (double)(int)maxHeap.peek();

        return (double)(int)minHeap.peek();

    }

    public void processInteger(int data){
        maxHeap.add(data);

        if(maxHeap.size() - minHeap.size() > 1
                || ( minHeap.size() > 0 
				&& (int)maxHeap.peek() > (int)minHeap.peek())){
            minHeap.add(maxHeap.poll());
        }

        if(minHeap.size() - maxHeap.size() > 1){
            maxHeap.add(minHeap.poll());
        }
    }
}

Test cases for median in integers stream

package test;

import com.company.MedianOfIntegerStream;
import org.junit.jupiter.api.Test;

import static org.junit.jupiter.api.Assertions.assertEquals;

/**
 * Created by sangar on 23.9.18.
 */
public class MedianOfIntegerStreamTest {

    MedianOfIntegerStream tester = new MedianOfIntegerStream();

    @Test
    public void baseTest() {

        tester.processInteger(12);
        tester.processInteger(7);

        assertEquals(9.5, tester.getMedian() );
    }

    @Test
    public void maxHeapWithMoreElementsTest() {

        tester.processInteger(12);
        tester.processInteger(7);
        tester.processInteger(9);

        assertEquals(9, tester.getMedian() );
    }

    @Test
    public void minHeapWithMoreElementsTest() {

        tester.processInteger(12);
        tester.processInteger(7);
        tester.processInteger(9);
        tester.processInteger(13);
        tester.processInteger(15);

        assertEquals(12, tester.getMedian() );
    }

    @Test
    public void minHeapSizeMoreThanTwoDifferenceTest() {

        tester.processInteger(12);
        tester.processInteger(7);
        tester.processInteger(9);
        tester.processInteger(13);
        tester.processInteger(15);
        tester.processInteger(17);
        tester.processInteger(19);

        assertEquals(13, tester.getMedian() );
    }

    @Test
    public void maxHeapGetsTheElementTest() {

        tester.processInteger(12);
        tester.processInteger(7);
        tester.processInteger(9);
        tester.processInteger(13);
        tester.processInteger(15);
        tester.processInteger(17);
        tester.processInteger(5);
        assertEquals(12, tester.getMedian() );
    }
}

Complexity of processing is O(log n) to insert an element into any heap. However, fetching median in stream of integers at any given time is O(1).

Please share if there is something wrong or missing. Please signup if you want to receive curated interview material for your preparation.

Merge overlapping intervals

Given N intervals S = {E1,E2,…..En} with each Ei has start time si and end time ei. Some of these intervals are overlapping. The problem statement is to merge these overlapping intervals.

Ei and Ej overlap when start time of Ej i.e sj is less than end time of Ei i.e ei.

For example:

Input:
[(1,3),(2,4),(5,8), (6,9)] 
Output: 
[(1, 4),(5,9)]
Explantion:
Interval (1,3) and (2,4) and interval (5,8) and (6,9) overlap.

merge overlapping intervals

Merge overlapping intervals solution

As we always do, first try to come up with brute force solution, given enough time and space and money, how would you solve this?
The natural course is to take ith interval and compare start time of all jth intervals with end time of ith, if the start time of jth interval is less than the end time of ith event, then you can merge intervals. What should be end time for merged interval then?  It should be a maximum of end times of two merged intervals.

What will be the time complexity of this approach? We are not using any additional space, however, the worst-case time complexity is O(n2). Can we do better?

What are two times we are comparing in brute force solution? It’s the start time of one interval with the end time of another. If we arrange input in a specific order, can we reduce processing some entries?

If we sort all intervals based on their start time, si < si+1< si+2. Also, interval is always forward looking, ei > si, ei+1 > si+1 and so on.

If si is greater ei-1, then si+1 will be greater than ei-1, so no need to compare si+1 with ei-1, that is no need to go beyond immediate previous interval for any interval Ei.

If si is less than ei-1, update ei-1 with maximum of ei-1 and ei and move to Ei+1.

Notice that we need last interval Ei-1 to decide if to merge new interval into previous one or keep it as standalone. A stack is the best data structure to use. The algorithm will look like:

  1. Consider interval Ei.
  2. If stack is empty, push Ei to stack.
  3. If stack is not empty, then pop interval at top of stack call it Ei-1.
  4. Compare si, start time of Ei with ei-1, end time of Ei-1.
  5. If si less than ei-1, update ei-1 as max(ei-1, ei), as in maximum of end times of two intervals and push back Ei-1on to stack.
  6. Else push Ei on to stack.
  7. Continue till all events are considered.
  8. At the end of processing, stack will contain all merged interval.

Let’s take an example and see how this algorithm works. We have following intervals and we have to merge overlapping intervals.
algorithm to merge overlapping intervals
algorithm to find overlapping intervals java

Find the maximum of end times of two intervals and update the previous interval with that end time and push it back on to stack.

At this point, when there is no more interval remaining, the stack contains all merged overlapping intervals.

Merge intervals Java implementation

package com.company;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.Stack;

/**
 * Created by sangar on 8.4.18.
 */
public class OverlappingIntervals {
    public  static ArrayList<Interval>
        mergeOverlappingIntervals(ArrayList<Interval> intervals){

        ArrayList<Interval> mergedIntervals = new ArrayList<>();
        Stack<Interval> s = new Stack();

        //Sort the ArrayList of interval based on start time.
        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));
        for(Interval currentInterval : intervals){
            if(s.empty())s.push(currentInterval);
            else {
                Interval previousInterval = s.pop();
                if(previousInterval.getEndTime() > 
                     currentInterval.getStartTime()){
                    /*
                    If current interval's start time is less than end time of
                    previous interval, find max of end times of two intervals
                    and push new interval on to stack.
                     */
                    int endTime = Integer.max(previousInterval.getEndTime(),
                                              currentInterval.getEndTime());
                    /* Notice that we have created new interval and 
                       did not update the old one
                       This concept is called as immutability of class
                     */
                    s.push(new Interval(previousInterval.getStartTime(),
                                        endTime));
                }
                else{
                    s.push(previousInterval);
                    s.push(currentInterval);
                }
            }
        }
        while(!s.empty()){
            mergedIntervals.add(s.pop());
        }

        return mergedIntervals;
    }

    public static void main(String[] args) {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(1,3));
        intervals.add(new Interval(2,4));
        intervals.add(new Interval(5,8));
        intervals.add(new Interval(6,9));
        ArrayList<Interval> mergedIntervals 
                    = mergeOverlappingIntervals(intervals);
        for (Interval interval : mergedIntervals){
            System.out.print("(" + interval.getStartTime() +"," 
                      + interval.getEndTime() + ")");
        }
    }
}


Complexity of algorithm to merge overlapping intervals will be O(nlogn) due to sorting with O(n) extra space for stack and then copying into the list to return also takes O(n) space.

There is another way to implement the same function without using the stack, here we use the fact that ArrayList in Java is implemented using the array as the base and getting an element at a particular index should be O(1) operation. The code looks more or less the same, however, there is no traversal of the stack at the end to create the list to return.

find overlapping intervals

public List<Interval> mergeOptimized(List<Interval> intervals) {

        if(intervals.size() == 0) return intervals;

        Collections.sort(intervals, 
           (Interval a, Interval b) -> a.getStartTime() - b.getStartTime());

        List<Interval> mergedIntervals = new ArrayList<Interval>();
        for(Interval interval : intervals){

            /*If the merged list is empty add the interval to 
              it or check if the last interval in merged list overlaps

            /*Remember the get function on ArrayList is O(1) operation
              because Arraylists in Java are backed by arrays */
            if(mergedIntervals.isEmpty()
                    || mergedIntervals.get(
                           mergedIntervals.size()-1).getEndTime() < 
                           interval.getStartTime() ){
                mergedIntervals.add(interval);
            }
            else {
                int lastEndTime = Math.max(
                        mergedIntervals.get(mergedIntervals.size()-1)
                                            .getEndTime(),
                        interval.getEndTime()
                );
                mergedIntervals.get(mergedIntervals.size()-1)
                                     .setEndTime(lastEndTime);
            }
        }

        return mergedIntervals;
    }

You can use the above snippet of code to submit for this leetcode problem and it should be accepted.

Please share if there is something missing or wrong. Also, please reach out to us at [email protected] if you want to contribute to the website and help others to learn by sharing your knowledge. If you are preparing for an interview and need some coaching to prepare for it, please sign up for the free session with us.