Minimizing maximum lateness

Minimizing maximum lateness : Greedy algorithm

Since we have chosen the greed, let continue with it for one more post at least. Today’s problem is to minimize maximum lateness of a task. Let me clarify the problem: given a processor which processes one process at a time and as always given a list of processes to be scheduled on that processor, with the intention that maximum late process should be minimized. Contrary to previous problems, this time, we are not provided with start time and end time, but we are given length of time ti process will run and deadline it has to meet di, fi is actual finish time of process completion.

Lateness of a process is defined as
li = max{0, fi − di}, i.e. the length of time past its deadline that it finishes.
Goal here to schedule all tasks to minimize maximum lateness L = max li For example:

minimize maximum lateness

Minimizing maximum lateness : algorithm

Let’s decide our optimization strategy. There is some order in which jobs can be decided: shortest job first, earliest deadline first, least slack time first.

Let’s see if any of the above strategies work for the optimal solution. For shortest processing time first, consider example P1 = (1,100) P2 = (10, 10). If we schedule the shortest job first as in order (P1, P2), lateness will be 91, but if we take them as (P2, P1), lateness will be 0. So, clearly taking the shortest process first does not give us an optimal solution.

Check for the smallest slack time approach. See if you can come up with some counterexample that it does not work.

That leaves us with only one option, take the process which has the most pressing deadline, that is the one with the smallest deadline and yet not scheduled. If you have noticed, the example given for the problem statement is solved using this method. So, we know it works.

  1. Sort all job in ascending order of deadlines
  2. Start with time t = 0
  3. For each job in the list
    1. Schedule the job at time t
    2. Finish time = t + processing time of job
    3. t = finish time
  4. Return (start time, finish time) for each job

Minimizing maximum lateness : implementation

from operator import itemgetter

jobs = [(1, 3, 6), (2, 2, 9), (3, 1, 8), (4, 4, 9), 
        (5, 3, 14), (6, 2, 15)] 

def get_minimum_lateness():
	schedule =[];
	max_lateness = 0
	t = 0;
	
	sorted_jobs = sorted(jobs,key=itemgetter(2))
	
	for job in sorted_jobs:
		job_start_time = t
		job_finish_time = t + job[1]

		t = job_finish_time
		if(job_finish_time > job[2]):
			max_lateness =  max (max_lateness, (job_finish_time - job[2]))
		schedule.append((job_start_time, job_finish_time))

	return max_lateness, schedule

max_lateness, sc = get_minimum_lateness();
print "Maximum lateness will be :" + str(max_lateness)
for t in sc:
	print t[0], t[1]

The complexity of implementation is dominated by sort function, which is O(nlogn), rest of processing takes O(n).

Please share your suggestions or if you find something is wrong in comments. We would love to hear what you have to say. If you find this post interesting, please feel free to share or like.

Coin change problem : Greedy algorithm

Coin change problem : Greedy algorithm

Today, we will learn a very common problem which can be solved using the greedy algorithm. If you are not very familiar with a greedy algorithm, here is the gist: At every step of the algorithm, you take the best available option and hope that everything turns optimal at the end which usually does. The problem at hand is coin change problem, which goes like given coins of denominations 1,5,10,25,100; find out a way to give a customer an amount with the fewest number of coins. For example, if I ask you to return me change for 30, there are more than two ways to do so like

 
Amount: 30
Solutions : 3 X 10  ( 3 coins ) 
            6 X 5   ( 6 coins ) 
            1 X 25 + 5 X 1 ( 6 coins )
            1 X 25 + 1 X 5 ( 2 coins )

The last solution is the optimal one as it gives us a change of amount only with 2 coins, where as all other solutions provide it in more than two coins.

Solution for coin change problem using greedy algorithm is very intuitive and called as cashier’s algorithm. Basic principle is : At every iteration in search of a coin, take the largest coin which can fit into remaining amount we need change for at the instance. At the end you will have optimal solution.

Coin change problem : Algorithm

1. Sort n denomination coins in increasing order of value.
2. Initialize set of coins as empty. S = {}
3. While amount is not zero:
3.1 Ck is largest coin such that amount > Ck
3.1.1 If there is no such coin return “no viable solution”
3.1.2 Else include the coin in the solution S.
3.1.3 Decrease the remaining amount = amount – Ck

Coin change problem : implementation

#include <stdio.h>
 
int coins[] = { 1,5,10,25,100 };
 
int findMaxCoin(int amount, int size){
	for(int i=0; i<size; i++){
	    if(amount < coins[i]) return i-1;
	}
	return -1;
}

int findMinimumCoinsForAmount(int amount, int change[]){
 
	int numOfCoins = sizeof(coins)/sizeof(coins[0]);
	int count = 0;
	while(amount){
	    int k = findMaxCoin(amount, numOfCoins);
	    if(k == -1)
                printf("No viable solution");
	    else{
                amount-= coins[k];
		change[count++] = coins[k];
            }
	}
	return count;
}
 
int main(void) {
	int change[10]; // This needs to be dynamic
	int amount = 34;
	int count = findMinimumCoinsForAmount(amount, change);
 
	printf("\n Number of coins for change of %d : %d", amount, count);
	printf("\n Coins : ");
	for(int i=0; i<count; i++){
		printf("%d ", change[i]);
	}
	return 0;
}

What will the time complexity of the implementation? First of all, we are sorting the array of coins of size n, hence complexity with O(nlogn). While loop, the worst case is O(amount). If all we have is the coin with 1-denomination. Overall complexity for coin change problem becomes O(n log n) + O(amount).

Will this algorithm work for all sort of denominations? The answer is no. It will not give any solution if there is no coin with denomination 1. So be careful while applying this algorithm.

Please share if you have any suggestion or if you want me to write on a specific topic. If you liked the post, share it!

Interval partitioning problem

Interval partitioning problem

In continuation of greedy algorithm problem, (earlier we discussed : even scheduling and coin change problems) we will discuss another problem today. Problem is known as interval partitioning problem and it goes like : There are n lectures to be schedules and there are certain number of classrooms. Each lecture has a start time si and finish time fi. Task is to schedule all lectures in minimum number of classes and there cannot be more than one lecture in a classroom at a given point of time. For example, minimum number of classrooms required to schedule these nine lectures is 4 as shown below.

interval partition

However,  we can do some tweaks and manage to schedule same nine lectures in three classrooms as shown below.

So, second solution optimizes the output.

Another variant of this problem is :  You want to schedule jobs on a computer. Requests take the form (si , fi) meaning a job that runs from time si to time fi. You get many such requests, and you want to process as many as possible, but the computer can only work on one job at a time.

Interval partitioning : Line of thought

First thing to note about interval partitioning problem is that we have to minimize something, in this case, number of classrooms. What template this problem fits into? Greedy may be? Yes it fits into greedy algorithm template. In greedy algorithm we take decision on local optimum.

Before discussing the solution, be clear that what is resource and what needs to be minimized? In this problem, resource is classroom and total number of classroom needs to be minimized by arranging lectures in certain order.

There are few natural orders in which we can arrange all lectures or for sake of generality, tasks. First is to arrange them in order of finish time,  second is to arrange in order of start time, third is to order them by smallest duration of task, fourth is by minimum number of conflicting jobs. Which one to chose?
You can come up with counter example when if lectures are arranged in classrooms by order of their end time, or smallest duration or minimum number of conflicting jobs, it does not end to optimal solution  So, let’s pick lectures based on earliest start time. At any given pint of time, pick lecture with least start time and yet not scheduled and then assign it to first available class. Will it work? Sure it does.  When you have assigned all lectures, total number of classrooms will be minimum number of classrooms required.

Interval partitioning algorithm

1. Sort all lectures based on start time in ascending order.
2. Number of initial classrooms = 0
3. While lecture to be scheduled:
   3.1 Take first lecture yet not scheduled,
   3.2 If there a already a class available for lecture's start time
       Assign lecture to the class.
   3.3 If not, then allocate a new classroom
       number of classroom = number of classroom + 1
4. Return number of classrooms.

Before jumping into the code, let’s discuss some data structures which we can use to implement this algorithm.

Understand that we have to find a compatible classroom for a lecture. There are many classrooms, we need to check if the finish time of lecture in that classroom is less than start time of new lecture. If yes , then classroom is compatible, if there is no such class, allocate a new class. If we store our allocated classrooms in such a way that it always gives classroom with least finish time of last lecture scheduled there, we can safely say that if this classroom is not compatible, none of the others will be.(Why?) Every time we assign a lecture to a classroom, sort the list of classroom, so that first classroom is with least finish time.  Sort has complexity of O(n log n) and if we do it for all n intervals, overall complexity of algorithm will be O(n2 log n).

We are sorting just to find minimum end time across all classrooms. This can easily be achieved by min heap or priority queue keyed on finish time of last lecture of class. Every time finish time of last lecture changes for a classroom, heap is readjusted and root gives us classroom with min finish time.

  • To determine whether lecture j is compatible with some classroom, compare sj to key of min classroom k in priority queue.
  • When a lecture is added to a classroom,  increase key of classroom k to fj.

Well know we have algorithm and data structure to implement in, so let’s code it.

PrioritityQueue implementation is given below:

import heapq
# This is our priority queue implementation
class PriorityQueue:
    def __init__(self):
        self._queue = []
        self._index = 0
 
    def push(self, item, priority):
        heapq.heappush(self._queue, (priority, self._index, item))
        self._index += 1
 
    def pop(self):
        if(self._index == 0):
                return None
        return heapq.heappop(self._queue)[-1];

Classroom class implementation

class Classroom:
	def __init__(self, number, finish_time):
		self.class_num = number
		self.finish_time = finish_time
	def __repr__(self):
		return 'Classroom({!r})'.format(self.class_num)

Interval partitioning problem : Implementation

from PriorityQueue import PriorityQueue
from Classroom import Classroom

jobs = [(1, 930, 1100),
        (2, 930, 1300),
        (3, 930, 1100),
        (5, 1100, 1400),
        (4, 1130, 1300),
        (6, 1330, 1500),
        (7, 1330, 1500),
        (8,1430,1700),
        (9, 1530, 1700),
        (10, 1530, 1700)
]

def find_num_classrooms():
	num_classrooms = 0;
	priority_queue = PriorityQueue()

	for job in jobs:
		# we have job here, now pop the classroom with least finishing time
		classroom = priority_queue.pop();
		if(classroom == None) :
			#allocate a new class
			num_classrooms+= 1;
			priority_queue.push(Classroom(num_classrooms,job[2]),job[2]);
		else:
			#check if finish time of current classroom is
			#less than start time of this lecture
			if(classroom.finish_time  <= job[1]):
				classroom.finish_time = job[2]
				priority_queue.push(classroom,job[2])
			else:
				num_classrooms+= 1;
				#Since last classroom needs to be compared again, push it back
				priority_queue.push(classroom,job[2])
				#Push the new classroom in list
				priority_queue.push(Classroom(num_classrooms,job[2]),job[2])

    return  num_classrooms
	
print "Number of classrooms required: " +  find_num_classrooms();

Java Implementation

package com.company;

import java.util.*;

/**
 * Created by sangar on 24.4.18.
 */
public class IntervalPartition {

    public static int findIntervalPartitions(ArrayList<Interval> intervals){
        PriorityQueue<Interval> queue =
                new PriorityQueue<Interval>(intervals.size(), Comparator.comparing(p -> p.getEndTime()));

        for(Interval currentInterval : intervals) {
            if (queue.isEmpty()) queue.add(currentInterval);
            else {
                if (queue.peek().getEndTime() > currentInterval.getStartTime()) {
                    queue.add(currentInterval);
                } else {
                    queue.remove();
                    queue.add(currentInterval);
                }
            }
        }
        return queue.size();
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));

        int minimumClassRooms = findIntervalPartitions(intervals);
        System.out.println(minimumClassRooms);
    }
}

This algorithm takes overall time of O(n log n) dominated by the sorting of jobs on start time. Total number of priority queue operations is O(n) as we have only n lectures to schedule and for each lecture we have push and pop operation.

Reference :

There is another method using binary search algorithm which can be used to solve this problem. As per problem statement, we have to find minimum number of classrooms to schedule n lectures. What are the maximum number of classrooms required? It will be number of lectures when all lectures conflict with each other.
Minimum number of classrooms will be 0 when there is no lecture to be scheduled. Now, we know the range of values of classrooms. How can we find minimum?

Basic idea is that if we can schedule all n lectures in m rooms, then we can definitely schedule them in m+1 and more rooms. So minimum number of rooms required will be either m or less than it. In this case, we can safely discard all candidate solution from m to n (remember n is the maximum number of classrooms).
Again what if we can not schedule lectures in m rooms, then there is no way we can schedule them in less than m rooms. Hence we can discard all candidate solutions less than m.

How can we select m? We can select is as mid of range which is (0,n). And try to fit all lectures on those m rooms based on condition that none of lecture conflicts. Keep track of end time of last lecture of each classroom. If none of the classroom has end time less than start time of new lecture, allocate new class. If total number of classrooms is less than or equal to m, discard m+1 to n. If it is more than m, then discard 0 to m and search for m+1 to n.

package com.company;

import java.util.*;

/**
 * Created by sangar on 24.4.18.
 */
public class IntervalPartition {

    public static boolean predicate(ArrayList<Interval> intervals, long candidateClassRooms){

        int i = 0;

        PriorityQueue<Interval> queue =
                new PriorityQueue<Interval>(intervals.size(), Comparator.comparing(p -> p.getEndTime()));

        for(Interval currentInterval : intervals){
            if(queue.isEmpty()) queue.add(currentInterval);
            else{
                if(queue.peek().getEndTime() > currentInterval.getStartTime()){
                    queue.add(currentInterval);
                }
                else{
                    queue.remove();
                    queue.add(currentInterval);
                }
            }
        }

        return queue.size() <= candidateClassRooms;
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        long low = 0;
        long high = intervals.size();

        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));

        while(low < high){
            long mid  = low + ( (high - low) >> 1);

            if(predicate(intervals, mid)){
                high = mid;
            }else{
                low = mid+1;
            }
        }
        System.out.println(low);
    }
}

Complexity of algorithm is dependent on number of lectures to be scheduled which is O(n log n ) with additional space complexity of O(c) where c is number of classrooms required.

Please share your views and suggestions in comments and feel free to share and spread the word. If you are interested to share your knowledge to learners across the world, please write to us on communications@algorithmsandme.com

Dijkstra’s Algorithm to find shortest path

Dijkstra’s Algorithm to find shortest path

Given a graph, directed or undirected and two nodes, find shortest path between these two nodes.
This is a standard problem and we don’t need to figure out what to do. We will have adjacency list representation of graph. Algorithm is widely published and is as below.

  1. Initialize distance of all nodes from start node as INFINITE and all nodes as not finalized.
  2. Take source node to start with, let’s say u.  Distance from source or start node to itself will be zero.
  3. Mark u as considered and distance finalized.
  4. Now for all neighbor nodes v of it, update the distance if the current distance is more than distance of  u + weight of (u,v). 
    For example, if the current node A is marked with a distance of 6, and the edge connecting it with a neighbor Bhas length 2, then the distance to B (through A) will be 6 + 2 = 8. If B was previously marked with a distance greater than 8 then change it to 8. Otherwise, keep the current value.{From Wikipedia}
  5. Now select a node for which distance is not finalized and distance is minimum till now and make it u. Go to step 3.

Let’s work out an example and see how it works. Input graph is


Execution of above algorithm on this graph will be

To figure out the path which was followed to reach destination from source, we can have an array to keep track of the parent node whenever distance to the node is updated. By reverse tracking parents from destination to source, we can figure out the path.

Dijkstrat’s algorithm to find shortest path : Implementation

#include<stdio.h>
#include<stdlib.h>
#include<string.h>

#define NUM_NODE 7
#define NUM_WORDS 10
#define NUM_CHAR 4 
#define true 1
#define false 0

#define INFINITE 1000

typedef struct node{
        int value;
        int wt;
        struct node *next;
}Node;

Node *graph[NUM_NODE + 1];

void add_edge_2(int i, int j, int wt);

int find_minimum_node(int visited[], int dist[]){
        int min = INFINITE;
        int index = -1;
        int i;
        for(i=1; i<= NUM_NODE; i++){
                if(visited[i] == false && min>dist[i]){
                        min = dist[i];
                        index = i;
                }
        }
        return index;
}

void dijstras(Node * graph[], int start, int end ){
    int i;
    int parent[NUM_NODE +1];
    int distance[NUM_NODE+1];
    int visited[NUM_NODE+1];

    for(i=1; i<=NUM_NODE; i++){
        visited[i] = false;
        distance[i] = INFINITE;
        parent[i] = -1;
    }
   // Mark distance of start as 0.
    distance[start] =0;
    for(i=1; i<=NUM_NODE; i++){
        int index  = find_minimum_node(visited, distance);
        if(index != -1){
            Node * node = graph[index];
            Node * current  = node->next;

            while(current){
               /*If neihbour node is not visited and its current distance is 
                more than distance of current node + cost of edge between 
                current node and this node, update the distance */
                if(visited[current->value] == false && distance[current->value] >
                    distance[node->value] + current->wt){
                
                    distance[current->value] = distance[node->value] + current->wt;
                    parent[current->value] = node->value;
                }
                current = current->next;
            }
            visited[node->value] = true;
            if(node->value == end)
                break;
        }
        else{
            break;
        }
    }


    printf("\nDistance between %d and %d : %d", start , end, distance[end]);
    
    // Printing path in reverse order,using stack, we can print it normal order to.
    printf("\nPath is  (In reverse order): ");
    int cur_parent =0;
    while(cur_parent != -1){
        printf("%d ", end );
        cur_parent = parent[end];
        end = cur_parent;
    }
    printf("\n");
}

Node *createNode(int j, int wt){

	Node * new_node = (Node *)malloc(sizeof(Node));
	if(new_node){
		new_node->value = j;
		new_node->next = NULL;
		new_node->wt = wt;
	}
	else{
		printf("\n Node cannot be allocated");
	}
	return new_node;
}

void addEdge(int i, int j, int wt){

	Node * temp = graph[i];
	if(temp == NULL){
		graph[i] = createNode(j, wt);
	}
	else{
		while(temp->next){
			temp = temp->next;
		}
		temp->next = createNode(j, wt);
	}
}

//driver program
int main(){

    int i,j;
    for(i=1; i<=NUM_NODE; i++){
        graph[i] = createNode(i,0);
    }
    // creating graph with weighted edges.
    addEdge(1,2,4);
    addEdge(1,3,8);
    addEdge(2,3,9);
    addEdge(2,4,9);
    addEdge(3,4,2);
    addEdge(2,5,10);
    addEdge(3,6,1);
    addEdge(4,5,7);
    addEdge(4,6,9);
    addEdge(5,6,6);
    addEdge(6,7,2);
    addEdge(7,5,5);
    
    dijstras(graph, 1, 6);
    
    return 0;
}

The complexity of above code is O(V2) where V is number of vertices in graph. This can be reduced to O(E log V) by using heaps. Heaps will reduce complexity of searching minimum weight cost from O(V) to O(log V).
Limitation of algorithm1. It does not work with negative weights.

Please share if there is anything wrong or missing.

Merge overlapping intervals

Merge overlapping intervals

Given N intervals S = {E1,E2,…..En} with each Ei has start time si and end time ei. Some of these intervals can be overlapping, Just to clarify, Ei and Ej overlap when start time of Ej i.e sj is less than end time of Ei i.e ei. For example, [(1,3),(2,4),(5,8), (6,9)] should transform into [(1, 4),(5,9)] has interval (1,3) and (2,4) overlap and interval (5,8) and (6,9) also overlap.

merge overlapping intervals

Merge overlapping intervals  : Thought process

As we always do, first try to come up with brute force solution, given enough time and space and money, how would you solve this?
Natural course is to take ith interval and compare start time of all jth intervals with end time of ith, if the start time of jth interval is less than the end time of ith event, then you can merge two intervals. What should be end time for merged interval then?  It should be maximum of end times of two merged intervals.

What will be time complexity of this approach? We are not using any additional space, however, worst case time complexity is O(n2). Can we do better?

What are two times we are comparing in brute force solution? It’s the start time of one interval with the end time of another. If we arrange input in a specific order, can we reduce processing some entries?

If we sort all intervals based on their start time, si < si+1< si+2. Also, interval is always forward looking, ei > si, ei+1 > si+1 and so on.

If si is greater ei-1, then si+1 will be greater than ei-1, so no need to compare si+1 with ei-1, that is no need to go beyond immediate previous interval for any interval Ei. If si is less than ei-1, update ei-1 with maximum of ei-1 and ei and move to Ei+1.
Notice that we need last interval Ei-1 to decide if to merge new interval into previous one or keep it as standalone. A stack is the best data structure to use. The algorithm will look like:

  1. Consider interval Ei.
  2. If stack is empty, push Ei to stack.
  3. If stack is not empty, then pop interval at top of stack call it Ei-1.
  4. Compare si, start time of Ei with ei-1, end time of Ei-1.
  5. If si less than ei-1, update ei-1 as max(ei-1, ei), as in maximum of end times of two intervals and push back Ei-1on to stack.
  6. Else push Ei on to stack.
  7. Continue till all events are considered.
  8. At the end of processing, stack will contain all merged interval.

Let’s take an example and see how this algorithm works. We have following intervals and we have to merge overlapping intervals.

First of all, sort all interval based on their start time.

Create a stack, start with the first interval, since the stack is empty, we will push the first event on to the stack.

After pushing the first event, the problem state looks like this

Take the second interval, start time (2) of the second interval is less than the end time of the previous event on the stack (3), hence, find the maximum of end times of these two intervals and update the last interval with that end time and push back on to the stack.

 

Look at the third interval, the start time of it is greater than the end time of interval on top of the stack, just push interval on to the stack.

Last interval, this time, the start time of the new interval is less than the end time of interval on top of the stack.

Find the maximum of end times of two intervals and update the previous interval with that end time and push it back on to stack.

merge overlapping intervals

At this point, when there is no more interval remaining, stack contains all merged overlapping intervals.

Merge overlapping intervals : Implementation

package com.company;


import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.Stack;

/**
 * Created by sangar on 8.4.18.
 */
public class OverlappingIntervals {
    public  static ArrayList<Interval>
        mergeOverlappingIntervals(ArrayList<Interval> intervals){

        ArrayList<Interval> mergedIntervals = new ArrayList<>();
        Stack<Interval> s = new Stack();

        //Sort the ArrayList of interval based on start time.
        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));
        for(Interval currentInterval : intervals){
            if(s.empty())s.push(currentInterval);
            else {
                Interval previousInterval = s.pop();
                if(previousInterval.getEndTime() > 
                     currentInterval.getStartTime()){
                    /*
                    If current interval's start time is less than end time of
                    previous interval, find max of end times of two intervals
                    and push new interval on to stack.
                     */
                    int endTime = Integer.max(previousInterval.getEndTime(),
                                              currentInterval.getEndTime());
                    /* Notice that we have created new interval and 
                       did not update the old one
                       This concept is called as immutability of class
                     */
                    s.push(new Interval(previousInterval.getStartTime(),
                                        endTime));
                }
                else{
                    s.push(previousInterval);
                    s.push(currentInterval);
                }
            }
        }
        while(!s.empty()){
            mergedIntervals.add(s.pop());
        }

        return mergedIntervals;
    }

    public static void main(String[] args) {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(1,3));
        intervals.add(new Interval(2,4));
        intervals.add(new Interval(5,8));
        intervals.add(new Interval(6,9));
        ArrayList<Interval> mergedIntervals = mergeOverlappingIntervals(intervals);
        for (Interval interval : mergedIntervals){
            System.out.print("(" + interval.getStartTime() +"," + interval.getEndTime() + ")");
        }
    }
}

Complexity of algorithm to merge overlapping intervals will be O(n log N) due to sorting with O(n) extra space for stack and then copying into the list to return also takes O(n) space.

There is another way to implement the same function without using the stack, here we use the fact that ArrayList in Java is implemented using the array as the base and getting an element at a particular index should be O(1) operation. The code looks more or less the same, however, there is no traversal of the stack at the end to create the list to return.

public List<Interval> mergeOptimized(List<Interval> intervals) {

        if(intervals.size() == 0) return intervals;

        Collections.sort(intervals, 
           (Interval a, Interval b) -> a.getStartTime() - b.getStartTime());

        List<Interval> mergedIntervals = new ArrayList<Interval>();
        for(Interval interval : intervals){

            /*If the merged list is empty add the interval to 
              it or check if the last interval in merged list overlaps

            /*Remember the get function on ArrayList is O(1) operation
              because Arraylists in Java are backed by arrays */
            if(mergedIntervals.isEmpty()
                    || mergedIntervals.get(mergedIntervals.size()-1).getEndTime() < 
                       interval.getStartTime() ){
                mergedIntervals.add(interval);
            }
            else {
                int lastEndTime = Math.max(
                        mergedIntervals.get(mergedIntervals.size()-1).getEndTime(),
                        interval.getEndTime()
                );
                mergedIntervals.get(mergedIntervals.size()-1).setEndTime(lastEndTime);
            }
        }

        return mergedIntervals;
    }

You can use the above snippet of code to submit for this leetcode problem and it should be accepted.

Please share if there is something missing or wrong. Also, please reach out to us at communications@algorithmsandme.com if you want to contribute to the website and help others to learn by sharing your knowledge. If you are preparing for an interview and need some coaching to prepare for it, please sign up for the free session with us.

Interval Scheduling Algorithm

Interval Scheduling Algorithm

Any interval has two time stamps, it’s start time and end time. To schedule number of intervals on to particular resource, take care that no two intervals are no overlapping, that is to say second interval cannot be scheduled while first is running. Given a set of intervals S with their start time si and end time ei, find out largest set R such that all events in R are mutually compatible. Two intervals are called compatible if they do not overlap (2nd job starts after or at the same time as the 1st one finishes). This problem is called as interval scheduling problem and algorithm which helps solve this class of problems is called as interval scheduling algorithm.
Example: 8 intervals{A,B,C,D,E,F,G,H}, optimal set would be {B,E,H}

interval scheduling

Interval Scheduling : Line of thought

Notice from problem statement that ask is to maximize output with given constraints. What template this kind of problems fit in? It’s greedy algorithm. We need to select each job which maximizes output, i.e gives us maximum number of compatible intervals. What should be the order of evaluation of intervals? There are some natural orders we can think of :
1. Order intervals by earliest start time first.
2. Order intervals by earliest end time first.
3. Order intervals by minimum number of overlapping jobs.
4. Order intervals by shortest job first.

Let’s take some examples and see how things work out with each criteria.
1. Earliest start time first

interval scheduling algorithm

In above arrangement, if we schedule interval with earliest start time first, only one interval will scheduled, however, optimally, 3 intervals, { B, C, D } should have been scheduled.

2. Minimum number of conflicting jobs

interval scheduling problem

If we select job with least conflicts first, we will select F ( 2 conflicts) then C ( with 3 conflicts ) and then E ( again with 3 conflicts ). However, ideal set should be { B, C, D, E }

3. Shortest job first.

In this case, if we select shortest job first, set will contain only interval A, where as optimal set is {B, C}.
These are counter examples for three of the four natural ordering, these three criteria cannot give us optimum output, which is maximum number of compatible intervals. If we take interval with earliest end time, it will give us optimal set. Can you check if above three examples give you correct answer if you take interval based on earliest end time first? If we take first example, when order by end time, intervals will look like this. From this we can easily find out that compatible intervals are B, E and H.

Interval scheduling algorithm

Sort all jobs which based on end time in increasing order.

  1. Take the interval which has earliest finish time.
  2. Repeat net two steps till all you process all jobs
  3. Eliminate all intervals which have start time less than selected interval’s end time.
  4. If interval has start time greater than current interval’s end time, at it to set. Set current interval to new interval
package com.company;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.PriorityQueue;

/**
 * Created by sangar on 25.4.18.
 */
public class IntervalScheduling {
    public static ArrayList<Interval> intervalScheduling(ArrayList<Interval> intervals){
        Collections.sort(intervals, Comparator.comparing(p -> p.getEndTime()));

        ArrayList<Interval> resultList = new ArrayList<>();

        for(Interval currentInterval : intervals) {
            if(resultList.isEmpty()) resultList.add(currentInterval);
            else{
                if(currentInterval.getStartTime() > resultList.get(resultList.size()-1).getEndTime()){
                    resultList.add(currentInterval);
                }
            }
        }
        return resultList;
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        ArrayList<Interval> compatibleIntervals = intervalScheduling(intervals);

        for(Interval interval : compatibleIntervals) {
            System.out.println("(" + interval.getStartTime() + "," + interval.getEndTime() + ")");
        }
    }
}

Complexity of algorithm is dominated by the sorting which is O(N log N) which is actually the complexity of sort algorithm.

Reference 
http://courses.cs.vt.edu/cs5114/spring2009/lectures/lecture04-greedy-scheduling.pdf

Please share if you find something missing or wrong. If you want to contribute to algorithm and me and share your knowledge with thousands of students across world, please reach out to us at communications@algorithmsandme.com