Interval partitioning problem

Interval partitioning problem

In continuation of greedy algorithm problem, (earlier we discussed : even scheduling and coin change problems) we will discuss another problem today. Problem is known as interval partitioning problem and it goes like : There are n lectures to be schedules and there are certain number of classrooms. Each lecture has a start time si and finish time fi. Task is to schedule all lectures in minimum number of classes and there cannot be more than one lecture in a classroom at a given point of time. For example, minimum number of classrooms required to schedule these nine lectures is 4 as shown below.

interval partition

However,  we can do some tweaks and manage to schedule same nine lectures in three classrooms as shown below.

So, second solution optimizes the output.

Another variant of this problem is :  You want to schedule jobs on a computer. Requests take the form (si , fi) meaning a job that runs from time si to time fi. You get many such requests, and you want to process as many as possible, but the computer can only work on one job at a time.

Interval partitioning : Line of thought

First thing to note about interval partitioning problem is that we have to minimize something, in this case, number of classrooms. What template this problem fits into? Greedy may be? Yes it fits into greedy algorithm template. In greedy algorithm we take decision on local optimum.

Before discussing the solution, be clear that what is resource and what needs to be minimized? In this problem, resource is classroom and total number of classroom needs to be minimized by arranging lectures in certain order.

There are few natural orders in which we can arrange all lectures or for sake of generality, tasks. First is to arrange them in order of finish time,  second is to arrange in order of start time, third is to order them by smallest duration of task, fourth is by minimum number of conflicting jobs. Which one to chose?
You can come up with counter example when if lectures are arranged in classrooms by order of their end time, or smallest duration or minimum number of conflicting jobs, it does not end to optimal solution  So, let’s pick lectures based on earliest start time. At any given pint of time, pick lecture with least start time and yet not scheduled and then assign it to first available class. Will it work? Sure it does.  When you have assigned all lectures, total number of classrooms will be minimum number of classrooms required.

Interval partitioning algorithm

1. Sort all lectures based on start time in ascending order.
2. Number of initial classrooms = 0
3. While lecture to be scheduled:
   3.1 Take first lecture yet not scheduled,
   3.2 If there a already a class available for lecture's start time
       Assign lecture to the class.
   3.3 If not, then allocate a new classroom
       number of classroom = number of classroom + 1
4. Return number of classrooms.

Before jumping into the code, let’s discuss some data structures which we can use to implement this algorithm.

Understand that we have to find a compatible classroom for a lecture. There are many classrooms, we need to check if the finish time of lecture in that classroom is less than start time of new lecture. If yes , then classroom is compatible, if there is no such class, allocate a new class. If we store our allocated classrooms in such a way that it always gives classroom with least finish time of last lecture scheduled there, we can safely say that if this classroom is not compatible, none of the others will be.(Why?) Every time we assign a lecture to a classroom, sort the list of classroom, so that first classroom is with least finish time.  Sort has complexity of O(n log n) and if we do it for all n intervals, overall complexity of algorithm will be O(n2 log n).

We are sorting just to find minimum end time across all classrooms. This can easily be achieved by min heap or priority queue keyed on finish time of last lecture of class. Every time finish time of last lecture changes for a classroom, heap is readjusted and root gives us classroom with min finish time.

  • To determine whether lecture j is compatible with some classroom, compare sj to key of min classroom k in priority queue.
  • When a lecture is added to a classroom,  increase key of classroom k to fj.

Well know we have algorithm and data structure to implement in, so let’s code it.

PrioritityQueue implementation is given below:

import heapq
# This is our priority queue implementation
class PriorityQueue:
    def __init__(self):
        self._queue = []
        self._index = 0
 
    def push(self, item, priority):
        heapq.heappush(self._queue, (priority, self._index, item))
        self._index += 1
 
    def pop(self):
        if(self._index == 0):
                return None
        return heapq.heappop(self._queue)[-1];

Classroom class implementation

class Classroom:
	def __init__(self, number, finish_time):
		self.class_num = number
		self.finish_time = finish_time
	def __repr__(self):
		return 'Classroom({!r})'.format(self.class_num)

Interval partitioning problem : Implementation

from PriorityQueue import PriorityQueue
from Classroom import Classroom

jobs = [(1, 930, 1100),
        (2, 930, 1300),
        (3, 930, 1100),
        (5, 1100, 1400),
        (4, 1130, 1300),
        (6, 1330, 1500),
        (7, 1330, 1500),
        (8,1430,1700),
        (9, 1530, 1700),
        (10, 1530, 1700)
]

def find_num_classrooms():
	num_classrooms = 0;
	priority_queue = PriorityQueue()

	for job in jobs:
		# we have job here, now pop the classroom with least finishing time
		classroom = priority_queue.pop();
		if(classroom == None) :
			#allocate a new class
			num_classrooms+= 1;
			priority_queue.push(Classroom(num_classrooms,job[2]),job[2]);
		else:
			#check if finish time of current classroom is
			#less than start time of this lecture
			if(classroom.finish_time  <= job[1]):
				classroom.finish_time = job[2]
				priority_queue.push(classroom,job[2])
			else:
				num_classrooms+= 1;
				#Since last classroom needs to be compared again, push it back
				priority_queue.push(classroom,job[2])
				#Push the new classroom in list
				priority_queue.push(Classroom(num_classrooms,job[2]),job[2])

    return  num_classrooms
	
print "Number of classrooms required: " +  find_num_classrooms();

Java Implementation

package com.company;

import java.util.*;

/**
 * Created by sangar on 24.4.18.
 */
public class IntervalPartition {

    public static int findIntervalPartitions(ArrayList<Interval> intervals){
        PriorityQueue<Interval> queue =
                new PriorityQueue<Interval>(intervals.size(), Comparator.comparing(p -> p.getEndTime()));

        for(Interval currentInterval : intervals) {
            if (queue.isEmpty()) queue.add(currentInterval);
            else {
                if (queue.peek().getEndTime() > currentInterval.getStartTime()) {
                    queue.add(currentInterval);
                } else {
                    queue.remove();
                    queue.add(currentInterval);
                }
            }
        }
        return queue.size();
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));

        int minimumClassRooms = findIntervalPartitions(intervals);
        System.out.println(minimumClassRooms);
    }
}

This algorithm takes overall time of O(n log n) dominated by the sorting of jobs on start time. Total number of priority queue operations is O(n) as we have only n lectures to schedule and for each lecture we have push and pop operation.

Reference :

There is another method using binary search algorithm which can be used to solve this problem. As per problem statement, we have to find minimum number of classrooms to schedule n lectures. What are the maximum number of classrooms required? It will be number of lectures when all lectures conflict with each other.
Minimum number of classrooms will be 0 when there is no lecture to be scheduled. Now, we know the range of values of classrooms. How can we find minimum?

Basic idea is that if we can schedule all n lectures in m rooms, then we can definitely schedule them in m+1 and more rooms. So minimum number of rooms required will be either m or less than it. In this case, we can safely discard all candidate solution from m to n (remember n is the maximum number of classrooms).
Again what if we can not schedule lectures in m rooms, then there is no way we can schedule them in less than m rooms. Hence we can discard all candidate solutions less than m.

How can we select m? We can select is as mid of range which is (0,n). And try to fit all lectures on those m rooms based on condition that none of lecture conflicts. Keep track of end time of last lecture of each classroom. If none of the classroom has end time less than start time of new lecture, allocate new class. If total number of classrooms is less than or equal to m, discard m+1 to n. If it is more than m, then discard 0 to m and search for m+1 to n.

package com.company;

import java.util.*;

/**
 * Created by sangar on 24.4.18.
 */
public class IntervalPartition {

    public static boolean predicate(ArrayList<Interval> intervals, long candidateClassRooms){

        int i = 0;

        PriorityQueue<Interval> queue =
                new PriorityQueue<Interval>(intervals.size(), Comparator.comparing(p -> p.getEndTime()));

        for(Interval currentInterval : intervals){
            if(queue.isEmpty()) queue.add(currentInterval);
            else{
                if(queue.peek().getEndTime() > currentInterval.getStartTime()){
                    queue.add(currentInterval);
                }
                else{
                    queue.remove();
                    queue.add(currentInterval);
                }
            }
        }

        return queue.size() <= candidateClassRooms;
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        long low = 0;
        long high = intervals.size();

        Collections.sort(intervals, Comparator.comparing(p -> p.getStartTime()));

        while(low < high){
            long mid  = low + ( (high - low) >> 1);

            if(predicate(intervals, mid)){
                high = mid;
            }else{
                low = mid+1;
            }
        }
        System.out.println(low);
    }
}

Complexity of algorithm is dependent on number of lectures to be scheduled which is O(n log n ) with additional space complexity of O(c) where c is number of classrooms required.

Please share your views and suggestions in comments and feel free to share and spread the word. If you are interested to share your knowledge to learners across the world, please write to us on communications@algorithmsandme.com

Interval Scheduling Algorithm

Interval Scheduling Algorithm

Any interval has two time stamps, it’s start time and end time. To schedule number of intervals on to particular resource, take care that no two intervals are no overlapping, that is to say second interval cannot be scheduled while first is running. Given a set of intervals S with their start time si and end time ei, find out largest set R such that all events in R are mutually compatible. Two intervals are called compatible if they do not overlap (2nd job starts after or at the same time as the 1st one finishes). This problem is called as interval scheduling problem and algorithm which helps solve this class of problems is called as interval scheduling algorithm.
Example: 8 intervals{A,B,C,D,E,F,G,H}, optimal set would be {B,E,H}

interval scheduling

Interval Scheduling : Line of thought

Notice from problem statement that ask is to maximize output with given constraints. What template this kind of problems fit in? It’s greedy algorithm. We need to select each job which maximizes output, i.e gives us maximum number of compatible intervals. What should be the order of evaluation of intervals? There are some natural orders we can think of :
1. Order intervals by earliest start time first.
2. Order intervals by earliest end time first.
3. Order intervals by minimum number of overlapping jobs.
4. Order intervals by shortest job first.

Let’s take some examples and see how things work out with each criteria.
1. Earliest start time first

interval scheduling algorithm

In above arrangement, if we schedule interval with earliest start time first, only one interval will scheduled, however, optimally, 3 intervals, { B, C, D } should have been scheduled.

2. Minimum number of conflicting jobs

interval scheduling problem

If we select job with least conflicts first, we will select F ( 2 conflicts) then C ( with 3 conflicts ) and then E ( again with 3 conflicts ). However, ideal set should be { B, C, D, E }

3. Shortest job first.

In this case, if we select shortest job first, set will contain only interval A, where as optimal set is {B, C}.
These are counter examples for three of the four natural ordering, these three criteria cannot give us optimum output, which is maximum number of compatible intervals. If we take interval with earliest end time, it will give us optimal set. Can you check if above three examples give you correct answer if you take interval based on earliest end time first? If we take first example, when order by end time, intervals will look like this. From this we can easily find out that compatible intervals are B, E and H.

Interval scheduling algorithm

Sort all jobs which based on end time in increasing order.

  1. Take the interval which has earliest finish time.
  2. Repeat net two steps till all you process all jobs
  3. Eliminate all intervals which have start time less than selected interval’s end time.
  4. If interval has start time greater than current interval’s end time, at it to set. Set current interval to new interval
package com.company;

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.PriorityQueue;

/**
 * Created by sangar on 25.4.18.
 */
public class IntervalScheduling {
    public static ArrayList<Interval> intervalScheduling(ArrayList<Interval> intervals){
        Collections.sort(intervals, Comparator.comparing(p -> p.getEndTime()));

        ArrayList<Interval> resultList = new ArrayList<>();

        for(Interval currentInterval : intervals) {
            if(resultList.isEmpty()) resultList.add(currentInterval);
            else{
                if(currentInterval.getStartTime() > resultList.get(resultList.size()-1).getEndTime()){
                    resultList.add(currentInterval);
                }
            }
        }
        return resultList;
    }

    public static void main(String args[] ) throws Exception {
        ArrayList<Interval> intervals = new ArrayList<>();

        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(930,1300));
        intervals.add(new Interval(930,1100));
        intervals.add(new Interval(1130,1300));
        intervals.add(new Interval(1100,1400));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1330,1500));
        intervals.add(new Interval(1430,1700));
        intervals.add(new Interval(1530,1700));

        ArrayList<Interval> compatibleIntervals = intervalScheduling(intervals);

        for(Interval interval : compatibleIntervals) {
            System.out.println("(" + interval.getStartTime() + "," + interval.getEndTime() + ")");
        }
    }
}

Complexity of algorithm is dominated by the sorting which is O(N log N) which is actually the complexity of sort algorithm.

Reference 
http://courses.cs.vt.edu/cs5114/spring2009/lectures/lecture04-greedy-scheduling.pdf

Please share if you find something missing or wrong. If you want to contribute to algorithm and me and share your knowledge with thousands of students across world, please reach out to us at communications@algorithmsandme.com