Scheduling weighted jobs

Scheduling weighted jobs

Suppose we have been give n jobs j1, j2,j3…jn with their start time s1,s2,… sn and finish time f1,f2, f3…fn. There is a value vi associated with each job. Problem is scheduling weighted jobs such all jobs are compatible and we get maximum value. Two jobs are said to be compatible, if there execution time do not overlap.

For example, we have four jobs as shown below:

scheduling weighted jobs

In above figure maximum value can be achieved by scheduling job 1 and job 4 which is value of 250. Notice that there one more schedule with compatible jobs (Job1, Job2 and Job 3), however, value we get by that schedule is only 170 which is less than what we got in earlier schedule.

Scheduling weighted jobs : Line of thoughts

There is strong urge to use greedy algorithm here, and problems is very similar to Interval Scheduling Algorithm. However, greedy algorithm works for this problem when value of all jobs is equal. Since value of jobs is different here, greedy algorithm fails.

Let’s consider brute force solution. First of all, sort all jobs based on finish time in increasing order. Now, for each job, decide if including it in schedule gives us maximum value or excluding it will give us maximum value. When we include a job, check if it is compatible with other jobs which are included in schedule. To determine compatibility quickly, we pre-calculate an array, called P such that

p(j) = largest index i < j such that job i is compatible with j.

For jth job or interval to be compatible with ith interval, start time of jth interval or job should be greater than end time of ith interval or job.

For example: p(8) = 5, p(7) = 3, p(2) = 0.

scheduling-weighted-jobs

Now, let’s say OPT(j) represents the maximum value which we gain by adding jobs from 1 to j. As mentioned above, there are two cases:

Case 1: OPT selects job j. In this case we can not use incompatible jobs {p(j) + 1, p(j) + 2, …, j – 1} and must include optimal solution to problem consisting of remaining compatible jobs 1, 2, …, p(j).

Case 2: OPT does not select job j. – must include optimal solution to problem consisting of remaining compatible jobs 1, 2, …, j-1

For case 1, we already have P[j] calculated. With P[j] already prepared, we know that we don’t have to check any job later than P[j] as all of them will be conflicting with current job. Recursive formula for calculating maximum value for n jobs will be:

OPT( j) = 0 if j = 0 
          max { vj + OPT( p(j) ), OPT(j-1)} otherwise

Scheduling weighted jobs : Recursive solution

package com.company;

import java.util.Arrays;

/**
 * Created by sangar on 4.5.18.
 */
public class ScheduleWeightedJobs {

    public static int optimalScheduling(Job[] jobs, int[] nonConflictJobs, int j){
        if(j == -1){
            return 0;
        }

        return Integer.max(optimalScheduling(jobs, nonConflictJobs, nonConflictJobs[j]) + jobs[j].getValue(),
                            optimalScheduling(jobs, nonConflictJobs, j-1));
    }

    public static void main(String[] args) {

        Job[] jobs = new Job[4];
        jobs[0] = new Job(1, 3, 50);
        jobs[1] = new Job(3, 5, 20);
        jobs[2] = new Job(6, 9, 100);
        jobs[3] = new Job(3, 12, 200);

        Arrays.sort(jobs, (o1, o2) -> o1.getEndTime() - o2.getEndTime());

        int[] nonConflictingJobs = new int[jobs.length];

        for (int j = 0; j < jobs.length; j++) {
            nonConflictingJobs[j] = -1;
            for(int i = j-1; i >= 0; i--) {
                if(jobs[i].getEndTime() <= jobs[j].getStartTime()) {
                    nonConflictingJobs[j] = i;
                    break;
                }
            } 
        }

        int maxValue = optimalScheduling(jobs,nonConflictingJobs, jobs.length-1);

        System.out.println(maxValue);
    }
}

This recursive algorithm has exponential complexity as there are lot of subproblems which are calculated repeatedly. For example,
Schedule weighted jobs

Recursive execution tree for above problem would like
weighted jobs scheduling

If we revisit the problems there are two properties of this problem : First it is optimal substructure, which means, optimal solution to subproblem leads to optimal solution to bigger problem. Second, there are overlapping subproblems. From figure, we can see that there are subproblems which are being re-calculated. Typical way to avoid this repetition is to store solutions to subproblem, this method is called memoization. This is kind of a cache where results of subproblems are stored and looked into whenever required.

This is typical case of dynamic programming application.

scheduling weighted job : Dynamic programming implementation

package com.company;

import java.util.Arrays;

/**
 * Created by sangar on 4.5.18.
 */
public class ScheduleWeightedJobs {

    public static int optimalSchedulingDP(Job[] jobs, int[] nonConflictJobs){
        int[] optimalValue = new int[jobs.length];

        optimalValue[0] = jobs[0].getValue();

        for(int i = 1; i < jobs.length; i++){
            optimalValue[i] = Integer.max(optimalValue[nonConflictJobs[i]] + jobs[i].getValue(),
                                optimalValue[i-1]);
        }
        return optimalValue[jobs.length-1];
    }

    public static void main(String[] args) {

        Job[] jobs = new Job[4];
        jobs[0] = new Job(1, 3, 50);
        jobs[1] = new Job(3, 5, 20);
        jobs[2] = new Job(6, 9, 100);
        jobs[3] = new Job(3, 12, 200);

        Arrays.sort(jobs, (o1, o2) -> o1.getEndTime() - o2.getEndTime());

        int[] nonConflictingJobs = new int[jobs.length];

        for (int j = 0; j < jobs.length; j++) {
            nonConflictingJobs[j] = -1;
            for(int i = j-1; i >= 0; i--) {
                if(jobs[i].getEndTime() <= jobs[j].getStartTime()) {
                    nonConflictingJobs[j] = i;
                    break;
                }
            }
        }

        int maxValue = optimalSchedulingDP(jobs,nonConflictingJobs);

        System.out.println(maxValue);
    }
}

Run time complexity of dynamic programming approach is O(n2). Sorting takes O(n log n) and calculation of maximum value takes O(n2).
If we have pre-sorted input based on finish time, then this approach takes only O(n). Note that we need additional O(n) space for storing results of subproblems.

How about finding the solution itself, means to find which jobs are actually give us optimal value? This requires some post processing. Algorithm is as follows

Find-solution(j) : 
 if (j = 0) output nothing 
 else if (vj + Table[P(j)] > Table[j-1]) print j 
     Find-Solution(p(j)) 
 else Find-Solution(j-1)

Please share if there is something wrong or missing. If you are interested in contributing to algorithms and me, please drop a mail