Proof of optimality of a greedy solution to job sequencing

Asked 13/1, 2015 at 11:59 Answered 16/10, 2019 at 2:10

In this Job Sequencing Problem, how do we prove that the solution which greedy approach will provide is a optimal one?

Moreover, I am also not able to figure out the O(n) solution as the author later is claiming

It can be optimized to almost O(n) by using union-find data structure.

Trout answered 13/1, 2015 at 11:59 Comment(6)

By proving that it's not NP-complete. Note that Job Shop scheduling is NP-complete, so a greedy algorithm won't be optimal on the latter (no idea about Job Sequencing Problem though). – Luci 13/1, 2015 at 12:33

They're still sorting... – Recitative 13/1, 2015 at 12:38

At least they're linking to lecture notes that mention union-find and its associated complexities. – Maren 13/1, 2015 at 13:7

At least I'm seriously trying to give an answer which is both correct and understandable. – Cystocarp 13/1, 2015 at 13:17

@Cystocarp I assumed David was referring to the article linked to in the question, not the answers here - at least that was what I thought I was commenting on. – Maren 13/1, 2015 at 21:45

@G.Bach You're right. I didn't mean to come across rude. – Cystocarp 9/4, 2018 at 16:54

The optimality of the greedy solution can be seen by an exchange argument as follows. Without loss of generality, assume that all profits are different and that the jobs are sorted in decreasing order of profits.

Fix a solution S. From this solution, delete every job that misses its deadline. As by this transformation every job finishes within its deadline, the resulting solution S1 is still optimal. For job i, consider the interval I_i:=[0,min(n,deadline(i))] (as in the greedy algorithm). In this interval, to the right of job i, there are only jobs with larger processing time (if not, they either would have been considered earlier by our ordering, or they could be exchanged witout violation of their deadlines). Place job i to the rightmost position possible in I_i.

In total, we have the following statement.

Each solution S can be transformed into a solution S' with the following propoerties.

S' contains every job of S which finishes before its deadline.
For every job i in S, all jobs after i in I_i have a larger processing time.
For every job i in S, there is no idle time after i in I_i.
S' has the same objective value as S.

In particular, there exists an optimal solution S* with the properties above. Let S be the algorithm generated by the greedy algorithm. Note that S also has the properties 2 and 3 above. Let i be the first job which occurs in S but not in S*. Let pos be the time slot of i in S. If pos was empty in S*, S* could be improved by adding i, in contradiction to the optimality of S*. If pos was not empty in S*, the job i'at pos in S* cannot be of larger profit than i, since otherwise the greedy algorithm would have chosen i' to be at pos in S. Hence, it must be of lower profit than i. This means that it can be deleted and replaced by i in S*, which also contradicts the optimality of S*.

Cystocarp answered 13/1, 2015 at 12:5 Comment(4)

Thanks @Codor. Didn't get "The described greedy algorithm does not solve it to optimality" – Trout 13/1, 2015 at 12:7

I think, the processing time in this problem is always 1. This is not a knapsack problem. – Rotman 13/1, 2015 at 12:15

I don't understand how I_i is defined; a minimum is a number, not an interval. When is a job to the right of another job, do you mean jobs with later deadlines or jobs that are scheduled later in S? The exchange argument confuses me somewhat, since S is chosen as a solution (to the problem? then it is chosen as maximizing profit) initially, but then it is argued about as if it was the result of the algorithm. Also, when is a job in I_i? Do you mean the jobs scheduled during I_i (in S or S')? Do properties 2 and 3 refer to S' or are they only talking about S? – Maren 13/1, 2015 at 22:1

Concerning the interval, I made a mistake which is now fixed. I_i is meant to be the interval from zero to the deadline of job i. By 'right' of i I mean executed later in some solution. The argument is supposed to work as follows: First it is shown that only some solutions are 'interesting' (namely the ones with properties 1-4) in the sense that any solution can be modified to have properties 1-4 but the same profit. This holds in particular for an optimal solution, which is then compared to the solution generated by the greedy algorithm. I admit that the presentation is not elegant. – Cystocarp 14/1, 2015 at 7:4

Proof of correctness. Let's assume that it is not correct. What does it mean? It means that at some step we have thrown away a job because it was not possible to add it without removing an already taken one. If we had removed an already taken one and put the current one into its time slot, we wouldn't have changed anything for the next jobs(the set of occupied time slots would have been the same). But the total profit would have reduced. Thus, a solution produced by this algorithm is optimal.
Making it faster than O(n ^ 2). To do it, we need to find the rightmost free position to the left from the given one quickly. If we use a union-find structure to maintain a group of occupied position, we can just find the group where a deadline of the current job is located and an take element located to the left from it. But it is still O(n log n) because of the sorting.

Recreant answered 13/1, 2015 at 12:44 Comment(0)

Consider a job worth 2 points with deadline 4, and three jobs worth 1 point each, with deadline 3. This disproves the optimality of the "greedy" algorithm: it is more profitable to do the three cheaper jobs first, before the more expensive one.

As for the O(n^2) vs. O(n), I think both claims are wrong too. The "greedy" algorithm, as described, consists of sorting the jobs - O(nlogn) - plus a single scan of the sorted list, filling the solution sequence slots - O(n) - and is therefor O(nlogn).

To make it linear, one would have to do something about sorting, such use using radix sort, which can be considered linear for practical purposes.

Rotman answered 13/1, 2015 at 12:30 Comment(4)

Your example is not correct. We can do all jobs. The answer is 5. And a greedy algorithm finds it. – Recreant 13/1, 2015 at 12:48

Yes, we can do all jobs, but have to start with the less profitable ones to get the result of 5. That's the point. What's "not correct"? – Rotman 13/1, 2015 at 12:53

"not correct" means that it does not disprove the correctness of the greedy algorithm. It does solve this instance optimally. – Recreant 13/1, 2015 at 12:55

This is not a counterexample. It is possible to do all of the the jobs by a deadline of 5. The time horizon apparently is [0,n] where n is the number of jobs, although this is not stated very clearly in the problem description. The problem demands a sequence of jobs, which implicitly means that there is no idle time in the schedule, which means that in either case n is an upper bound for the length of the time horizon. – Cystocarp 13/1, 2015 at 12:58

The given proof is not complete. Namely, the sentence "the job i'at pos in S* cannot be of larger profit than i" is not proven, and I think it cannot be. Anyhow, there can be proven that the statement holds even if job i has profit larger or equal to i''.

for complete proof see http://ggn.dronacharya.info/CSEDept/Downloads/QuestionBank/Even/VI%20sem/ADA/Section-B/job-scheduling1.pdf

Arsine answered 16/10, 2019 at 2:10 Comment(0)

Recommended topics

Hot tags