Why is the greedy algorithm optimal?

Asked 8/10, 2014 at 16:56 Answered 8/10, 2014 at 17:38

Codility, lesson 14, task TieRopes (https://codility.com/demo/take-sample-test/tie_ropes). Stated briefly, the problem is to partition a list A of positive integers into the maximum number of (contiguous) sublists having sum at least K.

I've only come up with a greedy solution because that's the name of the lesson. It passes all the tests but I don't know why it is an optimal solution (if it is optimal at all).

int solution(int K, vector<int> &A) {
    int sum = 0, count = 0;
    for (int a : A)
    {
        sum += a;
        if (sum >= K)
        {
            ++count;
            sum = 0;
        }
    }
    return count;
}

Can somebody tell me if and why this solution is optimal?

Selfabsorption answered 8/10, 2014 at 16:56 Comment(4)

The normal pattern for proving a greedy algorithm optimal is to (1) posit a case where greedy doesn't produce an optimal result; (2) look at the first place where its solution and the optimal solution diverge; and (3) prove that that place can't exist. Proof by contradiction. – Frescobaldi 8/10, 2014 at 17:3

To me it looks like that contiguity requirement already eliminates all meaningful flexibility from the solution of this problem. I.e. the solution is obvious and straightforward: just accumulate a partition from the beginning until you get K and then start a new partition. The only flexibility it allows for is to continue to extend the current partition when its sum is already K, but it is immediately obvious that this will only reduce the quality of the solution. – Highbinder 8/10, 2014 at 17:16

@AndreyT That's your intuition and you're probably right but I don't think that's enough of a proof. – Selfabsorption 8/10, 2014 at 17:19

@NPS:, Yes, I could be wrong here. I assumed that we are not allowed to skip elements between partitions. But apparently we are. There's another flexibility then. We can sacrifice (skip) a few initial elements and start accumulating the first partition from some later element. The same can be done between partitions. Does it have a chance of producing a better solution? – Highbinder 8/10, 2014 at 17:19

Maybe I'm being naive or making some mistake here, but I think that is not too hard (although not obvious) to see that the algorithm is indeed optimal.

Suppose that you have an optimal partition of the list that with the maximum number of sublists. You may or may not have all of the elements of the list, but since adding an element to a valid list produces an also valid lists, lets suppose that any possible "remaining" element that was initially not assigned to any sublist was assigned arbitrarily to one of its adjacent sublists; so we have a proper optimal partition of the list, which we will call P1.

Now lets think about the partition that the greedy algorithm would produce, say P2. There are two things that can happen for the first sublist in P2:

It can be the same as the first sublist in P1.
It can be shorter than the first sublist in P1.

In 1. you would repeat the reasoning starting in the next element after the first sublist. If every subsequent sublist produced by the algorithm is equal to that in P1, then P1 and P2 will be equal.

In 2. you would also repeat the reasoning, but now you have at least one "extra" item available. So, again, the next sublist may:

2.1. Get as far as the next sublist in P1.
2.2. End before the next sublist in P1.

And repeat. So, in every case, you will have at least as many sublists as P1. Which means, that P2 is at least as good as any possible partition of the list, and, in particular, any optimal partition.

It's not a very formal demonstration, but I think it's valid. Please point out anything you think may be wrong.

Systematist answered 8/10, 2014 at 17:38 Comment(1)

That's also how I think about this problem. Basically, you can show that for any optimal solution having k sublists, the greedy solution also has at least k sublists whose first positions are each <= the corresponding first positions in the optimal solution. (One typo: It should be "It can be shorter than the first sublist in P1", not "... in P2".) – Jacobo 8/10, 2014 at 17:44

Here are the ideas that lead to a formal proof.

If A is a suffix of B, then the maximum partition size for A is less than or equal to the maximum partition size for B, because we can extend the first sublist of a partition of A to include the new elements without decreasing its sum.
Every proper prefix of every sublist in the greedy solution sums to less than K.
There is no point in having gaps, because we can add the missing elements to an adjacent list (I thought that my wording of the question had ruled out this possibility by definition, but I'll say it anyway).

The formal proof can be carried out by induction to show that, for every nonnegative integer i, there exists an optimal solution that agrees with the greedy solution on the first i sublists of each. It follows that, when i is sufficiently large, the only solution that agrees with greedy is greedy, so the greedy solution is optimal.

The basis i = 0 is trivial, since an arbitrary optimal solution will do. The inductive step consists of finding an optimal solution that agrees with greedy on the first i sublists and then shrinking the i+1th sublist to match the greedy solution (by observation 2, we really are shrinking that sublist, since it starts at the same position as greedy's; by observation 1, we can extend the i+2th sublist of the optimal solution correspondingly).

Halinahalite answered 8/10, 2014 at 17:7 Comment(3)

I don't follow your point #2 I'm afraid -- either why it must be true, or why it's helpful! The rest makes sense, but I think it also needs to be said explicitly that the first sublist in any optimal solution cannot be shorter than the first sublist in the greedy solution, to explain why the only case we have to deal with in the inductive step is to shrink the optimal solution's (i+1)th sublist (i.e. why we never need to grow them instead). – Jacobo 8/10, 2014 at 17:39

@Jacobo That was my intended meaning, and I agree that the previous wording was poor. – Halinahalite 8/10, 2014 at 17:54

That's clearer to me now, thanks. FWIW I don't think point #3 is needed; "partition" in the OP's question tells me that no solution is allowed to contain a gap, and gaps don't appear during the greedy construction. (I'm also puzzled by dasblinkenlights's mentioning of this.) While I'm nitpicking: Writing "only" in "the only solution that agrees with greedy is greedy" seems to hint that the greedy solution might be unique, but it's not (and doesn't need to be); AFAICT, all that's needed is that a matching optimal solution can be found when i = opt_number_of_sublists. – Jacobo 8/10, 2014 at 18:17

Recommended topics

Hot tags