Better results in set partition than by differencing
Asked Answered
T

3

31

Partition problem is known to be NP-hard. Depending on the particular instance of the problem we can try dynamic programming or some heuristics like differencing (also known as Karmarkar-Karp algorithm).

The latter seems to be very useful for the instances with big numbers (what makes dynamic programming intractable), however not always perfect. What is an efficient way to find a better solution (random, tabu search, other approximations)?

PS: The question has some story behind it. There is a challenge Johnny Goes Shopping available at SPOJ since July 2004. Till now, the challenge has been solved by 1087 users, but only 11 of them scored better than correct Karmarkar-Karp algorithm implementation (with current scoring, Karmarkar-Karp gives 11.796614 points). How to do better? (Answers supported by accepted submission most wanted but please do not reveal your code.)

Tadd answered 2/9, 2015 at 13:13 Comment(18)
Forgive my ignorance - but wouldn't the score rather depend on the choice of input? Some algorithms may process certain data sets better than others...Lavernelaverock
Sounds like a job for Computer Science...Bolyard
@Robbie Dee, You are right, it depends of course on input too. But, if you know the better method than testing on some input data, please let me know.Tadd
Ah, OK - just wondered if they had a canonical set of data they ran against.Lavernelaverock
Try reading the next section on the Wikipedia page: en.wikipedia.org/wiki/Partition_problem#Other_approachesAnalects
@ j_random_hacker - Thanks (have seen it). Here m < 47 (ceil(lg 10^14)), while n <= 10000. So m is rather small but still not very easy. Anyway the section states which instances are hard but not how to cope with them.Tadd
@kuszi: I meant the 2 papers referenced in note 6.Analects
@ j_random_hacker - Thank you, I have missed them.Tadd
OK, I've just run it thru a few times and I'm getting the same score back each time (despite the source changing) so I guess they must use the same dataset for everyone.Lavernelaverock
And whatever set they're using must be nasty as all my test runs come out somewhere between 10 & 13 but my submission only got 7!Lavernelaverock
OK, seems to be some form of quadratic equation.Lavernelaverock
@Robbie Dee: Datasets are certainly the same for everyone.Tadd
If you are mostly interested in improving contest's score, you could try a simple approach: for smaller n (up to 40..50) use brute force algorithm, for larger n switch to Karmarkar-Karp. Also you could analyse input numbers' ranges, and if they are small, switch to DP.Annabelannabela
@Evgeny By brute force, do you mean branch and bound?Tadd
I mean Horowitz and Sahni method mentioned in Korf's paper (see comment #5). Or maybe Schroeppel and Shamir method, but I'm not sure if it will be useful here because contest's time limit is too strict.Annabelannabela
My experience with SPOJ challenge problems is that usually a combination of the following aspects leads to very good solutions: (1) Combine the exact approaches with the "fast" approximate approaches, i.e. by computing an initial solution with the approximate algorithm and then using DP/branch&bound to improve it. (2) Use random swapping or some other form of local search to improve an initial solution. Use all the time you have, i.e. by measuring the runtime (3) Use different algorithms and use the best result.Latreshia
Unfortunately, more often or not, there is also some test-case specific optimization involved, because there is often only very few actually large sets, and those often are randomly generated.Latreshia
@NiklasB. That is exactly my experience. I would rather calculate exact answers for given problems (as at Project Euler) rather than have to hack algorithms to fit certain unknown test data.Lavernelaverock
P
15

For whatever it's worth, a straightforward, unoptimized Python implementation of the "complete Karmarkar Karp" (CKK) search procedure in [Korf88] -- modified only slightly to bail out of the search after a given time limit (say, 4.95 seconds) and return the best solution found so far -- is sufficient to score 14.204234 on the SPOJ problem, beating the score for Karmarkar-Karp. As of this writing, this is #3 on the rankings (see Edit #2 below)

A slightly more readable presentation of Korf's CKK algorithm can be found in [Mert99].


EDIT #2 - I've implemented Evgeny Kluev's hybrid heuristic of applying Karmarkar-Karp until the list of numbers is below some threshold and then switching over to the exact Horowitz-Sahni subset enumeration method [HS74] (a concise description may be found in [Korf88]). As suspected, my Python implementation required lowering the switchover threshold versus his C++ implementation. With some trial and error, I found that a threshold of 37 was the maximum that allowed my program to finish within the time limit. Yet, even at that lower threshold, I was able to achieve a score of 15.265633, good enough for second place.

I further attempted to incorporate this hybrid KK/HS method into the CKK tree search, basically by using HS as a very aggressive and expensive pruning strategy. In plain CKK, I was unable to find a switchover threshold that even matched the KK/HS method. However, using the ILDS (see below) search strategy for CKK and HS (with a threshold of 25) to prune, I was able to yield a very small gain over the previous score, up to 15.272802. It probably should not be surprising that CKK+ILDS would outperform plain CKK in this context since it would, by design, provide a greater diversity of inputs to the HS phase.


EDIT #1 - I've tried two further refinements to the base CKK algorithm:

  1. "Improved Limited Discrepancy Search" (ILDS) [Korf96] This is an alternative to the natural DFS ordering of paths within the search tree. It has a tendency to explore more diverse solutions earlier on than regular Depth-First Search.

  2. "Speeding up 2-Way Number Partitioning" [Cerq12] This generalizes one of the pruning criteria in CKK from nodes within 4 levels of the leaf nodes to nodes within 5, 6, and 7 levels above leaf nodes.

In my test cases, both of these refinements generally provided noticeable benefits over the original CKK in reducing the number of nodes explored (in the case of the latter) and in arriving at better solutions sooner (in the case of the former). However, within the confines of the SPOJ problem structure, neither of these were sufficient to improve my score.

Given the idiosyncratic nature of this SPOJ problem (i.e.: 5-second time limit and only one specific and undisclosed problem instance), it is hard to give advice on what may actually improve the score*. For example, should we continue to pursue alternate search ordering strategies (e.g.: many of the papers by Wheeler Ruml listed here)? Or should we try incorporating some form of local improvement heuristic to solutions found by CKK in order to help pruning? Or maybe we should abandon CKK-based approaches altogether and try for a dynamic programming approach? How about a PTAS? Without knowing more about the specific shape of the instance used in the SPOJ problem, it's very difficult to guess at what kind of approach would yield the most benefit. Each one has its strengths and weaknesses, depending on the specific properties of a given instance.

* Aside from simply running the same thing faster, say, by implementing in C++ instead of Python.


References

[Cerq12] Cerquides, Jesús, and Pedro Meseguer. "Speeding Up 2-way Number Partitioning." ECAI. 2012, doi:10.3233/978-1-61499-098-7-223

[HS74] Horowitz, Ellis, and Sartaj Sahni. "Computing partitions with applications to the knapsack problem." Journal of the ACM (JACM) 21.2 (1974): 277-292.

[Korf88] Korf, Richard E. (1998), "A complete anytime algorithm for number partitioning", Artificial Intelligence 106 (2): 181–203, doi:10.1016/S0004-3702(98)00086-1,

[Korf96] Korf, Richard E. "Improved limited discrepancy search." AAAI/IAAI, Vol. 1. 1996.

[Mert99] Mertens, Stephan (1999), A complete anytime algorithm for balanced number partitioning, arXiv:cs/9903011

Phenazine answered 9/9, 2015 at 14:22 Comment(1)
Nice job on implementing the algorithm :)Latreshia
A
18

There are many papers describing various advanced algorithms for set partitioning. Here are only two of them:

Honestly, I don't know which of them gives more efficient solution. Probably neither of these advanced algorithms are needed to solve that SPOJ problem. Korf's paper is still very useful. Algorithms described there are very simple (to understand and implement). Also he overviews several even simpler algorithms (in section 2). So if you want to know the details of Horowitz-Sahni or Schroeppel-Shamir methods (mentioned below), you can find them in Korf's paper. Also (in section 8) he writes that stochastic approaches do not guarantee good enough solutions. So it is unlikely you get significant improvements with something like hill climbing, simulated annealing, or tabu search.

I tried several simple algorithms and their combinations to solve partitioning problems with size up to 10000, maximum value up to 1014, and time limit 4 sec. They were tested on random uniformly distributed numbers. And optimal solution was found for every problem instance I tried. For some problem instances optimality is guaranteed by algorithm, for others optimality is not 100% guaranteed, but probability of getting sub-optimal solution is very small.

problem space divided between algorithms

For sizes up to 4 (green area to the left) Karmarkar-Karp algorithm always gives optimal result.

For sizes up to 54 a brute force algorithm is fast enough (red area). There is a choice between Horowitz-Sahni or Schroeppel-Shamir algorithms. I used Horowitz-Sahni because it seems more efficient for given limits. Schroeppel-Shamir uses much less memory (everything fits in L2 cache), so it may be preferable when other CPU cores perform some memory-intensive tasks or to do set partitioning using multiple threads. Or to solve bigger problems with not as strict time limit (where Horowitz-Sahni just runs out of memory).

When size multiplied by sum of all values is less than 5*109 (blue area), dynamic programming approach is applicable. Border between brute force and dynamic programming areas on diagram shows where each algorithm performs better.

Green area to the right is the place where Karmarkar-Karp algorithm gives optimal result with almost 100% probability. Here there are so many perfect partitioning options (with delta 0 or 1) that Karmarkar-Karp algorithm almost certainly finds one of them. It is possible to invent data set where Karmarkar-Karp always gives sub-optimal result. For example {17 13 10 10 10 ...}. If you multiply this to some large number, neither KK nor DP would be able to find optimal solution. Fortunately such data sets are very unlikely in practice. But problem setter could add such data set to make contest more difficult. In this case you can choose some advanced algorithm for better results (but only for grey and right green areas on diagram).

I tried 2 ways to implement Karmarkar-Karp algorithm's priority queue: with max heap and with sorted array. Sorted array option appears to be slightly faster with linear search and significantly faster with binary search.

Yellow area is the place where you can choose between guaranteed optimal result (with DP) or just optimal result with high probability (with Karmarkar-Karp).

Finally, grey area, where neither of simple algorithms by itself gives optimal result. Here we could use Karmarkar-Karp to pre-process data until it is applicable to either Horowitz-Sahni or dynamic programming. In this place there are also many perfect partitioning options, but less than in green area, so Karmarkar-Karp by itself could sometimes miss proper partitioning. Update: As noted by @mhum, it is not necessary to implement dynamic programming algorithm to make things working. Horowitz-Sahni with Karmarkar-Karp pre-processing is enough. But it is essential for Horowitz-Sahni algorithm to work on sizes up to 54 in said time limit to (almost) guarantee optimal partitioning. So C++ or other language with good optimizing compiler and fast computer are preferable.

Here is how I combined Karmarkar-Karp with other algorithms:

template<bool Preprocess = false>
i64 kk(const vector<i64>& values, i64 sum, Log& log)
{
    log.name("Karmarkar-Karp");
    vector<i64> pq(values.size() * 2);
    copy(begin(values), end(values), begin(pq) + values.size());
    sort(begin(pq) + values.size(), end(pq));
    auto first = end(pq);
    auto last = begin(pq) + values.size();

    while (first - last > 1)
    {
        if (Preprocess && first - last <= kHSLimit)
        {
            hs(last, first, sum, log);
            return 0;
        }
        if (Preprocess && static_cast<double>(first - last) * sum <= kDPLimit)
        {
            dp(last, first, sum, log);
            return 0;
        }
        const auto diff = *(first - 1) - *(first - 2);
        sum -= *(first - 2) * 2;
        first -= 2;
        const auto place = lower_bound(last, first, diff);
        --last;
        copy(last + 1, place, last);
        *(place - 1) = diff;
    }

    const auto result = (first - last)? *last: 0;
    log(result);
    return result;
}

Link to full C++11 implementation. This program only determines difference between partition sums, it does not report the partitions themselves. Warning: if you want to run it on a computer with less than 1 Gb free memory, decrease kHSLimit constant.

Annabelannabela answered 8/9, 2015 at 21:8 Comment(7)
Awesome post! I think this kind of case-by-case distinction is exactly what leads to really good scores in those challenge problems.Latreshia
@EvgenyKluev Are you able to tell how your approach performs on the SPOJ problem instance?Phenazine
@mhum: I don't plan to add actual subsets reconstruction or try this on SPOJ. I was mostly interested in comparison of different algorithms (and their performance) on random data. As for SPOJ problem, this approach should get top points if it does not contain any specific anti-Karmarkar-Karp data set, or much less otherwise.Annabelannabela
@EvgenyKluev Ok. Fair enough. If I have time, I'll try to implement your hybrid KK/HS approach (and figure out how to recover the actual partitions) in Python and see how it does. Although, I suspect the performance gap between Python and C++ may leave my implementation unable to fully exploit the power of this approach (e.g.: I will likely need to set a much lower kHSlimit).Phenazine
@mhum: Good point! It seems DP is not necessary for this approach. Decreasing kHSlimit to anything below 54 already gives slightly non-perfect and possibly sub-optimal results with max numbers near 10^14. May be CKK will be able to solve problematic instances... What do you think?Annabelannabela
@mhum: as for recovering partitions after HS, one possibility is to remember an optimal pair of sums, then apply HS recursively to each subset (which was arbitrarily chosen at the beginning) and corresponding sum. Should be pretty quick. But a little bit tricky.Annabelannabela
@EvgenyKluev Recovering the partitions after HS shouldn't be a problem so long as each potential subset sum has a reference back to the subset that created (e.g.: via a hashmap/dictionary). It's integrating the result of HS back into the partial KK solution I'll need to careful about. I'm pretty sure it's easy, just fiddly detail stuff.Phenazine
P
15

For whatever it's worth, a straightforward, unoptimized Python implementation of the "complete Karmarkar Karp" (CKK) search procedure in [Korf88] -- modified only slightly to bail out of the search after a given time limit (say, 4.95 seconds) and return the best solution found so far -- is sufficient to score 14.204234 on the SPOJ problem, beating the score for Karmarkar-Karp. As of this writing, this is #3 on the rankings (see Edit #2 below)

A slightly more readable presentation of Korf's CKK algorithm can be found in [Mert99].


EDIT #2 - I've implemented Evgeny Kluev's hybrid heuristic of applying Karmarkar-Karp until the list of numbers is below some threshold and then switching over to the exact Horowitz-Sahni subset enumeration method [HS74] (a concise description may be found in [Korf88]). As suspected, my Python implementation required lowering the switchover threshold versus his C++ implementation. With some trial and error, I found that a threshold of 37 was the maximum that allowed my program to finish within the time limit. Yet, even at that lower threshold, I was able to achieve a score of 15.265633, good enough for second place.

I further attempted to incorporate this hybrid KK/HS method into the CKK tree search, basically by using HS as a very aggressive and expensive pruning strategy. In plain CKK, I was unable to find a switchover threshold that even matched the KK/HS method. However, using the ILDS (see below) search strategy for CKK and HS (with a threshold of 25) to prune, I was able to yield a very small gain over the previous score, up to 15.272802. It probably should not be surprising that CKK+ILDS would outperform plain CKK in this context since it would, by design, provide a greater diversity of inputs to the HS phase.


EDIT #1 - I've tried two further refinements to the base CKK algorithm:

  1. "Improved Limited Discrepancy Search" (ILDS) [Korf96] This is an alternative to the natural DFS ordering of paths within the search tree. It has a tendency to explore more diverse solutions earlier on than regular Depth-First Search.

  2. "Speeding up 2-Way Number Partitioning" [Cerq12] This generalizes one of the pruning criteria in CKK from nodes within 4 levels of the leaf nodes to nodes within 5, 6, and 7 levels above leaf nodes.

In my test cases, both of these refinements generally provided noticeable benefits over the original CKK in reducing the number of nodes explored (in the case of the latter) and in arriving at better solutions sooner (in the case of the former). However, within the confines of the SPOJ problem structure, neither of these were sufficient to improve my score.

Given the idiosyncratic nature of this SPOJ problem (i.e.: 5-second time limit and only one specific and undisclosed problem instance), it is hard to give advice on what may actually improve the score*. For example, should we continue to pursue alternate search ordering strategies (e.g.: many of the papers by Wheeler Ruml listed here)? Or should we try incorporating some form of local improvement heuristic to solutions found by CKK in order to help pruning? Or maybe we should abandon CKK-based approaches altogether and try for a dynamic programming approach? How about a PTAS? Without knowing more about the specific shape of the instance used in the SPOJ problem, it's very difficult to guess at what kind of approach would yield the most benefit. Each one has its strengths and weaknesses, depending on the specific properties of a given instance.

* Aside from simply running the same thing faster, say, by implementing in C++ instead of Python.


References

[Cerq12] Cerquides, Jesús, and Pedro Meseguer. "Speeding Up 2-way Number Partitioning." ECAI. 2012, doi:10.3233/978-1-61499-098-7-223

[HS74] Horowitz, Ellis, and Sartaj Sahni. "Computing partitions with applications to the knapsack problem." Journal of the ACM (JACM) 21.2 (1974): 277-292.

[Korf88] Korf, Richard E. (1998), "A complete anytime algorithm for number partitioning", Artificial Intelligence 106 (2): 181–203, doi:10.1016/S0004-3702(98)00086-1,

[Korf96] Korf, Richard E. "Improved limited discrepancy search." AAAI/IAAI, Vol. 1. 1996.

[Mert99] Mertens, Stephan (1999), A complete anytime algorithm for balanced number partitioning, arXiv:cs/9903011

Phenazine answered 9/9, 2015 at 14:22 Comment(1)
Nice job on implementing the algorithm :)Latreshia
W
6

EDIT Here's a implementation that starts with Karmarkar-Karp differencing then tries to optimize the resulting partitions.

The only optimizations that time allows are giving 1 from one partition to the other and swapping 1 for 1 between both partitions.

My implementation of Karmarkar-Karp at the beginning must be inaccurate since the resulting score with just Karmarkar-Karp is 2.711483 not 11.796614 points cited by OP. The score goes to 7.718049 when the optimizations are used.

SPOILER WARNING C# submission code follows

using System;
using System.Collections.Generic;
using System.Linq;
public class Test
{
    // some comparer's to lazily avoid using a proper max-heap implementation
    public class Index0 : IComparer<long[]>
    {
        public int Compare(long[] x, long[] y)
        {
            if(x[0] == y[0]) return 0;
            return x[0] < y[0] ? -1 : 1;
        }
        public static Index0 Inst = new Index0();
    }
    public class Index1 : IComparer<long[]>
    {
        public int Compare(long[] x, long[] y)
        {
            if(x[1] == y[1]) return 0;
            return x[1] < y[1] ? -1 : 1;
        }
    }

    public static void Main()
    {
        // load the data
        var start = DateTime.Now;
        var list = new List<long[]>();
        int size = int.Parse(Console.ReadLine());
        for(int i=1; i<=size; i++) {
            var tuple = new long[]{ long.Parse(Console.ReadLine()), i };
            list.Add(tuple);
        }
        list.Sort((x, y) => { if(x[0] == y[0]) return 0; return x[0] < y[0] ? -1 : 1; });

        // Karmarkar-Karp differences
        List<long[]> diffs = new List<long[]>();
        while(list.Count > 1) {
            // get max
            var b = list[list.Count - 1];
            list.RemoveAt(list.Count - 1);
            // get max
            var a = list[list.Count - 1];
            list.RemoveAt(list.Count - 1);
            // (b - a)
            var diff = b[0] - a[0];
            var tuple = new long[]{ diff, -1 };
            diffs.Add(new long[] { a[0], b[0], diff, a[1], b[1] });
            // insert (b - a) back in
            var fnd = list.BinarySearch(tuple, new Index0());
            list.Insert(fnd < 0 ? ~fnd : fnd, tuple);
        }
        var approx = list[0];
        list.Clear();

        // setup paritions
        var listA = new List<long[]>();
        var listB = new List<long[]>();
        long sumA = 0;
        long sumB = 0;

        // Karmarkar-Karp rebuild partitions from differences
        bool toggle = false;
        for(int i=diffs.Count-1; i>=0; i--) {
            var inB = listB.BinarySearch(new long[]{diffs[i][2]}, Index0.Inst);
            var inA = listA.BinarySearch(new long[]{diffs[i][2]}, Index0.Inst);
            if(inB >= 0 && inA >= 0) {
                toggle = !toggle;
            }
            if(toggle == false) {
                if(inB >= 0) {
                    listB.RemoveAt(inB);
                }else if(inA >= 0) {
                    listA.RemoveAt(inA);
                }
                var tb = new long[]{diffs[i][1], diffs[i][4]};
                var ta = new long[]{diffs[i][0], diffs[i][3]};
                var fb = listB.BinarySearch(tb, Index0.Inst);
                var fa = listA.BinarySearch(ta, Index0.Inst);
                listB.Insert(fb < 0 ? ~fb : fb, tb);
                listA.Insert(fa < 0 ? ~fa : fa, ta);
            } else {
                if(inA >= 0) {
                    listA.RemoveAt(inA);
                }else if(inB >= 0) {
                    listB.RemoveAt(inB);
                }
                var tb = new long[]{diffs[i][1], diffs[i][4]};
                var ta = new long[]{diffs[i][0], diffs[i][3]};
                var fb = listA.BinarySearch(tb, Index0.Inst);
                var fa = listB.BinarySearch(ta, Index0.Inst);
                listA.Insert(fb < 0 ? ~fb : fb, tb);
                listB.Insert(fa < 0 ? ~fa : fa, ta);
            }
        }
        listA.ForEach(a => sumA += a[0]);
        listB.ForEach(b => sumB += b[0]);

        // optimize our partitions with give/take 1 or swap 1 for 1
        bool change = false;
        while(DateTime.Now.Subtract(start).TotalSeconds < 4.8) {
            change = false;
            // give one from A to B
            for(int i=0; i<listA.Count; i++) {
                var a = listA[i];
                if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a[0]) - (sumB + a[0]))) {
                    var fb = listB.BinarySearch(a, Index0.Inst);
                    listB.Insert(fb < 0 ? ~fb : fb, a);
                    listA.RemoveAt(i);
                    i--;
                    sumA -= a[0];
                    sumB += a[0];
                    change = true;
                } else {break;}
            }
            // give one from B to A
            for(int i=0; i<listB.Count; i++) {
                var b = listB[i];
                if(Math.Abs(sumA - sumB) > Math.Abs((sumA + b[0]) - (sumB - b[0]))) {
                    var fa = listA.BinarySearch(b, Index0.Inst);
                    listA.Insert(fa < 0 ? ~fa : fa, b);
                    listB.RemoveAt(i);
                    i--;
                    sumA += b[0];
                    sumB -= b[0];
                    change = true;
                } else {break;}
            }
            // swap 1 for 1
            for(int i=0; i<listA.Count; i++) {
                var a = listA[i];
                for(int j=0; j<listB.Count; j++) {
                    var b = listB[j];
                    if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a[0] + b[0]) - (sumB -b[0] + a[0]))) {
                        listA.RemoveAt(i);
                        listB.RemoveAt(j);
                        var fa = listA.BinarySearch(b, Index0.Inst);
                        var fb = listB.BinarySearch(a, Index0.Inst);
                        listA.Insert(fa < 0 ? ~fa : fa, b);
                        listB.Insert(fb < 0 ? ~fb : fb, a);
                        sumA = sumA - a[0] + b[0];
                        sumB = sumB - b[0] + a[0];
                        change = true;
                        break;
                    }
                }
            }
            //
            if(change == false) { break; }
        }

        /*
        // further optimization with 2 for 1 swaps
        while(DateTime.Now.Subtract(start).TotalSeconds < 4.8) {
            change = false;
            // trade 2 for 1
            for(int i=0; i<listA.Count >> 1; i++) {
                var a1 = listA[i];
                var a2 = listA[listA.Count - 1 - i];
                for(int j=0; j<listB.Count; j++) {
                    var b = listB[j];
                    if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a1[0] - a2[0] + b[0]) - (sumB - b[0] + a1[0] + a2[0]))) {
                        listA.RemoveAt(listA.Count - 1 - i);
                        listA.RemoveAt(i);
                        listB.RemoveAt(j);
                        var fa = listA.BinarySearch(b, Index0.Inst);
                        var fb1 = listB.BinarySearch(a1, Index0.Inst);
                        var fb2 = listB.BinarySearch(a2, Index0.Inst);
                        listA.Insert(fa < 0 ? ~fa : fa, b);
                        listB.Insert(fb1 < 0 ? ~fb1 : fb1, a1);
                        listB.Insert(fb2 < 0 ? ~fb2 : fb2, a2);
                        sumA = sumA - a1[0] - a2[0] + b[0];
                        sumB = sumB - b[0] + a1[0] + a2[0];
                        change = true;
                        break;
                    }
                }
            }
            //
            if(DateTime.Now.Subtract(start).TotalSeconds > 4.8) { break; }
            // trade 2 for 1
            for(int i=0; i<listB.Count >> 1; i++) {
                var b1 = listB[i];
                var b2 = listB[listB.Count - 1 - i];
                for(int j=0; j<listA.Count; j++) {
                    var a = listA[j];
                    if(Math.Abs(sumA - sumB) > Math.Abs((sumA - a[0] + b1[0] + b2[0]) - (sumB - b1[0] - b2[0] + a[0]))) {
                        listB.RemoveAt(listB.Count - 1 - i);
                        listB.RemoveAt(i);
                        listA.RemoveAt(j);
                        var fa1 = listA.BinarySearch(b1, Index0.Inst);
                        var fa2 = listA.BinarySearch(b2, Index0.Inst);
                        var fb = listB.BinarySearch(a, Index0.Inst);
                        listA.Insert(fa1 < 0 ? ~fa1 : fa1, b1);
                        listA.Insert(fa2 < 0 ? ~fa2 : fa2, b2);
                        listB.Insert(fb < 0 ? ~fb : fb, a);
                        sumA = sumA - a[0] + b1[0] + b2[0];
                        sumB = sumB - b1[0] - b2[0] + a[0];
                        change = true;
                        break;
                    }
                }
            }
            //
            if(change == false) { break; }
        }
        */

        // output the correct ordered values
        listA.Sort(new Index1());
        foreach(var t in listA) {
            Console.WriteLine(t[1]);
        }

        // DEBUG/TESTING
        //Console.WriteLine(approx[0]);
        //foreach(var t in listA) Console.Write(": " + t[0] + "," + t[1]);
        //Console.WriteLine();
        //foreach(var t in listB) Console.Write(": " + t[0] + "," + t[1]);

    }
}
Warfold answered 4/9, 2015 at 14:30 Comment(4)
Karmarkar-Karp is superior to greedy approach, why I won't start with the result of KK?Tadd
@Tadd - Please point me to an implementation of Karmarkar-Karp that results in the actual partitions and not just the difference of their sums. The pseudo code at wikipedia only returns an integer (en.wikipedia.org/wiki/Partition_problem#Differencing_algorithm) The values in each partition are necessary to output the answer to the OPs referenced question.Warfold
@LouisRicci A version of the original paper by Karmarkar and Karp can be found here (pdf). Section 2.1 describes how the partitions are recovered. The presentation in Wikipedia is incomplete.Phenazine
@Phenazine - Thanks, recording the values and difference at each step, then working backward didn't occur to me before reading section (2.1). When I get a chance I'll update my answer with the pseudo code for Karmarkar-Karp as the initial estimate.Warfold

© 2022 - 2024 — McMap. All rights reserved.