Finding set of pairs that correspond to list of sums

Asked 28/2, 2013 at 9:38 Answered 28/6, 2013 at 6:2

Given two lists of numbers and a list of totals (none in any particular order):

a = [1,2,3]
b = [4,5,6]
c = [6,7,8]

How can I find all sets of pairs d where d[k] = (a[i], b[j]) such that c[k] = a[i] + b[j] where pairs are used from a and b without replacement? (all lists can have duplicates)

d = [(1,5), (3,4), (2,6)]
d = [(2,4), (1,6), (3,5)]

For c = [7,7,7]:

d = [(1,6), (2,5), (3,4)]

(1 answer because all permutations are essentially equivalent)

I'd like to do this with lists of length ~500, so a naive matching/backtracking search is out of the question.

Blinders answered 28/2, 2013 at 9:38 Comment(6)

You want a set of pair sequences where each sequence in the set has totals matching the sequence c? Also, in the first example, would [(1,5), (1,6), (2,6)]--and many more such--also be included? – Behaviorism 28/2, 2013 at 9:44

No replacement. The problem I'm trying to solve is that each list contains scores of students. I have access to each list and the sum of both, but would like to know given a total score, what the possible subscores are. If it makes the problem any easier to solve (or increases the chance of a unique solution), I have access to N of these lists and can query a database for the list of totals for any subset of them. – Blinders 28/2, 2013 at 9:48

This problem is described in Wikipedia as Numerical 3-dimensional matching. It is NP-complete. – Diaphone 28/2, 2013 at 11:25

The third set being the negative of c and with b = 0. Alright thanks. – Blinders 28/2, 2013 at 19:42

Ah, solving an NP-complete problem on a non-negligible size data set. Good times. – Bedwarmer 28/2, 2013 at 22:0

What counts as "equivalent"? For example, if a = [5]*100, b = [10]*100 and c = [15]*100, is there one solution or ~100! solutions? – Laius 6/3, 2013 at 4:18

Okay, there is the brute force approach with pruning. This takes O(N^3)

For ease of demonstration, I will go through an N-by-N square that has the sum of a and b

S:
 + | 4 5 6 
 --|-------
 1 | 5 6 7 
 2 | 6 7 8
 3 | 7 8 9

And I am looking to build c={6,7,8}
I find a '6' in S. I remove it, and mark its row and column as unavailable

S:
 + | 4 5 6 
 --|-------
 1 | / X / 
 2 | 6 / 8
 3 | 7 / 9

Solution = { (1,5) }

Then I try to find a '7'

S:
 + | 4 5 6 
 --|-------
 1 | / X / 
 2 | / / 8
 3 | X / /

Solution = { (1,5) (3,4) }

And finally the '6'

S:
 + | 4 5 6 
 --|-------
 1 | / X / 
 2 | / / X
 3 | X / /

Solution = { (1,5) (3,4) (2,6) }

The 1st loop ( the one for '6' ) will continue and find another match : (2,4). This will then form the second solution { (2,4) (1,6) (3,5) }

Now, One way to improve this is, use some dynamic-programming: find out all possible combinations that give the result beforehand.

Given c={ 6 7 8}, create sets S_x where x is {6,7,8} and 
    S_x = { (i,j) } such that S[i][j]=x
So:
    S_6 = { (1,2) (2,1) }
    S_7 = { (1,3) (2,2) (3,1)  }
    S_8 = { (2,3) (3,2) }

And now, the same algorithm with given heuristics will run in O(S_l1 * S_l2 * ... S_lN), where S_li denotes the length of S_i.

This may run a factor faster in the average case. It will also handle the c={7,7,7} case properly.

That's pretty much all I got.

Thormora answered 28/6, 2013 at 6:2 Comment(0)

Here is a brute-force approach in C++. It doesn't prune equivalent permutations e.g. for c=[7,7,7].

#include <vector>
#include <iostream>
#include <algorithm>
#include <utility>

using namespace std;

// numerical 3d match: x + y + z = b where                                                                                         
// x = a, y = b, z = -c, b = 0                                                                                                     
template <typename T>
vector<pair<vector<T>, vector<T> > > n3dmatch(vector<T> a, vector<T> b, vector<T> c) {
  vector<pair<vector<T>, vector<T> > > result;
  if (a.size() != b.size() || b.size() != c.size()) return result;

  vector<vector<T> > ap, bp;
  sort(a.begin(), a.end());
  sort(b.begin(), b.end());
  do { ap.push_back(a); } while (next_permutation(a.begin(), a.end()));
  do { bp.push_back(b); } while (next_permutation(b.begin(), b.end()));

  for (int i = 0; i < ap.size(); i++) {
    for (int j = 0; j < ap.size(); j++) {
      bool match = true;
      for (int k = 0; k < a.size(); k++) {
        if ((ap[i][k] + bp[j][k]) != c[k]) {
          match = false; break;
        }
      }
      if (match) result.push_back({ ap[i], bp[j] });
    }
  }
  return result;
}

int main(int argc, char *argv[]) {
  vector<int> a = { 1, 2, 3 };
  vector<int> b = { 4, 5, 6 };
  vector<int> c = { 6, 7, 8 };
  //vector<int> c = { 7, 7, 7 };                                                                                                   
  auto result = n3dmatch(a, b, c);
  for (int i = 0; i < result.size(); i++) {
    vector<int> &a = result[i].first;
    vector<int> &b = result[i].second;
    for (int j = 0; j < a.size(); j++) cout << a[j] << " "; cout << endl;
    for (int j = 0; j < b.size(); j++) cout << b[j] << " "; cout << endl;
    cout << "-" << endl;
  }
  return 0;
}

Radcliff answered 6/3, 2013 at 4:11 Comment(0)

Recommended topics

Hot tags