How to (efficiently) generate disjoint sets while usings pairs of elements only once?

Asked 26/9, 2015 at 21:38 Answered 27/9, 2015 at 15:26

Solved algorithm combinatorics solver constraint-programming disjoint-sets

What I would like to do is split a group of (n) items into groups of equal size (groups of size m, and for simplicity assume that there are no leftovers, i.e. n is divisible by m). Doing this multiple times, I would like to ensure that no pair of items is in the same group together twice.

To make this slightly more concrete, for building groups of two out of the six items A..F, once could partition the set five times in different ways:

(A, B), (C, D), (E, F)
(A, C), (B, E), (D, F)
(A, D), (B, F), (C, E)
(A, E), (B, D), (C, F)
(A, F), (B, C), (D, E)

The same set of items can be partitioned only once into groups of three without overlapping pairs:

(A, B, C), (D, E, F)

_{(As @DavidHammen points out below, there are different ways of making the partition in this example. However, having made the partition once, there is never another, second split which keeps all pairs of items apart. That's fine -- my application doesn't need to generate all possibly ways of partitioning the set globally, one solution meeting the constraints will do)}

My question, now, is this: Is there a way of doing this efficiently? Are there tricks to speed up the generation of these sets?

So, far, I've been treating this as an exact cover problem, and solving it with a backtracking algorithm (a variant of DLX). This works extremely well for pairs, but as the groups become larger the number of possibilities the algorithm has to consider explodes, and processing becomes very unwieldy.

What I'm looking for are tricks to speed things up. Any ideas are very welcome, in particular (but not limited to):

Optimizations and heuristics to reduce the number of possibilities that need to be considered prior to solving (for example, it is clear from the examples above that the first split can be made simply arbitrarily, and the first set of each partition [the first column above] can be generated automatically).
Are there variants of backtracking that can cope with huge amounts of candidates? (i.e. not needing to generate all possibilities beforehand)
Other algorithms, approaches or mathematical concepts I should consider?

Any ideas and suggestions are very welcome. Thank you very much for considering this!

Update

Ok, so this has been a while, but I spent a lot more time on this and wanted to get back to you. @david-eisenstat put me on the right path by giving me the correct search term (thank you so much!) -- I've since read quite a bit on the Social Golfer Problem.

One of the best resources that I found, that I would like to share here, is the work of Markus Triska, who discusses several approaches (and then goes on to present a very nice algorithm) in his thesis. This is highly recommended if anybody runs into a similar problem!

Flour answered 26/9, 2015 at 21:38 Comment(6)

Re The same set of items can be partitioned only once into groups of three without overlapping pairs: What's wrong with ((A B D), (C E F)) and ((A B E) (C D F)), and at least six more? The question as stated doesn't specify why you've ruled out those combinations. – Wolcott 26/9, 2015 at 22:30

@DavidHammen the pair A B appears to be in the same group in both partitions. In the OP's question it stated that no pair of items is in the same group together twice – Distiller 26/9, 2015 at 22:47

Thanks a lot, you two! @DavidHammen: You're totally correct in that I could have used any of your examples instead of the one I gave. I will clarify this in the question. – Flour 26/9, 2015 at 23:2

@a-s-h: You got my intention perfectly right, thanks for your clarification! Again, thanks for your efforts and putting thought into the question. – Flour 26/9, 2015 at 23:3

You may want to search for keywords: combinatorial designs, orthogonal arrays. – Saturniid 26/9, 2015 at 23:15

@Saturniid Thank you! Your pointers have been extremely helpful, would you consider expanding on your comment slightly and providing an answer? I have been reading up on combinatorics over the last days, and have found a few parallels to my question, but I had never stumbled across the field of combinatorial design. Am I right in thinking that what I am asking for is a "Pairwise Balanced Design"? Do you know of any resources in this field that would be accessable to a computer scientist, or any libraries? – Flour 26/9, 2015 at 23:46

This problem is studied under the name Social Golfer Problem. The literature has nontrivial size, but there are three main approaches:

Local search methods, which can handle the cases where many pairs are not present.
Complete search methods like your reduction to exact cover. From what I remember, the research here revolves around efficient methods for symmetry breaking, of which your idea of fixing the first row is probably the simplest.
Mathematical constructions. When q is a prime power, there's a construction for q groups of q involving finite affine planes that's not too awful to implement. Beyond that, there are a lot of one-off constructions. The Handbook of Combinatorial Designs is probably your best bet for summarizing what's known.

Chesty answered 27/9, 2015 at 15:26 Comment(1)

wow, thank you so much for your quick and comprehensive answer! Your pointer has been extremely helpful in finding additional literature and pointers to implementations. I've looked at the Handbook of Combinatorial Designs, too, looks like I will just have to dive into the math :-) . I've upvoted your answer for now (wishing I could do so multiple times), will come back and add my findings as a comment when I have done some reading. – Flour 27/9, 2015 at 21:58

Let n=m*k, partition has m groups with k items.

After x partitions, each item is in a group with x*(k-1) other items. After creating t-1 partitions, in next partition A can choose for:

second element : n -   (t-1)*(k-1)   - 1  items
third element  : n - 2*(t-1)*(k-1)   - 2  items
fourth element : n - 3*(t-1)*(k-1)   - 3  items
...
k'th element   : n -   (t-1)*(k-1)^2 - (k-1)  items

To create t'th partition we need:

n - (t-1)*(k-1)^2 - (k-1) > 0
=>
t < (n - k + 1) / ((k-1)^2) + 1

Number of possible partitions decrease with square of group length. That means, there are not too much of possible partitions :-)

I would go with some greedy approach. Store for each item set of items that are available, and create new partition by adding first available item to a group.

Fugitive answered 27/9, 2015 at 10:12 Comment(0)

Recommended topics

Hot tags