Speed up set partition generation by skipping ones with subsets smaller or larger than X, or partitions with more or less than Y subsets
Asked Answered
T

3

1

There is a great answer regarding the generation of partition sets:

Set partitions in Python

(another similar answer with ruby is found here: Calculate set partitions with specific subset sizes which also has a picture)

The problem is that it quickly gets slow when N increases, it can be noticed from 12 or 13

CONTEXT:

I need set partitions to fit a collection of images into either a portrait or landscape sheet of paper, to reduce blank space as most as possible.

I tried bin packing algorithms like answered here How is 2D bin packing achieved programmatically?

but it generates bad solutions (I posted a comment on why with a screenshot). Other bin packing solutions are too complex.

A good solution I found is to scale sets of images so they have the same height, put them in a row, and assemble several rows on a landscape or portrait layout. Description here:

enter image description here

I filter out candidates that:

  • scale up or down images too much, since I want each image to have almost equal surface allocated

  • waste too much space

END OF CONTEXT

This technique works well for 10 up to 11 images, but at 12 and beyond, it's too slow.

The more-itertools's set_partitions() also works, but it has the same speed problem.

A lot of partitions have subsets of size 1, or have too few large subsets. As you can see in the image, I don't need singletons (subsets of size 1) and it needs partitions with a minimum number of subsets (it's not really the square root of N, it depends if it's either landscape or portrait).

My question is this:

Is it possible to speed up set partition generation and directly skip:

  • set partitions with subsets of size SMALLER than X

AND/OR

  • set partitions with fewer than Y subsets

and eventually:

  • set partitions with subsets of size LARGER than X

or eventually

  • set partitions with MORE than Y subsets

Those X and Y can vary since I want to generate those set partitions for different values of N.

EDIT: I also found this: https://codereview.stackexchange.com/questions/1526/finding-all-k-subset-partitions

This seems to be somewhat fast.

Theresatherese answered 24/8, 2022 at 12:19 Comment(4)
The algorithms in the linked questions are pretty optimal for the task, your problem is combinatorial explosion. So you need an algorithm that correctly filters out non-conforming solutions during recursion, e.g. anything that will necessarily lead to a solution with a singleton. But you do need the singleton subsets during the early (lower) parts of the recursion, so this is not trivial... be ready with some validated solutions for moderately large N (at least 7 or 8), because it's very easy to get this wrong.Orelia
Also note that even with an optimal algorithm, combinatorial explosion will get you quick. You'll be lucky if you can raise it from 12 to about 15, but (at least for minimum X = 2) I wouldn't expect much more. So think about whether you really need all candidates, or whether you should just focus on how to get a good packing of the images (and not necessarily the very best).Orelia
The provided answer sort of worked, it narrowed down things enough to make it fast. I was able to have results for N=15, subset size=3 and partition size=3. The person who answered talked about one possible faster solution, so I'm still waiting.Theresatherese
@Theresatherese The speedup was only a factor of 3. But I posted it.Newlywed
N
1

Just use recursion.

Note, the arrays returned each time are modified in place for efficiency. If you're going to do anything interesting with them, you probably want to make your own copies.

def split_partition (elements, min_size, max_size, partition=None, rest=None, i = 0):
    if partition is None:
        partition = []
    if rest is None:
        rest = []

    # The first element always goes in.
    if i == 0:
        partition.append(elements[i])
        yield from split_partition(elements, min_size, max_size, partition, rest, i+1)
    elif len(partition) + len(elements) - i < min_size:
        pass # Too few solutions.
    elif max_size == len(partition):
        for j in range(i, len(elements)):
            rest.append(elements[j])
        yield (partition, rest) # Just the one solution.
        for j in range(i, len(elements)):
            rest.pop()
    elif i == len(elements):
        yield (partition, rest) # Just the one solution.
    else:
        partition.append(elements[i])
        yield from split_partition(elements, min_size, max_size, partition, rest, i+1)
        rest.append(partition.pop())
        yield from split_partition(elements, min_size, max_size, partition, rest, i+1)
        rest.pop()

def desired_partitions (elements, min_size, max_size, min_count, max_count):
    if len(elements) < min_size * min_count:
        pass # No solutions possible.
    elif max_size * max_count < len(elements):
        pass # No solutions possible.
    else:
        for partition, rest in split_partition(elements, min_size, max_size):
            if 0 == len(rest):
                yield [partition]
            else:
                for split in desired_partitions(rest, min_size, max_size, min_count - 1, max_count - 1):
                    split.append(partition)
                    yield split
                    split.pop()


for split in desired_partitions([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 3, 5, 2, 3):
    print(split)
Newlywed answered 24/8, 2022 at 16:11 Comment(5)
so copy.deepcopy(split) is necessaryTheresatherese
@Theresatherese Exactly.Newlywed
codereview.stackexchange.com/questions/1526/… I found this, it has 2 good answers it seems, not sure if it's faster than yours.Theresatherese
@Theresatherese Those fix the number of partitions, but don't fix the size of each partition. I have some ideas which would make mine potentially a lot faster.Newlywed
Are you trying to tell me you want a bounty? :pTheresatherese
N
1

Here is a faster approach. It is still using recursion to figure out the sizes of the partitions. But for the heavy lifting of producing the partitions it avoids recursive calls, unnecessary checks, and excessive copying of data. In my testing that makes it about 3x as fast.

I'm not sure that a factor of 3 will help too much with the combinatorial explosion though.

def desired_partition_sizes (element_count, min_size, max_size, min_count, max_count):
    if max_size * max_count < element_count:
        pass # No solutions.
    elif element_count < min_size * min_count:
        pass # No solutions.
    elif max_count == 1 and 0 < element_count:
        if min_size <= element_count:
            yield [element_count]
    elif 0 == element_count and 0 == min_count:
        yield []
    else:
        for i in range(min_size, max_size+1):
            for solution in desired_partition_sizes(
                    element_count-i, min_size, max_size,
                    max(min_count-1, 0), max_count-1):
                solution.append(i)
                yield solution

# The first partition contains the first element and has the first size
# The second partition contains the first element not in the first
#     partition and has the second size.
# The third partition contains the first element not in the first 2
#     partitions and has the third size.
# etc
def partitions_fitting_desired_sizes (elements, sizes):
    partitions = [[] for _ in sizes]
    assigned = []
    while True:
        # Attempt to assign.
        i = len(assigned)
        j = 0
        while i < len(elements):
            while sizes[j] <= len(partitions[j]):
                j += 1

            partitions[j].append(elements[i])
            assigned.append(j)
            i += 1

        yield partitions

        # Unassign the last element.
        partitions[ assigned.pop() ].pop()

        # Unassign until an element can be moved to a later partition.
        while 0 < len(assigned):
            k = assigned.pop()
            partitions[k].pop()
            # Remember, the k'th starts with first not in others.
            found = False
            if 0 < len(partitions[k]):
                for j in range(k+1, len(sizes)):
                    if len(partitions[j]) < sizes[j]:
                        found = True
                        partitions[j].append(elements[len(assigned)])
                        assigned.append(j)
                        break # Inner loop.
            if found: # EXIT HERE (we moved one
                break
        if 0 == len(assigned):
            break # All done.

def desired_partitions (elements, min_size, max_size, min_count, max_count):
    for sizes in desired_partition_sizes(len(elements), min_size, max_size, min_count, max_count):
        yield from partitions_fitting_desired_sizes(elements, sizes)

for partitions in desired_partitions([1, 2, 3, 4, 5, 6, 7, 8, 9, 10], 3, 5, 2, 3):
    print(partitions)
Newlywed answered 28/8, 2022 at 21:4 Comment(1)
Thanks a lot, I will try it later against the other I had. I still hope this answer might help other people. I might still use it for future image printing I will have in the future.Theresatherese
O
1

This is an adaptation of my answer to the basic partition problem here. The idea here is that if all subsets should have a minimum size of 2, recursive partitions of a smaller set should include the sets of size 1, since these can be brought up to size as the recursion unwinds. The number of missing elements that can be forgiven is the number of elements still to be added.

So we can avoid repetition and also trim at every level, which lets the number of cases grow less steeply (but still exponentially, of course). It runs pretty fast, I'd say. If you do any timing comparisons, please report what you find out.

def partition_filtered(collection, minsize=1, forgive=0):
    """
    Generate partitions that contain at least `minsize` elements per set;
    allow `forgive` missing elements, which can get added in subsequent steps
    """
    if len(collection) == 1:
        yield [ collection ]
        return

    first = collection[0]
    for smaller in partition_filtered(collection[1:], minsize, forgive=forgive+1):
        # insert `first` in each of the subpartition's subsets
        for n, subset in enumerate(smaller):
            candidate = smaller[:n] + [[ first ] + subset]  + smaller[n+1:]
            if conforms(candidate, minsize, forgive):
                yield candidate

        # put `first` in its own subset
        candidate = [ [ first ] ] + smaller
        if conforms(candidate, minsize, forgive):
            yield candidate

def conforms(candidate, minsize, forgive):
    """
    Check if partition `candidate` is at most `forgive` additions from making
    all its elements conform to having minimum size `minsize`
    """
    deficit = 0
    for p in candidate:
        need = minsize - len(p)
        if need > 0:
            deficit += need

    # Is the deficit small enough?
    return (deficit <= forgive)

something = list(range(1,13)) # [1, ... 12]
for n, p in enumerate(partition_filtered(something, minsize=2), 1):
    print(n, sorted(p))

# Alternately, save the list and check it for correctness
# clean_partitions = list(partition_filtered(something, minsize=2))
Orelia answered 30/8, 2022 at 23:11 Comment(1)
Thanks a lot. I already generated images and SVG for the image I had in mind, but this will greatly help me for the next time. I will try it against @Orelia 's answer. I don't really which answer to pick as I don't know which is fastest, although I hope it will help people who have a similar problem.Theresatherese

© 2022 - 2024 — McMap. All rights reserved.