fitting n variable height images into 3 (similar length) column layout

Asked 11/4, 2011 at 19:34 Answered 16/4, 2011 at 21:3

Solved algorithm layout scheduling packing

I'm looking to make a 3-column layout similar to that of piccsy.com. Given a number of images of the same width but varying height, what is a algorithm to order them so that the difference in column lengths is minimal? Ideally in Python or JavaScript...

Thanks a lot for your help in advance!

Martin

Cinchonism answered 11/4, 2011 at 19:34 Comment(7)

You're looking for a packing algorithm. Specifically a 2-D rectangle strip packing algorithm. Hope this helps. – Differentiate 11/4, 2011 at 19:39

but the fact that all rectangles are the same width makes it simpler, no? – Brillatsavarin 11/4, 2011 at 20:3

@Robin: yes, it's just plain Bin Packing (see wikipedia). But Bin Packing is NP-complete so finding the optimal solution probably won't scale enough for you. The First Fit Decreasing algorithm as clintp proposes is probably good enough for your needs. If you need more, look into meta-heuristics etc. – Elevate 12/4, 2011 at 8:16

No, as I said in my answer, it's actually a closely-related problem to Bin Packing, which can be described as offline makespan minimisation, or the multiprocessor scheduling problem. The difference is that bin packing involves a variable number of bins and a fixed bin size, whereas this problem is the opposite: effectively it's a fixed number of bins and a variable bin size. – Brillatsavarin 12/4, 2011 at 8:21

@Robin Good point that it's not an exact Bin Packing problem. – Elevate 12/4, 2011 at 18:26

@Robin Green, +1 -- I thought it was bin packing myself, but you're right. – Lipps 18/4, 2011 at 8:19

Hi, thanks for the bounty. I wonder, did things turn out OK? – Quintet 17/5, 2011 at 22:46

How many images?

If you limit the maximum page size, and have a value for the minimum picture height, you can calculate the maximum number of images per page. You would need this when evaluating any solution.

I think there were 27 pictures on the link you gave.

The following uses the first_fit algorithm mentioned by Robin Green earlier but then improves on this by greedy swapping.

The swapping routine finds the column that is furthest away from the average column height then systematically looks for a swap between one of its pictures and the first picture in another column that minimizes the maximum deviation from the average.

I used a random sample of 30 pictures with heights in the range five to 50 'units'. The convergenge was swift in my case and improved significantly on the first_fit algorithm.

The code (Python 3.2:

def first_fit(items, bincount=3):
    items = sorted(items, reverse=1) # New - improves first fit.
    bins     = [[] for c in range(bincount)]
    binsizes = [0] * bincount
    for item in items:
        minbinindex = binsizes.index(min(binsizes))
        bins[minbinindex].append(item)
        binsizes[minbinindex] += item
    average = sum(binsizes) / float(bincount)
    maxdeviation = max(abs(average - bs) for bs in binsizes)

    return bins, binsizes, average, maxdeviation

def swap1(columns, colsize, average, margin=0):
    'See if you can do a swap to smooth the heights'
    colcount = len(columns)
    maxdeviation, i_a = max((abs(average - cs), i)
                              for i,cs in enumerate(colsize))
    col_a = columns[i_a]
    for pic_a in set(col_a): # use set as if same height then only do once
        for i_b, col_b in enumerate(columns):
            if i_a != i_b: # Not same column
                for pic_b in set(col_b):
                    if (abs(pic_a - pic_b) > margin): # Not same heights
                        # new heights if swapped
                        new_a = colsize[i_a] - pic_a + pic_b
                        new_b = colsize[i_b] - pic_b + pic_a
                        if all(abs(average - new) < maxdeviation
                               for new in (new_a, new_b)):
                            # Better to swap (in-place)
                            colsize[i_a] = new_a
                            colsize[i_b] = new_b
                            columns[i_a].remove(pic_a)
                            columns[i_a].append(pic_b)
                            columns[i_b].remove(pic_b)
                            columns[i_b].append(pic_a)
                            maxdeviation = max(abs(average - cs)
                                               for cs in colsize)
                            return True, maxdeviation
    return False, maxdeviation

def printit(columns, colsize, average, maxdeviation):
    print('columns')
    pp(columns)
    print('colsize:', colsize)
    print('average, maxdeviation:', average, maxdeviation)
    print('deviations:', [abs(average - cs) for cs in colsize])
    print()


if __name__ == '__main__':
    ## Some data
    #import random
    #heights = [random.randint(5, 50) for i in range(30)]
    ## Here's some from the above, but 'fixed'.
    from pprint import pprint as pp

    heights = [45, 7, 46, 34, 12, 12, 34, 19, 17, 41,
               28, 9, 37, 32, 30, 44, 17, 16, 44, 7,
               23, 30, 36, 5, 40, 20, 28, 42, 8, 38]

    columns, colsize, average, maxdeviation = first_fit(heights)
    printit(columns, colsize, average, maxdeviation)
    while 1:
        swapped, maxdeviation = swap1(columns, colsize, average, maxdeviation)
        printit(columns, colsize, average, maxdeviation)
        if not swapped:
            break
        #input('Paused: ')

The output:

columns
[[45, 12, 17, 28, 32, 17, 44, 5, 40, 8, 38],
 [7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
 [46, 34, 9, 37, 44, 30, 20, 28]]
colsize: [286, 267, 248]
average, maxdeviation: 267.0 19.0
deviations: [19.0, 0.0, 19.0]

columns
[[45, 12, 17, 28, 17, 44, 5, 40, 8, 38, 9],
 [7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
 [46, 34, 37, 44, 30, 20, 28, 32]]
colsize: [263, 267, 271]
average, maxdeviation: 267.0 4.0
deviations: [4.0, 0.0, 4.0]

columns
[[45, 12, 17, 17, 44, 5, 40, 8, 38, 9, 34],
 [7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
 [46, 37, 44, 30, 20, 28, 32, 28]]
colsize: [269, 267, 265]
average, maxdeviation: 267.0 2.0
deviations: [2.0, 0.0, 2.0]

columns
[[45, 12, 17, 17, 44, 5, 8, 38, 9, 34, 37],
 [7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
 [46, 44, 30, 20, 28, 32, 28, 40]]
colsize: [266, 267, 268]
average, maxdeviation: 267.0 1.0
deviations: [1.0, 0.0, 1.0]

columns
[[45, 12, 17, 17, 44, 5, 8, 38, 9, 34, 37],
 [7, 34, 12, 19, 41, 30, 16, 7, 23, 36, 42],
 [46, 44, 30, 20, 28, 32, 28, 40]]
colsize: [266, 267, 268]
average, maxdeviation: 267.0 1.0
deviations: [1.0, 0.0, 1.0]

Nice problem.

Heres the info on reverse-sorting mentioned in my separate comment below.

>>> h = sorted(heights, reverse=1)
>>> h
[46, 45, 44, 44, 42, 41, 40, 38, 37, 36, 34, 34, 32, 30, 30, 28, 28, 23, 20, 19, 17, 17, 16, 12, 12, 9, 8, 7, 7, 5]
>>> columns, colsize, average, maxdeviation = first_fit(h)
>>> printit(columns, colsize, average, maxdeviation)
columns
[[46, 41, 40, 34, 30, 28, 19, 12, 12, 5],
 [45, 42, 38, 36, 30, 28, 17, 16, 8, 7],
 [44, 44, 37, 34, 32, 23, 20, 17, 9, 7]]
colsize: [267, 267, 267]
average, maxdeviation: 267.0 0.0
deviations: [0.0, 0.0, 0.0]

If you have the reverse-sorting, this extra code appended to the bottom of the above code (in the 'if name == ...), will do extra trials on random data:

for trial in range(2,11):
    print('\n## Trial %i' % trial)
    heights = [random.randint(5, 50) for i in range(random.randint(5, 50))]
    print('Pictures:',len(heights))
    columns, colsize, average, maxdeviation = first_fit(heights)
    print('average %7.3f' % average, '\nmaxdeviation:')
    print('%5.2f%% = %6.3f' % ((maxdeviation * 100. / average), maxdeviation))
    swapcount = 0
    while maxdeviation:
        swapped, maxdeviation = swap1(columns, colsize, average, maxdeviation)
        if not swapped:
            break
        print('%5.2f%% = %6.3f' % ((maxdeviation * 100. / average), maxdeviation))
        swapcount += 1
    print('swaps:', swapcount)

The extra output shows the effect of the swaps:

## Trial 2
Pictures: 11
average  72.000 
maxdeviation:
 9.72% =  7.000
swaps: 0

## Trial 3
Pictures: 14
average 118.667 
maxdeviation:
 6.46% =  7.667
 4.78% =  5.667
 3.09% =  3.667
 0.56% =  0.667
swaps: 3

## Trial 4
Pictures: 46
average 470.333 
maxdeviation:
 0.57% =  2.667
 0.35% =  1.667
 0.14% =  0.667
swaps: 2

## Trial 5
Pictures: 40
average 388.667 
maxdeviation:
 0.43% =  1.667
 0.17% =  0.667
swaps: 1

## Trial 6
Pictures: 5
average  44.000 
maxdeviation:
 4.55% =  2.000
swaps: 0

## Trial 7
Pictures: 30
average 295.000 
maxdeviation:
 0.34% =  1.000
swaps: 0

## Trial 8
Pictures: 43
average 413.000 
maxdeviation:
 0.97% =  4.000
 0.73% =  3.000
 0.48% =  2.000
swaps: 2

## Trial 9
Pictures: 33
average 342.000 
maxdeviation:
 0.29% =  1.000
swaps: 0

## Trial 10
Pictures: 26
average 233.333 
maxdeviation:
 2.29% =  5.333
 1.86% =  4.333
 1.43% =  3.333
 1.00% =  2.333
 0.57% =  1.333
swaps: 4

Quintet answered 16/4, 2011 at 21:3 Comment(3)

You could improve on the above swap1 function by also allowing a picture to be moved from one column to another as well as swapping. As it stands, there will always be the same number of pictures in a column as found by the first-fit algorithm. If there was one exceptionally tall picture then this could skew things. – Quintet 16/4, 2011 at 21:9

I just found that you get a perfect distribution with this data if you use first_fit on the reverse-sorted heights. I doubt this will always be the case though. Maybe we should always reverse-sort the picture heights? – Quintet 16/4, 2011 at 21:15

Just looked beyond the crappy wikipedia article to developerfusion.com/article/5540/bin-packing where I think what I have called first_fit might be better called worst_fit from their examples. – Quintet 17/4, 2011 at 7:29

This is the offline makespan minimisation problem, which I think is equivalent to the multiprocessor scheduling problem. Instead of jobs you have images, and instead of job durations you have image heights, but it's exactly the same problem. (The fact that it involves space instead of time doesn't matter.) So any algorithm that (approximately) solves either of them will do.

Brillatsavarin answered 11/4, 2011 at 20:3 Comment(1)

Yep. And here's a link to an algorithm: columbia.edu/~cs2035/courses/ieor6400.F07/hs.pdf – Chroma 14/4, 2011 at 1:24

Here's an algorithm (called First Fit Decreasing) that will get you a very compact arrangement, in a reasonable amount of time. There may be a better algorithm but this is ridiculously simple.

Sort the images in order from tallest to shortest.
Take the first image, and place it in the shortest column. (If multiple columns are the same height (and shortest) pick any one.)
Repeat step 2 until no images remain.

When you're done, you can re-arrange the elements in the each column however you choose if you don't like the tallest-to-shortest look.

Differentiate answered 11/4, 2011 at 21:34 Comment(2)

thanks clint, I've tried your solution, and in some cases the results are not satisfactory, e.g. when the columns have already very similar lengths after you added say 20 images, and there's only one more to add, then that last one will stick out very noticibly.. – Cinchonism 13/4, 2011 at 8:36

This is a an NP-hard (or I believe even NP-Complete) problem to find the absolute best packing arrangement. Given a small number of elements you can certainly brute-force a best solution. Simply get all of the permutations of the images, partition into 3 buckets, calculate the heights, save the best-fit partition/permutation. – Differentiate 14/4, 2011 at 14:20

Here's one:

 // Create initial solution
 <run First Fit Decreasing algorithm first>
 // Calculate "error", i.e. maximum height difference
 // after running FFD
 err = (maximum_height - minimum_height)
 minerr = err

 // Run simple greedy optimization and random search
 repeat for a number of steps: // e.g. 1000 steps
    <find any two random images a and b from two different columns such that
     swapping a and b decreases the error>
    if <found>:
         swap a and b
         err = (maximum_height - minimum_height)
         if (err < minerr):
              <store as best solution so far> // X
    else:
         swap two random images from two columns
         err = (maximum_height - minimum_height)

 <output the best solution stored on line marked with X>

Gemina answered 14/4, 2011 at 3:31 Comment(0)

Recommended topics

Hot tags