How do I generate a uniform random integer partition?
Asked Answered
E

7

26

A Google search reveals plenty about generating all possible partitions of an integer n into m parts, but I haven't found anything about sampling a uniformly distributed random partition of n into m parts.

Elsey answered 29/1, 2010 at 10:57 Comment(9)
Maybe I'm missing something. Why not just do m uniformly distributed cuts (over the remaining possible cut-points)? You might be able to optimize a little, but probably not a lot.Parricide
@Parricide It's not entirely clear to me what algorithm you are suggesting. Could you be more specific? Also, of the possible interpretations I can think of for your suggestion, some of them seem like they might plausibly result in uniform distribution, but others do not.Fluellen
opencover: To clarify, you mean an algorithm that's equivalent to (a) generate all possible partitions; (b) choose one at random. But hopefully much faster. Right?Phlegmatic
There is some ambiguity in the question. For instance, does the order matter? Are zeroes allowed?Dalmatia
@Moron Generally, when working with partitions, it is accepted that order does not matter and zeros are not allowed.Fluellen
@Beta: say n=9 and m=3. Then choosing random cut-points, there are several ways to get the partition {1,1,7}, but only one way to get the partition {3,3,3}, so it will not be uniform.Celie
The answer of Beta is only valid if the order matters.Venal
Well, when n is large and n/m = k where k is a small integer, you can use the same trick that I just posted here -- https://mcmap.net/q/536566/-what-is-the-algorithm-for-generating-a-random-deterministic-finite-automata Or you can check section 7 of this paper -- arxiv.org/pdf/1504.06238v1.pdfExsiccate
Related: stats.stackexchange.com/q/497858/2921, cs.stackexchange.com/q/154784/755Deathful
P
11

Here is some code that does it. This is O(n2) the first time you call it, but it builds a cache so that subsequent calls are O(n).

import random

cache = {}

def count_partitions(n, limit):
    if n == 0:
        return 1
    if (n, limit) in cache:
        return cache[n, limit]
    x = cache[n, limit] = sum(count_partitions(n-k, k) for k in range(1, min(limit, n) + 1))
    return x

def random_partition(n):
    a = []
    limit = n
    total = count_partitions(n, limit)
    which = random.randrange(total)
    while n:
        for k in range(1, min(limit, n) + 1):
            count = count_partitions(n-k, k)
            if which < count:
                break
            which -= count
        a.append(k)
        limit = k
        n -= k
    return a

How this works: We can calculate how many partitions of an integer n there are in O(n2) time. As a side effect, this produces a table of size O(n2) which we can then use to generate the kth partition of n, for any integer k, in O(n) time.

So let total = the number of partitions. Pick a random number k from 0 to total - 1. Generate the kth partition.

Phlegmatic answered 29/1, 2010 at 17:24 Comment(4)
So count_partitions(n, limit) counts the number of partitions of n into parts less than or equal to limit. Okay, I see how you're counting that. And then, given a bijection between the integers 1,...,count_partitions(n,n) and the partitions of n, random_partition chooses one of those integers and constructs the corresponding partition. Of course, this solution doesn't quite address the question, which asked for a random partition of n into exactly m parts. I suppose I should have made that clear in the title. However, I can definitely come up with what I need based on this.Elsey
Oh, I'm sorry. I misread the question! Anyway, all I did was take some code I already had for generating all partitions, make two copies, and turn one copy into the counting function and the other copy into the kth-partition-constructing function. The exact same approach should work for your problem.Phlegmatic
Oops! Never got around to marking this as answered. Sorry about that.Elsey
This algorithm is found in this paper: Uniform random integer partitionKestrel
T
26

The title of this post is a bit misleading. A random integer partition is by default unrestricted, meaning it can have as many parts of any size. The specific question asked is about partitions of n into m parts, which is a type of restricted integer partition.

For generating unrestricted integer partitions, a very fast and simple algorithm is due to Fristedt, in a paper called The Structure of Random Partitions of Large Integer (1993). The algorithm is as follows:

  1. Set x = exp(-pi/sqrt(6n) ).
  2. Generate independent random variables Z(1), Z(2), ..., Z(n), where Z(i) is geometrically distributed with parameter 1-x^i.
  3. IF sum i*Z(i) = n, where the sum is taken over all i=1,2,...,n, then STOP.
    ELSE, repeat 2.

Once the algorithm stops, then Z(1) is the number of 1s, Z(2) is the number of 2s, etc., in a partition chosen uniformly at random. The probability of accepting a randomly chosen set of Z's is asymptotically 1/(94n^3)^(1/4), which means one would expect to run this algorithm O(n^(3/4)) times before accepting a single sample.

The reason I took the time to explain this algorithm is because it applies directly to the problem of generating a partition of n into exactly m parts. First, observe that

The number of partitions of n into exactly m parts is equal to the number of partitions of n with largest part equal to m.

Then we may apply Fristedt's algorithm directly, but instead of generating Z(1), Z(2), ..., Z(n), we can generate Z(1), Z(2), ..., Z(m-1), Z(m)+1 (the +1 here ensures that the largest part is exactly m, and 1+Z(m) is equal in distribution to Z(m) conditional on Z(m)>=1) and set all other Z(m+1), Z(m+2), ... equal to 0. Then once we obtain the target sum in step 3 we are also guaranteed to have an unbiased sample. To obtain a partition of n into exactly m parts simply take the conjugate of the partition generated.

The advantage this has over the recursive method of Nijenhuis and Wilf is that there is no memory requirements other than to store the random variables Z(1), Z(2), etc. Also, the value of x can be anything between 0 and 1 and this algorithm is still unbiased! Choosing a good value of x, however, can make the algorithm much faster, though the choice in Step 1 is nearly optimal for unrestricted integer partitions.

If n is really huge and Fristedt's algorithm takes too long (and table methods are out of the question), then there are other options, but they are a little more complicated; see my thesis https://sites.google.com/site/stephendesalvo/home/papers for more info on probabilistic divide-and-conquer and its applications.

Tillage answered 7/11, 2013 at 6:46 Comment(2)
@stephen-desalvo do you know, from the top of your head, a nice algorithm to generate set partitions? I need to generate plenty of random set partitions, where the set size is a) 64, b) 128.Dora
Set partitions are an example of an assembly: arxiv.org/pdf/1308.3279.pdf One uses Poisson(\lambda_i) in place of Geometric, where \lambda_i = x^i/i!, for any x >0, with x satisfying: x*e^x = n being optimal. An extremely fast algorithm using PDC deterministic second half appears in: arxiv.org/pdf/1411.6698.pdf Section 8.7Tillage
P
11

Here is some code that does it. This is O(n2) the first time you call it, but it builds a cache so that subsequent calls are O(n).

import random

cache = {}

def count_partitions(n, limit):
    if n == 0:
        return 1
    if (n, limit) in cache:
        return cache[n, limit]
    x = cache[n, limit] = sum(count_partitions(n-k, k) for k in range(1, min(limit, n) + 1))
    return x

def random_partition(n):
    a = []
    limit = n
    total = count_partitions(n, limit)
    which = random.randrange(total)
    while n:
        for k in range(1, min(limit, n) + 1):
            count = count_partitions(n-k, k)
            if which < count:
                break
            which -= count
        a.append(k)
        limit = k
        n -= k
    return a

How this works: We can calculate how many partitions of an integer n there are in O(n2) time. As a side effect, this produces a table of size O(n2) which we can then use to generate the kth partition of n, for any integer k, in O(n) time.

So let total = the number of partitions. Pick a random number k from 0 to total - 1. Generate the kth partition.

Phlegmatic answered 29/1, 2010 at 17:24 Comment(4)
So count_partitions(n, limit) counts the number of partitions of n into parts less than or equal to limit. Okay, I see how you're counting that. And then, given a bijection between the integers 1,...,count_partitions(n,n) and the partitions of n, random_partition chooses one of those integers and constructs the corresponding partition. Of course, this solution doesn't quite address the question, which asked for a random partition of n into exactly m parts. I suppose I should have made that clear in the title. However, I can definitely come up with what I need based on this.Elsey
Oh, I'm sorry. I misread the question! Anyway, all I did was take some code I already had for generating all partitions, make two copies, and turn one copy into the counting function and the other copy into the kth-partition-constructing function. The exact same approach should work for your problem.Phlegmatic
Oops! Never got around to marking this as answered. Sorry about that.Elsey
This algorithm is found in this paper: Uniform random integer partitionKestrel
K
5

Another algorithm from Combinatorial Algorithms page 52, "Random Generation of n into k parts"

  1. Choose a1, a2, .. , ak-1 a random k-1 subset of {1,2,..,n+k-1} (see below 1., 2.)
  2. Set r1 = a1-1; rj = aj - aj-1-1 (j=2..k-1); rk = n+k-1- ak-1
  3. The rj (j=1..k) constitute the random partition of n into k parts

This algorithm for random compositions is based on the "balls-in-cells" model.

Briefly we choose the posiitons of the cell boundaries at random, then by differencing we find out how many balls are in each cell.

For efficiently generating a random subset of a set, see a 1. related answer here and 2. here

update

Another approach using a single random number in [0,1] to uniformly generate a random partition (also called composition) is given in IVAN STOJMENOVIC, "ON RANDOM AND ADAPTIVE PARALLEL GENERATION OF COMBINATORIAL OBJECTS" (section 5, section 10)

enter image description here

Kestrel answered 27/8, 2015 at 2:30 Comment(0)
M
1

Just one more version in c#.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;

namespace ConsoleApplication6
{
    class Program
    {
        static Random random = new Random();

        static void Main(string[] args)
        {
            PrintPartition(GetUniformPartition(24, 5));
            PrintPartition(GetUniformPartition(24, 5));
            PrintPartition(GetUniformPartition(24, 5));
            PrintPartition(GetUniformPartition(24, 5));
            PrintPartition(GetUniformPartition(24, 5));
            Console.ReadKey();
        }

        static int[] GetUniformPartition(int input, int parts)
        {
            if(input<= 0 || parts <= 0)
                throw new ArgumentException("invalid input or parts");
            if (input < MinUniformPartition(parts))
                throw new ArgumentException("input is to small");

            int[] partition = new int[parts];
            int sum = 0;
            for (int i = 0; i < parts-1; i++)
            {
                int max = input - MinUniformPartition(parts - i - 1) - sum;
                partition[i] = random.Next(parts - i, max);
                sum += partition[i];
            }
            partition[parts - 1] = input - sum; // last 
            return partition;
        }

        // sum of 1,2,3,4,..,n
        static int MinUniformPartition(int n)
        {
            return n * n - 1;
        }

        static void PrintPartition(int[] p)
        {
            for (int i = 0; i < p.Length; i++)
            {
                Console.Write("{0},", p[i]);
            }
            Console.WriteLine();
        }
    }
}

This code will produce next output:

5,8,7,2,2,
6,6,7,2,3,
5,7,6,2,4,
6,4,3,2,9,
7,8,4,4,1,
Mahlstick answered 26/6, 2010 at 3:45 Comment(0)
R
1

I have an evenly distributed partition generator.

Where n := the integer to be partitioned, r:= the number of slices: The algorithm is a patched version of the naive method of simply inserting partings at random. The problem with this method, as it appeared to me when I looked at its output, was that scenarios where partings are placed in the same spot are less likely to occur. There is only one way to get {1,1,1}, while there are 3! ways of getting {2,4,9}, any of {4,2,9},{2,4,9},{9,4,2}... will lead to the same partition placement when sorted. This has been amended by providing additional explicit opportunities for repeats. For each parting insertion, there's a chance that the position of the parting wont be random, but will be selected as a repeat of a formerly selected value. This balances the uneven probability distribution of the naive method right out.

I have proved by exhaustion that each partitioning is perfectly equally likely for r = 3, n = 2. I cbf proving it for higher values but healfhearted ventures to do so found only promising signs. I also tested it on random input, finding that it is at least roughly even for every values I tried[but probably perfectly even].

here it is in C++11: [the output format is different to what you're expecting, it's the positions of the partings rather than the size of the space between them. The conversion is easy, though]

#include <vector>
#include <algorithm>
#include <random>
#include <cassert>
template <typename Parting, typename Seed>
vector<Parting> partitionGen(unsigned nparts, unsigned bandw, Seed seed){//nparts is the number of parts, that is, one greater than the number of dividers listed in the output vector. Bandw is the integer being partitioned.
    assert(nparts > 0);
    vector<Parting> out(nparts-1);
    srand(seed);
    unsigned genRange = bandw;
    for(auto i=out.begin(); i<out.end(); ++i, ++genRange){
        unsigned gen = rand()%genRange;
        *i = ((gen<bandw)?
            gen:
            *(i-(gen-bandw+1)));
    }
    sort(out.begin(), out.end(), less<Parting>());
    return out;
}

I don't like the fact that I have to sort it though. If Vlody's version has an even distribution, it appears that it'd be better.

Romona answered 26/5, 2012 at 0:33 Comment(0)
F
0

After some googling I found an algorithm for this in the "Handbook of Applied Algorithms," which Google Books has indexed. The algorithm is given in section 1.12.2, on page 31.

Fluellen answered 29/1, 2010 at 17:19 Comment(1)
Yeah, I came across that, too. It doesn't generate partitions with exactly m parts, and it assumes that RP(n,m) (equivalent to Jason's count_partitions(n, limit), with limit = m) has already been computed. I'm thinking less work needs to be done to compute the number of partitions of n into m parts.Elsey
I
0

I have implemented the above solution and found that it works very well if one wants to calculate integer partitions for n but not with respect to m. If working with large n, recursion limits and call stacks may need to be increased a lot.

However, you don't need the first function because count_partitions(n, limit) will actually equal the number of partitions of 'n+limit' with 'limit' number of parts. Some mathematical software have very fast functions for finding the number of partition of n into m parts.

I have recently derived a definitely unbiased, very simple, and very fast method (using memoization) to solve your exact question: An algorithm for randomly generating integer partitions of a particular length, in Python?

It's based on knowing something about lexically ordered partitions of n having m parts and uses a similar approach to well-accepted algorithms (e.g. Nijenhuis and Wilf 1978) that find random partitions of n, and is conceptually similar to the above.

In short, if there are x partitions of n with m parts, then we choose a random number between 1 and x. That random number will code for one and only one partition satisfying n and m. I hope this helps.

Inositol answered 25/4, 2012 at 7:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.