Getting N random numbers whose sum is M
Asked Answered
A

9

53

I want to get N random numbers whose sum is a value.

For example, let's suppose I want 5 random numbers that sum to 1.

Then, a valid possibility is:

0.2 0.2 0.2 0.2 0.2

Another possibility is:

0.8 0.1 0.03 0.03 0.04

And so on. I need this for the creation of a matrix of belongings for Fuzzy C-means.

Aspic answered 14/4, 2010 at 18:37 Comment(2)
Possible duplicate of Random numbers that add to 100: MatlabZarathustra
With a uniform distribution? Nonnegative numbers? In the range [0,1]?Aliquant
L
64

Short Answer:

Just generate N random numbers, compute their sum, divide each one by the sum and multiply by M.

Longer Answer:

The above solution does not yield a uniform distribution which might be an issue depending on what these random numbers are used for. Another method proposed by Matti Virkkunen:

Generate N-1 random numbers between 0 and 1, add the numbers 0 and 1 themselves to the list, sort them, and take the differences of adjacent numbers.

This yields a uniform distribution as is explained here

Labia answered 14/4, 2010 at 18:39 Comment(7)
Then multiply by M (unless M is 1 like in the example).Goop
It's not a good randomization as increasing N would give a variance which tends to zeroWeatherly
I want to jump on the "this solution does provide well distributed answers" banwagonMilner
This is a bad answer. See this answer which proves using pretty graphs that this solution is incorrect: https://mcmap.net/q/175850/-random-numbers-that-add-to-100-matlabNestling
Please see below answer of taking the difference which is more accurateIbadan
This is incorrect. It won't yield uniformly random numbers. See my answer for the easiest and fastest way to do this.Czarist
Is there a way to generalize this to include negative numbers? AND...how about just integers?Hawkinson
S
52

Generate N-1 random numbers between 0 and 1, add the numbers 0 and 1 themselves to the list, sort them, and take the differences of adjacent numbers.

State answered 14/4, 2010 at 18:41 Comment(11)
All right, this was overly complicated. Maybe useful if someone wants to limit it to integers though (obviously using a range larger than 0 to 1)State
It is not immediately obvious to me that this will result in a normal distribution for N > 3Goop
I make no guarantees about math I don't fully understand.State
It looks like this is the only solution so far that results in a uniform distribution (unless I made a mistake verifying this, which is always possible).Limewater
Today, I encountered the same problem and I found your answer is very helpful. After some basic calculation, I could prove that all N variables are drawn by the identical PDF f(x) = M^(1 - N) (-1 + N) (M - x)^(-2 + N). Note that its mean is M/N as expected.Teamster
However, I want to comment that the PDF is not clearly uniform distribution so I don't think this method can be called uniform sampling. But in my opinion this method is the most natural way to do such task.Teamster
@Limewater Even though it seems too late, but the above my comments might be interesting to you:)Teamster
@Goop It will not result in a normal distribution.Teamster
can someone ELI5? I don't understand how this produces 3 numbers that add up to 8 for example.Merely
@chovy: To get "0 between 8" use 8 instead of 1 in the algorithm and use 3 for N. The reason it works is that it's like taking a piece of string with a set length, marking it at random places and then cutting it where the marks are. You end up with N pieces of string which must add up to the original length.State
Is there a way to do this if I have a lower limit on numbers? Numbers have to be bigger than A.Oletaoletha
L
27

I think it is worth noting that the currently accepted answer does not give a uniform distribution:

"Just generate N random numbers, compute their sum, divide each one by the sum"

To see this let's look at the case N=2 and M=1. This is a trivial case, since we can generate a list [x,1-x], by choosing x uniformly in the range (0,1). The proposed solution generates a pair [x/(x+y), y/(x+y)] where x and y are uniform in (0,1). To analyze this we choose some z such that 0 < z < 0.5 and compute the probability that the first element is smaller than z. This probaility should be z if the distribution were uniform. However, we get

Prob(x/(x+y) < z) = Prob(x < z(x+y)) = Prob(x(1-z) < zy) = Prob(x < y(z/(1-z))) = z/(2-2z).

I did some quick calculations and it appears that the only solution so far that appers to result in a uniform distribution was proposed by Matti Virkkunen:

"Generate N-1 random numbers between 0 and 1, add the numbers 0 and 1 themselves to the list, sort them, and take the differences of adjacent numbers."

Limewater answered 15/4, 2010 at 6:19 Comment(6)
In your example, x+y = 1 so P(\frac{x}{x+y} < z) = P(x < z). The problem with your statement is P(x < y\frac{z}{1-z}) != P(x < y) P(x < \frac{z}{1-z}). If that were true and \frac{z}{1-z} = 10, then P(x < 10y) = P(x < y) P(x < 10) = P(x < y) = 1/2 but the real answer is 10/11.Axel
@Apprentice Queue: Note that I'm only analyzing the case where 0 < z < 0.5 in the text above. Your assumption \frac{z}{1-z} = 10 implies z = 10/11. Hence you can not expect that the equations hold for this case.Limewater
I don't think your analysis is correct, since normal / uniform refers to the distribution of the values, which doesn't changed when dividing the range by a constant. If the original distribution was uniform, then dividing by the sum produces a uniform distribution that adds to the sum. Likewise for normal.Shinberg
@John: but the number that you divide by depends on the random values chosen (it's not a constant), so it can affect the uniformity of the distribution. For a more obvious example if you choose a uniform random value x and then divide it by sqrt(x), the result is not uniformly distributed.Squishy
Yes, the provided solution does not provide a uniform distribution. Because you're applying a constraint to a uniform distribution which changes the distribution. So while .1 .1 .1 .1 .1 is a fine generation for the originally distribution, within this contraint, it is not. So distribution will change.Orts
Am I missing something? I know the accepted answer does not provide a normal distribution, but doesn't it provide a uniform distribution? Doesn't uniform mean each number is equally random and is no more or less likely to be be any higher or lower? 0.2 0.2 0.2 0.2 0.2 adds up to 1. It is a uniform distribution. If your target number is 57 instead of 1, take the 0.2s, divide by 1, multiply by 57... And you get 11.4 11.4 11.4 11.4 11.4, which, correct me if I'm wrong, is also a uniform distribution. People keep saying "obvious example" but none of the examples are at all obvious to me.Fritzsche
C
7

Unfortunately, a number of the answers here are incorrect if you'd like uniformly random numbers. The easiest (and fastest in many languages) solution that guarantees uniformly random numbers is just

# This is Python, but most languages support the Dirichlet.
import numpy as np
np.random.dirichlet(np.ones(n))*m

where n is the number of random numbers you want to generate and m is the sum of the resulting array. This approach produces positive values and is particularly useful for generating valid probabilities that sum to 1 (let m = 1).

Czarist answered 26/6, 2018 at 18:30 Comment(1)
I find that the sum does not always perfectly add up to M,Syllabary
L
4

To generate N positive numbers that sum to a positive number M at random, where each possible combination is equally likely:

  • Generate N exponentially-distributed random variates. One way to generate such a number can be written as—

      number = -ln(1.0 - RNDU())
    

    where ln(x) is the natural logarithm of x and RNDU() is a method that returns a uniform random variate greater than 0 and less than 1. Note that generating the N variates with a uniform distribution is not ideal because a biased distribution of random variate combinations will result. However, the implementation given above has several problems, such as being ill-conditioned at large values because of the distribution's right-sided tail, especially when the implementation involves floating-point arithmetic. Another implementation is given in another answer.

  • Divide the numbers generated this way by their sum.

  • Multiply each number by M.

The result is N numbers whose sum is approximately equal to M (I say "approximately" because of rounding error). See also the Wikipedia article Dirichlet distribution.

This problem is also equivalent to the problem of generating random variates uniformly from an N-dimensional unit simplex.


However, for better accuracy (compared to the alternative of using floating-point numbers, which often occurs in practice), you should consider generating n random integers that sum to an integer m * x, and treating those integers as the numerators to n rational numbers with denominator x (and will thus sum to m assuming m is an integer). You can choose x to be a large number such as 232 or 264 or some other number with the desired precision. If x is 1 and m is an integer, this solves the problem of generating random integers that sum to m.

The following pseudocode shows two methods for generating n uniform random integers with a given positive sum, in random order. (The algorithm for this was presented in Smith and Tromble, "Sampling Uniformly from the Unit Simplex", 2004.) In the pseudocode below—

  • the method PositiveIntegersWithSum returns n integers greater than 0 that sum to m, in random order,
  • the method IntegersWithSum returns n integers 0 or greater that sum to m, in random order, and
  • Sort(list) sorts the items in list in ascending order (note that sort algorithms are outside the scope of this answer).

 

METHOD PositiveIntegersWithSum(n, m)
    if n <= 0 or m <=0: return error
    ls = [0]
    ret = NewList()
    while size(ls) < n
      c = RNDINTEXCRANGE(1, m)
      found = false
      for j in 1...size(ls)
        if ls[j] == c
          found = true
          break
        end
      end
      if found == false: AddItem(ls, c)
    end
    Sort(ls)
    AddItem(ls, m)
    for i in 1...size(ls): AddItem(ret,
        ls[i] - ls[i - 1])
    return ret
END METHOD

METHOD IntegersWithSum(n, m)
  if n <= 0 or m <=0: return error
  ret = PositiveIntegersWithSum(n, m + n)
  for i in 0...size(ret): ret[i] = ret[i] - 1
  return ret
END METHOD

Here, RNDINTEXCRANGE(a, b) returns a uniform random integer in the interval [a, b).

Least answered 25/7, 2017 at 12:5 Comment(0)
M
3

In Java:

private static double[] randSum(int n, double m) {
    Random rand = new Random();
    double randNums[] = new double[n], sum = 0;

    for (int i = 0; i < randNums.length; i++) {
        randNums[i] = rand.nextDouble();
        sum += randNums[i];
    }

    for (int i = 0; i < randNums.length; i++) {
        randNums[i] /= sum * m;
    }

    return randNums;
}
Machute answered 14/4, 2010 at 20:2 Comment(2)
> Then multiply by M (unless M is 1 like in the example). – ILMTitan Apr 14 at 18:49Amido
randNums[i] /= sum * m; is equivalent to randNums[i] = randNums[i] / (sum * m);. This needs to be randNums[i] = randNums[i] / sum * m; so that the order of operations is correct.Reporter
L
3

Just generate N random numbers, compute their sum, divide each one by the sum.

Expanding on Guillaume's accepted answer, here's a Java function that does exactly that.

public static double[] getRandDistArray(int n, double m)
{
    double randArray[] = new double[n];
    double sum = 0;

    // Generate n random numbers
    for (int i = 0; i < randArray.length; i++)
    {
        randArray[i] = Math.random();
        sum += randArray[i];
    }

    // Normalize sum to m
    for (int i = 0; i < randArray.length; i++)
    {
        randArray[i] /= sum;
        randArray[i] *= m;
    }
    return randArray;
}

In a test run, getRandDistArray(5, 1.0) returned the following:

[0.38106150346121903, 0.18099632814238079, 0.17275044310377025, 0.01732932296660358, 0.24786240232602647]
Laruelarum answered 2/4, 2017 at 17:32 Comment(0)
A
2
  1. Generate N-1 random numbers.
  2. Compute the sum of said numbers.
  3. Add the difference between the computed sum and the desired sum to the set.

You now have N random numbers, and their sum is the desired sum.

Autoharp answered 14/4, 2010 at 19:13 Comment(1)
Except if you get the last number to be negative.Kiyokokiyoshi
N
0

You're a little slim on constraints. Lots and lots of procedures will work.

For example, are numbers normally distributed? Uniform?
I'l assume that all the numbers must be positive and uniformly distributed around the mean, M/N.

Try this.

  1. mean= M/N.
  2. Generate N-1 values between 0 and 2*mean. This can be a standard number between 0 and 1, u, and the random value is (2*u-1)*mean to create a value in an appropriate range.
  3. Compute the sum of the N-1 values.
  4. The remaining value is N-sum.
  5. If the remaining value does not fit the constraints (0 to 2*mean) repeat the procedure.
Nupercaine answered 14/4, 2010 at 18:51 Comment(1)
The "remaining value" is not uniformly chosen because the sum of (n-1) uniform randoms is not uniform.Ureide

© 2022 - 2024 — McMap. All rights reserved.