Understanding the Poisson distribution of a random number generator
Asked Answered
D

1

6

I'm working with the random number generator available within C++11. At the moment, I'm using a uniform distribution, which should give me an equal probability to get any number within the range A & B which I specify.

However, I'm confused about generating Poisson distributions. While I understand how to determine the Poisson probability, I don't understand how a random series of numbers can be "distributed" based on the Poisson distribution.

For instance, the C++11 constructor for a Poisson distribution takes one argument -- λ, which is the mean of the distribution

std::tr1::poisson_distribution<double> poisson(7.0);
std::cout << poisson(eng) << std::endl;

In a Poisson probability problem, this is equal to the expected number of successes / occurrences during a given interval. However, I don't understand what it represents in this instance. What is a "success" / "occurrence" in a random number scenario?

I appreciate any assistance or reference materials which I can use to help me understand this.

Diphenylhydantoin answered 22/2, 2012 at 9:4 Comment(2)
Part of the issue here may be that I do not completely understand the purpose of a Poisson distribution. My statistics / probability texts discuss determining the Poisson probability, but provide nothing regarding generating numbers within a Poisson distribution. I don't have an actual application at the moment.. I'm really just curious as to how this works.Diphenylhydantoin
A sample implementation could calculate for each value the probability it occurs, and then calculate ranges based on these values to translate a uniform distribution to Poisson. e.g. for &lambda; == 2 we have 13% chance for 0, 27% chance for 1, 27% chance for 2... Then we generate a good old uniform random number between 0.0 and 1.0. If this number is <= 0.13 return 0. Is it <= 0.40 return 1. Is it <= 0.67 return 2 etc...Skidway
S
3

The probability of a Poisson distribution is the chance a specific value occurs. Imagine you want to calculate how many cars pass a certain point each day. This value will be more some days, but less on other days. But when keeping track of this over a serious amount of time, a mean will start to emerge, with values in its vicinity occurring more often, and values further away (0 cars per day or a tenfold amount) being less likely. λ is that mean that emerged.

When reflecting this to RNG's, the algorithm would return you the amount of cars that passed on a random day (which is selected uniformly). As you can imagine the mean value λ is more likely to emerge, and the extremes are least likely to pop up.

The following link has an example of the distribution Poisson has, showing the discrete results you acquire, and the chance each of them has of occurring:

http://www.mathworks.com/help/toolbox/stats/brn2ivz-127.html

A sample implementation could calculate for each value the probability it occurs, and then calculate ranges based on these values to translate a uniform distribution to Poisson. e.g. for λ == 2 we have 13% chance for 0, 27% chance for 1, 27% chance for 2... Then we generate a good old uniform random number between 0.0 and 1.0. If this number is <= 0.13 return 0. Is it <= 0.40 return 1. Is it <= 0.67 return 2 etc...

Skidway answered 22/2, 2012 at 9:31 Comment(5)
Ok -- this is similar to what I expected. However, what determines the "range" of the output numbers? For instance, if the mean is 75, we can have two numbers, such as 50 and 150, or 74 and 76. Both of these average to 75, but the range between 50 and 150 is significantly greater. In addition, what determines how many samples are required for the mean to begin to emerge?Diphenylhydantoin
50 and 150 each have a much lower chance of occurring than 74 and 76. Though the mean of their range is 75, you should look at the numbers on their own and how they relate to the mean. iow how high is the chance they occur in the experiment? How high is the chance 50 cars passed on a day, or 150 compared to how high is the chance 74 or 76 cars passed. Don't confuse the mean with the average. The mean is the number that occurs 'in the middle' of all results, not the average of all outputs.Skidway
Regarding how many samples you need for the mean to emerge, this depends on the mean as this is a discrete distribution. E.g. if your mean is 2, it will occur much quicker (27% chance RNG returns 2) than when your mean is 1050 (1.2% chance RNG returns 1050). With a &lambda; of 75, the probability of 75 emerging is about 4%.Skidway
Thanks for all of the help oddstar -- the "sample implementation" you provided in the comments of the question REALLY cleared everything up for me. I was missing that link in my mind -- despite how perhaps it should have been obvious to me.Diphenylhydantoin
I should have figured that would have been the clearest from the beginning, but anyway... glad it made you see the light ;)Skidway

© 2022 - 2024 — McMap. All rights reserved.