Random number generation in C++11: how to generate, how does it work? [closed]
Asked Answered
S

2

119

I recently came across new way to generate random numbers in C++11, but couldn't digest the papers that I read about it (what is that engine, maths term like distribution, "where all integers produced are equally likely").

So can anyone please explain

  • what are they?
  • what does they mean?
  • how to generate?
  • how do they work?
  • etc

You can call it all in one FAQ about random number generation.

Steelwork answered 18/8, 2011 at 20:56 Comment(3)
Asking about RNGs without knowing what is a distribution is like asking about expression parsers without knowing what an expression is... The RNG library in C++11 is aimed to people who know some statistics and have greater needs than the flat distribution generated by rand, you should have a quick look at wikipedia for some basic statistic and RNG concepts, otherwise it will be really hard to explain you the rationale of <random> and the usage of its various pieces.Cornejo
@Matteo: Hardly. A child can grasp the concept that a die produces random numbers, without understanding what a distribution is.Unwearied
@Benjamin: and that's where his understanding stops, which is just the very first step (the engines), and without even understanding why it is important that they generate a flat distribution. All the rest of the library remains a mystery without understanding distributions and other statistics concepts.Cornejo
C
160

The question is way too broad for a complete answer, but let me cherry-pick a couple of interesting points:

Why "equally likely"

Suppose you have a simple random number generator that generate the numbers 0, 1, ..., 10 each with equal probability (think of this as the classic rand()). Now you want a random number in the range 0, 1, 2, each with equal probability. Your knee-jerk reaction would be to take rand() % 3. But wait, the remainders 0 and 1 occur more often than the remainder 2, so this isn't correct!

This is why we need proper distributions, which take a source of uniform random integers and turn them into our desired distribution, like Uniform[0,2] in the example. Best to leave this to a good library!

Engines

Thus at the heart of all randomness is a good pseudo-random number generator that generates a sequence of numbers that uniformly distributed over a certain interval, and which ideally have a very long period. The standard implementation of rand() isn't often the best, and thus it's good to have a choice. Linear-congruential and the Mersenne twister are two good choices (LG is actually often used by rand(), too); again, it's good to let the library handle that.

How it works

Easy: first, set up an engine and seed it. The seed fully determines the entire sequence of "random" numbers, so a) use a different one (e.g. taken from /dev/urandom) each time, and b) store the seed if you wish to recreate a sequence of random choices.

#include <random>

typedef std::mt19937 MyRNG;  // the Mersenne Twister with a popular choice of parameters
uint32_t seed_val;           // populate somehow

MyRNG rng;                   // e.g. keep one global instance (per thread)

void initialize()
{
  rng.seed(seed_val);
}

Now we can create distributions:

std::uniform_int_distribution<uint32_t> uint_dist;         // by default range [0, MAX]
std::uniform_int_distribution<uint32_t> uint_dist10(0,10); // range [0,10]
std::normal_distribution<double> normal_dist(mean, stddeviation);  // N(mean, stddeviation)

...And use the engine to create random numbers!

while (true)
{
  std::cout << uint_dist(rng) << " "
            << uint_dist10(rng) << " "
            << normal_dist(rng) << std::endl;

}

Concurrency

One more important reason to prefer <random> over the traditional rand() is that it is now very clear and obvious how to make random number generation threadsafe: Either provide each thread with its own, thread-local engine, seeded on a thread-local seed, or synchronize access to the engine object.

Misc

  • An interesting article on TR1 random on codeguru.
  • Wikipedia has a good summary (thanks, @Justin).
  • In principle, each engine should typedef a result_type, which is the correct integral type to use for the seed. I think I had a buggy implementation once which forced me to force the seed for std::mt19937 to uint32_t on x64, eventually this should be fixed and you can say MyRNG::result_type seed_val and thus make the engine very easily replaceable.
Collette answered 18/8, 2011 at 21:41 Comment(8)
Once again, Kerrek beats me to the punch with a much better answer than the one I was working on. +1Trefor
@Justin: I'm sure I've missed out on a ton of things, do feel free to add further aspects to this topic! :-)Collette
the only thing I would add is a link to the listing of available engines and distributions in C++0x.Trefor
For the "populate somehow" part, I think std::random_device is worth mentioning rather than /dev/urandomEncarnacion
An example of std::random_device can be found here.Jacksnipe
@Cubbi: Thanks for the tip. Looks like GCC 4.7 implements std::random_device precisely by reading /dev/urandom, and consequently it fails on Windows (where this doesn't exist). I imagine that a good library implementation will eventually provide whichever platform-specific source of "true randomness" is available via std::random_device.Collette
Check out this presentation on the subject linkCasey
The code in the Wikipedia article is buggy. random and random2 are identical. From the comments in the code snippet it is clear the author doesn't understand how to use the features in <random>.Radioman
A
5

A random number generator is a equation that, given a number, will give you a new number. Typically you either provide the first number or its pulled from something like the system time.

Each time you ask for a new number it uses the previous number to execute the equation.

A random number generator is not considered very good if it has a tendency to produce the same number more often than other numbers. i.e. if you wanted a random number between one and 5 and you had this distribution of numbers:

  • 1: 1%
  • 2: 80%
  • 3: 5%
  • 4: 5%
  • 5: 9%

2 is generated FAR more often than any other number, so it is more likely to be produced than other numbers. If all numbers were equally like you would have a 20% chance of getting each number every time. To say it another way, the above distribution is very uneven because 2 is favored. A distribution with all 20%'s would be even.

Typically, if you want a true random number you would pull data from something like weather or some other natural source rather than a random number generator.

Avertin answered 18/8, 2011 at 21:4 Comment(4)
Most random number generators do generate good even distributions. They are just not random; the problem is they are calculated and thus you can guess the next number given enough number in the sequence (this fact makes them bad for security where truly random numbers are required). For games and stuff you should be fine.Operose
I'm pretty sure the OP is asking for specific information about the facilities provided in the C++ <random> header. This answer doesn't even address programming let alone C++.Unwearied
@Martin: Security doesn't necessarily require a source of truly random numbers. AES in counter mode (for one example) can do quite nicely even though it's deterministic. It requires a reasonable amount of entropy in the key, but not any true randomness.Tzar
@Benjamin Lindley: Nevermind. Just reread and realized I was wrong.Avertin

© 2022 - 2024 — McMap. All rights reserved.