Priming the Mersenne twister PRNG
Asked Answered
D

1

7

There seems to be some mythology around the use of mt19937, specifically that once seeded 'some' number of bits produced by the generator should be ignored so as to have only as near as is possible to pseudo randomness.

Examples of code I've seen are as follows:

boost::mt19937::result_type seed = 1234567; //taken from some entropy pool etc
boost::mt19937 prng(seed);
boost::uniform_int<unsigned int> dist(0,1000);
boost::variate_generator<boost::mt19937&,boost::uniform_int<unsigned int> > generator(prng,dist);

unsigned int skip = 10000;
while (skip--)
{
   generator();
}

//now begin using for real.
....

My questions are:

  1. Is this myth or is there some truth to it all?

  2. If it's something viable, how many bits should be ignored? as the numbers I've seen
    seem to be arbitrary

Discriminating answered 29/10, 2012 at 1:20 Comment(4)
This guy seems to imply it isn't a Myth: math.sci.hiroshima-u.ac.jp/~m-mat/MT/MT2002/emt19937ar.html -- if I read it right, it takes "some time" for a twister to clear out the initial zeros in the starting state, and if you initialize with an initial state that is mostly zero (say, with a single 32 bit value, leaving most of the state of the twister as a zero), the values will be "insufficiently random" (or, too similar to other values with a low hamming distance from your seed). This is barely above wikipedia level research, so take with a grain of salt.Pyroligneous
@Yakk Interesting so there seems to be 'something' to all this hocuspocus I keep on seeing.Discriminating
Regardless of truthiness or number of required iterations, wouldn't the right solution be to fill the entire seed with random bits from an entropy pool, instead of filling just first 32 or whatever bits?Cirrostratus
Gilly: What has this question to do with Java?Seneca
C
4

The paper referenced in the first comment, Mersenne Twister with improved initialization, isn't just some guy, he's one of the two co-authors of the paper upon which the Boost implementation is based.

The problem with using a single 32-bit integer (4 bytes) as a seed for this generator is that the internal state of the generator is 2496 bytes, according to the Boost documentation. It's not too surprising that such a small seed takes a while to propagate into the rest of the internal state of the generator, particular since the Twister isn't meant to be cryptographically secure.

To address your concern about needing to run the generator for a while to get started, you want the alternate (and explicit) constructor.

template<typename SeedSeq> explicit mersenne_twister_engine(SeedSeq &);

This is the spirit of the third comment, where you initialize with something longer than a single integer. The sequence provide comes from some generator. To use an entropy pool, write a generator as an adapter from an entropy pool, returning values from the pool as needed.

Chellean answered 8/11, 2012 at 14:48 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.