Why is Google Chrome's Math.random number generator not *that* random?
Asked Answered
M

3

21

I ran into an odd "bug" today when I was running some unit tests in various browsers. I had run the tests in Firefox many times before today, and even IE but apparently not Chrome (v19-dev) yet. When I ran them in Chrome it consistently failed one test because two values I was calculating did not match.

When I really dug into what was happening I realized that the issue was that I was assuming that if I filled an array with 100,000 Math.random() values that they would all be unique (there wouldn't be any collisions). Turned out that in Chrome that is not true.

In Chrome I was consistently getting at least two pairs of values that matched out of 100,000. Firefox and IE9 never experience a collision. Here is a jsfiddle I wrote just for testing this that creates 1M Math.random() entries in an array: http://jsfiddle.net/pseudosavant/bcduj/

Does anyone know why the Chrome pseudo-random number generator that is used for Math.random is really not that random? It seems like this could have implications for any client-side js encryption routines that ever use Math.random.

Milore answered 3/3, 2012 at 23:29 Comment(14)
The existence of duplicates doesn't imply non-randomness. en.wikipedia.org/wiki/Birthday_paradox.Inject
And of course, a pseudo-random generator is, by definition, not at all random.Inject
But Math.random() is supposed to be a double-precision floating-point number in the range [0,1), so duplicates are very unlikely for a good random number generator.Reed
If you're interested, you can see how V8 generates its random numbers.Vulnerable
@Oli Charlesworth: The repeated existence of duplicates among 100k 64 bit values most certainly does imply non-randomness (or at least low-quality randomness) - it's about a one in a billion chance if I read the Probability table in the Wikipedia article you linked to correctly.Biceps
@NayukiMinase: Math.random() generates a double-precision floating-point number, but the OP's test for equality works by converting those numbers to strings first, so unless the browser decides to include more than fifteen places past the decimal in its string representations, the OP is comparing values with much less entropy than that.Cassette
@delnan: You've misread the answer that you link to. According to that answer, there are roughly 7,036,874,417,766 double-precision floating-point numbers between 100.0 and 100.1. Between 0.0 and 1.0, there are about 2-to-the-power-of-62 double-precision floating-point numbers.Cassette
@Cassette You make a good point about toString(), but that still retains high precision (although not necessarily full precision). So I would still suspect that the granularity of Math.random() is at fault here.Reed
@ruakh: Of course, those numbers are not equiprobable (if they were, it wouldn't be a uniform distribution).Inject
@OliCharlesworth: Absolutely. But you could still get a uniform distribution with 2^52 distinct, equiprobable values. Whereas, in the specific case of Chrome, see Michael Brogwardt's answer below: apparently Chrome randomly generates a 32-bit integer, then scales that down to the range [0,1), so there are only 2^32 distinct values it can generate (and those are equiprobable).Cassette
@Cassette Great point about the type conversion to a string, didn't think about that. Funny enough though the part in my code where it caused the issue the random numbers were being multiplied by 100000, so it had at least 5 more characters of data than in the jsfiddle example I included.Milore
@pseudosavant: Are you sure about that? I would have thought that stringification of 5555555.5555555555555 might include a similar number of significant figures, and fewer places past the decimal-point, than stringification of 5.5555555555555.Cassette
@Cassette Another great point. I updated my jsfiddle and sure enough multiplying by 100000 usually makes it 1 character shorter in FF/IE, but 2 in Chrome. The Math.random().toString() is usually 18 digits +/- 1 in my jsfiddle. That is characters, not digits, since it includes the . in the string length. So usually 17 digits of numbers, which will go as high as 999 quadrillion. Still shouldn't have 2 sets of collisions per 100,000.Milore
I updated the jsfiddle one more time to implement a proper function for return an array with only unique values without an implicit type conversion to a string because of it being an object key (like my previous histogram function was doing). The numbers still look the same though. In an array with 1000000 Math.random() entries there are consistently still more than 100 non-unique pairs of numbers.Milore
B
32

Apparently Math.random() in V8 only works with 32 bit values (and didn't even correctly randomize all of those in the past). And with 32 bits, the probability of a collision reaches 50% around 2^16 = 65k values...

Biceps answered 3/3, 2012 at 23:43 Comment(1)
This is addressed more directly in another bug report.Ironworker
J
4

Other answers have explained the issue. If you're after better pseudo-random number generation in JavaScript, I'd recommend this page as a good place to start:

http://baagoe.com/en/RandomMusings/javascript/

I adapted one of the algorithms on this page for a script I'm using to generate UUIDs in the browser and had no collisions in my tests.

UPDATE 22 October 2013

The pages linked to above are no longer live. Here's a link to a snapshot from the Wayback Machine:

http://web.archive.org/web/20120502223108/http://baagoe.com/en/RandomMusings/javascript/

And here's a link to a Node.js module that includes Alea.js:

https://npmjs.org/package/alea

Jdavie answered 4/3, 2012 at 0:51 Comment(3)
@BrianBallsun-Stanton: So it is. I've added a link to a snapshot of that page.Jdavie
blah that one is down now too :(Haemorrhage
@user1544793: Found another one.Jdavie
T
4

See https://medium.com/@betable/tifu-by-using-math-random-f1c308c4fd9d:

If we analyze the first sub-generator independently we see that it has 32 bits of internal state. It’s not a full-cycle generator — its actual cycle length is about 590 million (18,030*2¹⁵-1, the math is tricky but it’s explained here and here, or you can just trust me). So we can only produce a maximum of 590 million distinct request identifiers with this generator. If they were randomly selected there would be a 50% chance of collision after generating just 30,000 identifiers.

Theressa answered 20/11, 2015 at 21:4 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.