arc4random() and arc4random_uniform() not really random?
Asked Answered
I

3

14

I have been using arc4random() and arc4random_uniform() and I always had the feeling that they wasn't exactly random, for example, I was randomly choosing values from an Array but often the values that came out were the same when I generated them multiple times in a row, so today I thought that I would use an Xcode playground to see how these functions are behaving, so I first tests arc4random_uniform to generate a number between 0 and 4, so I used this algorithm :

import Cocoa

var number = 0

for i in 1...20 {
    number = Int(arc4random_uniform(5))
}

And I ran it several times, and here is how to values are evolving most of the time :
enter image description here enter image description here

So as you can see the values are increasing and decreasing repeatedly, and once the values are at the maximum/minimum, they often stay at it during a certain time (see the first screenshot at the 5th step, the value stays at 3 during 6 steps, the problem is that it isn't at all unusual, the function actually behaves in that way most of the time in my tests.

Now, if we look at arc4random(), it's basically the same :
enter image description here enter image description here

So here are my questions :

  • Why is this function behaving in this way ?
  • How to make it more random ?

Thank you.

EDIT :
Finally, I made two experiments that were surprising, the first one with a real dice :
enter image description here
What surprised me is that I wouldn't have said that it was random, since I was seeing the same sort of pattern that as described as non-random for arc4random() & arc4random_uniform(), so as Jean-Baptiste Yunès pointed out, humans aren't good to see if a sequence of numbers is really random.

I also wanted to do a more "scientific" experiment, so I made this algorithm :

import Foundation

var appeared = [0,0,0,0,0,0,0,0,0,0,0]
var numberOfGenerations = 1000

for _ in 1...numberOfGenerations {
    let randomNumber = Int(arc4random_uniform(11))
    appeared[randomNumber]++
}

for (number,numberOfTimes) in enumerate(appeared) {
    println("\(number) appeard \(numberOfTimes) times (\(Double(numberOfGenerations)/Double(numberOfTimes))%)")
}

To see how many times each number appeared, and effectively the numbers are randomly generated, for example, here is one output from the console :
0 appeared 99 times.
1 appeared 97 times.
2 appeared 78 times.
3 appeared 80 times.
4 appeared 87 times.
5 appeared 107 times.
6 appeared 86 times.
7 appeared 97 times.
8 appeared 100 times.
9 appeared 91 times.
10 appeared 78 times.

So it's definitely OK 😊

EDIT #2 : I made again the dice experiment with more rolls, and it's still as surprising to me :
enter image description here

Ibnrushd answered 7/1, 2015 at 6:35 Comment(12)
Check this: #57148Needle
@Needle Aren't they using the same functions than I do?Ibnrushd
@Needle My problem isn't about how to replace elements in an array, just how to generate really random numbers. :)Ibnrushd
Oh ok... Random numbers can be in repetitive manner. It requires to handle on developer end. Check this now: #1618130Needle
A sequence of random numbers can contain a subsequence of repetitive numbers...Testator
Your question is unclear to me. You claim that "once the values are at the maximum/minimum, they often stay at it during a certain time". But the repeated value 3 in picture #1 is not the "maximum" value (and there will always be repeated values, as Jean-Baptiste said). – And what is "basically the same" in picture #3 and #4? I cannot see anything unusual there.Diphosgene
If you tossed a coin and got alternating heads and tails, would you consider that "random enough"?Happy
@Happy To you mean something like : Tail, Head, Tail, head, tail, head... ? If yes, no, but would I would NOT considered as random would be if I was getting a dozen times head, then a dozen times tail, and so on... all the time.Ibnrushd
@MartinR Basically, what I don't like about this function is that it looks like most of the numbers that are generated are close to the "generation bounds", if I say that I want a number between 0 and 10, it looks like most of the numbers will be close to 0 or 10, but never that much close to the "middle".Ibnrushd
If you generated random numbers for long enough you could expect to see that. I was making the point that seeing repeated values is expected with randomness. It doesn't take previous values into account, so each subsequent value is equally probable.Happy
@Happy Yes I understand, but I meant if it was always like this.Ibnrushd
This is a visual artifact! Try to use another kind of visualization, random points in a circle for example... You will be able to observe more easily that the distribution is "uniform". In your "1D" plotting, just try to swap between values for example take a graph and swap the line for value 6 with line for value 0, you will observe that the graph is almost the same!Testator
K
11

A true random sequence of numbers cannot be generated by an algorithm. They can only produce pseudo-random sequence of numbers (something that looks like a random sequence). So depending on the algorithm chosen, the quality of the "randomness" may vary. The quality of arc4random() sequences is generally considered to have a good randomness.

You cannot analyze the randomness of a sequence visually... Humans are very bad to detect randomness! They tend to find some structure where there is no. Nothing really hurts in your diagrams (except the rare subsequence of 6 three in-a-row, but that is randomness, sometimes unusual things happens). You would be surprised if you had used a dice to generate a sequence and draw its graph. Beware that a sample of only 20 numbers cannot be seriously analyzed against its randomness, your need much bigger samples.

If you need some other kind of randomness, you can try to use /dev/random pseudo-file, which generate a random number each time you read in. The sequence is generated by a mix of algorithms and external physical events that ay happens in your computer.

Kanzu answered 7/1, 2015 at 7:11 Comment(3)
When you told me that I would be surprised if I has used a dice... well that's exactly what I just done, and yes, I've been surprised : imgur.com/DZHfTUs If I had seen this graph and didn't knew it was a dice, I would have said that it was a bad random generator. Anyway, I'm still surprised that in my random generations, the values seem to be close to the maximum and the minimum of the range of numbers that arc4random can generate, even if I don't get exactly what it means, I might be searching like the discrepancy sequence Jeffrey Thomas is mentioning.Ibnrushd
I made an algorithm to see how many times each number appears, and that's definitely random ! For example, for 1000 generations of numbers between 0 and 10, I had : 0 appeard 101 times (9.9009900990099%) 1 appeard 87 times (11.4942528735632%) 2 appeard 79 times (12.6582278481013%) 3 appeard 90 times (11.1111111111111%) 4 appeard 95 times (10.5263157894737%) 5 appeard 106 times (9.43396226415094%) 6 appeard 90 times (11.1111111111111%) 7 appeard 84 times (11.9047619047619%) 8 appeard 78 times (12.8205128205128%) 9 appeard 106 times (9.43396226415094%) 10 appeard 84 times (11.9047619047619%)Ibnrushd
Be really careful, testing against randomness it a true challenge! May I suggest you to read en.wikipedia.org/wiki/Randomness_tests. There is a lot of software suite to test this...Testator
A
4

It depends on what you mean when you say random.

As stated in the comments, true randomness is clumpy. Long strings of repeats or close values are expected.

If this doesn't fit your requirement, then you need to better define your requirement.

Other options could include using a shuffle algorithm to dis-order things in an array, or use an low-discrepancy sequence algorithm to give a equal distribution of values.

Assignor answered 7/1, 2015 at 7:21 Comment(2)
I am not getting exactly what a low-discrepancy sequence is, is it that if I had used such a function in my algorithm, I would have got really fast as many 0, 1, 2, 3 and 4 ? (for my arc4random_uniform(5)) ?Ibnrushd
@TrevörAnneDenise low-discrepancy sequence has the look of random numbers, but each value is carefully picked to avoid clumpiness. To humans, it looks more random than actual randomness.Assignor
J
0

I don’t really agree with the idea of humans who are very bad to detect randomness. Would you be satisfied if you obtain 1-1-2-2-3-3-4-4-5-5-6-6 after throwing 6 couples of dices ? however the dices frequencies are perfect…

This is exactly the problem i’m encountering with arc4random or arc4random_uniform functions. I’m developing a backgammon application since many years which is based on a neural network trained by word champions players. I DO know that it plays much better than any one but many users think it is cheating. I also have doubts sometimes so I’ve decided to throw all dices by myself…

I’m not satisfied at all with arc4random, even if frequencies are OK. I always throw a couple of dices and results lead to unacceptable situations, for example : getting five consecutive double dices for the same player, waiting 12 turns (24 dices) until the first 6 occurs.

It is easy to test (C code) :

void randomDices ( int * dice1, int * dice2, int player )
{
    ( * dice1 ) = arc4random_uniform ( 6 ) ;
    ( * dice2 ) = arc4random_uniform ( 6 ) ;

    // Add to your statistics
    [self didRandomDice1:( * dice1 ) dice2:( * dice2 )  forPlayer:player] ;
}

Maybe arc4random doesn’t like to be called twice during a short time…

So I’ve tried several solutions and finally choose this code which runs a second level of randomization after arc4random_uniform :

int CFRandomDice ()
{
    int __result = -1 ;

    BOOL __found = NO ;

    while ( ! __found )
    {
        // random int big enough but not too big
        int __bigint = arc4random_uniform ( 10000 ) ;

        // Searching for the first character between '1' and '6'
        // in the string version of bigint :

        NSString * __bigString = @( __bigint ).stringValue ;

        NSInteger __nbcar = __bigString.length ;
        NSInteger __i = 0 ;

        while ( ( __i < __nbcar ) && ( ! __found ) )
        {
            unichar __ch = [__bigString characterAtIndex:__i] ;
            if ( ( __ch >= '1' ) && ( __ch <= '6' ) )
            {
                __found = YES ;
                __result = __ch - '1' + 1 ;
            }
            else
            {
                __i++ ;
            }
        }
    }

    return ( __result ) ;
}

This code create a random number with arc4random_uniform ( 10000 ), convert it to string and then searches for the first digit between ‘1’ and ‘6’ in the string.

This appeared to me as a very good way to randomize the dices because : 1/ frequencies are OK (see the statistics hereunder) ; 2/ Exceptional dice sequences occur at exceptional times.

10000 dices test:
----------
Game Stats
----------
HIM :
Total 1 = 3297
Total 2 = 3378
Total 3 = 3303
Total 4 = 3365
Total 5 = 3386
Total 6 = 3271
----------
ME :
Total 1 = 3316
Total 2 = 3289
Total 3 = 3282
Total 4 = 3467
Total 5 = 3236
Total 6 = 3410
----------
HIM doubles = 1623
ME doubles = 1648

Now I’m sure that players won’t complain…

Julieannjulien answered 3/7, 2019 at 15:16 Comment(1)
So you're pinning your intuition against all the mathematical theory and empirical statistical testing backing arc4random? Disagreeing with the observation that humans are bad at judging randomness is an opinion, not a fact-based statement. It's been very well documented.Calaboose

© 2022 - 2024 — McMap. All rights reserved.