Fast pseudo random number generator for procedural content
Asked Answered
C

7

15

I am looking for a pseudo random number generator which would be specialized to work fast when it is given a seed before generating each number. Most generators I have seen so far assume you set seed once and then generate a long sequence of numbers. The only thing which looks somewhat similar to I have seen so far is Perlin Noise, but it generates too "smooth" data - for similar inputs it tends to produce similar results.

The declaration of the generator should look something like:

int RandomNumber1(int seed);

Or:

int RandomNumber3(int seedX, int seedY, int seedZ);

I think having good RandomNumber1 should be enough, as it is possible to implement RandomNumber3 by hashing its inputs and passing the result into the RandomNumber1, but I wrote the 2nd prototype in case some implementation could use the independent inputs.

The intended use for this generator is to use it for procedural content generator, like generating a forest by placing trees in a grid and determining a random tree species and random spatial offsets for each location.

The generator needs to be very efficient (below 500 CPU cycles), because the procedural content is created in huge quantities in real time during rendering.

Contingency answered 3/10, 2008 at 16:25 Comment(2)
The reason Perlin noise is similar to what you're asking for is that Perlin noise uses a deterministic (repeatable) pseudorandom function to do part of its job (and then smooths the result). If you look at a Perlin noise implementation, especially the earlier pre-"improved" ones, you will often find the type of efficient, repeatable "random" function you're looking for, though the language, domain and range will vary. E.g. RandomNumber(vec2 seed, float x, float y) { return fract(sin(dot(seed + vec2(fx, fy), vec2(12.9898,78.233))) * 43758.5453); } (GLSL ES)Odoric
I've been trying to research this question too, and have come to the conclusion that the word "generator" implies the sequential, streaming behavior that we're trying to avoid. This is why a PRNG is usually understood as providing stateful "functions", not strictly deterministic ones. Maybe we'd have better success in research if we searched for PRNF (function) rather than PRNG. blogs.unity3d.com/2015/01/07/… calls them "random hash functions."Odoric
M
22

Seems like you're asking for a hash-function rather than a PRNG. Googling 'fast hash function' yields several promising-looking results.

For example:

uint32_t hash( uint32_t a)
    a = (a ^ 61) ^ (a >> 16);
    a = a + (a << 3);
    a = a ^ (a >> 4);
    a = a * 0x27d4eb2d;
    a = a ^ (a >> 15);
    return a;
}

Edit: Yep, some hash functions definitely look more suitable than others.

For your purposes, it should be sufficient to eyeball thefunction and check that a single-bit change in the input will propagate to lots of output bits.

Mimesis answered 3/10, 2008 at 16:30 Comment(3)
I hope this is a good direction to go. At first look it seems to me while hash functions do have one important property (uniform distribution), I am not quite sure if its output can be considered "random" - how do I know for a particular function how much does its output resemble noise?Contingency
One test for a good hash function is to give it the sequence of integers 0, 1, 2.. and test the output for 'randomness' using pseudo random number generator tests.Datura
Good answer, though I disagree with "hash function rather than a PRNG." In general, hash functions are not always good random number generators (they're designed more for other properties: en.wikipedia.org/wiki/Hash_function#Properties), and the OP does need a certain quality of randomness, or his forests will look fake. That being said, some hash functions make "random-enough" PRNGs, and they're certainly deterministic as the OP is asking for.Odoric
P
10

Yep, you are looking for a fast integer hash algorithm rather than a PRNG.

This page has a few algorithms, I'm sure you'll find plenty more now you know the correct search terms.

Edit: The original page has been removed, a live version can be found on GitHub.

Parik answered 3/10, 2008 at 16:38 Comment(0)
S
4

Here's a small random number generator developed by George Marsaglia. He's an expert in the field, so you can be confident the generator has good statistical properties.

v = 36969*(v & 65535) + (v >> 16);
u = 18000*(u & 65535) + (u >> 16);
return (v << 16) + (u & 65535);

Here u and v are unsigned ints. Initialize them to any non-zero values. Each time you generate a random number, store u and v somewhere. You could wrap this in a function to match your signature above (except the ints are unsigned.)

Sexagenary answered 19/10, 2008 at 1:7 Comment(3)
Unfortunately this does not match the question. I need to provide my own U and V each time, not to have them stored somewhere and updated between iterations. The function needs to always produce the same output given the same inputs.Contingency
@Suma: Why can't you provide your own U and V each time if you just pass them as parameters to this function? And having the same U and same V each time will also always produce the same result. It matches your question exactly!Rebeckarebeka
I tried this and did not get good results. Varying u by 1 does not vary the output significantly.Noleta
D
2

see std::tr1::ranlux3, or other random number generators that are part of TR1 additions to the standard C++ library. I suggested mt19937 initialially, but then saw your note that it needs to be very fast. TR1 is should be available on Microsoft VC++ and GCC, and can also be found in the boost libraries which support even more compilers.

example adapted from boost documentation:

#include <random>
#include <iostream>
#include <iterator>
#include <functional>
#include <algorithm>
#include <ctime>
using namespace std;
using namespace std::tr1;
int main(){
    random_device trueRand;
    ranlux3 rng(trueRand);  // produces randomness out of thin air
                            // see pseudo-random number generators
    uniform_int<> six(1,6); // distribution that maps to 1..6
                            // see random number distributions
    variate_generator<ranlux3&, uniform_int<> >
           die(rng, six);   // glues randomness with mapping

    // simulate rolling a die
    generate_n( ostream_iterator<int>(cout, " "), 10, ref(die));
}

example output:

2 4 4 2 4 5 4 3 6 2

Any TR1 random number generator can seed any other random number generator. If you need higher quality results, consider feeding the output of mt19937 (which is slower, but higher quality) into a minstd_rand or randlux3, which are faster generators.

Datura answered 3/10, 2008 at 17:48 Comment(0)
D
0

If memory is not really an issue and speed is of utmost importance then you can prebuild a large array of random numbers and just iterate through it at runtime. For example have a seperate program generate 100,000 random numbers and save it as it's own file like

unsigned int randarray []={1,2,3,....}

then include that file into your compile and at runtime your random number function only needs to pull numbers from that array and loop back to the start when it hits the end.

Deviationism answered 5/10, 2008 at 0:4 Comment(4)
Computing a simple hash as in stackoverflow.com/questions/167735/#167764 will almost always be faster than accessing a huge array (huge array will not fit into the cache, therefore accessing it is slow)Contingency
I just profiled it on my PC and using my lookup table method vs the hash function, the lookup table is 13x faster.Deviationism
The lookup table can be faster when it is small enough to fit into L2 cache, and when you are not using the L2 cache for anything else - which was most likely the case in your test. If you want to test real world performance, you need to perform some significant data processing between the lookups.Contingency
A SIMD implementation of Xorshift+ can generate random numbers faster than DRAM bandwidth, as in What's the fastest way to generate a 1 GB text file containing random digits? (and that includes extra work to turn them into ASCII decimal digits.)Kinzer
R
0

I use the following code in my Java random number library - this has worked pretty well for me. I also use this for generating procedural content.

/**
 * State for random number generation
 */
private static volatile long state=xorShift64(System.nanoTime()|0xCAFEBABE);

/**
 * Gets a long random value
 * @return Random long value based on static state
 */
public static long nextLong() {
    long a=state;
    state = xorShift64(a);
    return a;
}

/**
 * XORShift algorithm - credit to George Marsaglia!
 * @param a initial state
 * @return new state
 */
public static final long xorShift64(long a) {
    a ^= (a << 21);
    a ^= (a >>> 35);
    a ^= (a << 4);
    return a;
}
Resound answered 18/11, 2009 at 21:21 Comment(0)
A
-2

tt800.h

#ifndef TT800_H
#define TT800_H

#include <stdint.h>

typedef struct
{
    uint32_t state_array[25];       // the array for the state vector 
    int state_index;                // index into state vector array
} tt_state;

uint32_t random_uint32(tt_state* state);
void seed_initial_state(tt_state* state, uint32_t seed);
void set_initial_state_key(tt_state* state, char* key);
void set_initial_state(tt_state* state, uint32_t* seed_array);

#endif

tt800.c

/* A C-program for TT800 : July 8th 1996 Version 
   by M. Matsumoto, email: m-mat @ math.sci.hiroshima-u.ac.jp

   Formatting modifications for streamlined performance and
   better readability by Robert Bristow-Johnson  April 14th 2024
   Also added additional initialization functions similar to the
   MT19937 utility.

   random_uint32(tt_state* state) generates and returns one
   pseudorandom unsigned integer (32bit) which is uniformly
   distributed from 0 to 2^32 - 1.

   seed_initial_state(tt_state* state, uint32_t seed) sets initial
   values of the PRNG state of 25 words according to a given single
   32-bit seed word.

   set_initial_state_key(tt_state* state, char* key) sets the initial
   state according to the text of a key in a given string of chars.

   set_initial_state(tt_state* state, uint32_t* seed_array) sets the
   initial state to exactly the given seed_array.

   Before random_uint32() is called, a tt_state must be created and 
   seed_initial_state() or set_initial_state_key() or
   set_initial_state() must be called once.  One may choose any
   initial 25 values except all zeros.
 
   Copyright (C) 1996, Makoto Matsumoto
   All rights reserved.

   Redistribution and use in source and binary forms, with or without
   modification, are permitted provided that the following conditions
   are met:

     1. Redistributions of source code must retain the above copyright
        notice, this list of conditions and the following disclaimer.

     2. Redistributions in binary form must reproduce the above copyright
        notice, this list of conditions and the following disclaimer in the
        documentation and/or other materials provided with the distribution.

     3. The names of its contributors may not be used to endorse or promote 
        products derived from this software without specific prior written 
        permission.

   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
   A PARTICULAR PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT
   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

   See: ACM Transactions on Modelling and Computer Simulation,
   Vol. 4, No. 3, 1994, pages 254-266.
*/

#include "tt800.h"

// Period parameters 
#define N 25
#define M 7                          // not used directly in this version
#define K 18                         // redefined from N-M
#define MATRIX_A   0x8ebfd028UL      // constant vector A


uint32_t random_uint32(tt_state* state)
{
    uint32_t* state_array = &(state->state_array[0]);
    int n = state->state_index;     // 0 <= state_index <= N-1   always

    uint32_t x = state_array[n];    // get state N iterations ago 

    uint32_t xA = x >> 1;
    if (x & 0x00000001UL) xA ^= MATRIX_A;

    int k = n - K;                  // point to state K iterations ago
    if (k < 0) k += N;              // modulo N circular indexing

    x = state_array[k] ^ xA;        // generate next value in the state
    state_array[n++] = x;           // update new state

    if (n >= N) n = 0;              // modulo N circular indexing
    state->state_index = n;
                                    // tempering
    x ^= (x << 7) & 0x2b5b2500UL;   // s and b, magic vectors
    x ^= (x << 15) & 0xdb8b0000UL;  // t and c, magic vectors
    x ^= (x >> 16);                 // this line added by Makoto Matsumoto in 1996 to improve lower bits corellation.

    return x;
}


/*
    Initializing the state_array from a seed 
*/
void seed_initial_state(tt_state* state, uint32_t seed) 
{
    uint32_t* state_array = &(state->state_array[0]);

    for (int n=0; n<N; n++)
    {
        seed = 1812433253UL * (seed ^ (seed >> 30)) + n;    // Knuth TAOCP Vol2. 3rd Ed. P.106 for multiplier.
        state_array[n] = seed;                              // suggested initial seed = 19650218UL 
    }

    state->state_index = 0;
}


/*
    Initializing the state_array from a text key.

    This is, hopefully, well scrambled but is a fully deterministic
    function of the key, which must be at least 8 characters.  If the
    key is less than 8 characters, the PRNG state is still adequately 
    initialized and the default seed is used.
*/
void set_initial_state_key(tt_state* state, char* key)  
{
    seed_initial_state(state, 19650218UL);

    uint32_t* state_array = &(state->state_array[0]);

    int offset = 0;
    int k = 0;
    while ((key[k] != '\0') && (k < 4*N))           // determine string length, stop at 4N
    {
        offset += (int)key[k++];                    // see how many times we circle around the ASCII table.
    }
 
    if (k >= 8)                                     // key must be at least 8 chars
    {
        offset &= 0x007f;
        offset += N-128;                            // N-128 <= offset <= N-1
        k -= 3;                                     // shorten reach by 3 chars

        int i = 0;
        for (int n=0; n<N; n++)
        {
            uint32_t key28bit = ( ((uint32_t)key[i++] & 0x0000007fUL) << 21) 
                              | ( ((uint32_t)key[i++] & 0x0000007fUL) << 14) 
                              | ( ((uint32_t)key[i++] & 0x0000007fUL) << 7) 
                              | ( ((uint32_t)key[i++] & 0x0000007fUL) );

            if (i >= k) i -= k;                     // wrap i around
            
            state_array[n] += 1664525UL * (key28bit + n);

            int m = n - offset;
            if (m < 0) m += N;                      // modulo N

            state_array[n] ^= 1566083941UL * (state_array[m] ^ (state_array[m] >> 30));
        }
    }

    uint32_t alt_state_array[N];

    for (int n=0; n<N; n++)
    {
        alt_state_array[n] = random_uint32(state);      // scramble it up a bit with tempered random values
    }

    for (int n=0; n<N; n++)
    {
        state_array[n] = alt_state_array[(N-1)-n];      // swap with alternate state and reverse order
    }

    for (int n=0; n<2*N; n++)
    {
        random_uint32(state);     // this should scramble it up enough to dash any hope of reverse engineering the key
    }
}

/*
    Initialization by "seed_initial_state()" is an example.
    Theoretically, there are 2^800-1 possible states as an intial
    state.  This function allows to choose any of 2^800-1 states. 
    Essential bits in "seed_array[]" are the following 800 bits: 
    seed_array[0], seed_array[1], ..., seed_array[N-1]. 
    Theoretically, seed_array[0], seed_array[1], ..., seed_array[N-1] 
    can take any values except all zeros.

    The sizeof(seed_array[]) must be at least sizeof(state_array[]) = N = 25 
*/
void set_initial_state(tt_state* state, uint32_t* seed_array)
{
    uint32_t* state_array = &(state->state_array[0]);
 
    for (int n=0; n<N; n++)
    {
        state_array[n] = seed_array[n];
    }

    state->state_index = 0;
}

/*   This main() outputs first 1000 generated numbers.

#include <stdio.h>
#include "tt800.h"

main()
{ 
    tt_state state;
    
    seed_initial_state(&state, 19650218UL);
    for (int i=0; i<1000; i++)
    {
        printf("%10lu ", random_uint32(&state));
        if (i%5==4) printf("\n");
    }
}
*/
Atop answered 19/4, 2024 at 3:2 Comment(14)
yah, the haters gonna hate.Atop
It sort of answers the question title, but not the actual question being asked, where they want to give a new seed for every output. So it's actually a hash function. Also, a lot of good fast PRNGs have been developed since 1994, notably the xorshift family, leading to xoshiro (en.wikipedia.org/wiki/Xorshift). Those have good properties without as large a state as this. (Although your answer here only accesses a couple elements from the state array for each random number.)Kinzer
Well, it's not my algorithm but was a predecessor to the Mersenne Twister. I think the author called tt800 the smaller senior cousin to mt19377 . I just did some stuff to polish up the code, compared to the original. It's the same algorithm, but the batch update is not done, so a single word in the state is updated each time it's called. I did the same to the Mersenne Twister.Atop
You chose to post it as an answer to this question, presenting it as a good choice to solve this specific problem in 2024. That's fairly questionable even as an answer to the title question, and I'm pretty sure it's not a good choice to answer the reseed-every-output actual question (where the accepted answer has no state, just a hash function). The initial downvote wansn't mine, but voting should reflect how good an answer it is to the question asked. It's not a criticism of you directly, just of the answer as posted.Kinzer
If this is faster and/or has better randomness properties (like bigcrush metrics) than more modern PRNGs like xoshiro, your answer should say so. xoshiro variants are AFAIK the state of the art for fast but relatively good PRNGs with period larger than 2^32.Kinzer
Listen, I'm not going to make apologies for the code. It comes from the very same author who cooked up the Mersenne Twister, which I thought was the going standard. I polished up the code just as I had for the MT. I have not, myself, done any tests of randomness on it. I presume, because the state is 800 bits instead of 32 bits, that the period is much longer and that the tempering operations make the resulting pseudo-random values decently uncorrelated.Atop
Okay, the original author says this: "TT800 : A small senior cousin of Mersenne Twister. It is also in pLab : Salzburg University Random Number Research Team "Atop
Mersenne Twister is slow for the quality of random numbers it produces. MT uses uint32_t state_array[624]; each element is 32-bit but the full state is huge allowing a very large period. en.wikipedia.org/wiki/Mersenne_Twister#Characteristics lists some of the known disadvantages. It's still the default PRNG in a lot of things, but I think that's more a matter of inertia. The "alternatives" section of wikipedia discusses other more modern improvements, all having a less extreme amount of state space but still having advantages.Kinzer
You should see it before I rewrote it and this. Listen, I do DSP algorithms for real-time processing of audio and music signals. Once the batch-processing was undone and the alg only updates a single number in the state for each call, then it's not so slow, when considering worst-case timing. It's on par with a simple first-order or biquad linear filter.Atop
With that weakening to touch less state, is the quality of randomness better than an xorshift128+ that would also run fast, but have a smaller cache footprint for its state? Especially on a 64-bit machine where two 64-bit integers can hold 128 bits of state. There's a lot of room between "good enough" and "best tradeoff of speed vs. quality".Kinzer
This answer still doesn't address the part of the question that asks for "reseeding" before every output. Calling seed_initial_state before every random_uint32() call is probably not very fast. If you want to suggest a way to use this PRNG for the OP's use-case, edit your answer to explain how. Or if you want to frame-challenge that they shouldn't be re-seeding for their use-case (procedural content), explain that.Kinzer
seed_initial_state() is intended to be called once before random_uint32() is ever called. Either that or set_initial_state_key() or set_initial_state(). The Mersenne Twister is not meant to be reseeded every call. That's not how it works. It's seeded once, and in this version computes one new word in the state for each call. Each call 623 words remain the same but get bumped over by one. I dunno what they did that batch processing. It accomplishes nothing but increase worst-case timing by a factor of about 600.Atop
oops, I forgot it's tt800, 25 words in the state, not the mt19937 with 624.Atop
The Mersenne Twister is not meant to be reseeded every call - exactly, that's why it or tt800 aren't good answers to this Stack Overflow question. Unless your answer spends some time explaining that you shouldn't do that for the procedural-content use-case.Kinzer

© 2022 - 2025 — McMap. All rights reserved.