Port of Random generator from C to Java?
Asked Answered
B

6

20

George Marsaglia has written an excellent random number generator that is extremely fast, simple, and has a much higher period than the Mersenne Twister. Here is the code with a description:

good C random number generator

I wanted to port the CMWC4096 code to Java, but it uses several unsigned datatypes so I am not sure how to do this properly. Here is the full C code:

/* choose random initial c<809430660 and */
/* 4096 random 32-bit integers for Q[]   */
static unsigned long Q[4096],c=362436;

unsigned long CMWC4096(void) {
    unsigned long long t, a=18782LL;
    static unsigned long i=4095;
    unsigned long x,r=0xfffffffe;
    i = (i+1) & 4095;
    t = a*Q[i] + c;
    c = (t>>32);
    x = t + c;
    if (x < c) {
        x++;
        c++;
    }
    return (Q[i] = r - x);
}

Can anyone port this to Java? How does this work when you only have signed numbers available?

EDIT: Thanks everybody for the quick answers! For the first 100 million numbers this java code seems to produce the same result as the C code. It is 3 times faster than Java's java.util.Random.

public class ComplimentaryMultiplyWithCarryRandom {

    /**
     * Choose 4096 random 32-bit integers
     */
    private long[] Q;

    /**
     * choose random initial c<809430660
     */
    private long c = 362436;

    private int i;

    public ComplimentaryMultiplyWithCarryRandom() {
        Random r = new Random(1);
        Q = new long[4096];

        // TODO initialize with real random 32bit values
        for (int i = 0; i < 4096; ++i) {
            long v = r.nextInt();
            v -= Integer.MIN_VALUE;
            Q[i] = v;
        }
        i = 4095;
    }

    int next() {
        i = (i + 1) & 4095;
        long t = 18782 * Q[i] + c;
        c = t >>> 32;
        long x = (t + c) & 0xffffffffL;
        if (x < c) {
            ++x;
            ++c;
        }

        long v = 0xfffffffeL - x;
        Q[i] = v;
        return (int) v;
    }
}
Brace answered 29/12, 2008 at 15:8 Comment(0)
P
14

Can anyone port this to Java? How does this work when you only have signed numbers available?

No Stress! a=18782 so the largest t could ever be is not large enough to cause signed vs. unsigned problems. You would have to "upgrade" the result of using Q to a value equal to a 32-bit unsigned number before using it anywhere. e.g. if Q is an int (32-bit signed) then you'd have to do this before using it in the t=a*Q[i]+c statement, e.g.

t=a*(((long)Q[i])&0xffffffffL)+c

where this (((long)Q[i])&0xffffffffL) business promotes Q[i] to a 64-bit # and ensures its high 32 bits are 0's. (edit: NOTE: you need 0xffffffffL here. Java does the wrong thing if you use 0xffffffff, it seems like it "optimizes" itself to the wrong answer & you get a negative number if Q[i]'s high bit is 1.)

You should be able to verify this by running the algorithms in C++ and Java to compare the outputs.

edit: here's a shot at it. I tried running it in C++ and Java for N=100000; they both match. Apologies if I used bad Java idioms, I'm still fairly new to Java.

C++:

// marsaglia2003.cpp 

#include <stdio.h>
#include <stdlib.h> // for atoi

class m2003
{
    enum {c0=362436, sz=4096, mask=4095};
    unsigned long Q[sz];
    unsigned long c;
    short i;

public:
    m2003()
    {
        // a real program would seed this with a good random seed
        // i'm just putting in something that makes the output interesting
        for (int j = 0; j < sz; ++j)
            Q[j] = j + (j << 16);
        i = 4095;
        c = c0;
    }

    unsigned long next()
    {
        unsigned long long t, a=18782LL;
        unsigned long x;
        unsigned long r=0xfffffffe;
        i = (i+1)&mask;
        t=a*Q[i]+c;
        c=(unsigned long)(t>>32);
        x=(unsigned long)t + c;
        if (x<c)
        {
            x++;
            c++;
        }
        return (Q[i]=r-x);
    }
};

int main(int argc, char *argv[])
{
    m2003 generator;
    int n = 100;
    if (argc > 1)
        n = atoi(argv[1]);

    for (int i = 0; i < n; ++i)
    {
        printf("%08x\n", generator.next());
    }
    return 0;
}

java: (slower than compiled C++ but it matches for N=100000)

// Marsaglia2003.java

import java.util.*;

class Marsaglia2003
{
    final static private int sz=4096;
    final static private int mask=4095;
    final private int[] Q = new int[sz];
    private int c=362436;
    private int i=sz-1;

    public Marsaglia2003()
    {
        // a real program would seed this with a good random seed
        // i'm just putting in something that makes the output interesting
        for (int j = 0; j < sz; ++j)
            Q[j] = j + (j << 16);
    }

  public int next() 
    // note: returns a SIGNED 32-bit number.
    // if you want to use as unsigned, cast to a (long), 
    // then AND it with 0xffffffffL
    {
        long t, a=18782;
        int x;
        int r=0xfffffffe;
        i = (i+1)&mask;
        long Qi = ((long)Q[i]) & 0xffffffffL; // treat as unsigned 32-bit
        t=a*Qi+c;
        c=(int)(t>>32); 
           // because "a" is relatively small this result is also small

        x=((int)t) + c;
        if (x<c && x>=0) // tweak to treat x as unsigned
        {
            x++;
            c++;
        }
        return (Q[i]=r-x);
    }

    public static void main(String args[])
    {
        Marsaglia2003 m2003 = new Marsaglia2003();

        int n = 100;
        if (args.length > 0)
            n = Integer.parseInt(args[0]);
        for (int i = 0; i < n; ++i)
        {
            System.out.printf("%08x\n", m2003.next());
        }
    }
};
Pascha answered 29/12, 2008 at 15:48 Comment(0)
O
52

Most of the time there is no need to use larger numeric types for simulating unsigned types in Java.

For addition, subtraction, multiplication, shift left, the logical operations, equality and casting to a smaller numeric type it doesn't matter whether the operands are signed or unsigned, the result will be the same regardless, viewed as a bit pattern.

For shifting to the right use >> for signed, >>> for unsigned.

For signed casting to a larger type just do it.

For unsigned casting from a smaller type to a long use & with a mask of type long for the smaller type. E.g., short to long: s & 0xffffL.

For unsigned casting from a smaller type to an int use & with a mask of type int. E.g., byte to int: b & 0xff.

Otherwise do like in the int case and apply a cast on top. E.g., byte to short: (short) (b & 0xff).

For the comparison operators < etc. and division the easiest is to cast to a larger type and do the operation there. But there also exist other options, e.g. do comparisons after adding an appropriate offset.

Onomastic answered 29/12, 2008 at 16:5 Comment(2)
Good summary. Although you forgot one critical operation: casting to larger types. (e.g. cast a 32-bit to 64-bit #.) You need to AND the result with a mask so that it interprets the original as "unsigned".Pascha
I didn't see it that way then, but you are right. Without the mask the cast is signed, with the mask unsigned. Usually you don't even need to make the cast explicit, as & with a suitable value does widening.Onomastic
P
14

Can anyone port this to Java? How does this work when you only have signed numbers available?

No Stress! a=18782 so the largest t could ever be is not large enough to cause signed vs. unsigned problems. You would have to "upgrade" the result of using Q to a value equal to a 32-bit unsigned number before using it anywhere. e.g. if Q is an int (32-bit signed) then you'd have to do this before using it in the t=a*Q[i]+c statement, e.g.

t=a*(((long)Q[i])&0xffffffffL)+c

where this (((long)Q[i])&0xffffffffL) business promotes Q[i] to a 64-bit # and ensures its high 32 bits are 0's. (edit: NOTE: you need 0xffffffffL here. Java does the wrong thing if you use 0xffffffff, it seems like it "optimizes" itself to the wrong answer & you get a negative number if Q[i]'s high bit is 1.)

You should be able to verify this by running the algorithms in C++ and Java to compare the outputs.

edit: here's a shot at it. I tried running it in C++ and Java for N=100000; they both match. Apologies if I used bad Java idioms, I'm still fairly new to Java.

C++:

// marsaglia2003.cpp 

#include <stdio.h>
#include <stdlib.h> // for atoi

class m2003
{
    enum {c0=362436, sz=4096, mask=4095};
    unsigned long Q[sz];
    unsigned long c;
    short i;

public:
    m2003()
    {
        // a real program would seed this with a good random seed
        // i'm just putting in something that makes the output interesting
        for (int j = 0; j < sz; ++j)
            Q[j] = j + (j << 16);
        i = 4095;
        c = c0;
    }

    unsigned long next()
    {
        unsigned long long t, a=18782LL;
        unsigned long x;
        unsigned long r=0xfffffffe;
        i = (i+1)&mask;
        t=a*Q[i]+c;
        c=(unsigned long)(t>>32);
        x=(unsigned long)t + c;
        if (x<c)
        {
            x++;
            c++;
        }
        return (Q[i]=r-x);
    }
};

int main(int argc, char *argv[])
{
    m2003 generator;
    int n = 100;
    if (argc > 1)
        n = atoi(argv[1]);

    for (int i = 0; i < n; ++i)
    {
        printf("%08x\n", generator.next());
    }
    return 0;
}

java: (slower than compiled C++ but it matches for N=100000)

// Marsaglia2003.java

import java.util.*;

class Marsaglia2003
{
    final static private int sz=4096;
    final static private int mask=4095;
    final private int[] Q = new int[sz];
    private int c=362436;
    private int i=sz-1;

    public Marsaglia2003()
    {
        // a real program would seed this with a good random seed
        // i'm just putting in something that makes the output interesting
        for (int j = 0; j < sz; ++j)
            Q[j] = j + (j << 16);
    }

  public int next() 
    // note: returns a SIGNED 32-bit number.
    // if you want to use as unsigned, cast to a (long), 
    // then AND it with 0xffffffffL
    {
        long t, a=18782;
        int x;
        int r=0xfffffffe;
        i = (i+1)&mask;
        long Qi = ((long)Q[i]) & 0xffffffffL; // treat as unsigned 32-bit
        t=a*Qi+c;
        c=(int)(t>>32); 
           // because "a" is relatively small this result is also small

        x=((int)t) + c;
        if (x<c && x>=0) // tweak to treat x as unsigned
        {
            x++;
            c++;
        }
        return (Q[i]=r-x);
    }

    public static void main(String args[])
    {
        Marsaglia2003 m2003 = new Marsaglia2003();

        int n = 100;
        if (args.length > 0)
            n = Integer.parseInt(args[0]);
        for (int i = 0; i < n; ++i)
        {
            System.out.printf("%08x\n", m2003.next());
        }
    }
};
Pascha answered 29/12, 2008 at 15:48 Comment(0)
S
5

If you are implementing an RNG in Java, it is best to sub-class the java.util.Random class and over-ride the protected next(int) method (your RNG is then a drop-in replacement for java.util.Random). The next(int) method is concerned with randomly-generated bits, not what vales those bits might represent. The other (public) methods of java.util.Random use these bits to construct random values of different types.

Stink answered 29/12, 2008 at 15:42 Comment(0)
E
2

To get around Java's lack of unsigned types you usually store numbers in a bigger variable type (so shorts get upgraded to ints, ints to long). Since you're using long variables here, you're going to have to step up to BigInteger, which will probably wreck any speed gains that you're getting out of the algorithm.

Eolic answered 29/12, 2008 at 15:16 Comment(2)
On the other hand, classes that use the RNG will, of course, also use Java's signed types otherwise, so if the full unsigned long range is required, the RNG will probably not be the only place with a speed problem.Greggs
Your interpretation of long in the original C code is incorrect. It refers to a 32-bit integer type, which corresponds to int in Java.Theorist
U
0

You can use signed numbers provided the values don't overflow...for example long in java is a 64 bit signed integer. However the intent in this algorithm seems to be to use a 64 bit unsigned value, and if so I think you would be out of luck with the basic types.

You could use the multiprecision integers provided in the java class libraries (BigInteger). Or you could implement your own 64 bit unsigned type as an Object containing two java longs to represent the least significant and most significant words (but you'd have to implement the basic arithmetic operations yourself in the class).

Unfortunate answered 29/12, 2008 at 15:16 Comment(4)
Impressively old link. There is no java.lang.Bignum class, I assume that it became java.math.BigInteger before Java 1.1 went final.Stink
Whoops - you're right! That shows you how long it is since I needed to use MP ints. I'll fix that now.Unfortunate
Your interpretation of long in the original C code is incorrect. It refers to a 32-bit integer type, which corresponds to int in Java.Theorist
Interestingly, the old link to java.math.Bignum shows that it is a fusion of both BigInteger and BigDecimal. I am thankful for the final design, which is much clearer. Plus, I use BigInteger often but BigDecimal very rarely, so it's good that I don't need to pay attention to the BigDecimal methods when working with BigInteger.Theorist
T
0

Note: In your C code, I inferred that long is 32 bits wide, and long long is 64 bits wide.

Here is my way of porting that code to Java with the minimum number of changes:

/* choose random initial 0<=c<809430660 and */
/* 4096 random 32-bit integers for Q[]      */
int[] Q = new int[4096];
int c = 362436;
int i = 4095;

int CMWC4096() {
    long a = 18782;
    int r = 0xfffffffe;
    i = (i + 1) & 4095;
    long t = a * Q[i] + c;
    c = (int)(t >>> 32);
    int x = (int)(t + c);
    if (0 <= x && x < c) {
        x++;
        c++;
    }
    return (Q[i] = r - x);
}
Theorist answered 22/8, 2022 at 3:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.