WARNING: Alpha code. I'm not happy with the amount of avalanche, or with it using a multiplication. This is indefinitely in alpha until I have time to explore this space properly. I may get a better idea and replace these and still call them the same name. In the meantime you are encouraged to time it and detect bias in it. It's faster than JFS64 on my home machine, but slower on my work machine. The difference seems to be that my work machine has a slow multiply. Chips with slow multiplies are very common.
WOB2M (Wrangler Of Bits, 2 mixing variables, with Multiply) is a 64-bit noncryptographic pseudorandom number generator. It uses 1 counter and 2 mixing variables. It is reversible, but does not have the ability to jump forward or back by arbitrary amounts. All 1-mix 1-counter functions were too slow, so there will be no WOB or WOB1 or WOB1M. WOB2, WOB3, WOB3M are likely in the future.
#include <stdint.h> // 64-bit rotate left #define ROTL(d,lrot) ((d<<(lrot)) | (d>>(8*sizeof(d)-(lrot)))) class Wob2m { public: Wob2m(uint64_t seed1=0, uint64_t seed2=0) { m_a = seed1; m_b = seed2; m_count = (uint64_t)-10; for (int i=0; i<10; i++) (void)Next(); } uint64_t Next() { uint64_t temp = m_a + m_count++; m_a = m_b + ROTL(temp, 12); return m_b = 0x0581af43eb71d8b3 * temp ^ ROTL(m_a, 28); } uint64_t Prev() { uint64_t temp = 0x6cc3621b095c967b * (m_b ^ ROTL(m_a, 28)); m_b = m_a - ROTL(temp, 12); m_a = temp - --m_count; return m_b; } uint64_t m_a, m_b, m_count; }; |
Its cycle length is guaranteed to be at least 264 values because it takes that long for the counter to wrap around. Every seed produces a distinct sequence for at least 264 values, because every seed starts with the same counter value but different other state, so it takes at least 264 values before the counter wraps around and the rest of the state gets a chance to collide with the initial state of some other seed. The longest cycle is expected to be of length 2191, containing roughly half of all possible states, and the average cycle length is expected to be 2190. Two arbitrary seeds will land on the same cycle with probability about 1/3, but that is OK because the sequences won't overlap for at least 264 values.
The design philosophy is the same as for smallprng.html. The internal state had to be at least 128 bytes, the mix should be reversible and nonlinear (yielding no obvious cycle length), and the result should be part of the internal state at some point in mixing the internal state (no wasted computation just for producing the result). The counter just multiplies that uncertain cycle length by 264. If there are n mixing variables, the most likely bias will be in subsequences of length n+1. These can be tested by initializing a pair of states to differ by 1 bit then walking forward or backwards n+1 results, and comparing the pair of results (test2.cpp). Test2 found that states differing in 1 bit tend to differ in at least 5 bits for results 2+1=3 away (forward or backward). That's lower than I'd like, but maybe it is OK? Diffs in all bit positions usually cause a difference of at least 26 bits for results 4 away. Testing states that differ in 2 bits seems relevant too, but was not done.
It is uncommon for states to differ in only a few bits, even more so with 64-bit values than 32-bit values. Subsequences can be tested directly looking for bias. Frequency, gap, and run tests pass for at least 1 trillion values for each of the 16 4-bit ranges 0..3, 4..7, on up to 60..63. The strongest test on subsequences known is countx.cpp, which tests how many bits are set in each of 3 consecutive values. countx.cpp passes for at least 32 trillion values. Ditto for 4 consecutive values.
The Init() routine calls Next() 10 times. Calling it 5 times caused detectable bias after a billion initializations of seed2 with ROTL(counter,60), but 6 did not detect bias (within a billion initializations) with a counter at any bit position. So 10 calls is overkill, but not extremely so.
I used dice to generate possible generators. Finding random number generators is an exercise in exploring possibility spaces. That is done considering all possibilities for a sequence of small decisions, where each decision leaves the later decisions undecided. The first decision is the number of mixing variables: 1,2,3,4,5. The second is the arrangement of variables: screen it both for avalanche (exceedingly good mixing of each variable gives an upper bound on avalanche) and also time (just adding each variable gives a lower bound on time). The third is the type of mixing of each variable. The fourth is using +, -, or ^ to combine variables, and parenthesization. The fifth is the amounts for rotates and shifts. The sixth is fine tuning constants for addition and multiplication. The amount of shifts and the constants used tend not to affect timing. I should try unrolling the loop two or four iterations and using different constants for those unrolled iterations.
My small 32-bit noncryptographic PRNGs from 2007
My cryptographic PRNG from 1994
Text for a talk I gave in 2007 on designing smallstate PRNGs
Hash functions for 32-bit integers
Table of Contents