Random Number Results

Here's some code and analysis related to random numbers.

Here are some random facts about pseudorandom number generation.

• The Birthday Paradox. If you choose 1 out of N things sqrt(2N) times, you'll get about 1 duplicate value.
• A random number generator is reversible if every state has a unique predecessor; usually it's easy to deduce what that predecessor was. Irreversible generators can have multiple states map to the same next state.
• A reversible generator with S possible states has an expected cycle length of (S+1)/2, that is about S/2.

Proof: the probability of a cycle of length 1 is 1/S. Of length 2 is ((S-1)/S)(1/(S-1))=(1/S). And so on. The sum of i/S for i in 1..S is S(S+1)/2, and the average is (S+1)/2.

• An irreversible generator has an expected cycle length of about sqrt(S). References for these are
• V. F. Kolchin, "Random Mappings", Optimization Software Inc., New York 1986.
• P. Flajolet and A. M. Odlyzko, "Random Mapping Statistics", Advances in Cryptology -- EUROCRYPT '89, Springer-Verlag Lecture Notes in Computer Science #434 pp 329-354.
• Theorem SQRT
• Reversible generators often have a single cycle longer than S/2, that is, it contains over half of all possible internal states. This makes all values occur with equal frequency in reversible generators. If a test samples values that are smaller than the internal state (say, sampling 5 consecutive numbers when the internal state stores 256 numbers), over the long run things will look perfectly random. The tests that actually find bias are the ones that consider samples bigger than the internal state (for example, Knuth's gap test for gaps up to 512 values).
• If f(a) is a permutation of 0..N-1, then (f(a)+a mod N) is not a permutation of 0..N-1 if N is even. Proof: sum(f(a)) mod N = sum(a) mod N = N/2, but sum(f(a)+a) mod N/2 = 0. Combining a with f(a) is a common source of subtle biases in random number generators.
• Suppose you are testing something of the form f(x,f(y,f(x,f(y)))) and you want to know whether this method of combining f(x) with x will produce bias. Well, write a little program with a loop for x and a loop for y, generate all possible values, and see. It helps to shrink x and y first so they only have like 8 values apiece.
• If N/b items are hashed into N buckets, you should expect N/(b+.5) of the buckets to get filled. See Theorem BUCKETS.

Here are some other random number generator related pages.