Stephen Wolframstephenwolfram.com
Publications by Stephen Wolfram * Articles * Cellular Automata * Random Sequence Generation by Cellular Automata (1986)
Random Sequence Generation by Cellular Automata (1986)


Appendix A: Statistical Procedures

This Appendix describes the statistical randomness testing procedures used in Section 10. The procedures are mostly taken from [1], although their numbering has been changed slightly. The basic method in each case is to compare an observed distribution with that calculated for a purely probabilistic sequence.

The sequences studied consist of strings of binary bits. In many of the tests, these bits are grouped into blocks: either length 8 (non-overlapping) bytes, or length 4 (non-overlapping) nybbles. The possible bit sequences in these blocks can be represented by integer ``values'' between 0 and 255 or 16, respectively.

A. Block Frequency Distribution.

Each of the possible -blocks should occur with equal frequency. ( is used.)

B. Gap Length Distribution.

The lengths of runs of -blocks whose values are all greater than or less than should follow a binomial distribution. (, , are used; runs longer than 16 blocks are lumped together.)

C. Distinct Blocks Distribution.

The frequencies with which out of successive -blocks are distinct should follow a definite distribution. (, are used.)

D. Block Accumulation Distribution.

The number of successive -blocks necessary for all possible -blocks to appear in order as their first elements should follow a definite distribution. (, are used; numbers greater than 40 are lumped together.)

E. Permutation Frequency Distribution.

The values of successive -blocks should occur in all possible orderings with equal frequency. (, are used.)

F. Monotone Sequence Length Distribution.

The lengths of sequences in which successive -blocks have monotonically increasing values should follow a definite distribution. ( is used; lengths greater than 6 are lumped together; elements immediately following each run are discarded to make successive runs statistically independent.)

G. Maxima Distribution.

The maximum values of -blocks in sequences of -blocks should follow a power law distribution. (, are used.)

previous  l  next