Tests for randomness

Randomness tests (or tests for randomness), in data evaluation, are used to analyze the distribution of a set of data to see if it is random (patternless). In , as in some computer simulations, the hoped-for randomness of potential input data can be verified, by a formal test for randomness, to show that the data are valid for use in simulation runs. In some cases, data reveals an obvious non-random pattern, as with so-called "runs in the data" (such as expecting random 0–9 but finding "4 3 2 1 0 4 3 2 1..." and rarely going above 4). If a selected set of data fails the tests, then parameters can be changed or other randomized data can be used which does pass the tests for randomness.

There are many practical measures of randomness for a binary sequence. These include measures based on statistical tests, transforms, and complexity or a mixture of these. A widely used collection of tests introduced by Marsaglia is called the Diehard Battery of Tests

The issue of randomness is an important philosophical and theoretical question. Tests for randomness can be used to determine whether a data set has a recognisable pattern, which would indicate that the process that generated it is significantly non-random.

Many "random number generators" in use today are defined algorithms, and so are actually pseudo-random number generators. The sequences they produce are called pseudo-random sequences. These generators do not always generate sequences which are sufficiently random, but instead can produce sequences which contain patterns. For example, the infamous RANDU fails many randomness tests dramatically, including the Spectral Test. Wolfram used randomness tests on the output of Rule 30 to examine its potential for generating random numbers, though it was shown to have an effective key size far smaller than its actual size and to perform poorly on a chi-squared test. The use of an ill-conceived random number generator can put the validity of an experiment in doubt by violating statistical assumptions. Though there are commonly used statistical testing techniques such as NIST standards, Yongge Wang showed that NIST standards are not sufficient. Furthermore, Yongge Wang designed statistical–distance–based and law–of–the–iterated–logarithm–based testing techniques. Using this technique, Yongge Wang and Tony Nicol detected the weakness in commonly used pseudorandom generators such as the well known Debian version of OpenSSL pseudorandom generator which was fixed in 2008.

...
Wikipedia