You are on page 1of 20

5

Generation of Random Numbers

What is a random number generator? The aim of a random


number generator is to provide a stream of numbers that meet
the requirements of randomness. Randomness excludes presence
of any patterns, whether recognizable or otherwise. Applications
of random numbers are varied and include music and graphic
composition, numerical analysis, cryptology, security protocols,
computer games, virtual casinos on the internet study of
manufacturing systems, etc. The list goes on. Random numbers in
use can be classified into four different categories as described
below.

1. True random numbers: True random numbers, such as


used in cryptography, cannot be produced by any
mathematical algorithm. They must be generated from
naturally occurring phenomenon such as radioactive decay
of isotopes or electrical noise from a resistor or
semiconductor.
2. Pseudo-random numbers: The commonly used method of
generating random numbers today uses a computer
algorithm. Since the random numbers are generated by a
mathematical algorithm these are called pseudo-random
numbers as opposed to pure random numbers. These
pseudo-random numbers, however, must show statistical
independence and follow a uniform distribution to qualify
as valid random numbers for use in a simulation study.
3. Quasi-random numbers: These are generated by
algorithms formulated to optimize uniform distribution to
improve the accuracy of Monte Carlo method. The random
numbers generated are not independent and are not
suitable for general use.
4. Hardware-based random numbers: The only difference
between these and pseudo-random numbers is that these
are produced by an algorithm encoded in the hardware.

The pseudo-random numbers and their generators are of interest


in a simulation study, be it a discrete event system or a
continuous system. A pseudo-random number generator uses
mathematical algorithms to generate i.i.d (independent and
identically distributed) U(0, 1) random numbers. These random
numbers are used primarily to generate random variables from
other distributions as well as model flow logic in a system. There
are a number of (Pseudo) Random Number Generators (RNGs)
currently available for use. A good RNG should have the following
properties:

Good statistical properties


Long cycle length: Any algorithm used as a generator
starts repeating the sequence after a finite length. A good
RNG should have a very large cycle length.
Good reproduction: A starting seed leads to a unique
sequence of random numbers and repeats the sequence
whenever the seed is repeated.
Insensitivity to seeds: Randomness and cycle length should
not depend on the seed.
Maximum density: The random number sequence
generated should not have large gaps in its range [0, 1].
Fast execution: The algorithm must execute fast. In a
complex system, a generator may be called millions of
times during a simulation run.
Portability: Ideally, an RNG must produce identical results
across several computing platforms.

5.1 GENERATING RANDOM NUMBERS

There are several methods used for generating random numbers.


Some of the most common methods are described below.

5.1.1 Linear Congruential Method (LCG)

The most widely used Random Number Generators are based on


linear recurrences of the following form:

xi = (ai xi-1 + ... + ak xi-k) mod(m).


where modulus m and the order k of the recurrence are positive
integers and the coefficients ai belong to the set {0, 1, , m 1}.
If m is a prime number and if the ais satisfy certain conditions,
the sequence {xi, i 0} has the maximal period of
length = mk 1.

The random number returned at each step is given by Ri = xi/m.


This generator is known as multiplicative recursive generator (c =
0) and for k = 1, we obtain the classical multiplicative linear
congruential generator (MLCG). A commonly used random
number generator that is a variant of LCG with k = 1 has the
following form:

xi+1 = (a* xi + c) mod(m).

and yields integers x1, x2, that depends on the starting


seed x0. The random number sequence is generated by dividing
each xi by m. This leads to a sequence of numbers between 0 and
1. The parameter a is called the multiplier, c is called the
increment, and m the modulus. For c = 0, we obtain an MLCG. For
an LCG, m > 0, m > a, m > c, and x0 < m hold true. The
parameters of an LCG, namely a, c, and m are to be selected
carefully to avoid degeneracy. A full period generator is defined as
one with cycle length = m. The following relationship among
parameters a, c, and m must be satisfied for an LCG to be a full
period generator.

1. c and m must be relatively prime. It implies that only 1


divides both.
Thus, 4 and 13 are relatively prime numbers though 4 is
not a prime number.
2. If a prime number p (any prime number) divides m it must
divide (a 1).
3. If m is divisible by 4, then 4 must divide (a 1).

All three conditions must be satisfied simultaneously for an LCG to


be a full period generator.

For example, the generator xi + 1 = (5 * xi + 4) mod (16)


degenerates promptly starting with a seed of x0= 15. The
generator returns the single value of 15/16 repeatedly. This
generator does not satisfy the first of the three conditions
mentioned earlier since 4 and 16 are not relatively prime. The
example of another degenerate RNG is provided at the end of this
section.

However, if the parameter c is changed to 3, we have an LCG of


the form:

xi+1 = (5 *xi + 3) mod (16).


This generator satisfies all three conditions to be qualified as a full
period generator. Starting with a seed x0 = 10, the generator
returns the sequence: 5, 12, 15, 14, 9, 0, 3, 2, 13, 4, 7, 6, 1, 8, 11
and then the sequence repeats. The cycle length 16 = m shows
that it is a full period generator. Change of seed changes the
starting number in the sequence but does not change the cycle
length.

If c = 0, the method is called Multiplicative Congruential Method


and the generator is called the Multiplicative Linear Congruential
Generator (MLCG). An MLCG is not a full period generator since it
fails to satisfy the condition (i) described earlier. The
parameter m must be very large to guarantee a large cycle
length.

5.1.2 A Degenerate Random Number Generator

The following example illustrates how a bad random number


generator can distort the randomness of an element in a
simulation model.

Let us assume that time between random failures of a machine in


a system under study follows an Exponential distribution with a
mean of 120 min. This may be a representative value for average
time between failures implying that a failure occurs on an average
of every 2 hours. During a simulation run, a machine will break
down every now and then and the frequency of breakdown
depends on the length of the run, the statistical distribution used
to model the time between failures and the distribution
parameters. Any simulation software available today has a built-in
random number generator that guarantees a very large cycle
length. A large cycle length implies that the generator supplies a
large number of non-repeating random numbers before the cycle
repeats. A very large cycle length is necessary to faithfully
reproduce the randomness of an element.

Let us assume that we use the following LCG as our random


number generator:

Zi + 1 = (5Zi + 4) mod (16).

If we start with a seed of 5 (Z 0 = 5), the sequence of random


numbers generated is as follows:

As we can see here the generator degenerates and returns


numbers, R1 = 13/16 and R2 = 5/16 only. Each value of the
random number generates a value for time between failures from
the exponential (120) distribution. In this example, the generator
returns only two values of time between failures. These values
are:

tbf1 = (120) ln (13/16) = 24.92 min and tbf2 = (120) ln (5/16) =


139.58 min.

Table 5.1 shows the original distribution and how it has been
changed for a simulation study by a degenerate random number
generator.

In this case, the exponential distribution has been reduced to a


discrete distribution with two values for time between failures
only. Each value carries equal probability. The variability of the
distribution has been changed significantly by a bad random
number generator. The random nature of time between failures
has been replaced by two alternating time between failures. One
must realize that this is not how a machine in the real world
behaves and a simulation study based on such a distribution will
generate erroneous results. This simple example shows only the
importance of a good random generator in a simulation model.
Figure 5.1 Original distributionexponential (120)

Figure 5.2 Changed distributiondisc(0.5, 5/16, 1.0, 13/16)

Table 5.1 Distribution parameters

Mean of the Stand


distribution devia

Original distributionexponential (120) 120 120

Changed distributiondisc(0.5, 5/16, 1.0, 82.24 57.53


13/16)

5.2 LAGGED FIBONACCI GENERATORS (LFG)


LFGs improve on the LCGs by using more than one seed integers.
Alternatively, an yet better method, is to use lagged Fibonacci
sequence Xn = (Xn p Xn q) mod (m), where represents any
arithmetic operation such as +, , *, or XOR. p and q are the
lags and must be chosen carefully to give a large cycle length.

5.3 COMBINATION GENERATORS

Combination generators combine two generators expressed


as Sn = (Xn Yn) mod (m). Such generators give longer periods
and empirical tests show that the sequences are better
synchronized. The component generators with both LCGs or an
LCG and an LFG have been studied. The old random number
generator used by Arena software is the LCG16807, which has
been described later in this book. It is not a full period generator
and has maximum cycle length = m 1 = 2.1 billion. This cycle
length is no longer adequate for complex systems studied now
using simulation. With increase in computing speed, this
generator will repeat the sequence of random numbers even in a
short simulation run.

5.4 THE NEW RNG USED IN ARENA

A new random number generator has now been installed in Arena


that replaces the old MLCG. The new generator can be described
as a combined multiple recursive generator. It involves two
separate component generators each based on the ideas of an
LCG. The following steps are followed by the new generator.

1. Start two component generators concurrently. The


generators are:

An = (1403580 An-2810728An-3) mod (4294967087)


Bn = (527612 Bn-11370589 Bn-3) mod (4294944443)

2. The generator combines these two values for each n as


follows:

Zn = (An Bn) mod (4294967087)

3. The generator delivers the nth random number Rn as


follows:

The generator must define the six seed elements, the first
three Ans and the first three Bns.

(A0, A1, A2) and (B0, B1, B2) are required to start the generator. The
constants used in the component generators have been selected
through extensive research to guarantee a very large cycle length
and very good statistical properties of the random numbers
generated. Remember that the principle objective of a good
random number generator is to generate a very large sequence of
i.i.d U(0, 1) random numbers. The old MLCG had a cycle length of
2.1 108 compared to 3.1 1057 for the new generator.

5.5 SOME POPULAR GENERATORS

A brief description of each of a few widely used generators is


provided below.

Java: This generator is a part of Java standard library. It is based


on a linear recurrence with period 2 48, but each output value is
constructed by taking two successive values from the linear
recurrence, as follows:

The variant of the same generator is used in the UNIX standard


library.

Visual Basic: The generator used in Microsoft Visual Basic uses an


LCG with period length 224, defined by
Excel: The generator used in Microsoft Excel has the following
form:

Ri = (0.9821 Ri -1 + 0.211327) mod (1).

It is an LCG with the exception that its recurrence is implemented


directly for the Ris in floating point arithmetic. Its period length
actually depends on the numerical precision of the floating point
numbers used for the implementation.

LCG16807: This LCG is defined by the following relationship:

This generator has a period length of (231 2). It was first


proposed by Lewis, Goodman and Miller (1969). This LCG has
been widely used in many software libraries for statistics,
simulation and optimization, etc., as well as operating system
libraries. This LCG has been suggested in several text books and
was used in earlier versions of Arena software. A similar LCG with
a different multiplier 742938285 is used in AutoMod simulation
software.

5.6 STATISTICAL TESTS FOR A RANDOM NUMBER GENERATOR

The objective of a good random number generator is to generate


a stream of independent uniform random variables. The
successive values of Ri in the stream should appear as
independent and identically distributed (i.i.d) U(0, 1) variable.
Statistical tests are carried out to justify the goodness of the
random numbers generated. Two types of tests are performed:
test for uniformity or goodness-of-fit test and test for
independence. These tests are described below.

1. Frequency test: Uses the KolmogorovSmirnov or the Chi-


Square test to compare the distribution of the generated
random numbers with Uniform (0, 1) distribution.
2. Autocorrelation test: Estimates the correlation between the
generated random numbers and compares sample
correlation to the expected correlation of zero.

In testing for goodness-of-fit, the hypotheses are set up as


follows:

H0: The Ris are i.i.d U(0, 1) random variables (Ri ~ U(0, 1))

HA: The Ris do not follow U(0, 1) distribution (Ri U(0, 1))
The test is carried out by comparing the behavior of a test
statistic as observed in the data to its expected behavior if the
null hypothesis is true. Failure to reject the null hypothesis does
not imply that the hypothesis is true. It only implies that the data
do not find reason to reject it at the selected level of significance
for the test. The result may change if either the sample size or the
level of significance for the test is changed.

In testing for independence, the hypotheses are set up as follows:

H0: The autocorrelation is zero (jm = 0)

HA: The autocorrelation is not zero (jm 0)

where jm represents lag m autocorrelation starting


with jth random number in the sequence. Thus, j = 1 and m = 2
represent the autocorrelation among the random
numbers R1, R3, R5, R7, , Rn in the sequence of random number
generated. A non-zero autocorrelation implies lack of
independence. The test must be performed for different values
of m. The property of independence implies that there is no
relationship between successive random numbers at lag m. A
good random number generator must generate independent
sequence of random numbers.

For each test, a level of significance, typically denoted by


parameter , must be specified. The basics of a test of hypothesis
have been described earlier in this book.

Once a Random Number Generator has been constructed and


implemented, it is recommended to submit it to a battery of
empirical tests and test the null hypotheses described earlier.
Unfortunately, there is no universal battery of tests that can
guarantee, when passed, that a given generator is completely
reliable. One feels confident in using an RNG if it passes number
of empirical tests for goodness-of-fit. One simple solution to the
problem is to treat an RNG bad if it fails simple tests, whereas a
good RNG is one that fails only complex tests for goodness-of-fit.

5.6.1 Frequency Tests

Two different methods are available to test the goodness-of-fit


to U(0, 1) distribution. They are the KolmogorovSmirnov test and
the Chi-Square test. Both of these tests measure the degree of
agreement between the distribution of a sample of generated
random numbers and the theoretical U(0, 1) distribution. In each
test, the null hypothesis implies no significant difference between
the sample distribution and the theoretical distribution. These
tests have been described earlier in this book.
Distribution Summary

Distribution: Uniform

Expression: UNIF(0, 1)

Square error: 0.003797

Chi-Square Test

Number of intervals = 15

Degrees of freedom = 14

Test statistic = 14.2

Corresponding P-value = 0.44

KolmogorovSmirnov Test

Test statistic = 0.0603

Corresponding P-value > 0.15

Data Summary

Number of data points = 250

Min data value = 0.00449


Max data value = 0.98

Sample mean = 0.467

Sample standard deviation = 0.284

Histogram Summary

Histogram range = 01

Number of intervals = 15

The results are based on analysis of 250 randomly generated U(0,


1) variables by Arena Input Analyzer. Both goodness-of-fit tests
show that data fit the distribution well. Large P-values derived
from both tests indicate a good fit of data to U(0, 1) distribution.
Usually with P-value > 0.05, we fail to reject the null hypothesis.

SUMMARY

A broad overview of generation of pseudo-random numbers is


provided. The popular generating mechanism based on LCG is
described. The need for a good random number generator with
large cycle length is illustrated through a simple example. A brief
description of a number of other popular random number
generators is provided. The current trend of using combined
multiple recursive generator is discussed including the new
random generator used in Arena.

The next chapter discusses the methods used to generate random


variates from different statistical distributions.
EXERCISES

1. State and explain are the requirements of a good random


number generator.
2. What are differences between: (a) a linear congruential
generator (LCG) and (b) a multiplicative linear congruential
generator (MLCG)?
3.
a. Develop a degenerate LCG generator.
b. Use it to generate ten random variables from a
uniform (6, 8) distribution. Comment on the results
if unusual pattern is observed.
c. Comment on the impact of use of a degenerate
random number generator in a simulation study.
2. Consider an LCG with the following parameters: (a) a =
7, m = 12, c = 5 and the seed X0 = 9; (b) a = 5, m =
16, c = 3, and X0 = 11; and (c) a = 12, m = 24, c = 8,
and X0 = 19. Generate a complete cycle for each
generator. Identify the full period generators.
3. (a) Refer to Problem 4 and verify the rules for a full period
generator in each case. (b) Why an MLCG is not a full
period generator? Explain.
4. Use the following LCG: a = 9, c = 13, and m = 32. (a)
Check if it is full period generator. (b) Generate the random
numbers for a full cycle and plot Rk vs. Rk + 1 (adjacent
random numbers in the sequence). Comment on the shape
of the plot.
5.
a. How would you increase the cycle length of an LCG?
b. Does an LCG return all the values in the interval (0,
1)? If not, why?
2. The recent trend is to use composite random number
generators (CRG) which combine more than one LCG to
increase the cycle length. The book discusses the
composite generator used in Arena software. Compare the
input requirements of an LCG with that of a CRG.
3. Use a random number generator to generate m random
numbers from a discrete uniform distribution with values 1,
2, , n. For each trial, estimate the relative frequencies of
occurrence of different values 1, 2, , n. Compare the
results with the expected values. Explain how the results
may be used to test the goodness of the random number
generator used for this study. Carry out the following
experiments: (a) m = 100, n = 9; (b) m = 100, n = 19.
4. You are interested to design a random number generator
(RNG) with larger cycle length compared to the current
best RNG. Describe your approach to this problem. A good
starting point is to review the motivations behind moving
from use of LCG to use of CRG.

You might also like