You are on page 1of 44

Modeling & Simulation

Lecture 9

Random Number
Generators

Instructor:
Eng. Ghada Al-Mashaqbeh
The Hashemite University
Computer Engineering Department
The Hashemite University 2
Outline
Introduction.
Random number generators.
LCG.
Mixed generators.
Multiplicative generators.
Performance tests.
Empirical tests.
Theoretical tests.
The Hashemite University 3
Random-Number Generation
Any simulation with random components
requires generating a sequence of random
numbers.
E.g., we have talked about arrival times,
service times being drawn from a
particular distribution.
How to generate values from these
random distributions? We do this by:
first generating a random number (uniform between
[0,1]) Chapter 7
and then transforming it appropriately to obtain random
variates from other distribution Chapter 8
The Hashemite University 4
Types of Random Numbers
True random numbers:
Throw a dice or use a specialized machine to do that.
Not possible to do with a computer
Pseudo-random numbers:
Deterministic sequence that is statistically
indistinguishable from a random sequence.
No randomness exist.
Quasi-random numbers:
A regular distribution of numbers over the desired
interval
Mainly, we will study only the pseudo-random numbers.
The Hashemite University 5
True Random Number
Generators
Many true random number generators are
hardware solutions that you plug to a
computer.
The usual method is to amplify noise
generated by a resistor (Johnson noise). Once
you sample the output, you get a series of
bits which can be used to generate random
numbers.
True random number generators can be used
for research, modeling, encryption, and
lottery prediction, among many other uses.
The Hashemite University 6
Pseudorandom Numbers Coverage
The Hashemite University 7
If we change our generator so as to
maintain a nearly uniform density of
coverage of the domain then we have a
random number generator known as
quasi-random number generator.
Quasi-random numbers give up serial
independence of subsequently generated
values in order to obtain as uniform as
possible coverage of the domain. This
avoids clusters and voids in the pattern
of a finite set of selected points.
Quasi-random numbers
The Hashemite University 8
Quasi-random Numbers Coverage
The Hashemite University 9
Is Random Numbers Model
Important for Simulation?
Validity
The simulation model may not be valid
due to cycles and dependencies in the
random input model.

Precision
You can improve the output analysis by
carefully choosing the generated
random numbers.
The Hashemite University 10
Properties of Good Generators
To have a uniform distribution to guarantee that the
produced random numbers are identical and
independent, i.e. IID.
Fast and need the least resources (e.g. memory
space, CPU time, etc.).
Reproducibility: Have the ability to produce the same
already generated random sequence when needed.
Why?
For debugging.
For Performance comparison with other systems.
Produce separate streams of random numbers to
allow the use of the same generator for several inputs
modeling.
To be portable, i.e. can be run somehow with the
same accuracy or precision on different machines.
The Hashemite University 11
Pseudo-Random Numbers
Use statistics to generate series of random numbers
where each number is generated based on its
predecessor.
Want an iterative algorithm that outputs numbers on
a fixed interval.
When we subject this sequence to a number of
statistical test, we cannot distinguish it from a random
sequence
In reality, it is completely deterministic not random!!!!
Linear Congruential generators (LCGs) are widely
used to generate such numbers.
The Hashemite University 12
Linear Congruential
Generators (LCGs)
Introduced in the early 50s and still in
very wide use today
It is based on a recursive formula
) start with to value (initial seed
modulus
increment
multiplier
mod ) (
0
1
=
=
=
=
+ =

Z
m
c
a
m c aZ Z
i i
-- Every number is determined
by these four values
-- All these values are nonnegative
integers.
-- Also, a, c, and Z
0
must be < m
The Hashemite University 13
Transform to Unit Uniform
Simply divide by m

Generally the value of m is chosen to
be very large, e.g.
Examples of LCGs:
m
Z
U
i
i
=
1 , 16 mod ) 12 (
1 , 13 mod ) 3 (
1 , 16 mod ) 11 (
0 1
0 1
0 1
= + =
= =
= =

Z Z Z
Z Z Z
Z Z Z
i i
i i
i i
9
10 > m
The Hashemite University 14
LCGs Characteristics I
All LCGs loop
This is due to the modulus operator where

Thus, All the Zis are between 0 and m 1
{0,1/m,2/m,,(m-1)/m}
So, Z
i
will repeat in values endlessly (a repeated
sequence is called a cycle).
The length of the cycle is called the period.
LCGs with period m have full period.
If LGC does not have a full period then the period
of it is dependent on the chosen seed value (i.e.
Z
0
).
Exercise: See how the properties of good generators
(found in slide 10) are applicable to LCGs?
1 0 s s m Z
i
The Hashemite University 15
Example
Use Z
0
= 27, a = 17, c = 43, and m
= 100.
The Z
i
and U
i
values are:
Z
1
= (17*27+43) mod 100 = 502 mod 100 = 2, U
1
= 0.02;
Z
2
= (17*2+43) mod 100 = 77, U
2
= 0.77;
Z
3
= (17*77+43) mod 100 = 52, U
3
= 0.52;

The Hashemite University 16
C11/16
Example of a LCG
Parameters m = 63, a = 22, c = 4, Z
0
= 19:
Z
i
= (22 Z
i1
+ 4) (mod 63), seed with Z
0
= 19

i 22 Z
i1
+4 Z
i
U
i =
Z
i
/ m


0 19
1 422 44 0.6984
2 972 27 0.4286
3 598 31 0.4921
4 686 56 0.8889
: : : :
61 158 32 0.5079
62 708 15 0.2381
63 334 19 0.3016
64 422 44 0.6984
65 972 27 0.4286
66 598 31 0.4921
: : : :

Cycling will repeat forever
Cycle length s m
(could be < m depending
on parameters)
Result: Pick m BIG
The Hashemite University 17
LCGs Characteristics II
LCGs will have a full period if and only if
The only positive integer that divides both m and c
is 1
If q is a prime that divides m, then q divides a-1
If 4 divides m then 4 divides a-1
For performance reasons m is selected to be
2
b

Longest Possible Period (P)
If m = 2
b
and |c| > 0 , P = m
If m = 2
b
and c = 0 , P = m/4
If m = prime and c = 0 , P = m-1
The Hashemite University 18
LCGs Types
Based on the parameters values LCGs are
divided into two types:
If c=0 then it is called multiplicative LCG,
otherwise it is called mixed LCG.
Multiplicative LCGs are more widely used than
mixed ones.
The Hashemite University 19
Mixed LCGs Generators
Mixed Generator
Want m to be large
A good choice is m = 2
b
, where b is the number
of bits in a computer word -1, since the MSB is
reserved for the sign.
Such choice avoids the need for explicit
division to compute the modulus of m since as
we know division is expensive in computers.
Obtain full period if c is odd and a-1 is divisible
by 4
The Hashemite University 20
Multiplicative LCGs Generators
Multiplicative LCGs
Simpler
Cannot have full period (first condition cannot
be satisfied)
But can have a period of m-1 if m and a are
chosen carefully.
Still an attractive option
The Hashemite University 21
Examples of Practical RNGs I
Classic LCG16807
Multiplicative LCGs cannot have full period, but they can
get very close





Has period of 2
31
-2, that is, best possible
Dates back to 1969
Suggested in many simulation texts and was (is) the
standard for simulation software
Still in use in many software packages
1 2
1 2 mod 16807
31
31
1

=
=

i
i
i i
Z
U
Z Z
The Hashemite University 22
Examples of Practical RNGs II
Java RNG
Mixed LCG with full period




Variant of the old rand48() Unix LCG
48
2
i
i
Z
U =
53
21
1 2
22
2
27
48
1
2
2 2
2
2 mod ) 11 7 2521490391 (
|
.
|

\
|
(

+
(

=
+ =
+

i i
i
i i
Z Z
U
Z Z
The Hashemite University 23
Examples of Practical RNGs III
VB



Excel
24
24
1
2
2 mod ) 12820163 1140671485 (
i
i
i i
Z
U
Z Z
=
+ =

1 mod ) 211327 . 0 0 . 9821 (
1
+ =
i i
U U
The Hashemite University 24
Performance Tests
Goal
random number generators are deterministic
test whether they appear IID uniform on [0,1]
Empirical Tests vs. Theoretical Tests
empirical tests
generate Uis and examine statistically whether they
resemble IID U(0,1) variates
are local, i.e., examine only a segment of a cycle
(e.g., for LCGs)
theoretical tests
analyze structure and defining constants of RNG
are global, i.e., examine the entire cycle
often sophisticated and mathematically complex
The Hashemite University 25
Empirical Tests
Use the RNG to generate some numbers
and then test the null hypothesis
H
0
: The sequence is IID U(0,1)
Four Tests:
Chi-square test: tests for uniformity (identical).
Serial test: tests for both uniformity and
independence.
Runs test: tests for independence.
Correlation test: tests for independence.
Simple simulation tests: test for both uniformity
and independence.
The Hashemite University 26
Test 1: Chi-Square Test
Similar to before:
Generate
Split [0,1] into k subintervals (k >100 )
Test statistic is




With k-1 degrees of freedom
Then compare with the critical value and
whether it fits or not.
l subinterva th in s ' of Number
1
2
2
j U f
k
n
f
n
k
i j
k
j
j
=
|
.
|

\
|
=

=
_
n
U U U ,..., ,
2 1
The Hashemite University 27
Test 2: Serial Test I
Generalization of the chi-square test to a higher dimension.
Can test for independence between different streams and the
uniformity of these streams at the same time.
Consider




Similar to before, find chi-square and compare to the critical
value.
| |
| |,... ,..., ,
, ,..., ,
2 2 1 2
2 1 1
d d d
d
U U U
U U U
+ +
=
=
U
U
etc. , j l subinterva
in the component second , j l subinterva
in the component first having s ' of Number
2
1
1 1 1
2
2
2 1
1 2
2 1
i j j j
k
j
k
j
k
j
d
j j j
d
U f
k
n
f
n
k
d
d
d
=
|
.
|

\
|
=

= = =

_
The Hashemite University 28
Test 2: Serial Test II
Degree of freedom = k
d
-1 , why??
n: number of random numbers vectors.
k: number of intervals (or subintervals) on every
dimension.
j1: numbers of intervals on dimensions 1.
j2: number of intervals on the second dimension. And
so on.
Lets try the following:
d = 2, so we test if our RNG is IID over the unit square.
Why unit square????
k = 4.
How we will use the previous equation?
The Hashemite University 29
Test 3: Runs Test I
A run is a series of random numbers that are
monotonically increasing (called runs up) or
decreasing (called runs down).
We are interested in the length of these runs.
Example:
Consider the following series of random numbers: 0.1,
0.2, 0.15, 0.7, 0.8, 0.9
Run1 = 0.1, 0.2, length = 2
Run2 = 0.15, length = 1
Run3 = 0.7, 0.8, 0.9, length = 3
For large n (number of generated random
numbers) the series test is approximates the chi-
square test.
So, we depend on the critical value comparison
with some level of significance to judge whether
the RNG is independent or not.
The Hashemite University 30
Test 3: Runs Test II
In this test calculate for



Test statistic (chi-square with 6 degree of freedom)



Where the a and b values are given empirically
Finally compare with the critical value with the
appropriate level of significance to reject or accept
the null hypothesis.

= >
=
=
6 6 length of up runs of number
5 ,..., 2 , 1 for length of up runs of number
i
i i
r
i
n
U U U ,..., ,
2 1
( )( )

= =
=
6
1
6
1
1
i j
j j i i ij
nb r nb r a
N
R
The Hashemite University 31
Test 4: Correlation Test I
It is based on the calculation of the correlation factor for a
stream of random numbers.
Remember if this value = 0 the two random variables
are independent.
Here for large n this factor, which is called A
j
, will
approximate the normal distribution with degree of
freedom.
The final result is based on the comparison of the absolute
value of A
j
with a critical value.
If |A
j
|>z
1-/2
then we reject the null hypothesis (the critical
values are found in Table T.1 page 716, last row).
This quantity is different for different values of j where j is
called the lag.
Must be tested for different lags values since it may pass
for some values and fail for others.
And remember that you are working on a series of random
numbers (sample) from the RNG.
The Hashemite University 32
Test 4: Correlation Test II
For uniform variables
| | | |
| |
| | | | | |
| |
| | 3 12
4
1
,
12
1
,
2
1
=
=
=
=
= =
+
+
+ +
+
j i i j
j i i
j i i j i i
j i i j
U U E
U U E
U E U E U U E
U U Cov C
U Var U E

The Hashemite University 33


Test 4: Correlation Test III
Empirical estimate is





Test statistic


Approximately standard normal

| |
2
0
) 1 ( 1 1
) 1 (
7 13

1 / ) 1 (
3
1
12

+
+
=
=

+
=

=
+ + +
h
h
Var
j n h
U U
h
j
h
k
j k kj j

| |
j
j
j
Var
A

=
The Hashemite University 34
Simple Simulation Tests I
We will explore two additional empirical tests
briefly:
Collision test.
Birthday spacing test.
Collision Test
Divide [0,1) into d equal intervals.
Generate n random number and plot them as points in
[0,1)
t
, where t = 2 in our case (so you are looking at
the unit square).
So, if you generate U = 0.4 then plot it as (0.4, 0.4)
C = Number of times a point falls in a box that already
has a point (collision).
Based on the percentage C/n decide whether to accept
the RNG or not.
Mainly used to test whether the RNG is identical or not.
The Hashemite University 35
Simple Simulation Tests II
Birthday Spacing Test
Have k boxes, labeled with
These boxes can be viewed as the days within
a one year.
In our case these boxes are the values of the
generated random numbers.
This RNG will be considered good if it
generates random numbers that have different
birth dates (no collision in values) and they
are distributed evenly between 0 and 1.
Define the spacing
Consider
As j gets larger this means that the RNG is
closer to uniform (i.e. identical).
) ( ) 2 ( ) 1 ( n
I I I s s s
) ( ) 1 ( j j j
I I S =
+
{ } 2 ,..., 1 , :
1
= = =
+
n j S S j Y
j j
The Hashemite University 36
Performance: Collision
After 2
15
numbers, VB starts failing.
After 2
17
numbers, Excel starts failing.
After 2
19
numbers, LCG16807 starts
failing.
The Java RNG does OK up to at least 2
20

numbers.
Note that this means that a clear pattern
is observed from the VB RNG with less
than 100,000 numbers generated!
The Hashemite University 37
Performance: Birthday Spacing
After 2
10
numbers, VB starts failing.
After 2
14
numbers, Excel starts failing.
After 2
14
numbers, LCG16807 starts
failing.
After 2
18
numbers, Java starts failing.
For this test, the VB RNG is only good for
about 1000 numbers!
The performance gets even worse if we
look at less significant digits.
The Hashemite University 38
Passing the Test
A RNG with long period that passes a fixed
set of statistical test is no guarantee of
this being a good RNG
Many commonly used generators are not
good at all, even though they pass all of
the most basic tests.
And remember all empirical tests are
local. So, the test results are based on the
random numbers sample used in the test.
The Hashemite University 39
Why do RNGs Fail?
We have seen that many commonly used
RNGs fail simulation tests, even though
they pass the standard empirical tests.
Why do these RNGs fail?
Remember that all previously stated tests are
local (i.e. work on a subset or sample of
random numbers).
So, we need a global test (i.e. theoretical one).
In such test we need to analyze the structure
of the RNG as we will see in the next slides.
The Hashemite University 40
Theoretical Tests
Based on analyzing the structure of the numbers
that can be generated.
In these theoretical tests you must generate a
complete cycle of the. random numbers from its
RNG.
Again you are working on vectors of random
numbers (multi-dimension).
Such type of theoretical tests is called spectral
tests since you study the distribution of these
vectors in space.
Mainly we will explore Lattice test which test
against identical property, i.e. the RNG is uniform
or not).
The Hashemite University 41
Lattice Test
For all LCGs, the numbers generated fall in a
fixed number of planes.
We want this to be as many planes as possible
and fill-up the space, i.e. uniform.
This should be true in any number of dimensions.
Approach:
Generate a complete period or cycle of random number
from the RNG under test.
Define the following overlapping vectors: (U
1
, U
2
, .., U
d
),
(U
2
, U
3
, .., U
d+1
), , (only shift one random number at a
time) where d is the number of dimensions.
Plot these vectors using d-dimension space.
Consider the distribution of these points and how it
covers the space.
Based on that you can judge whether this RNG is
uniform or not.

The Hashemite University 42
Example: Two Full-Period
LCGs
Good RNG Poor RNG
The Hashemite University 43
LCG RANDU in 3 Dimensions
The RNG is poor
since the planes are
far from each other
(do not cover the
total space). And
the random
numbers are
clustered in groups.
The Hashemite University 44
Additional Notes
The lecture covers the following
sections from the textbook:
Chapter 7:
Sections: 7.1, 7.2, 7.4

You might also like