Anda di halaman 1dari 42

Distinguishing Features of

Simulation
Time (CLK) DYNAMIC
focused on this aspect during the modeling
section of the course
Pseudorandom variables (RND)
STOCHASTIC
will focus on this aspect in coming weeks
(Pseudo) Random Number
Generation
Properties of pseudo-random numbers
Continuous numbers between 0 and 1
Probability of selecting a number in interval
(a,b) ~ (b-a) i.e. Uniformly distributed
Numbers are statistically independent
Cant really generate random numbers
information finite algorithm or table
Example: XL spreadsheet function =RAND()
Also, want fast and repeatable...
Random Number Generation
How to generate random numbers
Table look-up
Computer generation: these values cannot be
truly random and a computer cannot express a
number to an infinite number of decimal places
Pseudorandom numbers
Random Number Generation
Random number seed:
Virtually all computer methods of random
number generation start with an initial
random number seed. This seed is used to
generate the next random number and then
is transformed into a new seed value.

Random Generators
Reasons for pseudorandom numbers:
Flexible policies
Lack of knowledge
Generate stochastic processes
Decision making (random decision)
Numerical analysis (numerical integration)
Monte Carlo integration
Desirable Properties of Random
Number Generators
Fast
Should not require much memory
Long cycle or period
Should support multiple streams
Sequence should be replicable
Debugging
Compare various scenarios under similar conditions
Numbers should come close to:
Uniformity (or known distribution)
Independence
Historical Generator
Midsquare method:
1. Start with an initial seed (e.g. a 4-digit
integer).
2. Square the number.
3. Take the middle 4 digits.
4. This value becomes the new seed. Divide
the number by 10,000. This becomes the
random number. Go to 2.

Midsquare Method, example
x
0
= 5497
x
1
: 5497
2
= 30217009 x
1
= 2170, R
1
= 0.2170
x
2
: 2170
2
= 04708900 x
2
= 7089, R
2
= 0.7089
x
3
: 7089
2
= 50253921 x
3
= 2539, R
3
= 0.2539

Drawback: Hard to state conditions for picking
initial seed that will generate a good
sequence.
Midsquare Generator, examples
Bad sequences:
x
0
= 5197
x
1
: 5197
2
= 27008809 x
1
= 0088, R
1
= 0.0088
x
2
: 0088
2
= 00007744 x
2
= 0077, R
2
= 0.0077
x
3
: 0077
2
= 00005929 x
3
= 0059, R
3
= 0.0059
x
i
= 6500
x
i+1
: 6500
2
=42250000 x
i+1
=2500, R
i+1
= 0.0088
x
i+2
: 2500
2
=06250000 x
i+2
=2500, R
i+1
= 0.0088

Linear Congruential Generator
(LCG) Generator
Start with random seed
Z
0
< m = largest possible integer on machine
Recursively generate integers between 0 and M
Z
i
= (a Z
i-1
+ c) mod m

Use U = Z/m for pseudo-random number get
(avoid 0 and 1)

When c = 0 Called Multiplicative Congruential Generator
When c > 0 Mixed LCG

Linear Congruential Generator (LCG) (Lehmer 1951)

Let Z
i
be the i
th
number (integer) in the sequence

Z
i
= (aZ
i-1
+c)mod(m) Z
i
e{0,1,2,,m-1}

where Z
0
= seed
a = multiplier
c = increment
m = modulus

Define U
i
= Z
i
/m (to obtain U(0,1) value)
LCG, example
16-bit machine
a = 1217 c = 0 Z
0
= 23 m = 2
15
-1 = 32767
Z
1
= (1217*23) mod 32767 = 27991
U
1
= 27991/32767 = 0.85424
Z
2
= (1217*27991) mod 32767 = 20134
U
2
= 20134/32767 = 0.61446
An LCG can be expressed as a function of the seed Z
0


THEOREM:

Z
i
= [a
i
Z
0
+c(a
i
-1)/(a-1)] mod(m)

Proof: By induction on i

i=0 Z
0
= [a
0
Z
0
+c(a
0
-1)/(a-1)] mod(m)

Assume for i. Show that expression holds for i+1
Z
i+1
= [aZ
i
+c] mod(m)
= [a {[a
i
Z
0
+c(a
i
-1)/(a-1)] mod(m)}

+c] mod(m)
= [a
i+1
Z
0
+ac(a
i
-1)/(a-1)

+c] mod(m)
= [a
i+1
Z
0
+c(a
i+1
-1)/(a-1)

] mod(m)
Examples:
Z
i
= (69069Z
i-1
+1) mod 2
32
U
i
= Z
i
/2
32

Z
i
= (65539Z
i-1
+76) mod 2
31
U
i
= Z
i
/2
31

Z
i
= (630360016Z
i-1
) mod (2
31
-1) U
i
= Z
i
/2
31

Z
i
= 13
13
Z
i-1
mod 2
59
U
i
= Z
i
/2
59



What makes one LCG better than another?
A full period (full cycle) LCG generates all m values
before it cycles.

Consider Z
i
= (3Z
i-1
+2) mod(9) with Z
0
=7
Then Z
1
= 5 Z
2
= 8 Z
3
= 8 Z
j
= 8 j = 3,4,5,6,
On the other hand Z
i
= (4Z
i-1
+2) mod(9) has full period.



Why?
Random Number Generation
Mixed congruential generator is full period if

1. m = 2
B
(B is often # bits in word) fast
2. c and m relatively prime (g.c.d. = 1)
3. If 4 divides m, then 4 divides a 1
(e.g., a = 1, 5, 9, 13,)
The period of an LCG is m (full period or full cycle) if and only if

If q is a prime that divides m, then q divides a-1
The only positive integer that divides both m and c is 1
If 4 divides m, then 4 divides a-1.

Examples
Z
i+1
= (16807Z
i
+3) mod (451605),
where 16807 =7
5
, 16806 =(2)(3)(2801), 451605 =(3)(5)(7)(11)(17)(23)
This LCG does not satisfy the first two conditions.

Z
i+1
= (16807Z
i
+5) mod (635493681)
where 16807 =7
5
, 16806 = (2)(3)(2801), 635493681 = (3
4
)(2801
2
)
This LCG satisfies all three conditions.
- m = 2
B
where B = # bits in the machine is often a good choice
to maximize the period.

- If c = 0, we have a power residue or multiplicative generator.

Note that Z
n
= (aZ
n-1
) mod(m) Z
n
= (a
n
Z
0
) mod(m).

If m = 2
B
, where B = # bits in the machine, the longest period is m/4
(best one can do) if and only if

Z
0
is odd
a = 8k+ 3, keZ
+
(5,11,13,19,21,27,)

Random Number Generation
Other kinds of generators
Quadratic Congruential Generator
S
new
= (a
1
S
old
2
+ a
2
S
old
2
+ b) mod L
Combination of Generators
Shuffling LEcuyer Wichman/Hill
Tausworthe Generator
Generates sequence of random bits
Feedback Shift Generators
Tausworthe, Math of Computing 1965
If {a
k
} is a sequence of binary digits (0 or 1)
defined by
a
k
= (c
1
a
k-1
+ c
2
a
k-2
+ + c
p
a
k-p
)mod 2
and the cs are relatively prime, then {a
k
}
has period 2
p
-1
IBM - Randu
If c = 0 power residue generator
(multiplicative generator)

u
n
= a
n
u
0
mod m
u
n
= a u
n-1
mod m (homework)
NOTES

Never invent your own LCG. It will probably not be good.

All simulation languages and many software packages have
their own PRN generator. Most use some variation of a linear
congruential generator.

Power residue generators are the most common.
Tests of RNG, contd
Theoretical tests
Prove sample moments over entire cycle are
correct
Lattice structure of LCGs
random numbers fall mainly in the planes
(Marsaglia)
Spacing hyperplanes: the smaller, the better
Tests of Random Number
Generators
Empirical tests
Uniformity
Compute sample moments
Goodness of fit
Independence
Gap Test
Runs Test
Poker Test
Spectral Test
Autocorrelation Test
Testing Random Number Generators

Desirable Properties:

Mean and Variance
Theorem: E 1/2 and V 1/12 as m+.
Proof:
For a full period LCG, every integer value from 0 to m-1 is
represented. Therefore
E = (0+1++(m-1))/m
2
= ((m-1)(m)/2)/m
2
= (1/2)-(1/2m)
V = ((0
2
+1
2
+2
2
++(m-1)
2
)/m
3
) - E
2

= [(m)(m-1)(2m-1)/6]/m
3
- [(1/2) - (1/2m)]
2

= [(1/12) - (1/12m
2
)]
Uniformity

X
2
Goodness of Fit Test

Divide n observations into k (equal) intervals

Do a frequency count f
i
, i=1,2,,k

Compute X
2
= E
i
(f
i
-n/k)
2
/ (n/k)
= E
i
(f
i
-np
i
)
2
/ np
i
,
where p
i
= 1/k, i=1,2,,k.

f
1
f
2
f
k-1
f
k

0
1
k
2
k
k 2
k
k1
k
e
1
e
2
e
k-1
e
k

e
i
= expected number of observations in interval i
= n p
i
=

n / k, i = 1, 2, , k
Data Classification
1

NOTE
(f
i
-np
i
)/(np
i
)
1/2
is the N(0,1) approximation of
a multinomial distribution for p
i
small, where
E[f
i
] = np
i
and Var [f
i
] = np
i
(1-p
i
)).
For n large, X
2
is distributed _
2
with k-1 degrees of freedom
Reject randomness assumption X
2
> _
2
1 k
NOTE: if X
2
is too close to zero, it may be because the
numbers have been fudged.
BE WARY OF PRN WHICH LOOK TOO RANDOM
Do Not
Reject H
O
Reject H
O

_
k1,o
2
_
2
Goodness of Fit Test

- Repeat test m times with independent samples of size n
- If H
0
is true, test will reject H
0
om times (on average)
Trouble Spots

Choosing the intervals evenly

Choosing the intervals such that you would expect
each class to contain at least 5 or 10 observations

p
i
should (ideally) be small (<.05)
Example

n = 1000

[0, .1) f
i
= 87 [.1, .2) f
i
= 93 [.2, .3) f
i
= 113
[.3, .4) f
i
= 106 [.4, .5) f
i
= 108 [.5, .6) f
i
= 99
[.6, .7) f
i
= 91 [.7, .8) f
i
= 95 [.8, .9) f
i
= 103
[.9, 1.] f
i
= 105

X
2
= 628/100 = 6.28 _
Do not reject H
0
: U(0,1).
9,.05
2
=16.919
NOTE
The _
2
goodness of fit test is also used to fit distributions to
data, where
X
2
= E
i
(f
i
-e
i
)
2
/ e
i

e
i
= expected number of observations in interval i.
Kolmogorov-Smirnov Goodness-of-fit Test

Order n U[0,1] variates {x
[i]
}
Construct an empirical CDF for the n variates {x
[i]
}
(i.e., F(x
[i]
) = i/n i = 1,2,,n)
Construct a hypothesized CDF for n uniform variates
(i.e., = x, 0sxs1)
Compute D = max {D
+
, D
-
}, where
D
+
= Max
1<i<n
[(i/n)-
D
-
= Max
1<i<n
[ -((i-1)/n)].
Check tables
Reject if D is too large, with a risk o, which means that
we reject (uniformity) falsely with probability o.
(x) F
~

F (x
[i]
)]
) (x F
~
[i]
1.00
.75
.50
.25
0
.1 .2
.3
.9 1.0
D
+
= max {.15, .30, .45, .10}=.45

D
-
= max {.10,
-
.05,
-
.20, .15}=.15
D
+

D
-
Examples

If {U
i
} = {.1, .2, .3, .9}, then D = .45.

If {U
i
} = {.2, .6, .8, .9}, then D = .35.

If {U
i
} = {.25, .5, .75, 1.}, then D = .25.



NOTE: The minimum value that D can take on is 1/2n.
(How?)
Independence

Sign Test
* Test Statistic: S = runs of numbers above or below median)
* For large N, S is distributed N( = 1+(N/2), o
2
= N/2)
Example
N = 15, S = 7, distributed N( = 8.5, o
2
= 15/2)
Maximum value for S: N (negative dependency)
Minimum value for S: 1 (positive dependency)
.87 .15 .23 .45 .69 .32 .30 .19 .24 .18 .65 .82 .93 .22 .81
+ - - - + - - - - - + + + - +
Normal Curve Rejection Regions
REJECT
(+ve)
REJECT
(-ve)
Do
Not
REJECT
Reject H
0
in favor of H
A
if

Z = (S - (1+(N/2))) / (N/2)
1/2
> Z
o/2
or Zs Z
o/2
Z
o/2
-Z
o/2
H
0
: Independence

H
A
: Dependence
Runs Up and Down Test
(runs of increasing and decreasing numbers)

Assign + if x
i
<x
i+1
, assign - if x
i
>x
i+1

Test Statistic: S = number of runs up AND down
(sequence of + and -)
E(S) = (2N-1)/3, V(S) = (16N-29)/90
Use Normal approximation for N>30.

Example:
N = 15, S = 8, distributed N( = 29/3, o
2
= 211/90)
Maximum value for S: N-1 (negative dependency)
Minimum value for S: 1 ?
.87 .15 .23 .45 .69 .32 .30 .19 .24 .18 .65 .82 .93 .22 .81
- + + + - - - + - + + + - +
Normal Curve Rejection Regions
REJECT (-ve) REJECT
Do Not
REJECT
H
0
: Independence

H
A
: Dependence
Z
o/2
-Z
o/2
Reject H
0
in favor of H
A
if

Z = (S - (2N-1)/3) / (16N-29/90)
1/2
> Z
o/2
or Zs Z
o/2
Test of Cycling
Floyds Test for Cycling
Assume u
i
= G(u
i-1
)
x
0
= y
0
= seeds
x
i
= G(x
i-1
) y
i
= G(G(y
i-1
)), i.e. skip every
other one so y will go twice as fast as x.
Then check to see if there is some value of
n for which x
n
= y
n
.
If x
n
= y
n
, cycling occurred.
Marsaglias Theorem
All N-tuples generated by a congruential generator
will fall in fewer than (N!m)
1/N
hyperplanes.
(Proc. Nat. Acad. Sci. 61, 1968 pp.25-28)
e.g. all 10-tuples fall in fewer than 13 9-dimensional
planes for m = 2
16
. Randu in ONLY 15
PLANES in 3D cube.
(Solution: Make m bigger limited by computer
word size.)
Plot of RND
i+1
vs RND
i
using LCG in SIGMA

Anda mungkin juga menyukai