The Gaussian probability distribution is perhaps the most used distribution in all of science.
Sometimes it is called the bell shaped curve or normal distribution.
Unlike the binomial and Poisson distribution, the Gaussian is a continuous distribution:
p( y)
1
e
2
( y )2
2 2
y is a continuous variable (- y
Probability (P) of y being in the range [a, b] is given by an integral:
1 b
P(a y b) p( y )dy
e
2 a
a
b
( y )2
2 2
dy
1
p(x)
e
2
p(x)
R. Kass/S06
P416 Lec 3
(x )2
2
2
gaussian
The total area under the curve is normalized to one by the (2) factor.
(y) 2
2 2 dy 1
1
e
2
l We often talk about a measurement being a certain number of standard deviations ( ) away
from the mean () of the Gaussian.
We can associate a probability for a measurement to be | - n|
from the mean just by calculating the area outside of this region.
n Prob. of exceeding n
0.67
0.5
It is very unlikely (< 0.3%) that a
1
0.32
measurement taken at random from a
2
0.05
Gaussian pdf will be more than 3
3
0.003
from the true mean of the distribution.
4
0.00006
P( y )
Shaded
area
gives
0.4
prob .
-3
-2
-1
gives
0.4
0.2
0.1
0.1
-4
-3
prob .
0.3
0.2
area
0.3
-4
Shaded
-2
-1
4
m500050 (10 m)!m!
For a Gaussian distribution:
( y) 2
2
1
P( y )
e 2 dy 0.68
2
See Taylor10.4
R. Kass/S06
P416 Lec 3
1 b 12 y 2
lim Pa
b
dy
e
n
n
2 a
Y
1 b 12 y 2
lim Pa
b lim Pa
b
dy
e
n
/ n n
m
2
a
R. Kass/S06
P416 Lec 3
Y
1 b 12 y 2
lim Pa
b lim Pa
b
dy
e
n
/ n n
m
2 a
The CLT is true even if the Ys are from different pdfs as long as the means
and variances are defined for each pdf !
See Appendix of Barlow for a proof of the Central Limit Theorem.
R. Kass/S06
P416 Lec 3
Random number generator gives numbers distributed uniformly in the interval [0,1]
n
= 1/2 and 2 = 1/12
Procedure:
a) Take 12 numbers (r1, r2,r12) from your computers random number generator (ran(iseed))
b) Add them together
c) Subtract 6
P a
b
n
12
1
r
12
2
i1
P a
b
1
12
12
12
P6 ri 6 6
i1
1 6 12 y 2
dy
e
2 6
R. Kass/S06
-6
+6
12 is close to
P416 Lec 3
Example:
1 b 12 y 2
lim Pa 1 2
b
dy
e
n
n
2 a
Y1 Y2 ...Yn n 25 365 0
4.5
1
n
365
12
4.5
1
n
365
12
The probability to be within 25 minutes is:
This integral is 1
to about 3 part in 106!
1 4.5 12 y 2
P
dy 0.999997
e
2 4.5
P416 Lec 3
l Example: The daily income of a card shark has a uniform distribution in the interval [-$40,$50].
u
What is the probability that s/he wins more than $500 in 60 days?
Lets use the CLT to estimate this probability:
Y1 Y2 ...Yn n
1 b 12 y 2
lim Pa
b
dy
e
n
n
2 a
The probability distribution of daily income is uniform, p(y) = 1.
p(y) needs to be normalized in computing the average daily winning () and its standard deviation ().
50
yp(y)dy
40
50
p(y)dy
1 [50 2
2
(40)2 ]
50 (40)
40
50
2
2
y p(y)dy
4050
p(y)dy
1 [50 3 (40) 3 ]
3
25 675
50 (40)
40
1
n
675 60
201
The upper limit is the maximum that the shark could win (50$/day for 60 days):
Y Y ...Yn n 3000 60 5 2700
b 1 2
13.4
n
675 60
201
1 13.4 12 y 2
1 12 y 2
P
e
dy
dy 0.16
e
2 1
2 1
R. Kass/S06
P416 Lec 3