• Midterm on April 5
• Any questions?
Ch1 Uncertainty
• “Uncertainty is the refuge of hope.”
Henri Frederic Amiel
• “Uncertainty is fun.”
JP Wang
Aleatory uncertainty in daily life
• In studying geotechnical
foundation engineering, you will
be introduced with Terzaghi and
Meyerhof and Vesic methods, all
targeting on the same problem
• Therefore,
=> probability = sample point / sample space
Finite and infinite sample space
Ex 2.2
Jerry managed three projects and the outcome will
be good, bad, and terrible with equal chance.
He was told that as long as he gets at least one
bad and one terrible, he is then the history of the
company. What is the probability of Jerry getting
fired?
• Venn’s diagram
Tree diagrams
Ex 2.2
Jerry managed three projects and the outcome will
be good, bad, and terrible with equal chance.
He was told that as long as he gets at least one
bad and one terrible, he is then the history of the
company. What is the probability of Jerry getting
fired?
1) sample space Ra
Ra : [0, 300]
2) sample space Rb
Rb: [0, 300]
4
Fundamental probability rules:
• Commutative rule (switching place)
=> A U B = B U A ; AB = BA
5
• Distributive rules => (A U B)C = AC U BC
• ME => Pr(AB) = 0
Ex 2.5
Draw respective Venn’s diagram for ME and SI
Conditional probability *****
• Pr (A | B)
=> probability of A happening GIVEN B has occurred
11
A => route 1 open
B => route 2 open
Pr(A) = 0.75
Pr(B) = 0.5
Pr(AB) = 0.4
12
Review of last lecture
Pr (A | B) = Pr(AB) / Pr(B)
2
A => route 1 open
B => route 2 open
Pr(A) = 0.75
Pr(B) = 0.5
Pr(AB) = 0.4
3
• What is your findings?
• What is Pr(A)?
Theorem of total probability
Ex2.7: flood question
Given:
H: heavy snow accumulation; Pr(H) = 0.2
N: normal snow accumulation ; Pr(N) = 0.5
L: light snow accumulation ; Pr(L) = 0.3
Pr(F|H) = 0.9; Pr(F|N) = 0.4; Pr(F|L) = 0.1
F: flood
What is Pr(F)?
• Can we solve the problem with alternative or
“elementary-school” methods?
What is Pr(D)?
=> Pr(D) = 0.35 x 0.05 + 0.25 x 0.1 + 0.14 x 0.25
+ 0.05 x 0.6 + 0.01 x 1 = 0.12
• What if now you are interested in: given A
happening, what is the probability of E1
happening
=> Pr(E1 | A) = Pr(E1A) / Pr(A)
Pr(Ei | A) = Pr(EiA) / Pr(A) (1)
Pr( A | Ei ) Pr(Ei )
Pr(Ei | A)
Pr( A | Ei ) Pr(Ei )
Ex2.9: construction question
You order 60% and 40% of aggregates from A and B
company. The sub-standard rates are 3% and 1%.
What is poor aggregates from A?
A B
Aggregates 60 40
Poor quality 1.8 0.4
• Bayes’ theorem
Summary of Ch 2
• Probability =
• Venn’s diagram
• Conditional probability
=> Pr(A|B) = Pr(AB) / Pr(B)
35
30
25
No. Students
20
15
10
0
30-40 40-50 50-60 60-70 70-80 80-90 90-100
Discrete RV Continuous RV
Basic axioms:
F (infinite) = Pr( X ≤ infinite) = 1
F(-infinite) = Pr( X ≤ - infinite) = 0
Pr (a < Y ≤ b) = F(b) – F(a)
Ex3.1: construction management example
you manage to use three bulldozers in
construction. Each of them has 50-to-50
probability to be functional or not after six
months. Let Y is a RV of functional bulldozers
1) what type of Y is it? 3) PDF of Y
=> discrete RV
2) range of Y
=> 0 ~ 3
4) CDF of Y
F(0) = f(0) = 1/ 8
F(1) = f(0) + f(1) = 1/2
F(2) = f(0) + f(1) + f(2) = 7/8
F(3) = 1
2) Given PDF of T:
What is the CDF?
3) Given λ = 1 years, what is the probability that a
bulldozer will fail within 2 years?
=> Pr(T ≤ 2) = F(2)
= 1 – e-1x2 = 0.86
=> 0.5
• RV
• CDF
How to describe a RV, or the ID of a RV
1. Distribution
2. Central values
– Mean => E(X) or μX
– Mode => Pr (X = xmode) = highest
– Median => Pr (X ≤ x50) = 0.5
3. Variability
– Standard deviation or variance (= SD2)
=> V(X) = E[(X - μX)2]
4. Higher order of central moment => skewness,
kurtosis, etc…
Ex3.4: Statistics about how we use the gym in
HKUST
1) What is mode?
120 => 1
120
A total of 420 data
110
No. people using the HKUST Gym
100
90
80
70
2) What is median?
70
60
50 50
60
=> 4
50
40
40
30
30
20
10
3) What is mean?
0
1 2 3 4 5 6 7 => 3.7
Duration (in 10 minutes)
What is “E” or how to calculate the mean and SD
• Let f(x) is the PDF of X,
E(X) = ∫ xf(x) dx ***** (continuous RV)
E(X) = ∑ xf(x) ***** (discrete RV)
100
90 2) E[(X - μX)2]
80
70
70 = (1-3.7)2x120/420
60
60
50
50 50 +(2-3.7)2x120/420
40
40
30 + … = 5.2
30
20 3) E[(X - μX)3]
10
0
= (1-3.7)3x120/420
1 2 3 4 5 6 7
• PDF:
Ex3.6: HKUST students’ height
Given our height (X) following the normal
distribution with mean and SD equal to 170 cm
and 7.5 cm, what is the probability that Jerry’s
roommate is taller than 185 cm
2) Ф(-0.15)
=> 1-Ф(0.15)
= 0.4404
Review of last lecture
PDF
=∫ xf(x) dx – μX Skewness = 0
= μX – μX = 0
X
X
How to find Ф()
• Using Excel
Ф(0.15) = NORMSDIST(0.15)
1.6
1.4
1.2
0.8
0.6
0.4
0.2
0
0 20 40 60 80 100 120 140
-0.2
Uniform distribution
• Clearly, it means that the probability density is
uniform, so that its PDF looks like follows when X in
[a, b]
a X b
PDF
=> PDF of X = 1 / (b-a)
a X b
0.35
mean rate = 1
=> Once again, Excel 0.25
PDF
0.20
0.10
0.05
0.00
0 1 2 3 4 5 6 7 8 9 10
X
Review of last lecture
0.35
mean rate = 1
=> Once again, Excel 0.25
PDF
0.20
0.10
0.05
0.00
0 1 2 3 4 5 6 7 8 9 10
X
Exponential distribution
• It is part of the statistical Poisson process. The
difference between it from the Poisson model is
that the exponential model is about this RV: the
recurrence time
n
=> Pr (M = m) = pm x (1-p)n-m x
m
Ex3.14
You manage three projects and the probability that
your supervisor is happy and not happy with your
performance is 0.9 and 0.1. What is the probability
that your boss is happy two out of three projects?
F (∞, ∞) = 1
Pr
F (-∞, -∞) = 0
Y
X
• Joint PDF can be reduced to single PDF or the
so-called marginal PDF
Pr (Y = 10) = fY(10)
The summation of blue columns
Pr (Y = 20) = fY(20)
The summation of purple columns
30 Pr (Y = 30) = fY(30)
20
1 10 The summation of yellow columns
2
3
6
Ex4.2: continuous case
6
( x y 2 ) ; 0 x 1, 0 y 1
f ( x, y ) 5
0
What is fX(x)?
1
f X ( x) f ( x, y )dy
0
1
6
( x y 2 )dy
0
5
6 2
x
5 5
What is the No.1 quality of a “legal” joint PDF
=> The summation of it needs to be 1.0
6
( x y 2 ) ; 0 x 1, 0 y 1
f ( x, y ) 5
0
1 1
6
Since ( x y 2 )dxdy 1 ,
x 0 y 0
5
1 1
0.8
Can we do ?
x 0 y 0 0.6
Y x*, 1-x*
0.4
1 1 x
How about
x 0 y 0
0.2 Region of
positive density
0.0
x*, 0
0.0 0.2 0.4 0.6 0.8 1.0
X
Therefore,
f(x) = 0
11
Review of last lecture
• In this case:
fX, Y(x, y) = fX (x) * fY(y)
For example:
fX (x) = 0.1, fY (y) = 0.1 and fX, Y(x, y) = 0.01
Ex4.5
Is X and Y in Ex4.1 statistical independent?
Ex4.7
What is E(XY) in Ex4.4 1.0
0.8
0.6
Y
0.4
0.2
0.0
0.0 0.2 0.4 0.6 0.8 1.0
1 1 x
=
x 0 y 0
xy*24xy dy dx = 2/15
Covariance
• Cov(X,Y) = E [ (X-μX)*(Y-μY) ] ***
= ∫∫ (X-μX)*(Y-μY) * f(x,y) dxdy
!! Cov(X,Y) is unit-dependent
X
0.3
PDF
0.2
Y
0.1
0.0
-10 -5 0 5 10 15 20
Ex4.8
Given X ~ U(0, 20) and Y = 3X
PDF
ii) Pr(Y < 100)? 0.03
=1 0.02 Y
iii) Pr(Y ≤ 30)? 0.01
= 0.5
0.00
0 20 40 60
• FY(Y)
= Pr(Y ≤ y) (You can imagine
Y = 3X so that X = Y/3
= Pr(X ≤ x) and use x = 10 and y = 30)
= Pr(X ≤ f-1(y))
= FX(f-1(y))
Ex4.9
Given X ~ U(10, 100) and Y = 1/X and X = 1/Y
0.12
PDF of X
=> 0.01 to 0.1 0.08
0.06
=1 0.02
0.00
• FY(Y)
= Pr(Y ≤ y)
(You can imagine
= Pr(X > x) Y = 1/X so that X = 1/Y
and use x = 50 and y = 0.02)
= 1 - Pr(X ≤ x)
= 1 - Pr(X ≤ f-1(y))
= 1 - FX(f-1(y))
• Function of RV
=> given PDF of X, and y = g(x), find PDF of Y
=> mapping
Ex4.10: X ~ Uniform(0, Π); Y = sinX
What is the range of Y
=> 0 to 1
What is Pr (Y ≤ 1)
=> 1
What is Pr(Y ≤ 0.5)
=> mapping
Pr(Y ≤ 0.5) = Pr (X ≤ Π/6) + Pr (X > 5Π/6)
Pr (X ≤ Π/6) = Π/6 * 1/ Π = 1/6
Pr (X > 5Π/6) = 1 - Pr (X ≤ 5Π/6) = 1/6
Pr(Y ≤ 0.5) = 1/6 + 1/6 = 1/3
Linear combination of RVs
• Given PDFs of X and Y, and Z = aX + bY for
example, what is PDF of Z
Since Z ~ N as well,
Pr(Z < 210) = Ф ((210 - 200) / 20) = 0.69
Extended special case:
=> Given Z = X/Y and both X and Y follow the
lognormal distribution, and Z follows the
lognormal distribution as well
Pr (C > 35000)
= Pr (lnC > ln35000)
= 1 – Pr (lnC ≤ ln35000)
= 1 - Ф ((ln35000 – 10.36) /0.26)
= 1 - Ф (0.39) = 0.35
Special case II: Poisson distribution
=> Z = X + Y and both X and Y follows the Poisson
distribution, so does Z
ke kx ; x 0 he hy ; y 0
f ( x) f ( y)
0 ; otherwise 0 ; otherwise
Y-X=z
Therefore,
FZ(z) = Pr(Z ≤ z) = Pr(Y – X ≤ z) A
= ∫∫ f(x,y) dxdy -z
zx z
x z y 0
Therefore,
FZ(z) = Pr(Z ≤ z) = Pr(Y – X ≤ z)
z A
= ∫∫ f(x,y) dxdy
zx
x 0 y 0
zx
h hz
kx hy
As a result, FZ ( z ) khe e dydx 1 e
x 0 y 0
hk
Ex4.15: Following Ex4.14, what is the CDF of Z,
given Z = X + Y and Z = Y/X?
ke
kx
; x0 he hy ; y 0
f ( x) f ( y)
0 ; otherwise 0 ; otherwise
X+Y=z
i) Z < 0
=> F(z) = 0
ii) Z ≥ 0
FZ(z) = Pr (Z ≤ z)
A
= Pr (X + Y ≤ z)
= ∫∫ f(x,y) dxdy z zx
x 0 y 0
With some tedious calculation:
zx
z
1
FZ ( z )
x 0 y 0
khekx e hy dydx 1
k h
kehz he kz
**********************************************************
Y - zX= 0
As for Z = Y/X,
FZ(z) = Pr(Y/X ≤ z)
= Pr(Y – zX ≤ 0) A
FZ(z) = ∫∫ f(x,y) dxdy
zx
hz
kx hy
FZ ( z ) khe e dydx
x 0 y 0
k hz
Tips of finding FZ(z) given Z = g(X,Y)
FZ(z) A?
= Pr(Z ≤ z)
= Pr(Y – X ≤ z) A?
= ∫∫ f(x,y) dxdy
zx
x 0 y 0
Computer-aided analysis:
• Repeating a number of “tossing-a-dice”
experiments in computers
20
18
16
14
12
10
4 6 8 10 12 14 16 18 20 22 24
26
24
22
20
18
16
14
12
10
8
6
4
2
0
-2
-2 0 2 4 6 8 10 12 14 16 18 20 22 24 26
total point point inside
• Can you calculate the size of Hong Kong
by MCS?
Summary of computer-aided probabilistic
analysis or MCS
x1 xn x1 xn
X X X x1
n n2
x1 xn x1 xn
X X
n n2
x xn 1
E 1 E ( x1 ) E ( xn )
n n
unbiased
E ( x1 ) E ( xn )
1
n n
x xn
E 1
1
E ( x1 ) E ( xn )
n2 n2
biased
1
E ( x1 ) E ( xn )
n2 n2
• Given the two estimators both unbiased,
we compare their consistency and
efficiency,…
• Consistency:
lim Pr(| ˆ | ) 0 ε: very small value
n
• Efficiency:
V(Estimator)
Central Limit Theorem
No matter what distribution X is following, when n
> 30, sample mean of X:
X ~ N ( , )
n
• Make-up midterm will be held on April 13, for
those who want to improve their midterm grades
(40%)
• Unbiased estimator
x1 xn
• Ex: X ,
n
is a unbiased estimator for calculating sample
mean, used as an estimation for population
mean
Estimator for standard deviation
S
2 1
X i X 2
n 1
Ex6.2:
You bought 10 helmets and found 3 of them are
flawed, what is your estimate for the flawed
rate?
(By Method of Moment: 3/10)
=> Maximum the probability of observed samples
Pr(3 bad out of 10) = C x p3 x (1-p)7
(p denotes the flawed rate)
Ex6.2:
You bought 10 helmets and found 3 of them are flawed, what is your estimate for the
flawed rate?
dy 3 7 3
=> 0 ; p
dp p 1 p 10
Ex6.3:
x1,…,xn are samples from a normal distribution,
what are the estimates for the population’s mean
and SD from MLE
1 1 1
e ( x1 ) / 2 2
e ( x2 ) / 2 2
e ( xn ) / 2 2
2
2 2
n/2
1 ( xi ) 2 / 2 2
2
e
2
Ex6.3:
x1,…,xn are samples from a normal distribution, what is the estimates for the population’s
mean and SD
n 1
=> ln( y ) ln( 2 ) 2 ( xi ) 2
2
2 2
dy
=> find 0 2( xi ) 0
d
xi n 0
xi
n
Ex6.3:
x1,…,xn are samples from a normal distribution, what is the estimates for the population’s
mean and SD
dy
=> find d 0
x X
2
2 i
• Do some calculus
Review of last lecture
0.2
0.0
Y
a b
Ex6.4
JP caught 100 roaches and found sample mean = 40
mm, what is the interval estimation for population
mean given confidence interval = 95%. Given
population SD = 4mm.
=> Based on CLT => X ~ N ,
n
X
Let Z = , PDF 0.4
Z ~ N(0 ,1)
/ n
0.3
Z ~ N(0, 1) 0.2
95%
0.1
2.5% 2.5%
0.0
Ex6.4
JP caught 100 roaches and found sample mean = 40 mm, what is the interval
estimation for population mean given confidence interval = 95%. Given
population SD = 4mm.
PDF
0.4
Z ~ N(0 ,1)
0.3
0.2
95%
0.1
X
2.5% 2.5% * because Z =
/ n
0.0
X
Z
S/ n
• t-distribution also
symmetrical
Ex6.5
JP caught 4 roaches and found sample mean = 40 mm and
sample SD = 4mm, what is the interval estimation for
population mean given confidence interval = 95%.
X
PDF
=> t 0.4
t-distribution with DOF = 3
S/ n
0.3
0.2
95%
0.1
2.5% 2.5%
0.0
*
3.18 = (40 - μ) / (4/2) -3.18 3.18
End Function
Review of last lecture
• Interval estimation
Interval estimation for population SD
(n 1) S 2
2
******
2
• 2
, chi-square, is a random variable following
chi-square distribution
In Excel:
CHIDIST
CHIINV
Chi-square table
Ex6.6
JP caught 21 roaches and found sample SD = 1 mm,
what is the interval estimation for population SD
given confidence interval = 95%.
PDF
0.4
(n 1) S 2
2
2
0.3
0.2
2.5% 2.5%
* 2.08 = 20x1/9.6
Interval estimation for variance ratio
between two populations
S12 / 12
F 2 2 ******
S2 / 2
In Excel:
FDIST
FINV
Ex6.7
Find F-value with DOF1 = 10 and DOF2 = 5 with
right-tail probability = 5%
=> 4.74
Ex6.8
Find F-value with DOF1 = 10 and DOF2 = 5 with
left-tail probability = 5%
1
=> Fleft tail (v1 , v2 ) = 1 / 3.33 = 0.3
Frighttail (v2 , v1 )
Ex6.9
Jerry caught 21 cockroaches in Hall I and Mike got 16
in Hall II in the HKUST. The sample variance are
1.85 and 1.65, respectively. Find the ratio of two
populations’ variance given confidence interval =
90%
F-dist with DOF1=20 and DOF2=15
PDF
=> Thinking of 0.4
S12 / 12 0.3
F 2 2
S2 / 2 0.2
90%
0.1
5% 5%
0.0
PDF 0.4
0.3
0.2
CI
0.1
(1-CI)/2
(1-CI)/2
0.0
LEFT RIGHT
• Interval estimation
(n 1) S 2
2
2
S /2 2
F 1 1
S /
2
2 2
2
Ch7 Hypothesis testing
• It is like placing a bet on unknowns
0.4
Z ~ N(0, 1)
PDF
Pr (Rejection region)
= level of significance = α
0.3
0.2
0.1
Rejection Acceptance Rejection
region region region
0.0
Z ~ N(0, 1)
X
0.4
PDF
Z
S/ n 0.3
rejection region,
0.1 Rejection Acceptance Rejection
reject H0, accept H1 region region region
0.0
Type I and Type II errors in hypothesis
testing
• It is a mistake that a judge sends an innocent
person to jail
• It is also a mistake that a judge does not sends a
guilty man to jail
• As a result, hypothesis testing is associated with
a given error, like interval estimation associated
with a confidence interval
• Type I error: reject H0 given H0 is true
• Type II error: accept H0 given H0 is false
Ex7.1
The population mean is 168 cm in HKUST.
Should this hypothesis be accepted given α =
5% and with10 random samples
i) H0: μ = 168
0.4
t-dist with DOF = 9
ii) H1: μ ≠168 (two-tail)
PDF
0.3
iii)Statistics:
0.2
X
t 0.1
S/ n
0.0
i) H0: σ = 3 2
0.4 with DOF = 9
ii) H1: σ ≠ 3 (two-tail)
PDF
0.3
iii)Statistics:
0.2
(n 1) S 2
2
0.1
2
0.0
0.2
0.1
0.0
• The larger the α, the larger the rejection zone, the, the
more easily H0 got rejected
• Given H0 not rejected in 1% α and in 10% α, the latter is
“stronger.” That should be the reason α is referred to as
“level of significance.”
Review of last lecture
• Set H0 and H1
• Use a proper statistics (Z, t, F, Chi2…) and calculate it
• Find rejection region
• As the statistics is inside the rejection zone, reject H0
0.4
PDF
Pr (Rejection region)
= level of significance = α
0.3
0.2
0.1 Acceptance
Rejection Rejection
region
region region
0.0
Ex7.3
The steel company claims that its steel can sustain
more than 2650 pounds. Randomly select 6
samples and their strengths are:
2680, 2780, 2450, 2620, 2480, 2500
Should we trust the statement of the company given
level of sig. = 5%?
0.4
t-dist with DOF = 5
=> H0: μ = 2650 PDF
H1: μ ≠ 2650
0.3
5%
0.0
PDF
Since t
S/ n 0.3
t 1.22
130.2 / 6
0.1
0.0
=> Accept H0 2.01
=> Reject the company’s statement
• In Ex7.3, why not set
H0: μ > 2650
H1: μ < 2650
In this case, when H0 is accepted, the company is
right?
X
t and μ > 2650, so that => t < -1.22
S/ n
• In Ex7.3, why not set
H0: μ > 2650
H1: μ < 2650
In this case, when H0 is accepted, the company is
right?
0.4
t-dist with DOF = 5
PDF
0.0
-2.01
How to set a solid H0 and H1
PDF
H1: μ > 12 0.3
H1: μ < 12
0.2
X 5%
Z 2.6 0.0
S/ n
1.64
=> Reject H0; the company is right, given 5% level of sig.
P-value
• When level of significance is not given, we calculate
p-value for decision making
Z(0, 1)
0.4
=> when p-value is small,
PDF
0.3 reject H0
0.2
0.1
0.4% = p-value = Pr(A > calculation)
0.0
2.6
Ex7.5
Below is the final scores of six male and females
students in CIVL2160. Is any difference in the
performance between boys and girls? Given level of
sig. = 5%
PDF
H1: μd ≠ 0
H1: μd > 0
0.3
0.1
X d -2.57 2.57
t 0.4
Sd / n => accept H0 => no difference in
the performance
Ex7.6
Selecting 21 samples for each production line:
0.4
f-dist with DOF1 = 20 and DOF2 = 20
PDF
0.1
1%
0.0
2.94
Review of last lecture
• P-value
right-tail test left-tail test
0.4
PDF
0.4
PDF
0.3
P-value 0.3
P-value
= Pr (A > a*)
0.2
0.2 = Pr (A < a*)
0.1
0.1
0.0
0.0
9
y = 0.9179x + 0.7658
8 R²= 0.9711
7
5 Series1
Linear (Series1)
4
0
0 2 4 6 8 10
– Model basics: y = ax + b + ε
– How to solve model parameters a, b, and ε
– What is r2 and how to determine it
– ……
Model basics:
=> Y = aX + b + ε *****
a: slope of the regression line
b: intercept of regression line
ε: model error; it follows N ~ (0, σε)
8
7
Y = aX + b +
6
5
Y
1 2 3 4 5 6
X
Ex8.1
Given a regression model as Y = 3X + 4; σε = 5. What
is the mean and SD of Y given X = 1?
=> Y = 3X + 4 + ε
E(Y) = E(3X + 4 + ε)
because X = 1 => E(Y) = E(3 + 4 + ε) = 7 + E(ε)
also because E(ε) = 0 => E(Y) = 7
=> For SD of Y
Var(Y) = Var(3X + 4 + ε) = Var(7 + ε) = Var(ε)
therefore, σY = σε = 5
Ex8.2
Following Ex8.1, what is Pr(Y > 12) given X = 1 with
the regression model Y = 3X + 4; σε = 5
7
Y = aX + b +
6
5
Y
1 2 3 4 5 6
X
Finding out model parameters: a and b
di
5
(Xi, Yi)
4
Y
3
Total distance or difference:
n
f (a, b) yi (axi b) *****
2 2
i 1
1
1 2 3 4 5 6
X
n
f (a, b) yi (axi b)
2
7
(Xi, aXi + b)
i 1
6
di
f (a, b)
5
(Xi, Yi)
0 4
Y
a 3
f (a, b)
2
0 1
b
1 2 3 4 5 6
a
x x y y
i i
x x
2
i
4 (Xi, Yi) i 1
Y
f (a)
2 Let 0
1
a
0 then you can find “a”
0 1 2 3 4 5 6
X
population regression sample regression
8 8
7
Y = apX + bp + p 7
6 6
Y = asX + bs + s
5 5
Y
4 4
Y
3 3
2 2
1 1
1 2 3 4 5 6 1 2 3 4 5 6
X X
• ap, bp, and εp are constants but you never know
• You can have as, bs, and εs but they are R.V.
• as, bs, and εs are point estimates for ap, bp, and εp
Review of last lecture
• ε ~ N (0, σε)
• Least square
a
x x y y
i i
b y ax
x x
2
i
Ex8.3
Data below shows the study time and score of the an
exam, find the relationship between the two
variables with simple linear regression analysis.
a
x x y y
i i
b y ax
x x
2
i
Y
3
4
2
2
0 1
-2 0
-2 0 2 4 6 8 10 12 14
0 1 2 3 4 5 6
X X
i 1 n2
Ex8.4
Following Ex8.3, what is the SD of ε
n
SSE
SSE yi (axi b)
2 2
i 1 n2
=> SD of ε = 13.66
or “STEYX” or “FORECAST”
Model explain-ability or R2
n
SSE yi (axi b)
7 (Xi, aXi) 2
6
di i 1
5
(Xi, Yi)
4
SSE
R 1
2
Y
2 SST
1
0
0 1 2 3 4 5 6 SST ( yi y ) 2
X
SSE
R 1 SST ( yi y ) 2
2
SST
=> R2 = 0.53
or “Rsq”
Point-estimate and interval estimate in regression
analysis
0.3
0.2
0.1
0.0
Y
Left_t right_t
a
i
( x x ) 2
• R2 =>
variability model can explain
= 1 – model can not explain
Let S xy xi x yi y
• Sxy normalized by square root of Sxx and Syy to make it
unit-independent
S xy
r
S xx S yy
where r is
correlation coefficient
in regression analysis
S xx xi x xi x
S yy yi y yi y
Some notes on r and R2
• Correlation coefficient r
= (coefficient of determination R2)0.5
• 0 ≤ R2 ≤ 1
(R2 =1 => model fits to samples perfectly)
• -1 ≤ r ≤ 1
(r =1 or -1 => model fits to samples perfectly)
Hypothesis testing on population correlation ρ
30
20
Y - Forecast
250
200
10
150
0
100
Y
-10
50
0 -20
0 2 4 6 8 10 12 14 16
0 2 4 6 8 10 12 14 16
X
X
• For Ex.8.7, what if one uses the square root of Y, to
perform linear regression on X and Y0.5
250 16
14
200
R2 = 0.94 12 R2 = 1.0
150 10
0.5
Y
100
Y
6
50 4
2
0
0
0 2 4 6 8 10 12 14 16 0 2 4 6 8 10 12 14 16
X X
Y = aX2 + bX + c
• How to find R2
=> 1 – model cannot explain => 1 – SSE/SST
Review of last lecture
• Regression analysis:
– Simple linear
– Intrinsic linear (nonlinear) after variable
transformation
– Polynomial regression for “increase-decrease”
data
=> 170 cm
Bad-quality ratio
• Bayesian updating algorithm for discrete cases:
Pr | i Pr' i
Pr"( i )
Pr | i Pr' i
ε: observations (samples )
• The key to using the Bayesian approach:
Pr | i
the likelihood function
Pr | i
Ex9.2
You randomly picked one sample from Jerry’s
company and found it is bad, what is the point
estimate for the bad ratio?
100%
i) When θ = 0.2,
obviously, P’(θ = 0.2) = 0.3
what is Pr(ε|θ=0.2)?
ε: one pile tested and failed
Pr | i
=> 0.2 x 0.2 = 0.04
• This is the
sequence
updated with bad
product one after
another
Review of last lecture
Pr | i
Discrete prior
Pr | i Pr' i
Pr"( i )
Pr | i Pr' i
Pr | i Pr' i
Pr"( i )
Pr | i Pr' i
Pr | i Pr' i
Pr"( i )
2.0
1.0
Given f’(r) = 2r
Pr | r ?
0.5
0.0
0.0 0.2 0.4 0.6 0.8 1.0
Pr | r r
Therefore,
i
0 2/3
Posterior = f’’(r) = 3r2
3.0
Posterior mean
2.5
2.0
1.5
Prior = f’(r) = 2r
1.0
0.5
0.0
What is the mean rate with the two combined by the Bayesian
approach? Assuming accident following the Poisson distribution
Pr | i Pr' i
Pr"( i )
Pr | i Pr' i
One accident per year = 1/12 accident per month
Pr | v 1 / 12 ?
1 accident in one month
v x
e v
Poisson' s PDF
x!
Pr(ε | v1) = 1/12*e-1/12
Prior Posterior
0.4 0.4
Probability
Probability
0.3 0.3
0.2 0.2
0.1 0.1
0.0 0.0
1 2 3 1 2 3
• Judgment + samples
Pr | i Pr' i
Pr"( i )
Pr | i Pr' i
Pr | i
About the Fina, on 22nd may, art hall, 0430pm (please
double check yourself!)
• About statistics
– Use samples (or sample + judgment) to picture the
populations