Example : Election 2000 voting poll from 2386 interviews Example : VSRS {X 1 , X 2 ,..., X n } from (
N µ, σ 2 )
Result of Nationwide 47% > 45%
σ2
Gallup poll X ~ N µ ,
Can Bush get more votes than Gore?
n
November 5, 2000
Point estimates
Interval estimates
George W. Bush 47%
Bush σ σ µ + 1.96σ n − µ µ − 1.96σ n − µ
Al Gore 45% Pr µ − 1.96 ≤ X ≤ µ + 1.96 = Φ
− Φ
n n σ n σ n
Ralph Nader 4% Gore
Buchanan 1% =Φ 1.96−) −(1Φ−(0−.1975
0 .(975 .96)
Not yet decided 3%
%
43 45 47 49
= 0.95
Margin
Margin
of error
of error
: 0.05%
: 2%
“toowill
“Bush close to the
win call”
race”
X σ−σµ σ σσ
= Z .3 2
15 = 1.0036
1Z(.1α296 )− Φ (−101==.−95
00α..)90
Z 00..15
Pr−XX1.−96
Pr −1Z21.≤.α96
576
645 ≤≤≤µ1µ≤ ≤X
.96 =+Φ
X + ..96
2645
576 = 96 = 0.95
99 α/2 1-α α/2
σ nnn
2
n nn
Z 0.9 = −1Z.0282
.1
1
Confidence Interval Confidence Interval
Example : X alcohol concentration of French wine 1.22
X ~ N µ,
X ~ N µ ,1.2( 2
) 60
µ + 0.255
µ − 0.255
VSRS {X 1 , X 2 ,..., X 60 } X = 9.3
area = 0.90
σ
= 9.3 ± (1.645)
1.2
A 90% C.I. for µ :
90 % confidence intervals
X ± Z 0.05 X
9.3
n 60 X
[9.045 , 9.555]
= 9.3 ± 0.255 X
8.9
= [9.045 , 9.555] [8.645 , 9.155]
X
9.5
X
[9.245 , 9.755]
N (0,1) ≡ t ∞
X ~ tr
XX −−−µµµ Student-t Densities
~ ~N~(t0n,?1− 1) S=
1 n
∑ (X i − X )
2
Var ( X ) =
r
t3 for r > 2
σS σ S r −2
X X− −
PrPr t n −Z1,α 2 ≤ µ ≤ X + tZnα−1,2α 2 = 1= −1α
−α
n n n t1 ≡ Cauchy
unknown
σS
100(1-α)% C.I. for µ : XX±±t nZ−1α,α 22
nn
3 0 3
tr X ~ tr
-tr,α/2 0 tr,α/2
2
Normal Model, Unknown Variance Two Samples Problem
Example : X Frequency of elephant call Population 1 Population 2
X ~ N (µ , σ 2 ) µ y , σ y2
µ x , σ x2
VSRS {X 1 , X 2 ,..., X 12 } X = 22.33 S = 56.4242
σ x2 σ y
2
Assumption : Equal variances σ x2 = σ y2 = σ 2
Point estimator for µx - µy : X − Y ~ N µ x − µ y , +
m n
(m − 1)S x2 + (n − 1)S y2
Pooled sample variance 2
S pool =
σ x2 σ y
2
σ x2 σ y
2
m+n−2
Pr (X − Y ) − Z α 2 + ≤ µ ≤ (X − Y ) + Zα 2 + = 1 −α
m n m n
σ y2
(X −−YY ))±±Zt m + n − 2σ,α 2 +S pool
2
1 1
100(1-α)% C.I. for µx - µy : α 2
x
+
σ x2 σ y m n m n
2
100(1-α)% C.I. for µ : (X − Y ) ± Z α 2 +
m n
unknown
point estimate margin of error
{X 1 , X 2 ,..., X n }
1(10 Y± )3(−±2.027 + n −)(
=2),7α[±−.331 , 0).025 (7 +.688) ] +
11 1 1 1 1 IF σ is unknown, replace it by S .
95% C.I. for µx - µy : X .−371
. 661 8.t160
.m714 1 . 366
2 tS13pool
, +4. 331
8m 7 n 8 7
3
Population Proportion Population Proportion
X − nπ
~ N (0,1)
.
Large Population For large n nπ (1 − π )
Virus carriers
1−π π π Population proportion Pr − Z α 2 ≤
X − nπ
≤ Zα 2 ≈ 1 − α
nπ (1 − π )
X = number of virus carriers in the sample
Approximate 100(1-α)% C.I. for π :
X ~ b(n, π )
2π + Z α2 2 ± (2π + Z n ) − 4π (1 + Z
2
α 2
2 2 2
α 2 n )
Point estimator for π : 2(1 + Z n) 2
α 2
Random sample of size n X
π= π (1 − π )
n
Sample proportion ≈ π ± Zα 2
n
K 41
π= = 0.082 π2
500 π1
Approximate 95% C.I. for π :
K K
K π (1 − π ) (0.082)(0.918)
π ± Z 0.025 = 0.082 ± (1.96) X Y
n 500
= 0.082 ± 0.024
Sample with size n1 Sample with size n2
= [5.8% , 10.6%]
Estimated unemployment rate is 8.2% with margin of error 2.4% . Compare π1 and π2 by : π1 − π 2 unknown
4
Sample Size Determination Sample Size Determination
σ
100(1-α)% C.I. for µ : X ± Zα 2 Example : survey on entertainment expenses
n
Zα 2
σ
Zα 2
σ σ σ
Zα 2 Zα 2 Zα 2
σ
Zα 2
σ σ
Zα 2 Zα 2
σ σ = $400 from past surveys
n n n n n n n n
Precision requirement :
X µ X X X
95% confidence that estimation error is at most $120
D : precision requirement
π (1 − π ) Z α2 2π (1 − π )
D = Zα 2 ⇔n=
n D2