Anda di halaman 1dari 4

STAT 3008: Applied Linear Regression

2014-15 Term 2
Assignment #2 Solutions
Problem 1:

'Y
) tr E X(X' X) 1 X' Y ' X(X' X) 1 X' Y
E (Y
tr ( E ( Y' X(X' X) 1 X' X(X' X) 1 X' Y ))

(a)

tr ( X(X' X) 1 X' E ( YY ' ))


tr ( X(X' X) 1 X' E ( X' X 'ee'2X' ))
tr ( X(X' X) 1 X' ( X' X ' 2 I n 0))
tr ( X(X' X) 1 X' X' X ' ) tr ( X(X' X) 1 X' 2 I n ) 0
' X ' X tr ( 2 (X' X) 1 X' X ) ' X ' X ( p 1) 2

(b) E (Y' Y) tr( E (' X' X e'e 2X' ' e)) tr(' X' X 2 I n 0) ' X' X tr( 2 I n ) ' X' X n 2
Hence, E(Y' Y) XX ( p 1) 2 (n p 1) 2 E(Y ' Y ) E(e' e ) .
Problem 2: (Please refer to the R codes at the last page)

(a)
(b) R outputs: 0 36.8759, 1 0.5821, 2 8.4562 71.5039 , SE ( 0 ) 64.4728, SE ( 1 ) 0.3892
(c) R outputs:

Df
Sum Sq
Mean Sq
F value Pr(>F)
x
1
159.95
159.947
2.237 0.1731
Residuals
8
572.01
71.502
Hypotheses: H0: y= 0 vs H1: y= 0 + 1x
Decision: Since p-value =Pr(F1,8 >2.237)=0.1731>0.05, we do not reject H0 at =0.05.
Conclusion: We do not have sufficient evidence that the response and the predictor are dependent.

(d) R outputs:

Estimate
0.5821

Std. Error
0.3892

t value Pr(>|t|)
1.496
0.173

Hypotheses: H0: 1 = 0 vs H1: 10


Decision: Since p-value=2Pr(t8 >|t0|)=2Pr(t8 >1.496)=0.1731> 0.05, we do not reject H0
at =0.05.
Conclusion: We do not have sufficient evidence that the response and the predictor are dependent.

(e) Hypotheses: H0: 1 = 2 vs H1: 12. Under H0, t0=(0.5821-2.0)/0.3892=-3.64,


Page 1/4

Decision: Since p-value=2Pr(t8>|t0|)=2Pr(t8>3.64)=0.0066 < 0.05, we reject H0 at =0.05.


Conclusion: We have sufficient evidence that 1 is not equal to 2.0.
(f) A 95% CI for W t is given by

1/ 2

1 ( x x )2

0 1 x t8,0.025
SXX
n

(53.94598, 66.49078).
1/ 2

(g) A 99% PI for Wt is given by

1 ( x x )2

0 1 x* t8,0.005 1 *
SXX
n

(30.41346, 90.02330)

Problem 3:
1
3

(a) (X' X) 1 , (X' X) 1 X' Y

y1 y2 y3
y,
3

1 1 1
1
2
2
2
1 1 , Y' (I H)Y (2Y1 2Y2 2Y3 2Y1Y2 2Y1Y3 2Y2Y3 )

3
3
1 1 1

(b) H X(X' X) 1 X' 1 1

1
E ( 2Y12 2Y22 2Y32 2Y1Y2 2Y1Y3 2Y2Y3 )
3
1
2(3 2 2 ) 2(1 ( 1) 2 ) 2(1 12 ) 2( 1 2( 1)) 2(0 2(1)) 2(0 ( 1)1)
3
(14 4 4 6 4 2) / 3 26 / 3

E[Y' (I H)Y ]

(c)

E (Y12 ) E (Y1Y2 ) E (Y1Y3 ) 3 2 2


1 ( 1)( 2) 0 (2)(1) 7 3 2

2
E ( YY' ) E (Y1Y2 ) E (Y2 ) E (Y2Y3 ) 1 ( 1)( 2)
1 ( 1) 2
0 ( 1)(1) 3 2 1

2
E (Y1Y3 ) E (Y2Y3 ) E (Y32 ) 0 ( 2)(1)
2 1 2
0

1
)(
1
)
1

2 1 1 7 3 2
15 7 3
1
1
( I H) E ( YY' ) 1 2 1 3 2 1 15 8 6 ,

3
1 3
1 1 2 2 1 2
0

E[ tr( Y' (I H)Y)] tr[(I H) E ( YY ' )]

15 8 3 26

3
3

Problem 4:
(a) From the R Codes, SYY=517.875, RSS=145.0752, SSreg=372.7998, 2 =29.01503. R2= 0.7198
16.5784

10.1144 ,
8.3464

21.88235
0.88235

32.49020
7.49020
23.65033
4.65033

17.35213 21.05012
24.74812

21.88235

2.11765

Y
,
e

,
V
ar
(

17
.
35213
41.62614 43.99665

28.95425
7.04575

21.05012 43.99665 47.03090


32.49020
3.50980
23.65033
0.34967

10.0000
0.00000

(b) Hypotheses H0: E(Y|X) = 0+ 1x1 vs H1: E(Y|X) = 0 + 1x1 + 2x2


Source
Regression

df
1

SS
42.98

MS
42.98

Residual
Total

5
6

145.07
188.05

29.014

F0
1.4814

p-value
0.2779

Decision: Since p-value =0.2779 > = 0.05, we do not reject H0 at = 0.05.


Conclusion: We do not have sufficient evidence that Model 2 is the appropriate model vs Model 1.

(c) x* (1,2.5,3), ~y x* 16.3317 , t5,0.025 = 2.5706, sepred( y | x* ) 1 x*' ( X ' X ) 1 x* 10.80717


y t5,0.025sepred( y | x* ) (11.44902, 44.11242) .
A 95% PI for the response is ~
Page 2/4

R Codes for Problem #2


### Problem 2(b) ###
library(car); library(alr3); y<-htwt$Wt; x<-htwt$Ht
fit<-lm(y~x); fit
Coefficients:
(Intercept)
x
-36.8759
0.5821
summary(fit)
Coefficients:
Estimate
Std. Error t value Pr(>|t|)
(Intercept) -36.8759
64.4728 -0.572
0.583
x
0.5821
0.3892 1.496
0.173
Residual standard error: 8.456 on 8 degrees of freedom
Multiple R-squared: 0.2185,
Adjusted R-squared: 0.1208
F-statistic: 2.237 on 1 and 8 DF, p-value: 0.1731
### Problem 2(c) ###
anova(fit)
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
x
1 159.95 159.947 2.237 0.1731
Residuals
8 572.01 71.502
### Problem 2(d) ###
2*(1-pt(abs((0.5821-2)/.3892),8)) # p-value
[1] 0.006559349
### Problem 2(e) ###
wthat<--36.8759+0.5821*(166.8)
sefit<-8.456*sqrt(1/10+(166.8-mean(x))^2/ sum((x-mean(x))^2))
c(wthat-qt(0.975,8)*sefit,wthat+qt(0.975,8)*sefit)
[1] 53.94598 66.49078
### Problem 2(f) ###
wthat<--36.8759+0.5821*(166.8)
sepred<-8.456*sqrt(1+1/10+(166.8-mean(x))^2/sum((x-mean(x))^2))
c(wthat-qt(0.995,8)*sepred,wthat+qt(0.995,8)*sepred)
[1] 30.41346 90.02330

Page 3/4

R Codes for Problem #4


### Problem 4(a) ###
y<-c(21,25,19,24,36,36,24,10); x1<-c(3,9,4,3,7,9,4,1); x2<-c(3,9,4,3,7,9,4,2)
X<-cbind(rep(1,length(x1)),x1,x2)
betahat<-solve(t(X)%*%X)%*%t(X)%*%y
yhat<-X%*%betahat; ehat<-y-yhat
SYY<-as.numeric(t(y-mean(y))%*%(y-mean(y)))
RSS<-as.numeric(t(ehat)%*%ehat); SSreg<-SYY-RSS; R2<-SSreg/SYY
sigmahat2<-RSS/(length(y)-2-1);
varbetahat<-sigmahat2*solve(t(X)%*%X)
### Problem 4(b) ### (NOT Required but the anova function will provide the answers right away)
fit0<-lm(y~x1)
anova(fit0)
Analysis of Variance Table
Response: y
Df Sum Sq Mean Sq F value Pr(>F)
x1
1 329.82 329.82 10.523 0.0176 *
Residuals
6 188.05 31.34
fit1<-lm(y~x1+x2)
anova(fit0,fit1)
Analysis of Variance Table
Model 1: y ~ x1
Model 2: y ~ x1 + x2
Res.Df
RSS Df Sum of Sq
F Pr(>F)
1
6 188.05
2
5 145.07 1
42.977
1.4812 0.2779
### Problem 4(c) ###
xstar<-c(1,-2.5,-3)
xstar%*%betahat
[,1]
[1,] 16.3317
xstar%*%betahat+c(-1,1)*qt(0.975,length(y)-2-1)*sqrt(sigmahat2)*sqrt(1+t(xstar)%*%solve
(t(X)%*%X)%*%xstar)
[1] -11.44902 44.11242

Page 4/4

Anda mungkin juga menyukai