log(yi )
= + xi + i .
where = log( ) and = log( ).
(b) Now, consider the sum of squares:
SS (, ) =
n
X
2i =
n
X
i=1
i=1
(log(yi ) xi )
and differentiating with respect to the parameters and setting to zero, that is,
n
X
(2) (log(yi ) xi ) = 0
SS (, ) =
i=1
n
X
(2xi ) (log yi xi ) = 0.
SS (, ) =
i=1
c Katja Ignatieva
xi
i=1
n
X
i=1
x2i
i=1
n
X
xi log (yi ) .
i=1
Page 1 of 19
Solutions Week 10
Solving, we get:
b=
n
X
i=1
log (yi ) /n b
n
X
xi /n
i=1
b
=log(y) x
Pn
Pn
xi log (yi )
b i=1 xi
Pn
b = i=1
2
i=1 xi
P
Pn
Pn
( ni=1 xi )2
b
x
log
(y
)
log(y)
x
+
i
i
i=1 i
n
Pn
= i=1
2
x
i
i=1
!
Pn
Pn
Pn
2
x
log
(y
)
log(y)
( i=1 xi )
i
i
i=1
i=1 xi
b
Pn
=
1 Pn
2
2
n i=1 xi
i=1 xi
Pn
Pn
xi log (yi ) log(y) i=1 xi
= i=1
Pn
Pn
( i=1 xi )2
2
i=1 xi
Pnn Pn
Pn
x
log
(y
)
i
i
j=1 log(yj )xi /n
i=1
i=1
Pn
=
2
2
i=1 xi nx
n
X
Pn
Pn
xi /n j=1 log(yj )
x
log
(y
)
i
i
i=1
i=1
Pn
| {z }
=x
2
2
i=1 xi nx
Pn
(xi x) log (yi )
Pn
= i=1
2
2
i=1 xi nx
n
X
ci log(yi )
=
i=1
(c)
" n
#
h
i
X
x
x
i
b = x =E
Pn
E |X
2 log(yi )|X = x
2
i=1 xi nx
i=1
=
n
X
i=1
n
X
i=1
xi x
2 E [log(yi )|X = x]
2
i=1 xi nx
Pn
xi x
2 ( + xi )
2
i=1 xi nx
Pn
=0
z }| {
n
X
xi x
Pn
xi (xi x)
Pi=1
+
n
2 nx2
x
x2i nx2
i=1 i
| i=1 {z
}
= Pni=1
=
c Katja Ignatieva
=1
Page 2 of 19
Solutions Week 10
(d)
n
X
b = x =V ar
V ar |X
=
n
X
ci log(Yi )|X = x
i=1
i=1
= 2
n
X
c2i
i=1
Pn
2
i=1 (xi x)
2 2
2
i=1 xi nx
2
Pn
2
2
i=1 xi nx
= Pn
n
1X
E [log(Yi )|X = x] x
=
n i=1
n
1X
( + xi ) x
n i=1
n
X
1
xi x
= (n + )
n
i=1
!
n
X
xi
x =
= +
n
i=1
c Katja Ignatieva
Page 3 of 19
Solutions Week 10
(g)
b
=x
V ar (b
|X = x) =V ar log(y) x|X
b =x
=V ar log(y)|X = x + x2 V ar |X
b =x
2xCov log(y), |X
!
n
X
log(yi )
b =x
=V ar
|X = x + x2 V ar |X
n
i=1
!
n
n
X log(yi ) X
ci log(yi )|X = x
,
2xCov
n
i=1
i=1
=
n
1 X
2
b =x
V
ar
(log(y
)|X
=
x)
+
x
V
ar
|X
i
n2 i=1
n
n
2x X X
Cov (log(yi ), log(yj )|X = x)
ci
n i=1 j=1
1
b =x
= V ar (log(yi )|X = x) + x2 V ar |X
n
n
2x X
ci V ar (log(yi )|X = x)
n i=1
2 2 x
2
x2 2
+
+ Pn
2
n
n
i=1 (xi x)
= 2
1
x2
+ Pn
2
n
i=1 (xi x)
n
X
ci
i=1
| {z
}
Pn
=0(due to
i=1 (xi x))
* using Cov (log(yi ), log(yj )|X = x) is equal to zero if i 6= j and equal to V ar (log(yi )|X = x) if
i = j.
(h) We have that is a linear combination of two normally distributed random variables, i.e.,
(log(Y )|X = x) and (|X = x), which is thus also normally distributed. The mean and variance
are given in question (f) and (g).
(i)
b = x =Cov log(y) x,
b |X
b =x
Cov
b, |X
b = x xCov ,
b |X
b =x
=Cov log(y), |X
|
{z
}
=0(see (g))
x 2
2
i=1 (xi x)
= Pn
(j) We have = exp(), = exp(). Moreover, we have that (|X = x), (|X = x) and log(Y ) are
normally distributed with their mean and variance as given in (e), (f), and (g). Thus, ( |X = x),
( |X = x) and (Y |X = x) are lognormally distributed with parameters the mean of the
logarithm of the variable and 2 the variance of the logarithm of the variable. For example,
is E[|X = x] and 2 is V ar (|X = x).
c Katja Ignatieva
Page 4 of 19
Solutions Week 10
3.5
Concentration
2.5
1.5
0.5
10
20
30
Interval (x)
40
50
60
2. (a) Interesting features are that, in general, the concentration of 3-MT in the brain seems to decrease
as the post mortem interval increases. Another interesting feature is that we observe two observations with a much higher post mortem interval than the other observations.
The data seems to be appropriate for linear regression. The linear relationship seems to hold,especially
for values of interval between 5 and 26 (we have enough observations for that). Care should be
taken into account when evaluating y for x lower than 5 and larger than 26 (only two observations)
because we do not know whether the linear relationship between x and y still holds then.
(b) We test:
H0 : = 0 v.s. H1 : 6= 0
The corresponding test statistic is given by:
R n2
tn2 .
T =
1 R2
We reject the null hypothesis for large and small values of the test statistic.
We have n = 18 and the correlation coefficient is given by:
P
xi yi nxy
r =q P
P
( x2i nx2 )( yi2 ny 2 )
0.827 16
T = p
= 5.89.
1 (0.827)2
From Formulae and Tables page 163 we observe Pr(t16 4.015) = Pr(t16 4.015) = 0.05%,
* using symmetry property of the student-t distribution. We observe that the value of our test
statistic (-5.89) is smaller than -4.015, thus our p-value should be smaller than 2 0.05% = 0.1%.
Thus, we can reject the null hypothesis even at a significance level of 0.1%, hence we can conclude
that there is a linear dependency between interval and concentration. Note that the alternative
c Katja Ignatieva
Page 5 of 19
Solutions Week 10
hypothesis is here a linear dependency and not negative linear dependency, so you do accept
the alternative by rejecting the null hypothesis. Although, when you would use as alternative
hypothesis negative dependency, you would accept this alternative, due to the construction of the
test we have to use the phrase a linear dependency and not a negative linear dependency.
(c) The linear regression model is given by:
y = + x +
The (BLUE) estimate of the slope is given by:
P
P
P
xi yi n xi /n yi /n
b
P
P 2
=
xi n( xi /n)2
672.8 337 42.98/18
= 0.0372008
=
9854.4 3342 /18
The (BLUE) estimate of the intercept is given by:
b
b =y x
=3.084259 0.0372008x
We have:
b =
s.e.()
b
tn2 .
b
s.e.()
P
b2
x2i nx2
b2
9854.5 3372 /18
P
P P
X
X
( xi yi xi yi /n)2
1
2
2
2
P 2
P 2
yi (
yi ) /n
b =
xi ( xi ) /n
n2
1
(672.8
337 42.98/18)2
=
109.7936 42.982 /18
= 0.1413014
16
9854.5 3372 /18
s
0.1413014
b
= 0.00631331
s.e.() =
9854.5 3372 /18
=
Page 6 of 19
3. (a)
Solutions Week 10
n
X
i=1
(yi xi )2 =
n
X
i=1
yi2 + 2 x2i 2
n
X
(yi xi )
i=1
i=1
i=1
Solving for we obtain the LSE estimator for :
Pn
i=1 (yi xi )
b1 = P
n
2 .
i=1 xi
Pn
h i
(yi xi )
i=1
b
E 1 =E Pn
2
i=1 xi
Pn
(E [yi |xi ] xi )
= i=1Pn
2
i=1 xi
Pn
i=1 (xi xi )
= P
=
n
2
i=1 xi
h i
* using that E b1 given a value of xi only depends on the value of yi , hence the E [yi |xi ]
with the condition and ** using E [yi |xi ] = xi .
For the variance we have:
Pn
i=1 (yi xi )
b
P
V ar 1 =V ar
n
2
i=1 xi
Pn
(x2 V ar (yi |xi ))
= i=1 Pin
2
( i=1 x2i )
2
= Pn
i=1
(b)
x2i
P
P
1. The expected value of the alternative estimator b2 = ni=1 Yi / ni=1 xi is given by:
Pn
h i
Yi
i=1
b
E 2 =E Pn
xi
Pn i=1
E
[Y
|x ]
Pn i i
= i=1
xi
Pn i=1
xi
= Pi=1
= .
n
i=1 xi
The variance of the estimator is given by:
Pn
Yi
V ar b2 =V ar Pi=1
n
i=1 xi
Pn
V ar (Yi |xi )
= i=1
Pn
2
xi )
(
Pn i=1
2
= i=1 2
(nx)
n 2
2
= 2 2 =
.
n x
nx2
2. We need to prove: V ar(b2 ) V ar(b1 ) which is equivalent to prove that V ar(b2 )V ar(b1 )
c Katja Ignatieva
Page 7 of 19
Solutions Week 10
0.
2
2
Pn
2
2
nx
i=1 xi
1
1
2
P
0
n
=
2
nx2
i=1 xi
n
n
X
X
x2i nx2 =
V ar(b2 ) V ar(b1 ) =
i=1
where
s2x
i=1
(xi x)2 =
=
=
n
X
i=1
n
X
i=1
n
X
i=1
n
X
i=1
(c)
x2i + x2 2 (xi x)
x2i + nx2 2x
n
X
xi
i=1
Thus the variance of the estimator b2 is at least as large as the variance of the least squares
estimator b1 and is strictly larger if there is variability in the value xi can take.
Pn
1. Our estimator is now b3 = i=1 ai Yi . The mean of the estimator is:
#
" n
h i
X
ai Yi
E b3 =E
i=1
n
X
i=1
n
X
ai E [Yi |xi ]
ai xi =
n
X
ai xi .
i=1
i=1
h i
Pn
Thus if b3 is unbiased we have E b3 = , which is only the case if i=1 ai xi = 1.
The variance of the estimator is given by:
!
n
X
ai Yi
V ar(b3 ) =V ar
i=1
n
X
i=1
n
X
i=1
2. For b1 we have:
Pn
xi Yi
b1 = Pi=1
n
2
i=1 xi
n
X
x
Pn i
=
i=1
hence ai =
Pnxi
i=1
x2i
for i = 1, . . . , n.
Pn
i=1
n
X
i=1
c Katja Ignatieva
i=1
x2i
Yi ,
ai xi = 1:
n
X
x
Pn i
2 xi
i=1 xi
i=1
Pn
x2i
= Pi=1
n
2 = 1.
i=1 xi
ai xi =
Page 8 of 19
For b2 we have:
Pn
xYi
b
2 = Pi=1
n
i=1 xi
n
X
1
Pn
=
i=1
1
hence ai = Pn1 xi = nx
for i = 1, . . . , n.
i=1
Pn
We need to verify the condition i=1 ai xi = 1:
n
X
ai xi =
n
X
i=1
xi
Solutions Week 10
Yi ,
1
Pn
xi
i=1 xi
i=1
Pn
xi
= Pi=1
= 1.
n
i=1 xi
i=1
P
3. We have that b3 is the general notation of a linear estimator. The condition ni=1 ai xi = 1
implies that we only look at unbiased estimators. This means that the linear estimator
with ai = Pnxi x2 , which is the least squares estimator, is the best (i.e., minimum variance)
i=1 i
unbiased estimator (BLUE estimator).
4. (a) The linear regression model is given by:
yi = + xi + i ,
where i N (0, 2 ) i.i.d. distributed for i = 1, . . . , n.
The fitted linear regression equation is given by:
b
yb =
b + x.
The estimated coefficients of the linear regression model are given by (see Formulae and Tables
page 25):
sxy
1122
b =
= Pn
2
2
sxx
i=1 xi nx
1122
1122
=
=
= 0.63223
2
60012 12 836
1774.67
Pn
Pn
b = i=1 yi b i=1 xi
b =y x
n
n
836
867
0.63223
= 28.205.
=
12
12
Thus, the fitted linear regression equation is given by:
yb = 28.205 + 0.63223 x.
b2 =
1 X
(yi ybi )2
n 2 i=1
!
Pn
n
2
X
( i=1 (xi x)(yi y))
1
2
2
Pn
y ny
=
2
2
n 2 i=1 i
i=1 xi nx
11222
1
= 25.289
= 63603
10
60016 8362 /12
c Katja Ignatieva
Page 9 of 19
Solutions Week 10
Note: we have the degree of freedom of n 2 because we have to estimate two parameters form
b We have that s2 =
the data (b
and ).
b2 . Thus we have that the 90% confidence interval is
given by:
10b
2
10b
2
< 2 < 2
2
0.95,10
0.05,10
10 25.289
10 25.289
< 2 <
18.3
3.94
13.8 < 2 < 64.2
i=1
v) The value of the test statistic is in the rejection region, hence we reject the null hypothesis of
a zero correlation.
(d) We have that yi |xi by|xi has a student-t distribution:
V ar(yi |xi )
yi |xi yb|xi
p
tn2
V ar(yi |xi )
+ Pn
V ar(yi |xi = 53) =
2
(x
n
x)
i
i=1
(53 836/12)2 c2
1
= 6.0657.
+
=
12 60016 8362 /2
Thus, the 95% confidence interval for the value of y given that x = 53 is given by:
p
p
yb t10.05/2 V ar(yi |xi = 53) < y|x = 53 < yb + t10.05/2 V ar(yi |xi = 53)
c Katja Ignatieva
Page 10 of 19
Solutions Week 10
where
1
1
1+r
1 + 0.85860
zr = log
= log
= 1.2880
2
1r
2
1 0.85860
1
1+
1 + 0.75
1
= log
= 0.97296
z = log
2
1
2
1 0.75
Pn
(xi x)(xi y)
r = pPn i=1
Pn
2
2
i=1 (xi x)
i=1 (yi y)
1122
Pn
= Pn
2
2
2
2
ny
( i=1 yi
i )(
i=1 xi nx )
1122
= 0.85860
=
962.25 1774.667
v) We have that z0.82894 = 0.95. Thus, the p-value is given by 2 (1 0.82894) = 0.34212. The
value of the test statistic is not in the critical region if the level of significance is lower than
0.34212 (which is normally the case). Hence, for reasonable values of the level of significance we
would not reject the null hypothesis.
(f) The proportion of the variability explained by the model is given by:
SSM
SSE
=1
SSTP
SST
n
(y
ybi )2
i
=1 Pni=1
2
i=1 (yi y i )
P
2
Pn
(xi x)(yi y))
( ni=1
2
2
Pn
2 nx2
i=1 yi ny
x
i=1 i
Pn
=1
2 ny 2
y
i
i=1 i
Pn
2
(xi x)(yi y))
(
Pn
= Pn i=1
2
( i=1 yi ny 2i )( i=1 x2i nx2 )
R2 =
11222
= 0.737193.
962.25 1774.667
D.F.
1
58
59
Sum of Squares
639.5-475.6=163.9
8.2*58=475.6
639.5
Mean Squares
163.9
8.2
F-Ratio
=19.99
163.9
8.2
2
below, r = 0.4794 = r = + 0.4794 = 69.2%. You take the positive square root because of the
positive sign of the coefficient of EP S.
c Katja Ignatieva
Page 11 of 19
Solutions Week 10
b + x0 t1/2,n2 s
n (n 1) s2x
v
!
u
2
u 1
(2.338 2)
t
= (39.934) t10.025,46 247
+
48 (47) (2.0042)
| {z }
=2.012896
247
= 7.445 2.0147
2.004 47
= 7.445 2.305
= (5.14, 9.75) .
10475
247 = 15.716 and R2 = SSM
SST = 21851 = 47.94%.
(e) A scatter plot or diagram of the fitted values against the residuals (standardised) will provide us
an indication of the constancy of the variation in the errors.
(d) s =
(f) To test for the significance of the variable EPS, we test H0 : = 0 against Ha : 6= 0. The test
statistic is:
7.445
b
= 6.508.
t b = =
1.144
b
se
This is larger than t1/2,n2 = 2.0147 and therefore we reject the null. There is evidence to
support the fact that the EPS variable is a significant predictor of stock price.
(g) To test H0 : = 24 against Ha : > 24, the test statistic is given by:
b
7.445 24
0 =
t b =
= 14.47.
1.144
b
se
Thus, since this test statistic is smaller than t1,n2 = t0.95,46 = 1.676, do not reject the null
hypothesis.
P
7. The grand total/sum is P x = 2479 + 2619 + 2441 + 2677 = 10216 so that the grand mean is x =
10216/40 = 255.4. Also,
x2 = 617163 + 687467 + 597607 + 718973 = 2621210. Therefore the total
sum of squares is:
X
2 X 2
2
xx =
SST =
x Nx
=
d.f.
3
36
39
F-Statistic
1258.27
229.69 = 5.478
Page 12 of 19
Solutions Week 10
Thus, to test the equality of the mean premiums across the regions, we test:
H0 : A = B = C = D = 0
using the F -test. Since F = 5.478 > F0.95 (3, 36) = 2.9 (approximately), we therefore reject H0 . There
is evidence to support a difference in the mean premiums across regions. The one-way ANOVA model
assumptions are as follows: each random variable xij is observed according to the model
xij = + i + ij , for i = 1, . . . , I, and j = 1, 2, . . . , ni
where ij refers to the random error in the j th observation of the ith treatment which satisfies:
- E [ij ] = 0 and V ar (ij ) = 2 for all i, j.
- The ij are independent and normally distributed (normal errors), and where is the overall
mean and i is the effect of the ith treatment with:
I
X
i = 0.
i=1
for i = 1, . . . , I and j = 1, . . . , J.
where the error terms ij are i.i.d. normal random variables with mean 0 and common variance 2 .
Since
Yij N + i , 2 ,
then the likelihood function is given by:
L yij ; , i ,
2
N
1 XX
exp (
2 i=1 j=1
yij i
2
where N = I J is the grand total sample size. Now, take the log-likelihood and differentiate with
respect to each parameter:
J
I
N
1 XX
log L = yij ; , i , 2 = log (2) N log
2
2 i=1 j=1
yij i
2
and
J
I
1 XX
(yij i )
2 i=1 j=1
I
I
J
X
1 X X
i = 0
yij IJ J
2 i=1 j=1
i=1
1 X
(ykj k ) = 0,
= 2
k j=1
I
N XX
=
+
i=1 j=1
for k = 1, 2, . . . , I
(yij i )
3
= 0.
PI
Assuming i=1 i = 0 which is a standard assumption in the one-way ANOVA model, we have from
the first equation:
b=
c Katja Ignatieva
I
J
1 XX
yij = y.
IJ i=1 j=1
Page 13 of 19
Solutions Week 10
bk =
J
1X
ykj y = y k. y
J j=1
and from the last equation, we have the MLE for the variance of the error term:
I
J
2
1 XX
yij y y i. + y
IJ i=1 j=1
I
J
1 XX
2
(yij y i. ) .
IJ i=1 j=1
2
The likelihood function can be written as:
PIi=1 ni
2
ni
I X
X
1
y
(
+
)
1
ij
i
exp
L(yij ; , i , ) =
2 i=1 j=1
I
X
ni
i=1
i
1 XX
log() +
2 i=1 j=1
=0
= 2
2 i=1 j=1
ni
I X
X
i=1 j=1
ni
I X
X
i=1 j=1
yij ( + i )
2
yij ( + i ) = 0
yij
I
X
ni +
i=1
I
X
ni i = 0
i=1
| {z }
0
ni
I X
X
i=1 j=1
b=
yij N = 0
I P
ni
P
yij
i=1 j=1
N
Taking the partial derivative of l w.r.t. to i and equating to 0:
ni
1
yij ( + i )
1 X
l
=0
= 2
i
2 j=1
ni
X
j=1
ni
X
j=1
yij ( + i ) = 0
yij ni ni i = 0
ni i =
ni
X
j=1
bi =
c Katja Ignatieva
ni
P
yij ni
yij
j=1
ni
Page 14 of 19
Solutions Week 10
0 < y < 1.
k
= c.
2
Pr(y < c| = 2)
Zc
2ydy = c2 .
0
Therefore c = 0.05 = 0.2236 and the rejection region of the best test is defined by:
y < 0.2236,
i.e. reject H0 when y < 0.2236 for a 5% level of significance.
11. (a) We have the estimated correlation coefficient:
sms
smm sss
P
ms nms
=p P
P
( m2 nm2 ) ( s2 ns2 )
221, 022.58 1136.1 1934.2/10
=p
= 0.764.
(129, 853.03 1136.12/10) (377, 700.62 1934.22/10)
r =
i) We hypothesis is:
H0 : = 0 v.s. H1 : > 0
ii) The test statistic is:
r n2
T =
tn2
1 r2
iii) The critical region is given by:
C = {(X1 , . . . , Xn ) : T (tn2,1 , )})
iv) The value of the test is:
0.764 10 2
r n2
=
= 3.35
T =
1 r2
1 0.7642
v) We have t8,10.005 = 3.35. Thus the p-value is 0.005 and we reject the null hypothesis of a
zero correlation for level of significance less than 0.005 (usually it is larger, thus then we reject
the null).
(b) Given the issue of whether mortality can be used to predict sickness, we require a plot of sickness
against mortality:
c Katja Ignatieva
Page 15 of 19
Solutions Week 10
230
220
Sickness (s)
210
200
190
180
170
160
100
105
110
115
120
Mortality (m)
125
130
There seems to be an increase linear relationship such that mortality could be used to predict
sickness.
(c) We have the estimates:
P
ms nms
sms
= P 2
b =
smm
m nm2
221, 022.58 1136.1 1934.2/10
= 1.6371
=
129, 853.03 1136.12/10
b = 1934.2 1.6371 1136.1 = 7.426
b =y x
10
10
n
X
1
s2ms
1
2
c
2
(yi ybi ) =
sss
=
n 2 i=1
n2
smm
P
X
1
( ms nms)2
2
2
P
=
(
s ns )
8
( m2 nm2 )
(1278.118)2
1
3587.656
= 186.902
=
8
780.709
b =b
V ar()
2 /smm = 186.902/780.709 = 0.2394
i) Hypothesis:
H0 : = 2 v.s. H1 : < 2
ii) Test statistic:
b
T = q
tn2
c2 /sxx
C = {(X1 , . . . , Xn ) : T (, tn2,1 )}
Page 16 of 19
Solutions Week 10
v) We have from Formulae and Tables page 163: t8,10.25 = 0.7064 and t8,10.20 = 0.8889. Thus
the p-value (using symmetry) is between 0.2 and 0.25. Thus, we accept the null hypothesis if the
level of significance is smaller than the p-value (which is usually the case). Note: exact p-value
using computer package is 0.2402.
(d) For a region with m = 115 we have the estimated value:
sb = 7.426 + 1.6371 115 = 195.69
= 19.1528
= 186.902
+
+
n
smm
10
780.709
The
corresponding 95% confidence limits are 195.69 t8,10.025 s.e.(s|m = 115)
= 195.69 2.306
19.1528 = 185.60 and 195.69 + t8,10.025 s.e.(s|m = 115) = 195.69 + 2.306 19.1528 = 205.78.
12. (a)
40
30
20
10
0
0
(b)
6
8
Quarter (i)
10
12
2. The mean number of deaths increases with an increasing rate with quarter. The variance also
appears to increase with quarter.
P
2 2
1. We have q = 12
i=1 (ni i ) . Take the derivative of q with respect to and equate that
equal to zero we obtain:
12
X
q
i2 (ni i2 ) = 0
=2
i=1
12
X
ni i 2 =
Pi=1
12
i4
i=1
i=1
P12
12
X
ni i
i=1
i4
=
b.
2q
2
> 0:
12
X
2q
i4 > 0.
=2
2
i=1
c Katja Ignatieva
Page 17 of 19
Solutions Week 10
P
P
(ni i2 )2
2
= 12
2. We have q = 12
i=1 (ni /i i) . Take the derivative of q with respect to
i=1
i2
and equate that equal to zero we obtain:
12
X
q
i(ni /i i) = 0
=2
i=1
12
X
i=1
ni =
P12
12
X
i2
i=1
ni
=e
.
Pi=1
12
2
i=1 i
2 q
2
> 0:
12
X
2 q
i2 > 0.
=
2
2
i=1
3. We have:
(c)
P12
ni i 2
15694
=
b = Pi=1
= 0.259
12
4
60710
i=1 i
P12
ni
174
=
= 0.268
e = Pi=1
12
2
650
i
i=1
b
tn2
b
s.e.()
b
1.6008 2
= 1.58
=
b
0.2525
s.e.()
v) From formulae and Tables page 163 we obtain t10,10.10 = 1.372 and t10,10.05 = 1.812.
Thus the p-value of the hypothesis is between 0.1 and 0.2 (two-sided test!). For level of
significance lower than 0.1 we will accept the null hypothesis that = = 2 and thus this
assumption seems appropriate. Note: exact p-value using computer package is 0.1452.
13. (a)
1. We have:
X 2
X
SST =
y2
y /n = 70.8744 29.122 /16 = 17.8760
X
X
x =4 (1 + 2 + 3 + 4) = 40
x2 = 4 (12 + 22 + 32 + 42 ) = 120
X
xy =1 2.73 + 2 6.26 + 3 9.22 + 4 10.91 = 86.55
X
X X
sxy =
xy
x
y/n = 86.55 40 29.12/16 = 13.75
2
13.75
20 = 9.453125
SSM = = b12 sxx =
20
SSE =SST SSM = 17.8760 9.453125 = 8.422875.
c Katja Ignatieva
Page 18 of 19
Solutions Week 10
2.
13.75
sxy
=
= 0.6875
b =
sxx
20
b = 29.12 0.6875 40/16 = 0.1012
b =y x
b = 0.1012 + 0.6875x.
Thus, the fitted model is given by yb =
b + x
b = 0.1012 + 0.6875 1 = 0.7887
For x = 1 we have: yb =
b + x
b = 0.1012 + 0.6875 4 = 2.8512
For x = 4 we have: yb =
b + x
q
b = 8.4229/14 = 0.1734.
3. We have s.e.()
20
i) Hypothesis:
H0 : = 0 v.s. H1 : 6= 0
ii) Test statistic:
T =
iii) Critical region:
b
tn2
b
s.e.()
0.6875 0
b
=
= 3.965
b
0.1734
s.e.()
v) We have t14,10.001 = 3.787 and t14,10.0005 = 4.140. Thus the p-value is between 0.1% and
0.2%. Accept the null hypothesis if the level of significance is lower than the p-value (which
is usually not the case). Hence, we have strong evidence against the no linear relationship
hypothesis. Note: exact p-value using computer package is 0.00070481.
(b)
1. We have:
SST =17.8760
SSB =(2.732 + 6.262 + 9.222 + 10.912 )/4 29.122 /16 = 9.6709
SSR =SST SSB = 17.8760 9.6709 = 8.2051
2.
b =29.12/16 = 1.82
c Katja Ignatieva
Page 19 of 19