Based on
In the spring 2001 the Biometry Research group at the Danish Institute of Agricultural Sciences
arranged a course in Mixed models for researchers at the Department of Animal Health and
Animal Welfare at the same institute. The course consisted a combination of lectures, group
exercises, written assignments and a final project report based on data from experiments that
the project participants were involved in.
During the course, the book SAS System for Mixed Models by Littell et al. (1996) was used,
referred to as LMSW in the present document. It was necessary to supplement the book with
additional theoretical material and examples based on data from the research institute. This
led to a comprehensive number of slides used for the presentations.
This supplementary material is compiled in the present document. We hope the readers will
find it useful. Maybe the online version1 of this document will be even more useful, because of
the hypertext facilities.
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/HSVmixed2001Slides.pdf
3
1 Preface
4
Contents
1 Preface 3
Contents 9
2 Overview of slides 11
5
Contents
6 An overview 137
Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Darwins maize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
Galtons tilgang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
Korrekt tilgang . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Hvad er sket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
Den 5. potte . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
Populations genetik . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Populations genetik/ Husdyravl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
Mixed Models generelt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6
Contents
7
Contents
8
Contents
Bibliography 373
9
Contents
10
2 Overview of slides
1. Brush-up concerning the necessary prerequisites of statistical concepts, linear algebra and
linear normal models. In addition, a historic review was given and experimental planning
discussed. This covers Chapter 3-7.
2. This block of lectures covered the basic application of Mixed Models within the experi-
mental designs typically used at the Department of Animal Health and Animal Welfare.
That is
In addition the fundamentals concerning estimation and tests in Mixed Models, is dis-
cuused in Chapter 12. The two remaining issues: numerical problems (Chapter 13) and
factor structure diagrams (Chapter 17) were included because of questions raised from the
participants. In practical examples some of the variance components estimates were very
often set to 0, leading to problems concerning the calculations of d.f. (i.e., with Satterth-
waites approximation). This further raised a need for a more ’manual’ approach towards
d.f. calculations in different designs.
3. In the final part of the course some additional topics and developments within Mixed
Models were presented and efforts were made to give a general summary and overview of
the topics. Lectures concerning variance heterogeneity is presented in Chapter 19 and 20.
An example using the presented methods on data concerning diurnal variation is presented
in Chapter 21
In addition, the preliminary work on the final project report were presented during this
final block.
11
2 Overview of slides
The final chapter (22) in this book consist of links to supplementary material. Mainly, SAS
examples.
The exercises uses in the course is not included but can be found by visiting the home page of
the course1
Finally, it should be mentioned that each chapter starts with a very short introduction to the
topic. In addition, a link to the full screen version of the presentation can be found.
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/HSVmixed2001.htm
12
3 Basic Concepts from Linear algebra)
Linear algebra is an important prerequisite in order to understand the model formulation and
calculations within Mixed Model. The following slides served as a brush-up on the theory, with
presentation of the most important concepts and results.
Link to the full screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/LinAlg.f.pdf
13
3 Basic Concepts from Linear algebra)
14
Vectors
b = (2, 1, 3)
15
3 Basic Concepts from Linear algebra)
a = (a>)>
• Example:
>
1 1
3 = [1, 3, 2] og [1, 3, 2]> = 3
2 2
• Example:
1 7
7 3 = 21
2 14
16
Sum of vectors: Let a and b be n–vectors. The sum a + b is the
n–vector
a1 b1 a1 + b 1
a b2 a2 + b 2
a + b = .2 + . = =b+a
. . ..
an bn an + b n
17
3 Basic Concepts from Linear algebra)
Matrices
A = [a1 : a2 : · · · : ac]
October 17, 2001 Mixed Models Course 10
18
Transpose of matrices: A matrix is transposed by interchanging
rows and columns and is denoted by “>”. That is,
a11 a21 . . . ar1
a12 a22 . . . ar2
>
A =
.. . ... .
a1c a2c . . . arc
Example:
>
1 2
1 3 2
3 8 =
2 8 9
2 9
19
3 Basic Concepts from Linear algebra)
Example:
1 2 7 14
7 3 8 = 21 56
2 9 14 63
A + B = [a1 + b1 : a2 + b2 : · · · : as + bs]
a11 a12 . . . a1c b11 b12 . . . b1c
a21 a22 . . . a2c b21 b22 . . . b2c
= . .. . . . .. +
. .. .. . . . ..
ar1 ar2 . . . arc br1 br2 . . . brc
a11 + b11 a12 + b12 . . . a1c + b1c
a21 + b21 a22 + b22 . . . a2c + b2c
= .. .. ... .. =B+A
ar1 + br1 ar2 + br2 . . . arc + brc
20
• Note Only matrices with the same dimensions can be added.
Example:
1 2 5 4 6 6
3 8 + 8 2 = 11 10
2 9 3 7 5 16
• Eksempel:
1 2 1·5+2·8 21
5
3 8 = 3 · 5 + 8 · 8 = 79
8
2 9 2·5+9·8 82
21
3 Basic Concepts from Linear algebra)
Example:
"
1 2
# 1 2 1 2
5 4 5 4
3 8 = 3 8 : 3 8
8 2 8 2
2 9 2 9 2 9
1·5+2·8 1·4+2·2 21 8
= 3·5+8·8 3 · 4 + 8 · 2 = 79 28
2·5+9·8 2·4+9·2 82 26
5 4
"
1 2
# 8 2 21 8
5 4
3 8 = 1 2 1 · 5 + 2 · 8 1 · 4 + 2 · 2 = 79 28
8 2
2 9 3 8 3·5+8·8 3·4+8·2 82 26
2 9 2·5+9·8 2·4+9·2
22
Special matrices:
• An n × n matrix is said to be a square matrix
• A matrix with 0 on all entries is the 0–matrix and is often written
simply as 0 (or as 0r×c to emphasize the dimension).
• A matrix consisting of 1s in all entries is of written J (or as Jr×c
to emphasize the dimension).
• A square matrix with 0 on all off–diagonal entries and elements
d1, d2, . . . , dn on the diagonal is said to be a diagonal matrix and
is iften written diag{d1, d2, . . . , dn}
• A diagonal matrix 1s on the diagonal is called the unity matrix
and is denoted I (or In×n to emphasize the dimension).
• A matrix A is a symmetric matrix A = A>.
(AB)> = B >A>
A(B + C) = AB + AC
AB = AC 6⇒ B = C
23
3 Basic Concepts from Linear algebra)
24
Example 2. Finding the inverse of a diagonal matrix is easy: Let
a1 0 . . . 0
0 a2 0
A= . ... 0
.
0 0 . . . an
25
3 Basic Concepts from Linear algebra)
Linear Combinations
26
n–dimensional Spaces
27
3 Basic Concepts from Linear algebra)
Linear Subspaces
28
More precisely, L consists of all vectors of the form
a 1 v1 + a 2 v2 + · · · + a c vc
29
3 Basic Concepts from Linear algebra)
f in
30
Throw–out–technique: If one vector, say ac, can be written as a
linear combination of the other vectors, then it can be thrown away
with changing the structure of the space, i.e.
3. The vectors a1, a2 are linearly independent and so are the sets
a1, a3 and a2, a3.
October 17, 2001 Mixed Models Course 36
31
3 Basic Concepts from Linear algebra)
f in
32
f in
3
2 1 1 1 1 1 1
ŷ = a(a>a)−1a>y = [2, 2] = = 2
3
2 8 2 2 1 1 2 2
33
3 Basic Concepts from Linear algebra)
Then ŷ = P y where
P = A(A>A)−1A>
1. P y is in span().
34
Example 8. Consider the 3 × 2 matrix A = [a1 : a2], where
1 2
a1 = 3 og a2 = 8
2 9
1 2
1 3 2 14 44
A> A = 3 8 =
2 8 9 44 149
2 9
Hence
> −1 1 149 −44
(X X) =
150 −44 14
> −1 > 1 149 −44 1 3 2
(X X) X =
150 −44 14 2 8 9
1 61 95 98
=
150 −16 −20 38
October 17, 2001 Mixed Models Course 45
35
3 Basic Concepts from Linear algebra)
Finally we find
1 2
1 61 95 98
P = A(A>A)−1A> = 3 8
150 −16 −20 38
2 9
29 55 −22
1
= 55 125 10
150
−22 10 146
f in
Exercise 2. Let
1 2
A = 3 4 .
5 6
36
1. Is A symmetrical?
2. Is A>A symmetrical?
3. Is AA> symmetrical?
Exercise 3. Let
1 2 1 0
A= , and B = .
3 4 1 1
Exercise 5. Let
a b
A=
c d
and
1 d −b
B=
ad − bc −c a
37
3 Basic Concepts from Linear algebra)
x1 + 2x2 = 3
2x1 + 3x2 = 4
can be written as
1 2 x1 3
= ,
2 3 x2 4
i.e. as Ax = b. Find A−1 and use this for solving the system of
equations as follows:
x = Ix = A−1Ax = A−1b.
October 17, 2001 Mixed Models Course 50
Exercise 8. Let
1 0
1 0
A= .
0 1
0 1
38
4 Linear normal models
Linear normal models serves as a natural starting point for the presentation of Mixed Models
theory. Most researchers within animal science has a least a working knowledge of linear normal
models
These slides served the purpose of giving an overview of the different concepts, and to link the
concepts with the underlying statistical theory. Finally, the standard terminology used within
SAS, were presented from a theoretical point of view.
Link to the full screen presentation1 .
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/LNM.f.pdf
39
4 Linear normal models
Introduction
The SAS procedure PROC GLM is designed to deal with the class of
linear normal models.
Y = Xβ +
40
Example 1. One–way analysis of variance.
The model
Ykl = αk + kl
2
where kl ∼ N (0, σ ) for k = 1, 2 and l = 1, 2, 3 can be written in
matrix form as
Y11 1 0 11
Y12
1 0
12
Y13
= 1 0 α1 13
α2 +
Y21
0 1 21
Y22 0 1 22
Y23 0 1 23
Y = X β +
µ = X β
f in
41
4 Linear normal models
There are good reasons for dealing with LNMs in general instead, of
treating regression analysis, analysis of variance etc. separately.
y = Xβ +
42
Example 2. Simple linear regression:
Y i = β 0 + β 1 xi + i
Y = X β +
µ = X β
f in
43
4 Linear normal models
Yi = µi + i i ∼ N (0, σ 2).
Hence each Yi is allowed to have its own mean value, but the
variance σ 2 is the same for all i = 1, . . . , n.
October 17, 2001 Mixed Models Course 9
44
As it has been illustrated, any LNM can be cast in matrix form as
Y = Xβ +
where
Y : is an n × 1 vector of observations,
The matrix X is called the design matrix (or model matrix) because
it contains information about covariates, i.e. about the design of the
study.
45
4 Linear normal models
Yi = β0 + β1xi + β2x2i + i
µ = X β +
f in
October 17, 2001 Mixed Models Course 13
46
• Most frequently the interest is the the mean vector.
E(Z1) µ1
E(Z2) µ2
E(Z) = = =µ
.. ..
E(Zn) µn
47
4 Linear normal models
Z = (Z1, . . . , Zn)>
is the n × n matrix whose element in the ith row and jth column is
the covariance between Zi and Zj .
f in
In general
48
Example 5. The error term = (1, . . . , n) from a linear normal
model has a very simple covariance matrix:
• Hence
1 0 ... 0
0 1 ... 0
Cov() = σ 2 .. .. . . . 2
.. = σ In
0 0 ... 1
f in
49
4 Linear normal models
Result 1.
50
The Multivariate Normal Distribution
Z ∼ Nn(µ, Σ)
n n 1
f (z) = (2π)− 2 det(Σ)− 2 exp{ (z − µ)>Σ−1(z − µ)}
2
51
4 Linear normal models
f in
Hence we have
∼ Nn(0, σ 2 I)
October 17, 2001 Mixed Models Course 26
52
Hence for the linear normal model Y = Xβ + we find that
E(Y ) = µ = E(Xβ + )
= Xβ + E() = Xβ
Cov(Y ) = Cov(Xβ + )
= Cov() = σ 2I
53
4 Linear normal models
Let
1 0 1 1 1 1 0
1 0
1 1
1 1 0
1 0 1 1 1 1 0
X= X2 = X3 = (3)
0 1 1 0 1 0 1
0 1 1 0 1 0 1
0 1 1 0 1 0 1
But that is also the case for vectors of the form X2β2 and X3β3.
From this we conclude that with respect to the mean vector the
matrices X, X2 and X3 are “all the same”.
µ = Xβ = X2β2 = X3β3.
54
Consider the mean vector µ = (2, 2, 2, 3, 3, 3)> . The formulation as
µ = X3β3 where β3 = (δ, ρ1, ρ2)> is different from the two others in
an important way:
f in
55
4 Linear normal models
y = Xβ + where µ = Xβ.
56
The columns of X defines a subspace of Rn which we denote by L,
i.e.
L = span(X).
Hence by saying that µ = Xβ, all one really says is that µ belongs
to L.
57
4 Linear normal models
Let x̄. = n1 i xi denote the average of the xis. Define new variables
P
zi = xi − x̄. and consider the regression model
Y i = α0 + α1 z i + i .
1 z1 1 x1 − x̄.
1 z2
1 x2 − x̄.
1 z3 1 x3 − x̄. α0
X̃ = = , β̃ =
1 z4 1 x4 − x̄. α1
1 z5 1 x5 − x̄.
1 z6 1 x6 − x̄.
f in
58
Representations of Models in SAS
The illustration is with PROC MIXED but applies to PROC GLM too.
PROC MIXED;
CLASS TREAT;
MODEL Y = TREAT / SOLUTION;
RUN;
59
4 Linear normal models
60
This model is highly over parametrized. SAS handles this problem in
the way indicated above: A new design matrix giving the same model
is created, namely
1 1 1 1 δ
1 1 0 0 α1
µ= = X 2 β2
1 0 1 0 β1
1 0 0 0 γ11
61
4 Linear normal models
62
Example 10. (Continuation of Example 2)
n
X
D(β) = (yi − (β0 + β1xi))2
i=1
This gives
P
i(y
Pi
− ȳ.)(xi − x̄.)
β̂1 = 2
i (xi − x̄.)
β̂0 = ȳ. − β̂1x̄.
f in
63
4 Linear normal models
3
1X
αk = ykl = ȳk
3
l=1
The vector µ̂ is in this case (ȳ1, ȳ1, ȳ1, ȳ2, ȳ2, ȳ2)>.
δ = 0, α1 = ȳ1, α2 = ȳ2
and
both results in the same vector µ̂ = (ȳ1, ȳ1, ȳ1, ȳ2, ȳ2, ȳ2)>. f in
64
Estimation on matrix form
D(β) = (y − µ)>(y − µ)
65
4 Linear normal models
β̂ = (X >X)−1X >y.
66
Example 12. (Continuation of Example 7).
f in
We shall now assume that the LNM is such that the columns of X
are linearly independent such that the least squares estimate
β̂ = (X >X)−1X >y.
of β is unique.
67
4 Linear normal models
If the elements of A are denoted aij we see that the ith component
Pp
of β̂ is β̂i = j=1 aij yj
Equation (4) says that the expected value of the least squares
estimator β̂ is simply the true but unknown value β.
68
Cov(β̂(Y )) = A Cov(Y )A> = σ 2AIA> = σ 2AA>
= σ 2(X >X)−1X >[(X >X)−1X >]>
= σ 2(X >X)−1X >X(X >X)−1
= σ 2(X >X)−1 (5)
Equation (5) says that the covariance of the least squares estimator
β̂ is proportional to the residual variance σ 2. Moreover, the matrix
(X >X)−1 does not depend on the data y but only on the design
matrix X, i.e. on how the study at hand was conducted.
69
4 Linear normal models
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept -1.286578758 0.83603651 -1.54 0.1987
x 0.483593802 0.21467436 2.25 0.0874
70
The two first diagonal elements of (X >X)−1 times the variance
estimate σ̂ (i.e. the Mean Square Error) gives variance estimates of
the regression parameters.
The square root of these estimates are the standard errors reported.
71
4 Linear normal models
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept 0.4059995498 0.36662626 1.11 0.3302
z 0.4835938022 0.21467436 2.25 0.0874
In this case we see that centering the x values around their average
(3.5) gives parameter estimates which are uncorrelated. Moreover,
the estimate of the slope (and the associated standard error) is the
same as before. f in
72
Example 14. (Continuation of Example 2)
With
1 x1
1 x2
1 x3 β0
X= , β =
1 x4 β1
1 x5
1 x6
P
n x i
X >X = P Pi 2
i xi i xi
Recall that
a b −1 1 d −b
A= implies that A =
c d ad − bc −c a
P 2 P
> −1 1 i xi − i xi
(X X) = P 2 P
n i xi − ( i xi )2
P
− i xi n
1
Letting K = P 2 P 2 , the variance of the estimator β̂0 for the
n i xi −( i xi )
intercept is
2
P
i xi
V ar(β̂0) =
K
October 17, 2001 Mixed Models Course 68
73
4 Linear normal models
1 X
Cov(β̂0, β̂1) = − xi
K i
f in
74
The estimator β̂ has a p–dimensional multivariate normal
distribution (in short MVN), with mean vector β and covariance
matrix σ 2(X >X)−1.
This is written
β̂ ∼ Np(β, σ 2(X >X)−1).
This means that any linear combination λ>β̂ has a univariate normal
distribution
λ>β̂ ∼ N (λ>β, σ 2λ>(X >X)−1λ) (6)
and that is a very important result for practical statistics.
Hence there are some constraints on what can actually be said about
β.
75
4 Linear normal models
Yij = δ + αi + βj + ij
where
δ
1 1 0 1 0
α1
1 1 0 0 1
µ= α2 = Xβ
1 0 1 1 0
β1
1 0 1 0 1
β2
It is clear that this model is grossely over parametrized (why?)
1
α1 − α2, δ + α1, δ + α1 + (β1 + β2)
2
October 17, 2001 Mixed Models Course 74
76
Note that
f in
Therefore the only thing one can truely say something about is linear
combinations of µ, i.e. linear combinations of the form
a> µ
77
4 Linear normal models
Therefore, we can say something about the contrast λ>β only if one
can find an n–vector a such that
a> X = λ >
In other words,
78
From the general result
we know the distribution of the contrast λ>β̂ and hence testing for
the contrast being zero is straight forward.
1 1 1
δ + α1 + (β1 + β2) = (1, 1, 0, , )β
2 2 2
is indeed estimable.
That is, we seek a vector a = (a1, a2, a3, a4)> such that
1 1
a>X = (1, 1, 0, , ).
2 2
79
4 Linear normal models
a1 + a 2 + a 3 + a 4 = 1
a1 + a 2 = 1
a3 + a 4 = 0
1
a1 + a 3 =
2
1
a2 + a 4 =
2
It is not hard to spot that the solution to these equations are
a1 = a2 = 1/2 and a3 = a4 = 0.
f in
Estimability in SAS
80
The output caused by the E–option in the MODEL statement is
General Form of Estimable Functions
Effect Coefficients
1 Intercept L1
2 i 1 L2
3 i 2 L1-L2
4 j 1 L4
5 j 2 L1-L4
Recall that β = (δ, α1, α2, β1, β2). The numbers 1,2,3,4,5 identify
the entry of the λ–vector, λ = (λ1, λ2, . . . , λ5), and the Ls specify
the constraints to be satisfied by the λis.
It reads as follows: λ1 can be set to any value L1, and λ2 can be set
to any value L2. But then λ3 is constrained to be equal to L1 − L2.
Likewise, λ4 can be set to any value L4, but then λ5 is constrained
to be equal to L1 − L4.
λ = (1, 1, 0, 1, 0) : λ>β = δ + α1 + β1
1 1 1
λ = (1, 1, 0, , ) : λ>β = δ + α1 + (β1 + β2)
2 2 2
>
λ = (0, 1, −1, 0, 0) : λ β = α1 − α2
But we can also see that the contrast δ + 12 (α1 + α2) is not
estimable: Taking λ1 = 1 and λ2 = λ3 = 12 would give the desired
result, but setting λ4 = 0 implies that λ5 = 1, so it is not possible.
81
4 Linear normal models
82
Least Squares Means
Coefficients for i Least Square Means i Level
Effect 1 2
1 Intercept 1 1
2 i 1 1 0
3 i 2 0 1
4 j 1 0.5 0.5
5 j 2 0.5 0.5
1
λ>β = δ + α1 + (β1 + β2).
2
From this we see that the LSMEANS for i = 1 is the δ + α1 plus the
“average effect” of the factor j, i.e. 12 (β1 + β2).
83
4 Linear normal models
Hypothetis Testing
Yij = δ + αi + βj + ij , , i = 1, 2, j = 1, 2
The mean µij of Yij is δ + αi + βj and the mean vector has the form
δ
µ11 1 1 0 1 0 α1
µ 1 1 0 0 1
µ = 12 = α2 = Xβ
µ21 1 0 1 1 0
β1
µ22 1 0 1 0 1
β2
Yij = δ + αi + ij
84
Under the reduced model, the mean µij of Yij is δ + αi and the mean
vector has the form
µ11 1 1 0
δ
µ12 1 1 0
µ= = α1 = X 0 β 0
µ21 1 0 1
α2
µ22 1 0 1
Note that any vector µ that can be written as µ = X0β0 can also be
written as µ = Xβ – simply by setting the last two elements of β to
zero.
85
4 Linear normal models
The answer lies in the “distance” between the observations and the
expected values.
e = y − µ̂ = y − P y = (I − P )y
October 17, 2001 Mixed Models Course 93
reflect random deviations from the mean under the large model (in
which we “believe”).
86
If the reduced model is true then e0 = (I − P0)y is also the vector of
residuals, and the length of the vector should also be small.
On the other hand if the reduced model is not true, then e0 is not
just residuals, because it contains some of the variation due to the
factor βj .
Result 3.
D >D 1
E( )= E(D >D) = σ 2 + k
d − d0 d − d0
or equivalently that
87
4 Linear normal models
88
Problem 1: σ 2 is not known
σ̂ 2 = e>e/(n − d),
Therefore, if the reduced model is true (and hence k = 0), the ratio
D>D/(d − d0)
F = ≈ 1.
e>e/(n − d)
October 17, 2001 Mixed Models Course 99
If the reduced model is not true, then the ratio F would tend to be
larger than 1. The problem remaining is to define what is meant by
“large”. On can show the following:
89
4 Linear normal models
If the reduced model is not true, then F has an expected value larger
than 1.
Result 5.
where RSS and RSS0 denote the residual (or error) sums of squares
under the large and the reduced model respectively.
October 17, 2001 Mixed Models Course 102
90
Tests in LNMs in short form
• Under M, M Y = M µ + M e = µ + M e.
October 17, 2001 Mixed Models Course 103
• If M0 is true, then
>
• Hence ||(Md−d
−M0 )Y ||
0
(M −M0 )Y
= Y r(M −M0 ) is a measure of how close M0Y
is to M Y in relation to the difference in dimensionality of the
models.
October 17, 2001 Mixed Models Course 104
91
4 Linear normal models
• Assuming only M,
Y >(I−M )Y
• If we use M SE = n−d = σ̃ 2 as an estimate for σ 2 then
under M0,
Y > (M −M0)Y
d−d0
F = Y >(I−M )Y
≈1
n−d
• Under M0,
1 >
2
Y (M − M0)Y ∼ χ2(d − d0, β >X >(M − M0)Xβ)
σ
, i.e. a non–central χ2 distribution.
October 17, 2001 Mixed Models Course 106
92
• Hence large values of F causes doubt in M0.
93
4 Linear normal models
94
In the notation from before
has mean
δ
α1
α2
µ11 1 1 0 1 0 1 0 0 0
β1
µ12 1 1 0 0 1 0 1 0 0
µ= = β2 = Xβ
µ21 1 0 1 1 0 0 0 1 0
γ11
µ22 1 0 1 0 1 0 0 0 1
γ12
γ21
γ22
October 17, 2001 Mixed Models Course 112
95
4 Linear normal models
1 1 1 1 δ
1 1 0 0 α1
µ= = X 2 β2 (8)
1 0 1 0 β1
1 0 0 0 γ11
f in
96
5 Some Basic Statistical Concepts
This lecture presented/refreshed basic statistic concepts, such as central limit theorem, principles
of estimation, the likelihood principle and test of hypothesis.
Link to the full screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/StatTheory.f.pdf
97
5 Some Basic Statistical Concepts
y = (y1, . . . , yn)
We shall in general use the term experiment even though the setting
may not be that of a controlled experiment.
Some Characteristics:
Y = (Y1, . . . , Yn)
of the experiment.
98
Here Yi could be for example
• the set of all real numbers,
• the set of positive real numbers,
• the set {diseased, not diseased}, or
• the set {low, medium, high}.
The link between the observed value yi and the set of possible values
Yi is established through the notion of a random variable Yi.
For the ith animal in the population the state of disease is denoted by
Yi and Yi can therefore take one of the values {diseased, not diseased}
(for brevity written simply as {1, 0}).
f in
October 18, 2001 Mixed Models Course 4
99
5 Some Basic Statistical Concepts
100
In statistical terms, one speaks of a parametrical statistical model:
101
5 Some Basic Statistical Concepts
Histogram of z.mean
5
4
Relative Frequency
3
2
1
0
0.3 0.4 0.5 0.6 0.7
z.mean
1 1
f (y; µ, σ 2 ) = √ exp(− 2 (y − µ)2)
2πσ 2σ
102
Result 1. The Central Limit Theorem says that
0.7
1.5
Relative Frequency
Relative Frequency
Relative Frequency
Sample Quantiles
0.6
0.8
1.0
0.5
0.4
0.4
0.5
0.3
0.0
0.0
0.2
0.0 0.4 0.8 0.0 0.4 0.8 0.2 0.4 0.6 0.8 −2 −1 0 1 2
f in
103
5 Some Basic Statistical Concepts
104
Example 5. (Continuation of Example 2) Consider the experiment
of tossing a “pin” n times, giving data y = (y1, . . . , yn). Hence the
possible outcomes are Yi = {up, down} which we write {1, 0}.
It is assumed that
P (Yi = 1) = θ
for all i, such that the probability of observing “pin up” (!) is the
same every time.P If we observe that the pin points upwards all
together y+ = i yi times, then it takes only very little creativity to
suggest that the relative frequency
y+/n
105
5 Some Basic Statistical Concepts
f in
106
In the examples above it is easy to suggest ways of estimating the
unknown parameters. These can be described as:
107
5 Some Basic Statistical Concepts
Method of Moments
n
1X
Z1 = Yi ∼ N (θ, 1/n)
n i=1
Pn
The estimate z1 = n1 i=1 yi can then be regarded as a realization
of the random variable Z1 which has mean E(Z1) = θ.
We say that
108
The method of moments is to consider θ̂(y) as a good estimate of θ
because the corresponding random variable Z1(Y ) has θ as its
expectation:
E(Z1(Y )) = θ (1)
However, there are many estimators with the property (1). Two
additional ones are
• the average Z2(Y ) = (Y1 + Y2)/2 of the two first random variables,
and
109
5 Some Basic Statistical Concepts
V ar(Z1(Y )) = 1/n
V ar(Z2(Y )) = 1/2
V ar(Z3(Y )) = 1
110
Someone might suggest to estimate θ by Z4(Y ) = Z1(Y ) + 7.
111
5 Some Basic Statistical Concepts
Consistency of Estimators
• Unbiasedness,
• Consistency
112
Estimators, whatever kind they are, are functions of the random
variables Y1, . . . , Yn from which data y1, . . . , yn are realizations.
Hence estimators are random variables and as such they have a
distribution. This distribution is needed when drawing inference
about a parameter, e.g. when making a test or constructing a
confidence interval.
113
5 Some Basic Statistical Concepts
114
Suppose the observed data are y = {1, 1, 0, 1, 0, 1, 0, . . . , 0, 0}.
f in
L(θ) = θ y+ (1 − θ)n−y+ .
115
5 Some Basic Statistical Concepts
5*10^-8
4*10^-8
3*10^-8
Likelihood function
2*10^-8
10^-8
0
Theta value
116
The Maximum likelihood principle
For clarity one should write θ̂(y) for the estimate and θ̂(Y ) for the
corresponding estimator, but this is too cumbersome to do. So,
except for special cases, we simple write θ̂ for both entities and then
derive from the context whether its is an estimate (a number) or and
estimator (the corresponding random variable).
117
5 Some Basic Statistical Concepts
-40
-50
-60
-70
Theta value
118
Maximization of
S(θ) = l0(θ) = 0,
We find that
y+ n − y +
S(θ) = l0(θ) = − =0
θ 1−θ
October 18, 2001 Mixed Models Course 42
119
5 Some Basic Statistical Concepts
Y+
θ̂(Y+) = .
n
120
In Figure 3 is shown the likelihood function for (n = 2, y+ = 5),
(n = 4, y+ = 10), (n = 10, y+ = 25) and (n = 20, y+ = 50).
0.0012
0.03
0.0008
Likelihood function
Likelihood function
0.02
0.0004
0.01
0.0
0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
2*10^-15
Likelihood function
Likelihood function
3*10^-8
10^-15
10^-8
0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
It is clear from those graphs that the more observations the more
“peaked” is the likelihood function and the higher is its curvature at
its maximum and.
That is, the value of L(θ̂) is more and more distinct from the value
of L(θ) for θ 6= θ̂ when more and more observations are made.
121
5 Some Basic Statistical Concepts
The key to answering this question is the random variable θ̂(y). Put
in a popular way, one has to investigate whether 0.5 is a “likely”
outcome of θ̂(Y ). To answer that question, one need to know
the distribution of θ̂(Y ) – and this distributions is in general very
complicated to find. f in
122
Therefore one frequently resorts to an approximate result, on which
so much resides in statistics:
1
θ̂ ∼ N (θ, − )
l00(θ̂)
Example 10. For the binomial experiment, it is not hard to see why
the MLE is asymptotically normal:
We can regard y as a sum of independent random variables yi where
yi = 1 corresponds to pin up and yi = 0 is “pin not up”.
Hence the Central Limit Theorem gives that y is approximately
normally distributed, and hence so is θ̂ = y/n.
For a single experiment we know that E(yi) = θ and V ar(yi) =
θ(1 − θ). From this we find that
θ(1 − θ)
E(θ̂) = θ, V ar(θ̂) =
n
so approximately,
θ(1 − θ)
θ̂ ∼ N (θ, )
n
f in
123
5 Some Basic Statistical Concepts
The expression for the variance is obtained as follows: Recall that the
likelihood and score functions are given by
x n−x
−l00(θ) = +
θ2 (1 − θ)2
October 18, 2001 Mixed Models Course 51
Hence, asymptotically,
!
θ̂(1 − θ̂)
θ̂ ∼ N θ,
n
124
With n = 25, x = 10 we get θ̂ = 0.4 and V ar(θ̂) ≈ 0.0096. Hence,
an (approximate) 95% confidence interval for θ is
q q
(θ̂ − 1.96 V ar(θ̂) ; θ̂ + 1.96 V ar(θ̂))
= (0.4 − 0.19 ; 0.4 + 0.19) = (0.21; 0.59)
f in
125
5 Some Basic Statistical Concepts
θ θ̂ θ
η̂ ∼ N ( , ) = N( , 0.0133).
1 − θ (1 − θ̂)n 1−θ
f in
Tests of Hypotheses
126
In other words, it is tempting to consider the
likelihood ratio test statistic Q defined by
L(θ0)
Q=
L(θ̂)
l(θ)
l(θ̂) − l(θ0 )
Slope
l0 (θ)
θ
.
θ̂ θ0
Figure 4: Illustration of the likelihood ratio test, the score test and
the Wald test.
127
5 Some Basic Statistical Concepts
It can be shown that when n is large and the hypothesis is true, the
distribtion of the so called score test
S = −l0(θ0)2/l00(θ0)
Hence when n is large the likelihood ratio test and the score test are
equivalent.
A third test is the Wald test which compares the values of θ̂ and θ0
October 18, 2001 Mixed Models Course 59
It can be shown that when n is large and the hypothesis is true, the
distribtion of the Wald test statistic
W = −(θ̂ − θ0)2(l00(θ̂))2
128
Hence when n is large the likelihood ratio test, the score test and
the Wald test are equivalent.
129
5 Some Basic Statistical Concepts
For later purposes we need the mean and the variance of the score
function.
1
S(θ) = l0(θ; x) = (log p(x; θ))0 = p0(x; θ)
p(x; θ)
1 2 1
S 0(θ) = l00(θ; x) = − 2
(p 0
(x; θ)) + p00(x; θ)
p(x; θ) p(x; θ)
Z
p(x; θ)dx = 1
d d d
Z Z
p(x; θ)dx = p(x; θ)dx = 1 = 0
dθ dθ dθ
d d d
Z Z
E(S(θ)) = p(x; θ)dx = p(x; θ)dx = 1 = 0
dθ dθ dθ
October 18, 2001 Mixed Models Course 64
130
So the expected value of the score function is zero.
Interchanging
R 00the order of differentiation and integration as before
gives that p (x; θ)dx = 0. Hence
1
Z
0
E(S (θ)) = − (p0(x; θ))2dx = −V ar(S(θ, X)).
p(x; θ)
October 18, 2001 Mixed Models Course 66
131
5 Some Basic Statistical Concepts
E(S(θ)) = 0
I(θ) = V ar(S(θ)) = E(S(θ)2) = −E(S 0(θ)) (3)
From (2) it is seen that the likelihood for all data is the product of
the likelihood for each observation, i.e.
Y
L(θ; y) = p(y1; θ) . . . p(yn; θ) = p(yi; θ),
i
components:
X X
l(θ) = l(θ; yi) = li(θ)
i u
X X X
S(θ) = l0(θ; y) = l0(θ; yi) = S(θ; yi) = Si(θ),
i i i
X
S 0(θ) = Si0(θ), (4)
i
E(Si(θ)) = 0
I(θ) = V ar(Si(θ)) = E(Si(θ)2) = −E(Si0(θ))
October 18, 2001 Mixed Models Course 68
132
and correspondingly for all observations
E(S(θ)) = 0
V ar(S(θ)) = nI(θ).
That is
1 1 √
√ S(θ0) ≈ − S 0(θ0) n(θ̂ − θ0)
n n
√
≈ I(θ0) n(θ̂ − θ0)
October 18, 2001 Mixed Models Course 70
133
5 Some Basic Statistical Concepts
or
θ̂ ∼ N (θ0, (nI(θ))−1).
as desired.
134
Because the observations are independent, the likelihood becomes
135
5 Some Basic Statistical Concepts
1
σ̂ 2 = (y − µ̂)>(y − µ̂)
n
In practice, one never uses this variance estimate. Instead one uses
1
σ̃ 2 = (y − µ̂)>(y − µ̂)
n−p
n−p 2
E(σ̂ 2) = σ
n
E(σ̃ 2) = σ 2
136
6 An overview
The purpose of this lecture was to illustrate, how the problems of the research within the
biological sciences is related to the progress within statistical theory both in general, and related
to mixed models.
Starting out with an experiment reported from Darwin, the lecture discussed the state of the art
of experimental design and analysis at Darwin’s time, proceeded with the progress in statistical
theory, very much related to animal breeding, and ended up with the general theory of mixed
models. Important researchers such as F. Galton, R.A. Fisher, S. Wright, C.R.Henderson were
presented.
The slides are in Danish. Link to the full screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/oversigt.f.pdf
137
6 An overview
Outline
• Baggrund for metoder
• Historisk forløb
• Relation til vores fagområder
February 7, 2001
Darwins Majs
February 7, 2001
138
3
Darwins Majs
column I II III
Crossed Self. -fert
Pot I 23 48 17 38
12 20 38
21 20
Pot II 22 20
19 18 18 38
21 48 18 58
Pot III 22 18 18 58
20 38 15 28
18 28 16 48
21 58 18
23 28 16 28
Pot IV 21 18
22 18 12 68
23 15 48
12 18
February 7, 2001
Darwins Majs
column I II III
Crossed Self. -fert
Pot I 23.50 17.38
12.00 20.38
21.00 20.00
Pot II 22.00 20.00
19.13 18.38
21.50 18.63
Pot III 22.13 18.63
20.38 15.25
18.25 16.50
21.63 18.00
23.25 16.25
Pot IV 21.00 18.00
22.13 12.75
23.00 15.50
12.00 18.00
February 7, 2001
139
6 An overview
Darwins Majs
” As only a moderate number of crossed and self-fertilised plants
were measured, it was of great importance to learn, how far the
averages were trustworthy. I therefore asked Mr Galton, who has
much experience in statistical researches, to examine some of my
tables..... I may premise that if we took by chance a dozen score of
men belonging to different nations and measured them, it would I
presume, be very rash to form any judgment from such small
numbers on their average heights. But the case is somewhat
different with my crossed and self-fertilised plants, as they were of
exactly the same age, were subjected from first to last to the same
conditions, and were descended from the same parents”
February 7, 2001
Galtons tilgang
column I II III Sorteret Diff.
Crossed Self. -fert Crossed Self. -fert
Pot I 23.50 17.38 23.50 20.38 3.125
12.00 20.38 23.25 20.00 3.250
21.00 20.00 23.00 20.00 3.000
Pot II 22.00 20.00 22.13 18.63 3.500
19.13 18.38 22.13 18.63 3.500
21.50 18.63 22.00 18.38 3.625
Pot III 22.13 18.63 21.63 18.00 3.625
20.38 15.25 21.50 18.00 3.500
18.25 16.50 21.00 18.00 3.000
21.63 18.00 21.00 17.38 3.625
23.25 16.25 20.38 16.50 3.875
Pot IV 21.00 18.00 19.13 16.25 2.875
22.13 12.75 18.25 15.50 2.750
23.00 15.50 12.00 15.25 -3.250
12.00 18.00 12.00 12.75 -0.750
February 7, 2001
140
7
Galtons Tilgang
• Sortering
• Differencer
February 7, 2001
K. Pearson’s Guru
February 7, 2001
141
6 An overview
Korrekt tilgang ?
column I II III Diff.
Crossed Self. -fert
Pot I 23.50 17.38 3.125
12.00 20.38 3.250
21.00 20.00 3.000
Pot II 22.00 20.00 3.500
19.13 18.38 3.500
21.50 18.63 3.625
Pot III 22.13 18.63 3.625
20.38 15.25 3.500
18.25 16.50 3.000
21.63 18.00 3.625
23.25 16.25 3.875
Pot IV 21.00 18.00 2.875
22.13 12.75 2.750
23.00 15.50 -3.250
12.00 18.00 -0.750
February 7, 2001
10
Korrekt tilgang ?
• Differencer
• Spredning + t-test
• Uafhængighedsantagelse.
• Randomisering
February 7, 2001
142
11
Korrekt tilgang ?
column I II III Diff.
Crossed Self. -fert
Pot I 23.50 17.38 3.125
12.00 20.38 3.250
21.00 20.00 3.000
Pot II 22.00 20.00 3.500
19.13 18.38 3.500
21.50 18.63 3.625
Pot III 22.13 18.63 3.625
20.38 15.25 3.500
18.25 16.50 3.000
21.63 18.00 3.625
23.25 16.25 3.875
Pot IV 21.00 18.00 2.875
22.13 12.75 2.750
23.00 15.50 -3.250
12.00 18.00 -0.750
February 7, 2001
12
Hvad er sket
• R.A. Fisher
? Rothamstead
February 7, 2001
143
6 An overview
13
Den 5. Potte
• Hvorfor ?.
• Tilfældige effekter,
Populationer,
Stikprøver
February 7, 2001
14
Populationsgenetik
• Population
• P =A+M
• Ao = 12 Am + 21 Af
February 7, 2001
144
15
Populationsgenetik
• R.A. Fisher
• Sewall Wright
• (Haldane)
February 7, 2001
16
Hierarkiske populationer
.
Sires
Females
Offspring
.
February 7, 2001
145
6 An overview
17
Populationsgenetik/ Husdyravl
• R.A. Fisher
• Sewall Wright
• Jay R. Lush
• C.R. Henderson
• S.R. Searle.
February 7, 2001
18
Husdyravl
February 7, 2001
146
19
Husdyravl
February 7, 2001
20
• Spatiale observationer
February 7, 2001
147
6 An overview
21
February 7, 2001
148
7 Experimental planning and design
The purpose of the lecture was to refresh the concepts used in experimental planning and design,
i.e., hypothesis, power of designs, blocking. Typical blocking factors were discussed.
Different types of experimental design, such as randomized block, split-plot, latin squares and
factorial designs, were discussed, and examples were sought within the participants areas of
research.
The slides are in Danish. Link to full-screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/Forsplanpl.f.pdf
149
7 Experimental planning and design
Outline
• Hypotheses
• Decision Support
• Need of information for planning
• Restrictions in experimental design
• Different designs
Forskningsprocessen
Pakke
Ansøgning
Publicering
Forsøg
150
3
Forskningsprocessen
• (Kvantificering af viden)
Darwins Majs
column I Height, Inch
Crossed Self. -fert
Pot I 23.50 17.38
12.00 20.38
21.00 20.00
Pot II 22.00 20.00
19.13 18.38
21.50 18.63
Pot III 22.13 18.63
20.38 15.25
18.25 16.50
21.63 18.00
23.25 16.25
Pot IV 21.00 18.00
22.13 12.75
23.00 15.50
12.00 18.00
151
7 Experimental planning and design
Hypotheses
Luse Beslutningsstøtte
152
7
Forskning Beslutningsstøtte
Typer af fejlkonklusion
Hypotese 1 Hypotese 2
153
7 Experimental planning and design
Muligheder i designfase
Hypotese 1 Hypotese 2
-1 0 1 2 -1 0 1 2
10
Biologisk input
• Måleegenskaber
• Forventede forsøgsudslag
• Hypotesegene(re)rende egenskaber
154
11
12
Typiske blokfaktorer
• Kuld
• Sti, Flok, Bur
• Køn
• Afstamning
• Besætning
• Observatør
155
7 Experimental planning and design
13
• Blokstørrelse
• Opstaldning/Management
• Ressource kamp
14
Designtyper
• Randomiseret Blokforsøg
• Split-Plot forsøg
• Romer Kvadrat
• Ikke komplette blokforsøg
• Faktorielle forsøg
• Fraktionerede designs
156
8 Randomized Complete Block Design
These are the first slides in the second block of lectures. They start off with the augmentation
of the linear normal model to a mixed model. Then PROC MIXED in SAS were presented, and
example 1.2.1 in LMSW (Littell et al., 1996) were discussed. The slides can be seen as a summary
of chapter 1 in LMSW.
Link to the full screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/RDBC.f.pdf
157
8 Randomized Complete Block Design
Outline
• Hypotheses
• Udvidelse af LNM
Y11 = δ + α1 + u1 + ε11
Y12 = δ + α2 + u1 + ε12
Y21 = δ + α1 + u2 + ε21
Y22 = δ + α2 + u2 + ε22
εij ∼ N (0, σ 2)
Y ∼ N (0, σ 2I)
158
5
Matrix formulering
Y11 1 1 0 1 0 ε11
δ
Y12 1 0 1 1 0 u1 ε
= α1 + + 12
Y21 1 1 0 0 1 u2 ε21
α2
Y22 1 0 1 0 1 ε22
Y = Xβ + Zu + ε
• Are Inferences to be drawn from these data about just these level
of the factor ? Searle, (1971)
159
8 Randomized Complete Block Design
ML - estimation
Likelihood function
1 1 n
l(y, β, σ 2, σu2 ) = − log |V | − (y − Xβ)>V −1(y − Xβ) − log(2π)
2 2 2
−40
−41
Loglike
−42
−43
−44
1 2 3 4 5
February 28, 2001 σ2
160
9
Proc Mixed I
10
Proc Mixed II
CLASS variables ;
161
8 Randomized Complete Block Design
11
12
Proc Mixed
Model concerns Xβ
162
13
14
Design
Ingot no.
Lodning 1 2 3 4 5 6 7
1 n i c c c n n
2 c n i i n c i
3 i c n n i i c
163
8 Randomized Complete Block Design
15
• Parrede observationer
Den rullende Afprøvning
• (Beretning 685) Stigende mængder solsikkefrø (4 niveauer). 20
kuld a 4 grise.
• Beretning 546. Opdrætningsintensitet, Jersey. 10 par enæggede
tvillinger. Høj vs. lav intensitet,
• Forskningsrapport 25. Airwash systemet. Besætning opdeles efter
lige vs ulige konumre.
16
164
17
Anden notation
Yijk = µ + αi + uj + εij
uj ∼ N (0, σu2 )
εij ∼ N (0, σε2)
18
Tredje notation
Yijk = Xβ + Zu + ε
u ∼ N (0, G)
ε ∼ N (0, R)
165
8 Randomized Complete Block Design
19
Model Information
20
ingot 7 1 2 3 4 5 6 7
metal 3 c i n
166
21
Dimensions
Covariance Parameters 2
Columns in X 4
Columns in Z 7
Subjects 1
Max Obs Per Subject 21
Observations Used 21
Observations Not Used 0
Total Observations 21
22
Iteration History
0 1 112.40987952
1 1 107.79020201 0.00000000
167
8 Randomized Complete Block Design
23
Covariance Parameter
Estimates
ingot 11.4478
Residual 10.3716
24
Fit Statistics
168
25
Signifikans test
Num Den
Effect DF DF F Value Pr > F
26
Degrees of Freedom
Numerator H0 : α1 = α2 = α3 = 0
µ
0 1 −1 0
α
K β = 0 ⇔ 0 1 0 −1 1 = 0
>
α2
0 0 1 −1
α3
Num DF is rank(K)
169
8 Randomized Complete Block Design
27
28
Estimates Standard
Label Estimate Error DF t Value Pr > |t|
Contrasts
Num Den
Label DF DF F Value Pr > F
170
29
Standard
Effect metal Estimate Error DF t Value Pr > |t|
Standard
Effect metal _metal Estimate Error DF t Value Pr > |t|
30
GLM
GLM:
Source DF Type III SS Mean Square F Value Pr > F
Mixed:
Num Den
Effect DF DF F Value Pr >F
171
8 Randomized Complete Block Design
31
GLM:
Standard LSMEAN
metal pres LSMEAN Error Pr > |t| Number
Standard
Effect metal Estimate Error DF t Value Pr > |t|
32
GLM: Standard
Parameter Estimate Error t Value Pr > |t|
Mixed: Standard
Label Estimate Error DF t Value Pr > |t|
172
33
Summary
• Model specification
• Output elements
• Estimation Methods
• Fit Statistics/Information Criterias
• Degrees of freedom, model parameters.
• GLM differs
34
IC Options
173
8 Randomized Complete Block Design
35
36
174
9 Randomized Complete Block Design II
These slides discussed the concept of BLUE and BLUP estimates. The question of model control
is addressed.
Link to full-screen presentation presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/RDBC2SLU.f.pdf
175
9 Randomized Complete Block Design II
Outline
• BLUEs and BLUPs
• Examples of model control
176
3
Linear Regression
2
2
1
1
x2
x2
0
0
−1
−1
−2
−2 −1 0 1 2 −2 −2 −1 0 1 2
x1 x1
Linear Regression
x2 µx2 V2 C21
∼N ,
x1 µx1 C12 V1
V(E(X2|X1)) = C21V1−1C21
>
177
9 Randomized Complete Block Design II
u µu G C
∼N ,
y µy C> V
u1 = u1
u2 = u2
Y11 = δ + α1 + u1 + ε11
Y12 = δ + α2 + u1 + ε12
Y21 = δ + α1 + u2 + ε21
Y22 = δ + α2 + u2 + ε22
178
7
Variance in BLUP
u = ũ + εu ⇔ u − ũ = εu
Example
niσu2
ũi = BLUP(ui) = (ȳi· − µ)
σ2 + niσu2
179
9 Randomized Complete Block Design II
2.5
2.0
2.0
1.5
1.5
1.0
1.0
0.5
10
BLUP summary
180
11
• Probit plots.
• εi,t vs εi,t−1
• etc.
12
where VG = ZGZ >. i.e., not iid. ( option OUTP in PROC MIXED)
Standardized residuals ?
181
9 Randomized Complete Block Design II
13
Residual vs predicted
10
10
5
0
0
r1
r2
−20 −15 −10 −5
−10
−20
72 73 74 75 66 68 70 72 74 76 78
p1 p2
(e) (f)
182
10 Split-Plot Experiments
These slides present the theoretical background for split-plot designs. The slides augments the
presentation of split-plot designs in chapter 2 in LMSW, (Littell et al., 1996). The concept of
variance-components are presented, and the different variance of different contrast presented. In
addition concepts such the distribution of Sum of Squares, Satterthwaite’s approximation and
the distinction between random and fixed effects are presented.
Link to the full screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/SplitPlot.f.pdf
183
10 Split-Plot Experiments
184
Other examples:
185
10 Split-Plot Experiments
• There are 4 blocks (BLOCK, indexed with k.) and and CULT is randomly
assigned to each half of the block.
186
The variance components have implications for the correlation structure
among the variables:
1. Observations within the same block (k) but with different levels
of factor A (i) are correlated through the block component
Cov(yijk ,yi0jk ) σr2
Corr(yijk , yi0j 0k ) = Corr(yijk , yi0jk ) = Var(yijk ) = 2
σtot
2. Observations within the same block (k) and with the same level
of factor A (i) but different levels of factor B (j) are correlated
through the block component and the whole–plot component
Cov(yijk ,yij 0k ) σr2+σw 2
Corr(yijk , yij 0k ) = 2
σtot
= 2
σtot
Hence in the split plot model it is assumed that the correlation, when
present, is positive.
187
10 Split-Plot Experiments
Comparing Differences
188
Hence Var(ȳ1.. − ȳ2..) is bigger than Var(ȳ.1. − ȳ.2.).
In PROC MIXED one can make “approximate F–tests” (but SAS never
informs you that the tests are only approximate).
189
10 Split-Plot Experiments
For simplicity suppose that factor B does not represent a treatment but
only replications within each whole–plot. Then the model reduces to
2
ȳi.k = µ + αi + rk + (wik + ¯i.k ) where Var(wik + ¯i.k ) = σw + σ 2/b
2
• Hence the between whole–plot variation (σw ) remains unchanged while
2
the within whole–plot variation σ is reduced by a factor b.
190
Modelling the Mean
Let zik = ȳi.k denote the mean and define uik = wik + ¯i.k .
zik = µ + αi + rk + uik
In connection with ANOVA calculations, one frequently uses the following results:
ANOVA2:
Pn Let Y1, . . . , Yn be independent with Yi ∼ N (µ, σ 2), and let SSD =
2
i=1 (Yi − Ȳ. ) . Then
191
10 Split-Plot Experiments
ANOVA3: Let Y1, . . . , Yn be independent with Yi = µi + i, where i ∼ N (µi, σ 2), and
let n n
X X
SSD = (Yi − Ȳ.)2 and Q(µ) = (µi − µ̄.)2
i=1 i=1
Then
n
X
E(SSD) = Q(µ) + E( (i − ¯.)2) = Q(µ) + (n − 1)σ 2
i=1
With
zik = µ + αi + rk + uik
summation gives
The difference
z̄i. − z̄.. = (αi − α.) + (ūi. − ū..)
192
− z̄..)2 we find that
P
Letting SSDA = i(z̄i.
X X
E(SSDA) = 2
(αi − α.) + E( (ūi. − ū..)2)
i i
σu2
= Q(α) + (a − 1)
and hence
c
X
E(c (z̄i. − z̄..)2) = Q(α) + (a − 1)σu2 .
i
193
10 Split-Plot Experiments
c · SSDA/(a − 1)
F =
SSDAC /(a − 1)(c − 1)
c i(z̄i. − z̄..)2/(a − 1)
P
= P 2
ik (zik − z̄i. − z̄.k + z̄..) /(a − 1)(c − 1)
∼ Fa−1,(a−1)(c−1)
194
Back to the Original Setup
1. The interaction effect γij is tested exactly as if wik and rk had been
fixed effects. I.e. the test is made “against” the residual variation σ 2.
2. In the absence of γij , the main effect βj is also tested as if wik and rk
had been fixed effects.
Unbalanced cases
All the nice calculations previously presented breaks down when the
design is no longer balanced.
Consider again
yijk = µ + αi + rk + wik + ijk
and suppose this time that i = 1 . . . a, k = 1 . . . c and j = 1 . . . bik .
Hence there might not be the same number of replicates (j) within each
whole–plot unit.
195
10 Split-Plot Experiments
2
Var(uik ) = σw + σ 2/bik = σu2 ik
2
has variance (σw + σu2 i. )/c which depends on i.
196
Some consequences of this:
A related problem:
The optimal estimate of this contrast is in the balanced case the differnce
ȳ11. − ȳ21.
197
10 Split-Plot Experiments
2
• The problem is that to estimate σw + σ 2, two sums–of–squares are
needed.
Satterthwaites approximation
Then
σi2 σ2 σ2
Ȳi ∼ N (µi, ), Ȳ1 − Ȳ2 ∼ N (µ1 − µ2, 1 + 2 )
ni n1 n2
ni
1X σ2
Si2 = (Yij − Ȳi.)2 ∼ i χ2(fi), fi = ni − 1
fi j=1 fi
198
2 σ12 σ2 2
Let σD = n1 + n22 . A natural and unbiased estimate for σD is
2 S12 S22
SD = + (1)
n1 n2
2
Question : What is the distribution SD ?
2 S12 S2
• With SD = n1 + n22 we have
2 σ12 σ22 2
E(SD ) = + = σD
n1 n2
2 σ14 σ24
V ar(SD ) = 2( 2 + 2 )
n1f1 n2f2
2 φ2 2
• Under the approximation SD ∼approx η χ (η) is
2
E(SD ) = φ2
2 φ4
V ar(SD ) = 2
η
April 6, 2001 Mixed Models Course 32
199
10 Split-Plot Experiments
φ2 = σD
2
2 2
(σD )
η =
σ14 σ4
n21f1
+ n22f
2 2
(s2D )2
η =
s41 s4
n21f1
+ n22f
2 2
2 2 10
σD = + =2
6 6
22
η = 22 102
= 6.9 ≈ 7
62·5
+ 62·5
Hence
2 S12 S22 2
σD
SD = + ∼approx χ2(7)
n1 n2 7
f in
200
Example 3. Let σ12 = 100, σ22 = 90, n1 = 100, n2 = 10, f1 = 99, f2 =
9. Then
2 100 90
σD = + = 10
100 10
(1 + 9)2
η = 12 92 = 11.1
99 + 9
Quite a difference! f in
201
10 Split-Plot Experiments
Two–sample Problem
Model:
Yij = µi + ij , i = 1, 2, j = 1, . . . , ni
where ij ∼ N (0, σi2).
202
Split–Plot Experiment
2
where wik ∼ N (0, σw ) and ijk ∼ N (0, σ 2).
2
• Make simulations for different values of σw .
n1 = 3 and n2 = 8
i = 1 : j = 1 . . . n1k = 5
i = 2 : k = 1 . . . 3 : j = 1 . . . n1k = 3
i = 2 : k = 4 . . . 8 : j = 1 . . . n1k = 9
203
10 Split-Plot Experiments
A typical SAS program for analyzing the split plot data above is like
proc mixed data=sim noitprint;
class i j k subject;
model y = i j /ddfm=contain chisq;
random i*k;
run;
• This tells SAS that when testing any of the fixed effects in the model,
SAS should look for a random effect which syntactically contains the
April 6, 2001 Mixed Models Course 42
204
fixed effect: Since i is contained in i*j SAS then knows that that it is
this random effect the test should be “made against”.
• It is well known that this is the right thing to do when the experiment
is balanced.
A Severe Warning!!
205
10 Split-Plot Experiments
subject and (i, k) really identifies the same units in the experiment
then it should be immaterial what one writes.
will also make SAS perform the test of effect of the factor i against the
residual variance which, as poined out above, is wrong.
206
Some Tentative Conclusions on Satterthwaite
• For larger samples, there is not much difference between the two
methods. In practice, this is because the difference between the
quantiles in a F (1, 7) and F (1, 14) distribution is not large whereas the
differences between quantiles in a F (1, 2) and a F (1, 4) distribution be
substantive
• Both methods generally perform better than the large sample χ2 tests.
somewhat intensive.
207
10 Split-Plot Experiments
lactation number, management system, cage number, and breed class. Usually if the
sampling were to be repeated a second time, those factors which maintain the same
classes between the two samplings would be fixed factors. For example, a growth trial
on pigs using two diets would probably need to use the same housing facilities, the
same age groups of pigs, and the same diets, but the individual pigs would necessarily
have to be new animals because an animal could not go through the same growth
phase a second time in its life. Pig effeets would be considered a random factor whfle
the other effects would be fixed.
Random factors are factors whose levels are considered to be drawn randomly from an
infinitely large population of levels. As in the previons pig experiment, pigs were
considered random because the pig population of the world is large enough to be
considered infinitely large, and the group that were involved in that experiment were a
random sample from that population. In actual fact, however, the pigs on that
experiment were likely sampled from those relatively few pigs that were available at the
time the trial started, but still they are considered to be a random factor because if the
experiment were to be repeated again, there would likely be a completely different
group of pigs involved.
208
Another way to determine if a factor is fixed or random is to know how the results will
be used. In a nutrition trial the results infer something about the diets in the trial. The
diets are specific and no inferences should be made about other diets not tested in the
experiment. Hence diet effects would be a fixed factor. In contrary, if animal effeets
were in the model, inferences about how any animal might respond to a specific diet
may need to be made. There should not be anything peculiar about the animal on the
trial that would nullify that inference. Animal effeets would be a random factor.
In general, a few questions need to be answered to make the correct choice of fixed or
random factor designation. Some of the questions are:
1. How many levels of the factor a-re in the model? If smalt, then perhaps this is a
fixed factor. If large, tILen perhaps this is a random factor.
3. Would the same levels be used again if the experiment were to be repeated a second
4. Are inferences to be made about levels not included in the experiment? If yes, then
perhaps this factor should be random.
5. Were the levels of a factor determined in a nonrandom manner? If yes, tiden perhaps
this factor should be treated as fixed.
By studying the scientific literature, a researcher should be able to get some help in this
decision process. If in doubt, then the assistance of an experienced statistician should
be sought.
209
10 Split-Plot Experiments
Multilocation Trials
Note that since there are replicates within each farm, the
farm–treatment interaction can be estimated.
210
It is reasonable to assume that (RL)jk and ijk are random. But other
effects need more consideration:
But if the farms are selected as e.g. “those 9 farms whose owners
responded to a questionnaire sent out to all farms with given
characteristics”, then the farms are not random representatives from the
April 6, 2001 Mixed Models Course 55
211
10 Split-Plot Experiments
212
11 Examples of Split-Plot Designs
The purpose of this lecture was to illustrate the kind of problems that may arise, if split-plot
designs are not treated properly. Most of the experiments presented were made at the Danish
Institute of Agricultural Sciences, or rather the National Institute of Animal Science, as it was
called in those days.
Another common aspect of several of the experiments were that they have led to a heated debate.
The pro’s and con’s in those debates were presented.
Link to the full screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/SPLITPLOTExamples.pdf
213
11 Examples of Split-Plot Designs
March 6, 2001 1
March 6, 2001 2
214
Anovas
Total 63 7
March 6, 2001 3
Reported model:
Yijk = µ + Pi + Hj + εijk
March 6, 2001 4
215
11 Examples of Split-Plot Designs
Are the present feeding standards for essential nutrients per FUp
sufficient for Ad lib feeding ?
March 6, 2001 5
Model
March 6, 2001 6
216
Similar designs
March 6, 2001 7
Straw shortener
A number of sows were fed with either control feed or feed containing straw from
fields treated with straw shortener (CCC). To investigate long term effects the
study covered 4 parities.
Reported model:
Yijk = µ + ti + pj + (tp)ij + εijk
Yijk : Observed variable e.g., litter size. ti: effect of treatment. pj : effect of parity.
(tp)ij : Interaction between parity and treatment.εijk : random residual
Correct model:
Yijk = µ + ti + pj + (tp)ij + Sik + εijk
Sik : Effect of sow k on treatment i, Sik ∼ N (0, σS2 )
March 6, 2001 8
217
11 Examples of Split-Plot Designs
Group housing
March 6, 2001 9
Herd Investigations
March 6, 2001 10
218
Multi location trials
March 6, 2001 11
219
11 Examples of Split-Plot Designs
220
12 Estimation and tests in mixed models
The purpose of this lecture was to give a detailed description of theoretical issues of estimation
and tests in mixed models, i.e. properties of maximum likelihood estimators in the linear normal
model and the mixed linear normal model. Concepts such as ML and REML is introduced.
Link to the full screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/MLMixed.f.pdf
221
12 Estimation and tests in mixed models
yi = β0 + β1xi + i
We shall show that the maximum likelihood estimate and the least squares
estimate for
β = (β0, β1)
are identical.
Because of the independence, the joint density for y1, . . . , yn (and hence
the likelihood function) becomes
n
Y
f (y1, ...yn; β) = f (yi; β)
i=1
n
Y 1 1 1
= √ exp(− 2 (yi − (β0 + β1xi))2)
i=1
2π σ 2σ
1 1 1 X
= √ n n exp(− 2 (yi − (β0 + β1xi))2)
2π σ 2σ i
= L(β)
222
The likelihood function is
1 1 1 X
L(β) = √ n
n
exp(− 2
(yi − (β0 + β1xi))2)
2π σ 2σ i
− (β0 + β1xi))2.
P
• Let D(β0, β1) = i(yi
f in
y = Xβ + where ∼ N (σ 2I)
the likelihood is
1 1 1 X
L(β, σ 2) = √ n n exp(− 2
(yi − µi))2)
2π σ 2σ i
1 1 1
= √ nn
exp(− 2
(y − Xβ)>(y − Xβ))
2π σ 2σ
(y − Xβ)>(y − Xβ).
223
12 Estimation and tests in mixed models
Once β̂ (and hence µ̂) is found, it is not hard to verify that L(β̂, σ 2) is
maximized as a function of σ 2 by
1
σ̂ 2 = (y − X β̂)>(y − X β̂)
n
However, in practice one never uses the ML estimate for σ 2. Instead one
uses
1
σ̃ 2 = (y − X β̂)>(y − X β̂)
n−p
E(σ̃ 2) = σ 2
n−p 2
E(σ̂ 2) = σ
n
224
It can be noted that
1
σ̃ 2 = (y − X β̂)>(y − X β̂)
n−p
• So we write V = V (α).
225
12 Estimation and tests in mixed models
226
Case 2 - V is unknown: If V is unknown (which of course is
generally the case in practice) things become more complicated.
1 n 1
L(β̂(V ), V ) = √ n det(V ) 2 exp(− (y − X β̂(V ))>V −1(y − X β̂(V )))
2π 2
227
12 Estimation and tests in mixed models
y = Xβ + Zu + , where Var(y) = V
= y − Xβ ∼ N (0, V )
April 6, 2001 Mixed Models Course 14
228
and one could use the ML method from before for estimating V .
and while not the optimal estimate for β, it is still an unbiased estimate.
The likelihood for the “residuals” ls then depends only on V and one
can maximize that likelihood numerically.
−1 −1
β̂reml = β̂(V̂reml) = (X >V̂reml X)−1X >V̂reml y
229
12 Estimation and tests in mixed models
Using ML or REML
The main argument for REML estimation is that, at least in the balanced
cases, V̂reml is unbiased while V̂ml is not.
In dealing with tests in mixed models we shall first assume that the
covariance matrix V is known.
230
Standard calculations gives that
Var(X β̂) = X(X >V −1X)−1X >V −1X(X >V −1X)−1X >
= X(X >V −1X)−1X >
so
X β̂ ∼ N (Xβ, X(X >V −1X)−1X >).
Hence
a>X β̂ ∼ N (a>Xβ, a>X(X >V −1X)−1X >a)
231
12 Estimation and tests in mixed models
How to derive f2 shall not be discussed here. We just note that PROC
MIXED attempts to construct such test statistics and to derive the
appropriate number f2 of denominator degrees of freedom.
April 6, 2001 Mixed Models Course 21
232
Another approach is to construct approximate F–tests by establishing
a denominator D, such that
One can force PROC MIXED to making such tests by adding the
CHISQ–option to a the model statement.
233
12 Estimation and tests in mixed models
234
13 Complications concerning Variance
Components
This lectures illustrated some of the problems that may arise because of numerical problems
in the iterative search for the maximum likelihood, and the reason why some of the variance
components are set equal to 0.
Based on an example from one of the exercises, the profile of the likelihood function is illustrated.
A special problem is that Satterthwaites approximation fails in the cases where the variance
component is set to 0, and the G matrix is not positive-semidefinit. Rules of thumb is suggested
in that case.
Finally, the relevance of a test of a positive variance component is discussed, e.g. comparable
to a test of a block effect, when block is treated as a fixed effect
Link to the fullscreen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/Complicate.pdf
235
13 Complications concerning Variance Components
Pct Sukk
Num Den
Effect DF DF F Value Pr > F
OPTAGN 1 2 15.21 0.0599
SAATID 4 16 189.37 <.0001
OPTAGN*SAATID 4 16 5.37 0.0061
Kg
OPTAGN 1 18 336.85 <.0001
SAATID 4 18 408.52 <.0001
OPTAGN*SAATID 4 18 12.70 <.0001
Inspection of Log
Pct Sukk
Kg
236
Sugar beet example
Kg
BLOK 0.05344 0.05 0.01660 3.13E192
BLOK(OPTAGN) 0 . . .
Residual 5.1215 0.05 2.9241 11.2004
Outline
237
13 Complications concerning Variance Components
Reason
−5 −4 −3 −2 −1 0 1 2
(
log10 σ2B(O) )
March 13, 2001 6
238
Likelihood contour plot Kg
2
1
0
log10(σ2B)
−1
−2
−3
−4
−5
−5 −4 −3 −2 −1 0 1 2
(
log10 σ2B(O) )
March 13, 2001 7
2
σB 0 0 0 0 0
0 ... 0 0 0 0
0 2
0 σ B 0 0 0
V(u) = G =
2
0 0 0 σ B(O) 0 0
0
0 0 0 . .. 0
2
0 0 0 0 0 σB(O)
239
13 Complications concerning Variance Components
2
σ̂B 0 0 0 0 0
0 ... 0 0 0 0
0 2
0 σ̂ B 0 0 0
V̂(u) = Ĝ =
2
0 0 0 σ̂ B(O) 0 0
0
0 0 0 . .. 0
2
0 0 0 0 0 σ̂B(O)
2
σ̂B 0 0 0 0 0
0 ... 0 0 0 0
2
0 0 σ̂B 0 0 0
Ĝ =
0
0 0 0 0 0
0 0 0 0 ... 0
0 0 0 0 0 0
Ĝ−1 =???
240
Warning: Satterthwaite Goes Wrong
Conclusions
• If not
– If model reductions are ”natural”, reestimate parameters using
revised models.
– Nested design should be reformulated to maintain design
– Use containment method but be careful to specify model
syntactically correct. (Compare with random statement in GLM)
241
13 Complications concerning Variance Components
2
• Why are we interested in testing σB >0?
• Model Reduction
2
• σ̂B = 0 is not a test and may not be used for this purpose.
Model Reduction
242
General recommandations
Fixed Effects
However:
243
13 Complications concerning Variance Components
Biologically significant
244
14 Repeated Measurements
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/Repeated.f.pdf
245
14 Repeated Measurements
Some questions:
246
Mean profiles
Group means
W
W
82.5
W W
strength
81.5
W
R R R
W R
R
80.5 R
C C C
C
R C
79.5
C C
1 2 3 4 5 6 7
time
Individual profiles:
CONT RI WI
90
90
90
85
85
85
strength
strength
strength
80
80
80
75
75
75
1 2 3 4 5 6 7 1 2 3 4 5 6 7 1 2 3 4 5 6 7
247
14 Repeated Measurements
2
where wik ∼ N (0, σw ) and ijk ∼ N (0, σ 2).
248
2. It is assumed that the correlation between two measurements on the
same individual is the same – no matter how far the measurements
are apart in time.
This may not be a reasonable assumption: Observations close
to each other in time might be expected to be more alike than
observations far from each other.
µij = µ + αi + β × j + β2 × j 2
249
14 Repeated Measurements
Modelling of Covariances
200
150
150
100
100
50
50
5 10 15 20 5 10 15 20
x x
200
150
150
100
100
50
50
5 10 15 20 5 10 15 20
x x
250
Can be summarized as:
• Serial dependence
• Residual variation
A very general model is the model where for each treatment i and
time j there is mean value µij , and the measurements have a
completely unstructured covariance matrix.
Yi1k µi1
Yik = .. ∼ N7(µi = ..
,V )
Yi7k µi7
where k refers to to subject within treatment, and where V is a
7 × 7 unstructured matrix.
October 11, 2001 Mixed Models Course 12
251
14 Repeated Measurements
Since the subjects are independent the random vector arising after
stacking all Yik s on the top of each other has a covariance matrix
consisting of V ’s on the “diagonal” and 0s outside.
252
The estimated correlation matrix is
1.0000 0.9602 0.9246 0.8716 0.8421 0.8091 0.7968
0.9602 1.0000 0.9396 0.8770 0.8596 0.8273 0.7917
0.9246 0.9396 1.0000 0.9556 0.9372 0.8975 0.8755
0.8716 0.8770 0.9556 1.0000 0.9601 0.9094 0.8874
0.8421 0.8596 0.9372 0.9601 1.0000 0.9514 0.9165
0.8091 0.8273 0.8975 0.9094 0.9514 1.0000 0.9531
0.7968 0.7917 0.8755 0.8874 0.9165 0.9531 1.0000
The AR(1)–model
zt = ρzt−1 + t t = 2, . . . , T
253
14 Repeated Measurements
σ2
Hence ω 2 = 1−ρ2
.
254
In general, the covariance between observations k time–steps apart is
Cov(zt, zt−k ) = ρk ω 2
Some Autocorrelations
October 11, 2001 Mixed Models Course 19
10
0.8
rho^c(0, x)
5
0.6
z
0.4
0
0.2
−5
0.0
0 10 20 30 40 50 0 10 20 30 40 50
c(0, x) x
10
0.5
5
rho^c(0, x)
0
0.0
−5
−10
−0.5
0 10 20 30 40 50 0 10 20 30 40 50
c(0, x) x
255
14 Repeated Measurements
1.0
10
0.8
5
rho^c(0, x)
0.6
0
z
0.4
−5
0.2
−10
0.0
0 10 20 30 40 50 0 10 20 30 40 50
c(0, x) x
10
0.8
5
rho^c(0, x)
0.6
0
0.4
−5
0.2
−10
0.0
0 10 20 30 40 50 0 10 20 30 40 50
c(0, x) x
1 ρ12 ρ13 ρ14
ρ21 1 ρ23 ρ24
Corr =
ρ31 ρ23 1 ρ34
ρ41 ρ24 ρ43 1
256
Simple estimates of the autocorrelation for observations one, two
and three time–step apart are
1
γ̂(1) = (ρ12 + ρ23 + ρ34)
3
1
γ̂(2) = (ρ13 + ρ24)
2
1
γ̂(3) = (ρ14)
1
This creates the SAS dataset autocorr with autocorrelation and lag.
257
14 Repeated Measurements
1.00
0.95
autocorr
0.90
0.85
0.80
0 1 2 3 4 5 6
lag
258
Since all autocorrelations γ(k) are positive it is tempting to plot
log γ(k) against k as well.
−0.15
−0.20
0 1 2 3 4 5 6
lag
259
14 Repeated Measurements
Again, there is not any strong evidence against the AR(1) structure.
Compound Symmetry
The option type=cs specifies that the covariance matrix for each
October 11, 2001 Mixed Models Course 30
260
subject has a compound symmetry structure:
σ 2 + σw2
σw 2
... σw 2
2
σw σ 2 + σw 2
... σw 2
.. .. ... ..
2 2
σw σw . . . σ 2 + σw2
From the SAS output one sees that the correlation between
observations on the same subject is estimated to
2
σw
2 + σ2
≈ 0.8892
σw
261
14 Repeated Measurements
Numerical Criteria
AIC and BIC are some criteria to be used. They are both the
log–likelihood + some term penalizing for the number of parameters
used in the model. BIC penalizes the use of many parameters harder
than AIC.
Structure CS AR(1) UN
AIC 1424.9 1270.8 1290.9
BIC 1428.9 1274.9 1348.1
262
Hence the result is in favor of using the AR(1)–structure.
For the Exercise Therapy the p–values for the test of no interaction
effect are:
Structure CS AR(1) UN
Program*Time 0.0005 0.3007 0.1297
263
14 Repeated Measurements
264
15 Repeated Measurements: Covariance
structures
This lecture gives an overview of how to specify different covariance structures in SAS via the
REPEATED statement in PROC MIXED. The lecture is based on the description in the on-line SAS-
manual1 .
The most important types of covariance structure is presented.
• Unstructured (UN)
• Autoregressive (AR(1)–SP(POW))
• Antedependence (ANTE(1))
• Toeplitz (TOEP)
1
http://dokumentation.agrsci.dk/sasdocv8/sasdoc/sashtml/onldoc.htm
2
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/RepeatedType.f.pdf
265
15 Repeated Measurements: Covariance structures
Repeated statement
Y = Xβ + Zu + ε
V(ε) = R
Repeated Statement
266
Missing data: example
267
15 Repeated Measurements: Covariance structures
Unstructured: type=un
Parameters t × (t + 1)/2
268
Autoregressive: type=AR(1)
ρ ρ ρ ρ
Y1 Y2 Y3 Y4 Y5
.
Autocovariance
1.0
0.8
0.6
ρ
0.4
0.2
0.0
0 1 2 3 4 5 6
lag
March 21, 2001 8
269
15 Repeated Measurements: Covariance structures
Autocovariance
1.0
0.8
0.6
ρ
0.4
0.2
0.0
0 1 2 3 4 5 6
lag
March 21, 2001 9
Autoregressive: type=SP(POW)
270
Ante-Dependence: type=ANTE(1)
AR(1)
.
ρ ρ ρ ρ
Y1 Y2 Y3 Y4 Y5
.
ANTE(1)
.
ρ1 ρ2 ρ3 ρ4
Y1 Y2 Y3 Y4 Y5
.
Ante-Dependence: type=ANTE(1)
271
15 Repeated Measurements: Covariance structures
Toeplitz: type=TOEP
Heterogenous variance
272
Conclusions
• Parsimony !
• Fixed observation times and similar intervals : AR(1)
(2 parms)
• Slightly varying observation times and similar intervals
: SP(POW) (2 parms)
• Fixed observation times but intervals of different type:
ANTE(1) (2t − 1 parms (heterogen. variance))
• Fixed observation times, similar intervals, no simple
lag-structure : TOEP (t − 1 parms)
AR vs CS
AR(1)
.
ρ ρ ρ ρ
Y1 Y2 Y3 Y4 Y5
.
CS A
Y1 Y2 Y3 Y4 Y5
.
273
15 Repeated Measurements: Covariance structures
274
16 Random Regression
The random regression model is discussed starting with an example from one of the exercises.
The presentation supplements chapter 7: Random Coefficients in LMSW (Littell et al., 1996)
The basic idea behind random regression and the implementation of the model in PROC MIXED
is shown. Finally, the implications for the covariance structure of the observations is presented.
Link to full-screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/RandomRegression.f.pdf
275
16 Random Regression
100
100
80
80
80
Weight
Weight
Weight
60
60
60
40
40
40
4 6 8 10 12 4 6 8 10 12 4 6 8 10 12
Aims:
276
First idea: fit linear regression model (with random pig effect) and
treatment specific parameters:
5
Resid
Resid
Resid
0
0
−5
−5
−5
−10
−10
−10
4 6 8 10 12 4 6 8 10 12 4 6 8 10 12
277
16 Random Regression
Second idea: fit individual linear regression model (with random pig
effect):
yijt = αi + βij t + Uij + ijt
where i is treatment, j is subject (pig) within treatment, t is time,
and Uij ∼ N (0, σu2 ) and ijt ∼ N (0, σ 2), independent.
title ’Individual linear regressions (with random Pig effect)’;
proc mixed data=CuFeed;
class Cu Pig;
model Weight = Cu Cu*Pig*Time /noint solution outp=R2;
random Cu*Pig;
ods output solutionf=sf2;
proc gplot data=R2;
by Cu;
plot resid*Time=Pig;
run;
Cu = 1 Cu = 2 Cu = 3
4
4
2
2
Resid
Resid
Resid
0
0
−2
−2
−2
−4
−4
−4
4 6 8 10 12 4 6 8 10 12 4 6 8 10 12
278
Analyzing the Individual Regression Coefficients
The analysis could then proceed by comparing β̄1. , β̄2. and β̄3. in
some way.
279
16 Random Regression
Sample Quantiles
0.0 0.4 0.8
Density
7.5
6.0
5 6 7 8 9 −1.0 −0.5 0.0 0.5 1.0
Sample Quantiles
Density
0.6
7.4
0.0
6.6
5 6 7 8 9 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5
0.3
7.0
5.5
0.0
Random Regression
280
Hence
281
16 Random Regression
Independence:
title ’Random regression model (with random Pig effect)’;
title2’Independent intercepts and slopes’;
proc mixed data=CuFeed;
class Cu Pig;
model Weight = Cu Cu*Time / ddfm=satterth noint solution outp=R3;
random int Time / sub=Pig type=vc solution;
ods output solutionf=sf3;
ods exclude listing solutionr;
ods output solutionr=sr3;
run;
Dependence:
title ’Random regression model (with random Pig effect)’;
title2’Dependent intercepts and slopes’;
proc mixed data=CuFeed;
class Cu Pig;
model Weight = Cu Cu*Time / ddfm=satterth noint solution outp=R4;
random int Time / sub=Pig type=un solution;
ods output solutionf=sf4;
ods exclude listing solutionr;
ods output solutionr=sr4;
run;
282
Inference
run;
283
16 Random Regression
284
random regression coefficient for t and a random intercept.
and assume for simplicity that Uij and Bij are independent.
2 2 2
Var(Yijt) = σU + σB t + σe2
2 2 2
For later use let Vt = σU + σB t .
285
16 Random Regression
In total
Var(Yijt) = Vt + σe2
2
Var(Yij(t+k) ) = Vt + k(2t + k)σB + σe2
2
Cov(Yijt , Yij(t+k) ) = Vt + tkσB
286
Hence we know from high school mathematics that
Corr(Yijt , Yij(t+k) ) → 0
as k (i.e. the time span between Yijt and Yij(t+k) goes to infinity.
287
16 Random Regression
288
17 Factor Structure Diagrams
The discussion with participants during the previous lectures had shown the need for an inde-
pendent means of checking the degrees of freedom in the F-tests in PROC MIXED. The methods
of calculation of degrees of freedom (option ddfm) is not fool-proof. The containment method
may lead to errors if the experimental design cannot be deducted from the model specification,
and the satterthwaite method is erroneous if one of the variance component is estimated as
0.
Therefore, the factor structure diagram method were presented, supplement with an exercise.
Link to the full screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/FactorStructure.f.pdf
289
17 Factor Structure Diagrams
However
290
Two–way ANOVA with Replicates
Aaa−1
[ABR]abr
abr−ab
ab
ABab−a−b+1 O11
b
Bb−1
291
17 Factor Structure Diagrams
1. Fill in the levels of the factors as superscripts (i.e. the red) symbols.
3. Proceed like this towards left in the diagram: The DF for AB are
ab − (a − 1) − (b − 1) − 1 = ab − a − b + 1
A 3 16 0.56 0.6470
B 1 16 1.54 0.2329
A*B 3 16 1.72 0.2021
292
Two–way ANOVA without Replicates
If there are no replicates within each combination of A and B (i.e.
r = 1), the model is
yab = µ + αa + βb + ab
[ABR]ab
ab−ab=0
ab
ABab−a−b+1 O11
b
Bb−1
Aaa−1
[AB]ab
ab−a−b+1 O11
b
Bb−1
293
17 Factor Structure Diagrams
A 3 3 0.45 0.7377
B 1 3 0.05 0.8414
The diagram is
[A]aa−1
[ABR]abr
abr−ab [AB]ab
ab−a−b+1 O11
b
Bb−1
294
Note:
B 1 3 14.99 0.0305
295
17 Factor Structure Diagrams
yab = µ + Ua + βb + ab
The diagram is
[A]aa−1
[AB]ab
ab−a−b+1 O11
b
Bb−1
B 1 3 3.30 0.1671
296
Split Plot Experiment
Let A denote the whole–plot treatment and B the split–plot treatment.
Replicate units within A are denoted by R.
[AR]ar
ab−a−b+1 [A]aa−1
[ABR]abr
abr−ab O11
ab b
ABab−a−b+1 Bb−1
A 3 8 0.68 0.5901
B 2 16 3.81 0.0444
A*B 6 16 2.57 0.0618
297
17 Factor Structure Diagrams
cm m
CMcm−c−m+1 Mm−1
ecm
ECM(e−1)(c−1)(m−1)
[ECRM ]ecrm
ec(rm−r−m+1)
em
EMem−e−m+1 e
Ee−1 O11
[ECR]ecr
ec(r−1)
ec c
ECec−e−c+1 Cc−1
298
“Proof that it works”
title ’Split plot experiment - homework - with 3 membranes’;
%let sigma_G = 2;
%let sigma_M = 6;
%let sigma_E = 1;
data mem;
do cu= 1 to 2;
do e_vit= 1 to 2;
do grnr= 1 to 8;
U_g = &sigma_G * rannor(0);
do membran= 1 to 3;
V_m = &sigma_M * rannor(0);
do muskel= 1 to 2;
E = &sigma_E * rannor(0);
y = U_g + V_m + E;
output;
end;
end;
end;
end;
end;
data mem1; set mem(where=(muskel=1));
Num Den
Effect DF DF F Value Pr > F
cu 1 28 0.05 0.8316
e_vit 1 28 0.10 0.7489
cu*e_vit 1 28 1.55 0.2230
membran 2 56 0.10 0.9091
cu*membran 2 56 0.57 0.5708
e_vit*membran 2 56 1.26 0.2904
cu*e_vit*membran 2 56 1.16 0.3198
299
17 Factor Structure Diagrams
300
18 Covariate Models and Multivariate
Response
The use of covariates in mixed models is discussed, initially based on chapter 5 in LMSW (Littell
et al., 1996), i.e., model specification, comparison, and reduction.
Then it is shown that the covariate model may be naturally modified to include several dependent
variables, i.e., to a multivariate response model. The data manipulation steps in SAS is described
and the necessary model specification shown.
Link to full screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/covariate.f.pdf
301
18 Covariate Models and Multivariate Response
Plot
1.0
0.9
Daily Gain
0.8
0.7
0.6
15 20 25 30 35
Start weight
April 17, 2001 2
302
Plot
1.0
0.9
Daily Gain
0.8
0.7
0.6
15 20 25 30 35
Start weight
April 17, 2001 3
Model reduction ?
303
18 Covariate Models and Multivariate Response
Model reduction
(αγ)ij = µ + αi + γj + (αγ)0ij
0
βij = β0 + β1i + β2j + βij
3. Test that the slopes are equal. If fail to reject use common slope
model, if reject goto step 4.
304
SAS-code
Step 1:
Step 3:
SAS-Anova
305
18 Covariate Models and Multivariate Response
Plot
1.0
0.9
Daily Gain
0.8
0.7
0.6
15 20 25 30 35
Start weight
April 17, 2001 9
Final Model
1.0
0.9
Daily Gain
0.8
0.7
0.6
15 20 25 30 35
Start weight
April 17, 2001 10
306
Feed per day
1.0
0.9
Daily Gain
0.8
0.7
0.6
307
18 Covariate Models and Multivariate Response
1.0
0.9
Daily Gain
0.8
0.7
0.6
SAS-code
Test
Estimation:
308
Feed per day
1.0
0.9
Daily Gain
0.8
0.7
0.6
E(Xij ) = µx
E(Yij ) = µy = E(µ + βXij ) = µ + βµx
309
18 Covariate Models and Multivariate Response
V(Xij ) = σx2
1
V(Yij |Xij ) = V(εij ) = σx2 − σyx σxy
σx2
C(Xij , Yij ) = C(Xij , µ + βXij + εij ) = β V(Xij ) = βσx2
V(Yij ) = σε2 + β 2σx2
Multivariate Responses
Y 1 : Weight gain
Y 2 : Feed intake
310
Return to the feeding experiment.
Hence all parameters µr , αir , σr2 are specific to the rth response.
In the example,
E(Yikr ) = δ r + αir
311
18 Covariate Models and Multivariate Response
It is assumed that the mean value has the same structure for each
response r made on the same unit.
In the example,
If the vectors are regarded as row vectors, then it just looks like two
linear normal models appended to each other, with the extra finesse
that the two responses are allowed to be non–independent.
And - that is just what it is !
April 17, 2001 24
312
Such models can be dealt with in a mixed model setup.
Suppose there are two treatments, i.e. i = 1, 2 and two pigs per
treatment, i.e. j = 1, 2.
It is not very hard to see that the mean of each of these can be
written in the matrix form
1
Y11 1 1 0 0 0 0
2
Y11
0 0 0 1 1 0
δ1
1
Y12
1 1 0 0 0 0
α11
2
Y 0 0 0 1 1 0 α12
E( 12 ) =
1
δ2
Y21 1 0 1 0 0 0
2
0 0 0 1 0 1 α21
Y
21
1
α22
Y22 1 0 1 0 0 0
2
Y22 0 0 0 1 0 1
313
18 Covariate Models and Multivariate Response
The covariance matrix is easy to specify too: The units are assumed
independent, and hence the covariance between measurements on
different units is zero.
314
How to ... In SAS
315
18 Covariate Models and Multivariate Response
More generally,
E(Yjr ) = x>
j β
r
316
The previous considerations then gives that
E(Y ) = X B
(n × R) (n × p) (p × R)
i.e. the mean is now organized as a matrix rather than as a vector.
317
18 Covariate Models and Multivariate Response
318
19 Heterogeneous Variance
The purpose of this lecture was to present why it is important to recognize variance heterogeneity,
how to model such heterogeneity and consequences of different modelling approaches. The
lecture extends the description in chapter 8 in LMSW (Littell et al., 1996).
Graphical techniques for finding suitable models of variance heterogeneity is presented and vari-
ance functions including the power-family is introduced. In addition, the effect of transformation
is illustrated.
Link to full-screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/VarianceStructure.f.pdf
319
19 Heterogeneous Variance
• extract more information from portions of the data which are more
precise.
October 18, 2001 Mixed Models Course 1
320
Graphical Investigation of the Variance Structure
Var(Y ) = f (E(Y ))
Ykl = αk + kl
Good estimates for mean and variance in the kth group are
• Mean: ȳk.
1
• Variance: s2k = − ȳk.)2
P
Lk −1 l (ykl
321
19 Heterogeneous Variance
Variance Functions
After having found that the variance is non–constant, the next step
is to look for some structure in which it is non–constant.
Var(Y ) = σ 2µθ
With
Var(Y ) = σ 2µθ
we have a linear relationship on the log–scale:
322
Example 2. A substance X14 has been added in the concentration
fod∈ {0.0, 4.4, 6.2, 9.3} to the food for some pigs. The pigs are
fed (up!) with this food until their weight is 60 kg. From thereof
and until they are slaughtered at 100kg, their food does contain the
substance.
1
3
1
2
m
2
2
1
1
2
0 2 4 6 8
fod
4
3
logv
X14
X14
−2.5
2
2
1
−3.5
0
f in
323
19 Heterogeneous Variance
The Delta–method
324
Example 3. Let Y ∼ N (µ, σ 2). If h is linear, i.e. h(y) = α + βy,
then it is well known that
Z = h(Y ) ∼ N (α + βµ, β 2σ 2)
Taylors Approximation
Let x0 and x be two numbers (not too far apart) and assume that h
is “nice” (i.e. differentiable).
325
19 Heterogeneous Variance
80
60
f(x)
40
20
0
0 1 2 3 4
326
• From the approximation
327
19 Heterogeneous Variance
Transformation of Data
2
σZ = Var(h(Y )) ≈ h0(µ)2 Var(Y ) = h0(µ)2σ 2µθ
q 2
σZ
• For later use let c = σ2
. Hence we look for a function h which
328
satisfies that its derivative is
β
h0(µ) = cµ− 2 .
h(µ) = c log(µ).
2 2−θ
h(µ) = c µ 2 .
2−θ
329
19 Heterogeneous Variance
With Var(Y ) = σ 2µθ there are some well known special cases:
330
Consider the pig–feeding example from before and the model
where i is pig, s is sample and xi is the dose given to the ith pig.
• if is ∼ N (0, σ12) when xi = 0.0 and is ∼ N (0, σ22) when xi 6= 0.0
there are two different variance parameters in the model.
October 18, 2001 Mixed Models Course 23
331
19 Heterogeneous Variance
332
Parts of the SAS output is
Variance homogeneity: Residual 0.1262
-2 Res Log Likelihood 51.6
AIC (smaller is better) 53.6
AICC (smaller is better) 53.7
BIC (smaller is better) 55.4
333
19 Heterogeneous Variance
• The dialyzers are evaluated in vitro using bovine blood and flow
rates QB of either 200 or 300 dl/min.
60
40
40
ufr
ufr
20
20
0
0.5 1.0 1.5 2.0 2.5 3.0 0.5 1.0 1.5 2.0 2.5 3.0
tmp tmp
• The plot also suggest that there might be individual curves for each
membrane, i.e. to consider random regression coefficient models.
334
The starting point is the 4. degree polynomial model
im ∼ N (0, R)
With this program data is treated as being equidistant in TMP, i.e. the
actual difference between two TMP–measurements is accounted for.
335
19 Heterogeneous Variance
f in
• Note that with the model above there are 7 × 8/2 = 28 parameters
in the covariance matrix.
• The variances increase with TMP, and hence the covariances increase
with the differences in TMP.
336
• A simple AR(1) model in which the ijth element of R is
Rij = σ 2ρ|i−j|
337
19 Heterogeneous Variance
For the model with the unstructured covariance matrix, a plot of the
residuals against TMP gives some insight:
5
Resid
Resid
0
0
−5
−5
−10
−10
0.5 1.0 1.5 2.0 2.5 3.0 0.5 1.0 1.5 2.0 2.5 3.0
tmp tmp
338
• This suggests that maybe we are not faced with variance
heterogeneity but rather with individual regression coefficients.
4
2
2
Resid
Resid
0
0
−2
−2
−4
−4
0.5 1.0 1.5 2.0 2.5 3.0 0.5 1.0 1.5 2.0 2.5 3.0
tmp tmp
Yet, the curves are still somewhat “smooth” suggesting that some
within subject variation has yet to be accounted for.
339
19 Heterogeneous Variance
• In the setup in this section the mean and variance parameters are
not estimated separately.
340
• We consider cases where the variance of the residuals is
Var(i) = σ 2|µi|θ
• Since µi = x>
i β, the mixed model becomes complicated:
y = Xβ + Zu +
where
E(Y ) = Xβ
Var() = R(σ 2, β, θ) = diag(σ 2|x> θ
i β| )
341
19 Heterogeneous Variance
• The trick is then to set β p equal to the new estimate for β and
repeat the iteration until the parameters stop changing.
October 18, 2001 Mixed Models Course 45
342
2. Then the estimated parameters β are used as provisional parameters
in the next iteration. (This happens in the repeated statement).
The estimated parameters of Var(u) as used as starting point
for the maximization algorithm. (This happens in the parms
statement).
This step is not necessary to but it speeds up the procedure
considerably:
343
19 Heterogeneous Variance
Also the curves are less smooth than before, suggesting that more of
the within subject variation has yet to be accounted for.
Residuals, POM − QB= 200 Residuals, POM − QB= 300
4
4
2
2
Resid
Resid
0
0
−6 −4 −2
−6 −4 −2
0.5 1.0 1.5 2.0 2.5 3.0 0.5 1.0 1.5 2.0 2.5 3.0
tmp tmp
344
Noget om transformationer,
normalfordelingsapproximationen og
konfidensintervaller
ki
yi = , i = 1, . . . , 250
li
Af forskellige grunde beslutter man sig for at ville sælge bilen i USA,
hvor man sædvanligvis angiver benzinøkonomi som “gallon pr. 100
miles”. For at gøre det nemt betragter vi i stedet “liter pr. 100 km”,
nemlig
li 1
zi = 100 = 100 .
ki yi
Det vil sige at vi transformerer data som zi = h(yi) = 100/yi.
345
19 Heterogeneous Variance
Sample Quantiles
14
Frequency
30
12
0 10
10
10 11 12 13 14 15 −3 −2 −1 0 1 2 3
y Theoretical Quantiles
Sample Quantiles
10
60
Frequency
40
9
8
20
7
0
7 8 9 10 11 −3 −2 −1 0 1 2 3
z Theoretical Quantiles
Man sporer en svagt højreskæv fordeling for zierne, men ellers ser
data ud til rimeligt at kunne beskrives ved en normalfordeling. Det
vil sige at med en vis rimelighed kan man arbejde med at 100/Yi
tilnærmelsesvist er normalfordelt.
Sample Quantiles
60
16
Frequency
Frequency
30
15
40
12
10
20
10
5
0
6 8 10 12 14 16 18 −3 −2 −1 0 1 2 3 5 10 15 20 −3 −2 −1 0 1 2 3
Sample Quantiles
14
60
20
Frequency
Frequency
60
40
15
10
20
10
20
8
6
0
6 8 10 12 14 −3 −2 −1 0 1 2 3 5 10 15 20 −3 −2 −1 0 1 2 3
346
variansen af de transformerede data.
1
Det ses at µ̂ er en god approximation til E(Z) = η og ligeledes er
2
10000 σ̂µ̂4 en rimelig tilnærmelse til V ar(Z) = τ 2 når spredningen er
lille. Det fremgår også at når spredingen bliver stor, blive specielt
approximationen til V ar(Z) = τ 2 dårlig.
347
19 Heterogeneous Variance
reciprokke data:
1 X −1
100/µ̂ = 100( yi )
n i
1X 1X 1
η̂ = zi = 100
n i n i yi
Beslutter man sig for at enheden “liter pr. 100 km” er den relevante
størrelse, så har vi altså to måder at få den frem på: Enten som et
gennemsnit af transformerede data eller som en transformation af
middelværdien af de oprindelige data.
348
Transformation og konfidensintervaller
Zi = α + θxi + i.
October 18, 2001 Mixed Models Course 59
349
19 Heterogeneous Variance
tilvækst af Y altså på den originale skala når x øges med een enhed.
Populært sagt, vil man udtrykke θ “på den oprindelige skala”.
Dette gøres ofte ved følgende. Lad h−1 være den omvendte funktion
til h. Da lader man h−1(θ) være et udtryk for θ “på den oprindelige
skala”.
Man anvender derfor h−1 på den estimerede værdi θ̂, hvilket giver
η̂ = h−1(θ̂). Konfidensgrænserne på den transformerede skala kan
også transformeres tilbage med h−1:
Ŷlav = h−1(Ẑlav )
Ŷhøj = h−1(Ŷhøj )
October 18, 2001 Mixed Models Course 61
Ŷlav = h−1(Ẑhøj )
Ŷhøj = h−1(Ŷlav )
350
den transformerede skala er tilnæmelsesvis givet ved
E(Y ) ≈ h−1(E(Z))
V ar(Z) V ar(Z)
V ar(Y ) ≈ 0 2
= 0 −1 .
[h (E(Y ))] [h (h (E(Z)))]2
η̂ = h−1(θ̂)
σ̂θ
σ̃η =
|h0(η̂)|
Der er dog ikke såvidt vides gode formelle argumenter for at kalde
October 18, 2001 Mixed Models Course 64
351
19 Heterogeneous Variance
zi = h(yi)
Zi = α + βxi + i
352
For simpelhedens skyld antager vi at der er så mange observationer
at t fordelingen ligner en normalfordeling. Dermed bliver
t1− α2 (d) ≈ 1.96 for α = 0.05.
Bemærk først at
p
h(y) = (y) = y 1/2 hvormed
h−1(y) = y 2 og
1
h0(y) = √ .
2 y
353
19 Heterogeneous Variance
σ̂β
σ̃η =
|h0(η̂)2|
σ̂β
= = 0.015
2
idet σ̂β = 0.03. Vi får nu
354
20 Variansheterogeneity: Example of effect of
transformation
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/VariansHetero.f.pdf
355
20 Variansheterogeneity: Example of effect of transformation
Variance homogeneity
• Variance homogeneity
• Transformation as a solution
• Effect of back-transformation.
Variance homogeneity
Yij = µ + αi + εij
εij ∼ N (0, σ 2)
356
3
Variance homogeneity
Variance homogeneity
357
20 Variansheterogeneity: Example of effect of transformation
Variance of an average
nobs
1 X
Ȳ = Yi
nobs i
1
V(Ȳ ) = σY2
nobs
The magnitude of variance inhomogeneity can be assessed by
using this as an analogue.
Example
358
7
Mean curve
5
y, Muscle conc., 60 kg
4
3
2
1
0
0 2 4 6 8
x, Feed contents
12. oktober 2001
Transformation ?
0.4
−1.5
0.3
Log(Variance)
Variance
0.2
−2.5
0.1
−3.5
0.5 1.0 1.5 2.0 2.5 3.0 3.5 −1.0 −0.5 0.0 0.5 1.0
Mean Log(Mean)
359
20 Variansheterogeneity: Example of effect of transformation
Model of expectations
E(y) = µ + αi
√
E( y) = µ + αi ⇒ E(y) = µ2 + αi2 + 2µαi
E(log(y)) = µ + αi ⇒ E(y) = exp(µ) exp(αi)
10
Curve tting
360
11
Model comparison
√
Dependent variable y y
Parameter Estimate P-value Estimate P-value
β1 0.438 0.081∗∗∗ 0.242 0.026∗∗∗
β2 -0.007 0.008 -0.010 0.003∗∗
12
Sqrt transformed
sqrt(y), Muscle conc., 60 kg
5
2.0
y, Muscle conc., 60 kg
4
1.5
3
1.0
2
1
0.5
0 2 4 6 8 0 2 4 6 8
x, Feed contents x, Feed contents
361
20 Variansheterogeneity: Example of effect of transformation
13
Comparisons
5
5
y, Muscle conc., 60 kg
y, Muscle conc., 60 kg
4
4
3
3
2
2
1
1
0
0 2 4 6 8 0 2 4 6 8
x, Feed contents x, Feed contents
14
Treatment differences
ESTIMATE.
0.5
How do we transform ??
0 2 4 6 8
x, Feed contents
362
15
Conclusion
16
Natural scales ?
• Geometric cell-count
363
20 Variansheterogeneity: Example of effect of transformation
364
21 Variance Homogeneity: Diurnal Variation
The purpose of this lecture was to illustrate the application and combination of some of the
advanced topics presented during the course.
A data set consisting of half-hourly observations of cortisol release in pigs was analysed using a
random regression model to capture the individual difference between pigs in diurnal variation.
The power-of-mean approach was used to model the variance heterogeneity.
The application of such a model requires iterative use of PROC MIXED
The experience with the model was that it was possible to estimate the model parameters, but
that it was necessary to ’nudge’ the procedure to secure convergence of the iterative calculations,
and that the calculations were very time-consuming. At the current of state-of-the art the
application of such models is not a routine matter.
Link to full-screen presentation1
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/PowerOfMean.f.pdf
365
21 Variance Homogeneity: Diurnal Variation
Example
cos( 2π 2π
24 tijk ) and sin( 24 tijk ) are covariates for estimation of the
diurnal variation. βk and Bkj are corresponding regression
parameters. βk a systematic effect and, Bkj a random deviation
from the line. The random effects (Aij , B1k , B2k )> ∼ N 3(0, V ),
where V is a 3 × 3 variance matrix. εijk ∼ N (0, σ 2)
May 2, 2001 2
366
Random regression model
*Initial model ;
data a ;
....
PI=3.141593 ;
sint=sin(time*2*pi/24) ;
cost=cos(time*2*pi/24) ;
May 2, 2001 3
Resultat eksempler
6.0
6.0
log(Cortisol)
log(Cortisol)
log(Cortisol)
5.0
5.0
5.0
4.0
4.0
4.0
3.0
3.0
3.0
15 20 25 30 35 15 20 25 30 35 15 20 25 30 35
Timer Timer Timer
May 2, 2001 4
367
21 Variance Homogeneity: Diurnal Variation
Model of Mean ?
May 2, 2001 5
368
SAS Model
*Initial model ;
proc mixed CL data=a ;
class kuld beh dyr ;
model cortisol = beh sint cost /ddfm=satterth s;
random intercept sint cost / subject=dyr*kuld*beh type=un ;
repeated / subject=dyr*kuld*beh local ;
ods output SolutionF=sf ;
ods output Covparms=cp ;
run;
May 2, 2001 7
* Loop ;
proc mixed CL data=a maxiTER=100 CONVH=1e-8;
class kuld beh dyr ;
model cortisol = beh sint cost /ddfm=satterth s;
random intercept sint cost /
subject=dyr*kuld*beh type=un s ;
repeated /local=pom(sf) ;
parms /pdata=cp ;
ods output SolutionF=sf1 ;
ods output SolutionR=Coeff ;
ods output Covparms=cp1 ;
run ;
May 2, 2001 8
369
21 Variance Homogeneity: Diurnal Variation
Experience
May 2, 2001 9
370
22 Links to supplementary material
In order to illustrate the underlying principles in linear algebra it was necessary to introduce
a method for performing the calculations. For that purpose the IML procedure of SAS was
introduced using the small program in ImlExample.sas1
Several SAS macros were introduced for performing standard calculations, e.g., a SAS macro
for calculation of autocorrelations2 . The biometry research unit has further SAS macros and
examples on this web-page3 .
The book used for the course, LMSW (Littell et al., 1996), contains a series of program examples.
These examples may be downloaded from SAS institutes home pages, but can be found here 4
as well. Another important link is the SAS online manual5
Finally, most of the course participants used Word for text processing and SAS for making graphs.
To get these two programs to interact satisfactorily was clearly a problem. Therefore a short
note Eksport af grafer fra SAS til Word 6 were made, and references made to SAS tech. report
ts252x7 were the export facilities are discussed in detail.
1
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/ImlExample.sas
2
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/SAS/autocorr.sas
3
http://www.jbs.agrsci.dk/Biometri/SASmateriale/SASmateriale.html
4
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/SAS/sasmixed.sas
5
http://dokumentation.agrsci.dk/sasdocv8/sasdoc/sashtml/onldoc.htm
6
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/SAS2Word.pdf
7
http://www.jbs.agrsci.dk/biometri/Courses/HSVmixed2001/ts252x.pdf
371
22 Links to supplementary material
372
Bibliography
Littell, R.C., G.A. Milliken, W.W. Stroup, & R.D. Wolfinger (1996). SAS System for Mixed
Models. SAS Institute, Inc., Cary, NC.
373