10 Meanvector PDF

LECTURE 10
INFERENCE ON MEAN VECTORS
Introduction
In this section we will generalize the concept of t-tests to the multivariate situation. In
the two-sample problem, we will assume equality of covariance matrices. Later we will
develop tests for equality of covariance matrices and for other patterns on the covariance
matrix. The concept of t-tests generalizes in the univariate case to the analysis of
variance and we will make that generalization in the multivariate case. We will see that
SAS does not have a PROC specifically devoted to the multivariate t-test so we will
develop an IML program. Later we will see how to use the multivariate analysis of
variance program to perform the test.
Hotelling T2 Statistic
(This concept is discussed in DJ Section 10.2 page 408.)
The problem is as follows: Given a random sample of size n from a p-variate normal
distribution, N(., D), that is, the data matrix X, we want to test the hypothesis
H: . .0
where .0 is a specified p-vector.
p1
For motivation, recall the special case p 1. In that case we computed

_
(s/n)
x.0
t
and rejected the hypothesis if ltl t( !2 , (n 1)). Equivalently, we could compute F t2

and reject the hypothesis if F F(!, 1, (n 1). Note that we can write
_ _ _
(x.0 )2
t2 (s2 /n)
n(x .0 ) s"2 (x .0 )
p1
In the multivariate case, the test statistic is given by

_ ^ 1 (_x . )
T2 n(x .0 )T D 0
_
noting that x and .0 are vectors of length p. The hypothesis is rejected if
(n1)p
T2 (np)
F(!, p, (n p))
or equivalently
(np) 2
(n1)p
T F(!, p, (n p))
To motivate the statistic note that this problem is similar to the classification problem.
That is, did this data come from a population with mean, .0 of from a population with
some other mean. _
The decision is based on the distance from x to . _ 0 . Equivalently, it is based on the ratio
of the likelihood functions for . .0 and . x.
Example
Recall the SWEAT data from a previous exercise and suppose the question is raised as to
whether these 20 females were selected from a population of females whose mean vector
is know to be .T0 (4 50 10) on the three variables, x1 sweat rate, x2 sodium
content and x3 potassium content. From the 20 observations, we compute
_
4.64 2.9 10.0 -1.8
9.97 -1.8 3.6

^
x 45.40 and D 10.0 200 -5.6
-5.6
(np) 2
It follows that T2 9.74 and (n1)p
T 2.9. With ! 0.10, we see that
F(0.10, 3, 17) 2.44 and the hypothesis is rejected.
Invariance
Since the variances in this example are quite different, if is natural to ask if we should
have standardized the data before performing the test. The answer is that it does not
make any difference. In fact, if C is any non-singular matrix of size p and d is any p-
vector, we can compute y Cx d and the T2 statistic based on y will be identical to the
one based on x.
Development of the T2 Statistic
For those interested, here are two developments of the T2 statistic.
Likelihood Ratio Test
Given n observations on a p-variate normal distribution, the likelihood function is

given by
f(X, ., D) (21) 2 lDl # exp "# tr(D1 A(.)

p n
where A(.) (X J.T )T (X J.T ). Letting A(x- ) XT Sn X we can write
_ _
A(.0 ) A(x- ) n(x .0 )(x .0 )T
The likelihood ratio test leads to the ratio

lA(x- )l
) lA(.)l
and we reject the hypothesis, H: . .0 if ) is 'small'. To relate to the T2 statistic,
n(x .0 )
consider the determinant _
A(x- )
n(_x . ) 1
0
_ _
lA(x- ) n(x .0 )(x .0 )T l lA(.0 )l
_ _
lA(x- )ll1 n(x .0 )T A(x- )1 (x .0 )l
It follows that
lA(x- )l "
) lA(.)l
2
1 (nT1)
and we reject the hypothesis if T2 is large.
Union-Intersection Test
Let a be an arbitrary p-vector and consider y aT x ~ N(aT ., aT Da). Then consider the
hypothesis, H(a):E[y] .y0 based on the data Y Xa. The t-statistic for this univariate
test is _
(y.y0 )
sy /n
t(a) 2
^ a, we see that
Noting that .y aT .0 , y- aT x- and sy2 aT D
_ _
naT (x.0 )(x.0 )T a
t2 (a) ^a
aT D
and we would accept the hypothesis is t2 (a) F(!, 1, (n 1)). Noting that . .0 is
equivalent to aT . aT .0 for any vector a, Thus acceptance of the hypothesis H:. .0
is equivalent to accepting the hypothesis H(a) for any vector a. this implies that
max t2 (a) F(!, 1, (n 1))

a
Thus we seek a to maximize t2 (a) of equivalently we consider the problem

_ _
max naT (x .0 )(x .0 )T a
^a
subject ot aT D
These leads to the system of equations
nD
^ 1 (_x . )(_x . )T -Ia 0
0 0
since the matrix has rank one, there is only one eigenvalue and eigenvector given by
_ ^ 1 (_x . )
- n(x .0 )T D 0
^ 1 (_x . )
aD 0
Thus the maximum is given by the T2 statistic and we are lead to the same test.
Confidence Regions and Intervals
It is usually more informative to give a confidence interval on a parameter than simple an

accept reject decision. In the univariate case, we recall the familiar confidence interval
for ..
.: x- t(!/2, (n 1)s2 /n
The analog of that in the multivariate case is the confidence region given by the p-
dimensional ellipsoid centered at x- given by
^ 1 (. x- )
n(. x- )T D p(n1)
F(!,p, (n p)
(np)
Thus, we accept as reasonable, any vector . that lies in this ellipsoid. Since this is hard
to visualize in more than two dimensions, a common practice is to write confidence
intervals on the individual components or linear functions of them. Since we are writing
several intervals, the confidence coefficient based on a single interval is no longer
applicable and alternatives have been suggested. For example, if we are testing k
!
intervals, a simple suggestion it to use t( #k , (n 1) in the above univariate expression
2 ^
with s replace by 5 ii .. This is know as the Bonferroni method. In addition to the p
components of . we may be interested in other linear functions, say, aT . and k can be
quite large making these intervals very wide. An alternative, suggested by the Union-
Intersection principle is to use the intervals
aT .: aT x- "n (aT D
^ a) p(n1) F(!,p, (n p)
(np)
These intervals, called simultaneous confidence interval, although wider than the simple
t-statistic intervals, more correctly reflect the confidence coefficient. Since we often
write many such intervals, they are generally better than the Bonferroni intervals that are
given by
aT .: aT x- t( 2k
!
, (n 1)) "n (aT D
^ a)
GENERAL LINEAR HYPOTHESES
As a generalization of the hypothesis H0 :. .0 we consider the hypothesis, H0 :

M. m where M is a matrix of size q p of rank q and m is a q-vector, usually zero.
We will encounter several M matrices
Testing for Equality of Elements of .
The initial hypothesis H: . .0 is a special case of this with M I and h .0 .

Another hypothesis is that all of the elements of . are equal. With p 4, consider the
matrix
1 1 0 0 0
0 1 0
M 0 1 1 0 m 0
0 1
Thus,
.1 .2
. .
M. .2 .3
3 4
and we see that this matrix is testing the hypothesis that all elements of the mean vector
are the same. The T2 statistic generalizes as follows:
^ MT )1 (Mx- m)
T2 n(Mx- m)T (MD
and we reject the hypothesis if

q(n1)
T2 (nq)
F(!, q, (n q))
The concept of simultaneous confidence intervals extends directly. Thus we can write
intervals on linear functions of the form aT M.
aT M.: aT Mx- "n aT MD

^ MT a q(n1) F(!,q, (n q)
(nq)
For example with M as given above and if aT (1 1 0) we have aT M (1 2 1) and
hence the function aT M. .1 2.2 .3 .
The Analysis of Repeated Measurements
Often, the p responses on a subject are taken over time. That is the p columns of the X
matrix represent observations taken at various points in time. In this case, we may ask if
there is a difference in response at different points in time. The hypothesis of equal
means discussed above would be appropriate. If this hypothesis is rejected, we might ask
if there is a linear trend. That is, is the rate of change from one time to the next constant.
Assuming that the time periods are equally space, this says that the difference between
consecutive means is constant. The hypothesis of a linear trend is described (for p 4)
as
H0 : .2 .1 .3 .2 .4 .3
This is equivalent to the hypothesis
H0 : .3 2.2 .1 0
.4 2.3 .2 0
and the appropriate matrix is given by
M m
1 2 1 0 0
0 1 2 1 0
A plot of the sample means over time helps to visualize the situation. If the hypothesis of
a linear trend is accepted, we would see that these means lie roughly on a straight line
and we might be interested in the equation of that line. This suggested fitting a linear
regression model to the sample means. Thus, we would consider the linear model
x- i b0 b1 ti ei
where ti denotes the ith time period. Ordinary linear regression is not appropriate in this
case, since the x- i are not independent. Recall that, if x- denotes the vector of means, then
Var(x) - D/n. We thus, consider a weighted least squares estimator. To describe this,
let B denote the matrix whose first column is all ones and whose second column is
(t1 ,t2 tp )T . Then the estimator is given by
b (B D B) B D x
b0 1 1
T^ 1 T ^ -
1
Clearly, one might consider a more complex relation, for example, a quadratic in time.
Refer to RRH- DJ-EX-10.5 to illustrate this analysis.
Note comment on page 421

THE TWO-SAMPLE PROBLEM
Suppose we have data from two normal populations with the same covariance matrix D
but possibly different mean vectors, .1 and .2 . We wish to test the hypothesis
H: .1 .2 .
Let the data matrices be X1 of size n1 p and X2 of size n2 p. Define the sample mean
_ _ ^ and D ^ . Since we are assuming the
vectors, x1 and x2 and sample covariance matrices D 1 2
covariance matrices are equal, we compute the 'pooled' covariance matrix
^ (n 1) D
^
^
D
(n1 1) D 1 2 1
(n1 n2 2)
For testing the hypothesis, H .1 .2 $ , the test statistic is given by
T2 n1 n2
(x-
(n1 n2 ) 1
x- 2 )T D1 (x- 1 x- 2 )
The hypothesis is rejected if

p(n1 n2 2)
T2 (n1 n2 p1)
F(!, p, (n1 n2 p 1)
or equivalently, if
(n1 n2 p1) 2
p(n1 n2 2)
T F(!, p, (n1 n2 p 1).
In general, we may consider a hypothesis of the form
H: M(.1 .2 ) $ .
where M is a matrix of size q p of rank q. The test statistic is
(n1 n2 )
x- 2 ) MDMT M(x- 1 x- 2 )
T 1
T2 n1 n2
M(x- 1
and the hypothesis is rejected if

q(n1 n2 2)
T2 (n1 n2 q1)
F(!, q, (n1 n2 q 1)
Simultaneous confidence intervals on functions of the form aT M(.1 .2 )

are given by
aT M(.1 .2 ):
aT M(x- 1 x- 2 ) (nn11nn22 ) aT MD
^ MT a q(n1 n2 2) F(!,q, (n q)
(n1 n2 q-1)
For example, with aT (1 00), we are interested in the first row of M. With
aT (1 1 0 0) we are interested in the difference of the first two rows etc.
Two Group Repeated Measures: Profile Analysis
Recall the concept of repeated measurements discussed earlier. That is, we now assume
the we have data on two populations taken over time and are interested in examining the
relation beyond simply asking if the mean vectors are the same. Several hypotheses are
of interest. Again these are motivated by a plot of the sample means for the two groups
as a function of time. This plot is sometimes called a "profile". Examining this plot
suggests three questions that we might ask.
(REFER TO PROFILE PLOT)
1. Are the profiles for the two groups similar in the sense that the line segments of
adjacent tests are parallel?
2. If the profiles are parallel, are they at the same level?
3. If the profiles are parallel, are the means of the tests different?
Another way to look at the Profile Plot is to display the means for the two groups in a
two-way table as follows:
Time
1 2 3 4
Group 1 .11 .12 .13 .14 .1
2 .21 .22 .23 .24 .2
.1 .2 .3 .4
Question 1. Parallelism Hypothesis: Group by Time Interaction
To answer the first question, we note that the parallelism condition is that that the
difference the means in consecutive time periods is the same in each group. Thus, we
want to test the hypothesis
H0 : .11 .12 .21 .22

.12 .13 .22 .23
.13 .14 .23 .24
etc.
The appropriate matrix in this case ( for p 4) is

1 1 0 0
0 1
M 0 1 1 0
0 1
and again we test the hypothesis H0 : M(.1 .2 ) 0. In analysis of variance

terminology, this is known as the "group by time interaction" hypothesis.
Question 2. Equal Average Over Time: Group Main Effect
Letting .ij denote the mean response in group i, i 1,2 at time j, j 1,2,,p. Suppose
we want to check that the average effect over time is the same in the two groups. The
hypothesis of interest is
H0 : ! .1j ! .2j
p p
j 1 j 1
In our general notation, this corresponds to letting M (1,1,,1) and testing the
hypothesis
H0 : M(.1 .2 ) 0
In analysis of variance terminology, this is known as the "marginal means, main effect
test". Note that we can test this hypothesis even if the parallelism hypothesis is rejected,
but the interpretation is not as strong.
Question 3. Are the means of the tests different? Time Main Effect
In this case, the hypothesis is stated as
H0 : !.ij !.ij* for j j*

2 2
i 1 i 1
or, in matrix notation, the hypothesis is written as
H0 : M(.1 .2 ) 0
with
1 1 0 0
0 1
M 0 1 1 0
0 1
The test statistic is given by
(n1 n2 )
x- 2 ) MDMT M(x- 1 x- 2 )
T 1
T2 n1 n2
M(x- 1
and the hypothesis is rejected if
q(n1 n2 2)
T2 (n1 n2 q1)
F(!, q, (n1 n2 q 1)
Refer to RRH-DJ-EX-10.6 to illustrate this analysis.
In the next section we will extend this concept to more than two groups.

10 Meanvector PDF

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

10 Meanvector PDF

Diunggah oleh

Hak Cipta:

Format Tersedia

LECTURE 10

INFERENCE ON MEAN VECTORS

(This concept is discussed in DJ Section 10.2 page 408.)

where .0 is a specified p-vector.

For motivation, recall the special case p 1. In that case we computed

and rejected the hypothesis if ltl t( !2 , (n 1)). Equivalently, we could compute F t2

In the multivariate case, the test statistic is given by

9.97 -1.8 3.6

F(0.10, 3, 17) 2.44 and the hypothesis is rejected.

Development of the T2 Statistic

For those interested, here are two developments of the T2 statistic.

Likelihood Ratio Test

Given n observations on a p-variate normal distribution, the likelihood function is

f(X, ., D) (21) 2 lDl # exp "# tr(D1 A(.)

The likelihood ratio test leads to the ratio

and we reject the hypothesis, H: . .0 if ) is 'small'. To relate to the T2 statistic,

and we reject the hypothesis if T2 is large.

max t2 (a)  F(!, 1, (n 1))

Thus we seek a to maximize t2 (a) of equivalently we consider the problem

These leads to the system of equations

Confidence Regions and Intervals

It is usually more informative to give a confidence interval on a parameter than simple an

GENERAL LINEAR HYPOTHESES

As a generalization of the hypothesis H0 :. .0 we consider the hypothesis, H0 :

Testing for Equality of Elements of .

The initial hypothesis H: . .0 is a special case of this with M I and h .0 .

and we reject the hypothesis if

aT M.: aT Mx- "n aT MD

The Analysis of Repeated Measurements

This is equivalent to the hypothesis

and the appropriate matrix is given by

Refer to RRH- DJ-EX-10.5 to illustrate this analysis.

Note comment on page 421

For testing the hypothesis, H .1 .2 $ , the test statistic is given by

The hypothesis is rejected if

In general, we may consider a hypothesis of the form

where M is a matrix of size q p of rank q. The test statistic is

and the hypothesis is rejected if

Simultaneous confidence intervals on functions of the form aT M(.1 .2 )

Two Group Repeated Measures: Profile Analysis

(REFER TO PROFILE PLOT)

Question 1. Parallelism Hypothesis: Group by Time Interaction

H0 : .11 .12 .21 .22

The appropriate matrix in this case ( for p 4) is

and again we test the hypothesis H0 : M(.1 .2 ) 0. In analysis of variance

Question 2. Equal Average Over Time: Group Main Effect

In this case, the hypothesis is stated as

H0 : !.ij !.ij* for j j*

or, in matrix notation, the hypothesis is written as

The test statistic is given by

Refer to RRH-DJ-EX-10.6 to illustrate this analysis.

Anda mungkin juga menyukai

max t2 (a) F(!, 1, (n 1))