Slides

Theory of the Multinormal
1-1
MVA: HumboldtUniversitt zu Berlin

Elementary Properties of the Multinormal
1-2
The pdf of
X Np (, ) is given by:
1 2
f (x ) = |2 |1/2 exp (x ) 1 (x )
The expectation and variance are respectively given by:
E (X ) = ,
Var(
X) =

1-3
Theorem r pr 1 Let X = X X2 Np (, ), X1 R , X2 R . Dene 1 X2.1 = X2 21 11 X1 from the partitioned covariance matrix

= 11 12 21 22 .

1-4
Theorem Then
X1 Nr (1 , 11 ), X2.1 Npr (2.1 , 22.1 ) are independent with

1 2.1 = 2 21 11 1 , 1 22.1 = 22 21 11 12 .

1-5
Corollary
Let
X=
X1 X2
Np (, ),
with
12 = 0
if and only if
X1
is independent of
X2 .

1-6
The independence of two linear transforms of a multinormal be shown via the following corollary.
can
Corollary
If
X Np (, ), A and B
matrices, then
AX
and
BX
are
independent if and only if
AB = 0.

1-7
Theorem If X Np (, ) and A(q p ), c Rq , q p, then Y = AX + c is a q-variate Normal i.e.
Y Nq (A + c , AA ).

1-8
Theorem The conditional distribution of X2 given X1 = x1 is normal with 1 mean 2 + 21 11 (x1 1 ) and covariance 22.1 , i.e.,
1 (X2 | X1 = x1 ) Npr (2 + 21 11 (x1 1 ), 22.1 ).

1-9
Theorem
1 L(X2 | X1 = x1 ) Npr (2 + 21 11 (x1 1 ), 22.1 )
i.e. the conditional mean of E(X2 | X1 = x1 ) is LINEAR!

1-10
Example
Suppose
p = 2, r = 1, =
=
0 0
0.8
11 = 1, 21 = 0.8, 22.1 = fX1 (x1 ) f (x2 | x1 )

1
2 2 (0.8)
0.8 2 = 1.36.
exp
2 x1
1 exp 2 (1.36)
0.8x1 )2 . ( x2 2(1.36)
conditional normal densities f(x2|x1)

3. column
arguments for density x2 conditional values x1
Shifts in the conditional density.
MVAcondnorm.xpl

1-12
Theorem If X1 Nr (1 , 11 ) and (X2 |X1 = x1 ) Npr (Ax1 + b, ) where does not depend on x1 , then
X= where
X1 Np (, ), X2 A + b
. 1
and
=
11 11 A A11 + A11 A

1-13
X2 R, X1 Rr
1 E (X2 |X1 ) = 2 + 21 11 (X 1 )
approximation is linear!
X2 = E (X2 |X1 ) + U , U Npr (0, 22.1 )

= 0 + X1 + U 22 =
var( 2 ),
11 12 21 22
1 = 11 + 22.1 = 21 11 12 + 22.1

Consider the case where Now
1-14
X2 R and B
r = p 1.
of dimension
is a row vector
(1 r )
X2 = 0 + X1 + U .
This means that the best MSE approximation of of
X1
X2
by a function
is a straight line.
The marginal variance of
X2
can be decomposed as
1 22 = 11 + 22.1 = 21 11 12 + 22.1 .
2 2.1...r =
variables
1 21 11 12 22
is the square of the multiple correlation between
X1 .
X2
and the

Example
1-15
The classic blue pullover data. Suppose that
X1
(sales),
X2
(price),
X3
(advertisement) and
X4
.
(sales assistants) are normally distributed with
172.7 104.6
1037.21
and = 80.02 219.84 1430.70 104.0 92.10 93.8 271.44 91.58
2624.00 210.30 177.36

Example (continued)
The conditional distribution of normal with mean
1-16
X1
given
(X2 X3 X4 )
is univariate
X2 2 1 1 + 12 22 X3 3 = 65.7 0.2X2 + 0.5X3 + 0.8X4 X4 4

and variance
1 11.2 = 11 12 22 21 = 96.761
The multiple correlation is
2 1.234 =
1 12 22 21 11
= 0.907.

Example (continued)
1-17
The correlation matrix between the 4 variables is given by
.
1
0.168 1 P= 0.867 0.121 0.633 0.464
1 0.308

Example (continued)
1-18
The conditional distribution of normal with mean:
(X1 , X2 )
given
(X3 , X4 )
1
is bivariate
1 2
13 14 23 24 =
32.516
33 34 43 44
X3 3 X4 4
+ 0.467X3 + 0.977X4 153.644 + 0.085X3 0.617X4

Example
and covariance matrix:
1-19
11 12 21 22
13 14 23 24 =
104.006
33 34 43 44
155.592
31 32 41 42
33.574
This covariance matrix allows to compute the partial correlation between
X1
and
X2
for a xed level of
X3
and
X4 :
33.574 X1 X2 |X3 X4 = = 0.264. 104.006 155.592


Mahalanobis Transform
If
1-20
X Np (, ) then the Mahalanobis transform is Y = 1/2 (X ) Np (0, Ip )
and it holds
Y Y = (X ) 1 (X ) 2 p. Y is random vector and Y Y is scalar. Y Y can be used for testing (assuming that is known).
Normally, we do not know
The tests in this situation can
be carried out using Wishart and Hotelling distributions.
1-21
Summary: Elementary Properties
X Np (, ) then a linear transformation AX + c , A(q p ), c Rq has distribution Nq (A + c , AA ).

If Two linear transformations independent if and only if If
AX and B X AB = 0.
of
X Np (, ) are
X Np (, ) then the conditional distribution of X2 given X1 = x1 is normal again.

and are partitions of
X1
X2
1-22
Summary: Elementary Properties

In the multivariate normal case, only if
X1
is independent of
X2
if and
12 = 0. (X2 |X1 )
is a linear function for
The conditional expectation of
X1 N (, ). p X2
The multiple correlation coecient is dened as 1 21 11 12 . 2 2.1...r = 22 The multiple correlation coecient is the percentage of the variance of
0 + X1 .
X2
explained by the linear approximation
1-23
The Wishart Distribution

Wishart Distribution
1-24
X Np (, ),
X (n p )
=0
data matrix
M(p p ) = X X Wp (, n)
Example p =
1,
X x1
. . .
N1 (0, 2 )
n i =1
X
=
M
=
X X =
xn
xi2 2 2 n

Wishart is Generalisation of
1-25
2 p
Theorem If M Wp (, n) and B(p q ), then the distribution of B MB is Wishart Wq (B B, n).

With this theorem we can standardize Wishart matrices since with
B = 1/2
the distribution of
1/2 M1/2
is
Wp (I , n).
1-26
Theorem If M Wp (, m), and a Rp with a a = 0, then the a Ma is 2 distribution of m. a a

Theorem (Cochran)
1-27
X (n p )
data matrix with
Np (, )
covariance matrix)
nS x
and
X HX sample Wp (, n 1)
are independent
1-28
Summary: Wishart Distribution
The Wishart distribution is a generalization of the
2 -distribution.
distribution.
In particular
W1 ( 2 , n) = 2 2 n.
S
has a
The empirical covariance matrix
n Wp (, n 1)
In the normal case, For
S are independent. a 2 . M Wp (, m), aa M m a

and
1-29
The Hotelling Distribution

Hotelling's
1-30
T 2-Distribution
Assume that random vector
Y Np (0, I ) is independent of random matrix M Wp (I , n ). n Y M1 Y T 2 (p , n)

Hotelling's
2 The critical values of Hotelling's T can be calculated using
T2
is a generalisation of Student's
t -distribution
F -distribution:
T 2 (p , n ) =
np F n p + 1 p,np+1
1-31
Summary: Hotelling's
T 2-Distribution
T 2 -distribution is a generalization of the t -distribution. In particular T (1, n) = tn .

Hotelling's
(n 1)(x ) S 1 (x )
has a
T 2 (p , n 1) distribution.
T 2 and Fisher's F -distribution is given by T 2 (p , n) = nnp p+1 Fp,np+1 .

The relation between Hotelling's
1-32
Spherical and Elliptical Distributions

1-33
Denition A (p 1) random vector Y is said to have a spherical distribution Sp () if its characteristic function Y (t ) satises:
Y (t ) = (t t )
for some scalar function
(.)
which is then called the characteristic
generator of the spherical distribution
Sp ().
We will write
Y Sp ()
.

1-34
Theorem Spherical random variables have the following properties:

1.
All marginal distributions of a spherical distributed random vector are spherical. All the marginal characteristic functions have the same generator.
2.

1-35
Theorem Spherical random variables have the following properties:

1.
Let X Sp (), then X has the same distribution as ru (p) where u (p) is a random vector distributed uniformly on the unit sphere surface in Rp and r 0 is a random variable independent of u (p) . If E (r 2 ) < , then E (X ) = 0 , Cov (X ) = E (r 2 ) I . p p

1-36
Denition A (p 1) random vector X is said to have an elliptical distribution with parameters (p 1) and (p p ) if X has the same distribution as + A Y , where Y Sk () and A is a (k p ) matrix such that A A = with rank() = k . We shall write
X ECp (, , )
.

1-37
Example The multivariate t-distribution.
Let
Z Np (0, Ip ) and s 2 m Z s
degrees of freedom.
be
independent. The random vector
Y= m
has a multivariate Moreover the
t -distribution belongs to the family of p-dimensioned
t -distribution with m
spherical distributions.

1-38
Theorem Elliptical random vectors X have the following properties:

1.
Any linear combination of elliptically distributed variables are elliptical. Marginal distributions of elliptically distributed variables are elliptical.
2.

1-39
Theorem
1.
A scalar function (.) can determine an elliptical distribution ECp (, , ) for every Rp and 0 with rank() = k i (t t ) is a p-dimensional characteristic function. Assume that X is nondegenerate. If X ECp (, , ) and X ECp ( , , ), then there exists a constant c > 0 such that = , = c , (.) = (c 1 .). In other words , , A are not unique, unless we impose the condition that det() = 1.
2.

1-40
Theorem
1.
The characteristic function of X , (t ) = E (e it X ) is of the form (t ) = e it (t t ) for a scalar function .

1-41
Theorem
1.
X ECp (, , ) with rank() = k i X has the same distribution as:

+ r A u (k )
(1)
where r 0 is independent of u (k ) which is a random vector distributed uniformly on the unit sphere surface in Rk and A is a (k p ) matrix such that A A = .

1-42
Theorem
1.
Assume that X ECp (, , ) and E (r 2 ) < . Then E (X ) = Cov (X ) = E (r 2 )

rank()
= 2 (0).
2.
Assume that X ECp (, , ) with rank() = k. Then Q (X ) = (X ) (X ) has the same distribution as r 2 in equation (1).

Slides

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Slides

Diunggah oleh

Hak Cipta:

Format Tersedia

Theory of the Multinormal

Theory of the Multinormal

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

The expectation and variance are respectively given by:

Theory of the Multinormal

Theorem r pr 1 Let X = X X2 Np (, ), X1 R , X2 R . Dene 1 X2.1 = X2 21 11 X1 from the partitioned covariance matrix

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

X1 Nr (1 , 11 ), X2.1 Npr (2.1 , 22.1 ) are independent with

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

independent if and only if

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

Theorem If X Np (, ) and A(q p ), c Rq , q p, then Y = AX + c is a q-variate Normal i.e.

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

i.e. the conditional mean of E(X2 | X1 = x1 ) is LINEAR!

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

11 = 1, 21 = 0.8, 22.1 = fX1 (x1 ) f (x2 | x1 )

MVA: HumboldtUniversitt zu Berlin

conditional normal densities f(x2|x1)

arguments for density x2 conditional values x1

Shifts in the conditional density.

Theory of the Multinormal

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

X2 = E (X2 |X1 ) + U , U Npr (0, 22.1 )

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

The marginal variance of

is the square of the multiple correlation between

Theory of the Multinormal

The classic blue pullover data. Suppose that

(sales assistants) are normally distributed with

and = 80.02 219.84 1430.70 104.0 92.10 93.8 271.44 91.58

2624.00 210.30 177.36

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

X2 2 1 1 + 12 22 X3 3 = 65.7 0.2X2 + 0.5X3 + 0.8X4 X4 4

The multiple correlation is

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

The correlation matrix between the 4 variables is given by

0.168 1 P= 0.867 0.121 0.633 0.464

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

The conditional distribution of normal with mean:

+ 0.467X3 + 0.977X4 153.644 + 0.085X3 0.617X4

MVA: HumboldtUniversitt zu Berlin

Theory of the Multinormal

This covariance matrix allows to compute the partial correlation between

for a xed level of

33.574 X1 X2 |X3 X4 = = 0.264. 104.006 155.592

Theory of the Multinormal

X Np (, ) then the Mahalanobis transform is Y = 1/2 (X ) Np (0, Ip )

MVA: HumboldtUniversitt zu Berlin

Theorem r pr 1 Let X = X X2 Np (, ), X1 R , X2 R . Dene 1 X2.1 = X2 21 11 X1 from the partitioned covariance matrix

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

The classic blue pullover data. Suppose that

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

for a xed level of

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin

MVA: HumboldtUniversitt zu Berlin