Eigenvalue
Ax x
x
Eigenvector
x 7.2
Ex 1: Verifying eigenvalues and eigenvectors
2 0 1 0
A x1 x 2
0 1 0 1
Eigenvalue
※ In fact, for each eigenvalue, it
2 0 1 2 1 has infinitely many eigenvectors.
Ax1 2 2x1 For = 2, [3 0]T or [5 0]T are
0 1 0 0 0 both corresponding eigenvectors.
Moreover, ([3 0] + [5 0])T is still
Eigenvector
an eigenvector. The proof is in
Thm. 7.1.
Eigenvalue
2 0 0 0 0
Ax2 1 (1)x2
0 1 1 1 1
Eigenvector
7.3
Thm. 7.1: The eigenspace corresponding to of matrix A
If A is an nn matrix with an eigenvalue , then the set of all
eigenvectors of together with the zero vector is a subspace
of Rn. This subspace is called the eigenspace of
Pf:
x1 and x2 are eigenvectors corresponding to
(i.e., Ax1 x1 , Ax2 x2 )
(1) A(x1 x 2 ) Ax1 Ax 2 x1 x 2 (x1 x 2 )
(i.e., x1 x 2 is also an eigenvector corresponding to λ)
(2) A(cx1 ) c( Ax1 ) c( x1 ) (cx1 )
(i.e., cx1 is also an eigenvector corresponding to )
Since this set is closed under vector addition and scalar
multiplication, this set is a subspace of Rn according to
Theorem 4.5 7.4
Ex 3: Examples of eigenspaces on the xy-plane
For the matrix A as follows, the corresponding eigenvalues
are 1 = –1 and 2 = 1:
1 0
A
0 1
Sol:
For the eigenvalue 1 = –1, corresponding vectors are any vectors on the x-axis
For the eigenvalue 2 = 1, corresponding vectors are any vectors on the y-axis
x x 0 x 0
Av A A A A
y 0 y 0 y
x 0 x
1 1
0 y y
7.6
Thm. 7.2: Finding eigenvalues and eigenvectors of a matrix AMnn
Let A be an nn matrix.
(1) An eigenvalue of A is a scalar such that det( I A) 0
(2) The eigenvectors of A corresponding to are the nonzero
solutions of ( I A)x 0
Note: follwing the definition of the eigenvalue problem
Ax x Ax Ix ( I A)x 0 (homogeneous system)
( I A)x 0 has nonzero solutions for x iff det( I A) 0
(The above iff results comes from the equivalent conditions on Slide 4.101)
Characteristic equation of A:
det( I A) 0
Characteristic polynomial of AMnn:
det( I A) ( I A) n cn1 n1 c1 c0
7.7
Ex 4: Finding eigenvalues and eigenvectors
2 12
A
1 5
Eigenvalue: 1 1, 2 2
7.8
(1) 1 1 ( I A)x 3 12 x1 0
1 1 4 x 0
2
3 12 G.-J. E. 1 4
1 4 0 0
x1 4t 4
t , t 0
x2 t 1
4 12 x1 0
(2) 2 2 (2 I A)x
1 3 x2 0
4 12 G.-J. E. 1 3
1 3 0 0
x1 3s 3
s , s 0
x2 s 1 7.9
Ex 5: Finding eigenvalues and eigenvectors
Find the eigenvalues and corresponding eigenvectors for
the matrix A. What is the dimension of the eigenspace of
each eigenvalue?
2 1 0
A 0 2 0
0 0 2
Sol: Characteristic equation:
2 1 0
I A 0 2 0 ( 2)3 0
0 0 2
Eigenvalue: 2
7.10
The eigenspace of λ = 2:
0 1 0 x1 0
( I A)x 0 0 0 x2 0
0 0 0 x3 0
x1 s 1 0
x2 0 s 0 t 0, s, t 0
x3 t 0 1
1 0
s 0 t 0 s, t R : the eigenspace of A corresponding to 2
0 1
7.12
Ex 6:Find the eigenvalues of the matrix A and find a basis
for each of the corresponding eigenspaces
1 0 0 0
0 1 5 10
A
1 0 2 0
1 0 0 3
0
5 is a basis for the eigenspace
0 corresponding to 3 3
1
※The dimension of the eigenspace of λ3 = 3 is 1
7.16
Thm. 7.3: Eigenvalues for triangular matrices
If A is an nn triangular matrix, then its eigenvalues are
the entries on its main diagonal
Ex 7: Finding eigenvalues for triangular and diagonal matrices
1 0 0 0 0
0 0
2 0 0 2 0 0
(a) A 1 1 0 (b) A 0 0 0 0 0
5 3 3 0 0 0 4 0
0 0 0 0 3
Sol:
2 0 0
(a) I A 1 1 0 ( 2)( 1)( 3) 0
5 3 3 ※According to Thm. 3.2, the
determinant of a triangular
1 2, 2 1, 3 3 matrix is the product of the
entries on the main diagonal
(b) 1 1, 2 2, 3 0, 4 4, 5 3 7.17
Eigenvalues and eigenvectors of linear transformations:
A number is called an eigenvalue of a linear transformation
T : V V if there is a nonzero vector x such that T (x) x.
The vector x is called an eigenvector of T corresponding to ,
and the set of all eigenvectors of (together with the zero
vector) is called the eigenspace of
※ The definition of linear transformation functions should be introduced in Ch 6
※ Here I briefly introduce the linear transformation and its some basic properties
※ The typical example of a linear transformation function is that each
component of the resulting vector is the linear combination of the components
in the input vector x
T (x)B ' A ' xB ' , where A ' T ( v1 )B ' T ( v 2 )B ' T ( v n ) B '
is the transformation matrix for T relative to the basis B '
※ On the next two slides, an example is provided to verify numerically that this
extension is valid 7.22
EX. Consider an arbitrary nonstandard basis B ' to be {v1, v2,
v3}= {(1, 1, 0), (1, –1, 0), (0, 0, 1)}, and find the transformation
matrix A ' such that T (x)B ' A ' xB ' corresponding to the same
linear transformation T(x1, x2, x3) = (x1 + 3x2, 3x1 + x2, –2x3)
1 4 4 1 2 0
T ( v1 )B ' T ( 1 ) 4 0 , T ( v 2 )B ' T ( 1 ) 2 2 ,
0 0 B ' 0 0 0 B ' 0
B' B'
4 0 0
A ' 0 2 0
0 0 2 7.23
Consider x = (5, –1, 4), and check that T (x)B ' A ' xB '
corresponding to the linear transformation T(x1, x2, x3) = (x1 + 3x2,
3x1 + x2, –2x3)
4 0 0 2 8
A ' x B ' 0 2 0 3 6 T (x) B '
0 0 2 4 8
7.24
For a special basis 𝐵′ = 𝐯1 , 𝐯2 , … , 𝐯𝑛 , where 𝐯𝑖 ’s are eigenvectors of
the standard matrix 𝐴, 𝐴′ is obtained immediately to be diagonal due to
𝑇 𝐯𝑖 = 𝐴𝐯𝑖 = 𝜆𝑖 𝐯𝑖
and
𝜆𝑖 𝐯𝑖 𝐵′ = 0𝐯1 + 0𝐯2 + ⋯ + 𝜆𝑖 𝐯𝑖 + ⋯ + 0𝐯𝑛 𝐵′ = 0 ⋯ 0 𝜆𝑖 0 ⋯ 0 𝑇
Then A ', the transformation matrix for T relative to the basis B ', defined as
[[T ( v1 )]B ' [T ( v 2 )]B ' [T ( v 3 )]B ' ] (see Slide 7.22), is diagonal, and the main
diagonal entries are corresponding eigenvalues (see Slides 7.23)
for 1 4 for 2 2 4 0 0
B ' {(1, 1, 0), (1, 1, 0), (0, 0, 1)} A' 0 2 0
Eigenvectors of A 0 0 2
Eigenvalues of A 7.25
Keywords in Section 7.1:
eigenvalue problem:
eigenvalue:
eigenvector:
characteristic equation:
characteristic polynomial:
eigenspace:
multiplicity:
linear transformation:
diagonalization:
7.26
7.2 Diagonalization
Diagonalization problem :
For a square matrix A, does there exist an invertible matrix P
such that P–1AP is diagonal?
Diagonalizable matrix :
Definition 1: A square matrix A is called diagonalizable if
there exists an invertible matrix P such that P–1AP is a
diagonal matrix (i.e., P diagonalizes A)
Definition 2: A square matrix A is called diagonalizable if A
is similar to a diagonal matrix
※ In Sec. 6.4, two square matrices A and B are similar if there exists an invertible
matrix P such that B = P–1AP.
Notes:
In this section, I will show that the eigenvalue and eigenvector
problem is closely related to the diagonalization problem
7.27
Thm. 7.4: Similar matrices have the same eigenvalues
If A and B are similar nn matrices, then they have the
same eigenvalues
Pf:
1 For any diagonal matrix in the
A and B are similar B P AP form of D = λI, P–1DP = D
0 0 n
[1p1 2p 2 np n ] 7.31
AP PD (since D P 1 AP )
[ Ap1 Ap 2 Ap n ] [1p1 2p 2 np n ]
Api i pi , i 1, 2, ,n
(The above equations imply the column vectors pi of P are eigenvectors
of A, and the diagonal entries i in D are eigenvalues of A)
Because A is diagonalizable P is invertible
Columns in P, i.e., p1 , p 2 , , p n , are linearly independent
(see Slide 4.101 in the lecture note)
Thus, A has n linearly independent eigenvectors
()
Since A has n linearly independent eigenvectors p1 , p 2 , p n with
corresponding eigenvalues 1 , 2 , n (could be the same), then
Api i pi , i 1, 2, ,n
Let P [p1 p2 pn ] 7.32
AP A[p1 p 2 p n ] [ Ap1 Ap 2 Ap n ]
[1p1 2p 2 np n ]
1 0 0
0 0
[p1 p 2 pn ] 2
PD
0 0 n
Since p1 , p 2 , , p n are linearly independent
P is invertible (see Slide 4.101 in the lecture note)
AP PD P 1 AP D
A is diagonalizable
(according to the definition of the diagonalizable matrix on Slide 7.27)
※ Note that pi 's are linearly independent eigenvectors and the diagonal
entries i in the resulting diagonalized D are eigenvalues of A
7.33
Ex 4: A matrix that is not diagonalizable
Show that the following matrix is not diagonalizable
1 2
A
0 1
Sol: Characteristic equation:
1 2
I A ( 1)2 0
0 1
The eigenvalue 1 1, and then solve (1I A)x 0 for eigenvectors
0 2 1
1 I A I A eigenvector p1
0 0 0
Since A does not have two linearly independent eigenvectors,
A is not diagonalizable
7.34
Steps for diagonalizing an nn square matrix:
Step 3: 1 0 0
1
0 2 0
P AP D
0 0 n
where Api i pi , i 1, 2, ,n
7.35
Ex 5: Diagonalizing a matrix
1 1 1
A 1 3 1
3 1 1
Find a matrix P such that P 1 AP is diagonal.
7.36
1 1 1 1 0 1 x1 0
1 2 1 I A 1 1 1 G.-J. E.
0 1 0 x2 0
3 1 3 0 0 0 x3 0
x1 t 1
x 0 eigenvector p 0
2 1
x3 t 1
3 1 1 1 0 14 x1 0
2 2 2 I A 1 5 1
G.-J. E.
0 1 14 x2 0
3 1 1 0 0 0 x3 0
x1 14 t 1
x 1 t eigenvector p 1
2 4 2
x3 t 4
7.37
2 1 1 1 0 1 x1 0
3 3 3 I A 1 0 1 G.-J. E.
0 1 1 x2 0
3 1 4 0 0 0 x3 0
x1 t 1
x t eigenvector p 1
2 3
x3 t 1
1 1 1
P [p1 p 2 p3 ] 0 1 1 and it follows that
1 4 1
2 0 0
P 1 AP 0 2 0
0 0 3
7.38
Note: a quick way to calculate Ak based on the diagonalization
technique
1 0 0 1k 0 0
0
0 0 2k 0
(1) D 2
D
k
k
0 0 n 0 0 n
(2) D P 1 AP D k P 1 AP P 1 AP P 1 AP P 1 Ak P
repeat k times
1k 0 0
k 1 0 2k 0
A PD P , where D
k k
k
0 0 n 7.39
Thm. 7.6: Sufficient conditions for diagonalization
If an nn matrix A has n distinct eigenvalues, then the
corresponding eigenvectors are linearly independent and
thus A is diagonalizable according to Thm. 7.5.
Pf:
Let λ1, λ2, …, λn be distinct eigenvalues and corresponding
eigenvectors be x1, x2, …, xn. In addition, consider that
the first m eigenvectors are linearly independent, but the
first m+1 eigenvectors are linearly dependent, i.e.,
xm1 c1x1 c2 x2 cm xm , (1)
where ci’s are not all zero. Multiplying both sides of Eq. (1)
by A yields
Axm1 Ac1x1 Ac2 x 2 Acm x m
m1xm1 c11x1 c22 x2 cm m xm (2) 7.40
On the other hand, multiplying both sides of Eq. (1) by λm+1 yields
m1xm1 c1m1x1 c2m1x2 cmm1xm (3)
Because all the eigenvalues are distinct, it follows all ci’s equal to 0,
which contradicts our assumption that xm+1 can be expressed as a
linear combination of the first m eigenvectors. So, the set of n
eigenvectors is linearly independent given n distinct eigenvalues,
and according to Thm. 7.5, we can conclude that A is diagonalizable7.41
Ex 7: Determining whether a matrix is diagonalizable
1 2 1
A 0 0 1
0 0 3
7.42
Ex 8: Finding a diagonalized matrix for a linear transformation
Let T : R 3 R 3 be the linear transformation given by
T (x1 ,x2 ,x3 ) (x1 x2 x3 , x1 3x2 x3 , 3x1 x2 x3 )
Find a basis B ' for R 3 such that the matrix for T relative
to B ' is diagonal
Sol:
The standard matrix for T is given by
1 1 1
A 1 3 1
3 1 1
From Ex. 5 you know that λ1 = 2, λ2 = –2, λ3 = 3 and thus A is
diagonalizable. So, similar to the result on Slide 7.25, these
three linearly independent eigenvectors found in Ex. 5 can be
used to form the basis B '. That is 7.43
B ' {v1 , v 2 , v3} {(1, 0, 1),(1, 1, 4),(1, 1, 1)}
※ Note that it is not necessary to calculate A ' through the above equation.
According to the result on Slide 7.25, we already know that A ' is a diagonal
matrix and its main diagonal entries are corresponding eigenvalues of A
7.44
Keywords in Section 7.2:
diagonalization problem:
diagonalization:
diagonalizable matrix:
7.45
7.3 Symmetric Matrices and Orthogonal Diagonalization
Symmetric matrix :
A square matrix A is symmetric if it is equal to its transpose:
A AT
a b, c 0
a c a 0
A itself is a diagonal matrix
c b 0 a
※ Note that in this case, A has one eigenvalue, a, and its multiplicity is 2
(2) (a b) 2 4c 2 0
P 5 1
5
0
2 4 5
3 5 3 5 3 5
7.51
1 2 2
3 3
3
Moreover, let p1 5 , p 2 5 , and p3 0 ,
2 1
2 4 5
3 5 3 5 3 5
we can produce p1 p 2 p1 p3 p 2 p3 0 and p1 p1
p 2 p 2 p3 p3 1
7.52
Thm. 7.9: Properties of symmetric matrices
Let A be an nn symmetric matrix. If 1 and 2 are distinct
eigenvalues of A, then their corresponding eigenvectors x1 and x2
are orthogonal. (Thm. 7.6 only states that eigenvectors
corresponding to distinct eigenvalues are linearly independent)
Pf:
1 (x1 x 2 ) (1x1 ) x 2 ( Ax1 ) x 2 ( Ax1 )T x 2 (x1T AT )x 2
because A is symmetric
(x1T A)x 2 x1T ( Ax 2 ) x1T (2 x 2 ) x1 (2 x 2 ) 2 (x1 x 2 )
7.56
Ex 7: Determining whether a matrix is orthogonally diagonalizable
Symmetric Orthogonally
matrix diagonalizable
1 1 1
A1 1 0 1
1 1 1
5 2 1
A2 2 1 8
1 8 0
3 2 0
A3
2 0 1
0 0
A4
0 2
7.57
Ex 9: Orthogonal diagonalization
Find an orthogonal matrix P that diagonalizes A.
2 2 2
A 2 1 4
2 4 1
Sol:
(1) I A ( 3) 2 ( 6) 0
1 6, 2 3 (has a multiplicity of 2)
v1
(2) 1 6, v1 (1, 2, 2) u1 ( 13 , 32 , 32 )
v1
(3) 2 3, v 2 (2, 1, 0), v 3 ( 2, 4, 5) ※ Verify Thm. 7.9 that
v1·v2 = v1·v3 = 0
orthogonal
7.58
※If v2 and v3 are not orthogonal, the Gram-Schmidt Process should be
performed. Here we simply normalize v2 and v3 to find the
corresponding unit vectors
v2 v3
u2 ( 2
5
, 1
5
, 0), u3 ( 325 , 3
4
5
, 5
3 5
)
v2 v3
13 2 2 6 0 0
2 15 3 5
P 1 AP 0 3 0
P3 4
5 3 5
23 0 5 0 0 3
3 5
u1 u2 u3
※ Note that there are some calculation error in the solution of Ex.9 in the
text book
7.59
Keywords in Section 7.3:
symmetric matrix:
orthogonal matrix:
orthonormal set:
orthogonal diagonalization:
7.60
7.4 Applications of Eigenvalues and Eigenvectors
The rotation for quadratic equation: ax2+bxy+cy2+dx+ey+f = 0
Ex 5: Identify the graphs of the following quadratic equations
(a) 4 x 2 9 y 2 36 0 (b) 13x 2 10 xy 13 y 2 72 0
Sol:
x2 y 2
(a) In standard form, we can obtain 2 2 1.
3 2
7.61
(b) 13x 2 10 xy 13 y 2 72 0
※ Since there is a xy-term, it is difficult to identify the graph of this equation.
In fact, it is also an ellipse, which is oblique on the xy-plane.
x
If we define X = y , then XTAX= ax2 + bxy + cy2 . In fact, the
quadratic equation can be expressed in terms of X as follows
X T AX d e X f
7.63
Principal Axes Theorem
For a conic whose equation is ax2 + bxy + cy2 + dx + ey + f = 0,
the rotation to eliminate the xy-term is achieved by X = PX’,
where P is an orthogonal matrix that diagonalizes A. That is,
1 1 0
P AP P AP D
T
,
0 2
where λ1 and λ2 are eigenvalues of A. The equation for the
rotated conic is given by
1 ( x ') 2 2 ( y ') 2 d e PX f 0
7.64
Pf:
According to Thm. 7.10, since A is symmetric, we can
conclude that there exists an orthogonal matrix P such that
P–1AP = PTAP = D is diagonal
Replacing X with PX’, the quadratic form becomes
X T AX ( PX )T A( PX ) ( X )T PT APX
( X )T DX 1 ( x) 2 2 ( y) 2
※ It is obvious that the new quadratic form in terms of X’ has no x’y’-term, and
the coefficients for (x’)2 and (y’)2 are the two eigenvalues of the matrix A
x x x x
※ X PX v1 v 2 xv1 yv 2 Since and are
y y y y
the orignal and new coodinates, the roles of v1 and v 2 (the eigenvectors
of A) are like the basis vectors (or the axis vectors) in the new coordinate
system 7.65
Ex 6: Rotation of a conic
Perform a rotation of axes to eliminate the xy-term in the
following quadratic equation
13x 2 10 xy 13 y 2 72 0
Sol:
The matrix of the quadratic form associated with this equation is
13 5
A
5 13
The eigenvalues are λ1 = 8 and λ2 = 18, and the corresponding
eigenvectors are
1 1
x1 and x 2
1 1
7.66
After normalizing each eigenvector, we can obtain the
orthogonal matrix P as follows. ※ According to the results on p.
1 1 268 in Ch4, X=PX’ is
2 2 cos 45 sin 45
equivalent to rotate the xy-
P
coordinates by 45 degree to
1 1 sin 45 cos 45 form the new x’y’-coordinates,
2 2
which is also illustrated in the
figure on Slide 7.62
X T AX g h i X j
7.68
Keywords in Section 7.4:
quadratic form
principal axes theorem
7.69
7.5 Principal Component Analysis
Principal component analysis
It is a way of identifying the underlying patterns in data
It can extract information in a large data set with many
variables and approximate this data set with fewer factors
In other words, it can reduce the number of variables to a
more manageable set
Steps of the principal component analysis
Step 1: Get some data
Step 2: Subtract the mean
Step 3: Calculate the covariance matrix
Step 4: Calculate the eigenvectors and eigenvalues of the
covariance matrix
Step 5: Deriving the transformed data set
Step 6: Getting the original data back 7.70
Step 1: Step 2:
x y x y
2.5 2.4 0.69 0.49
0.5 0.7 -1.31 -1.21
2.2 2.9 0.39 0.99
1.9 2.2 0.09 0.29
demeaned
X T x y
3.1 3.0 1.29 1.09
2.3 2.7 0.49 0.79
2.0 1.6 0.19 -0.31
1.0 1.1 -0.81 -0.81
1.5 1.6 -0.31 -0.31
1.1 0.9 -0.71 -1.01
1.81 1.91 0 0
Step 3:
x T
x T
x x T
y
var( X ) E XX E T x y E T
T
T
T
y y x y y
var( x) cov( x, y ) 0.616556 0.615444
A
cov( x, y ) var( y ) 0.615444 0.716556 7.71
Step 4: Calculate the eigenvectors and eigenvalues of the
covariance matrix A
0.67787 0.73518
1 1.284028, v1 = 2 0.049083, v 2 =
0.73518 0.67787
x’ y’ x’ y’
-0.82797 -0.17512 -0.82797 0
1.77758 0.14286 1.77758 0
-0.99220 0.38437 -0.99220 0
-0.27421 0.13042 -0.27421 0
-1.67580 -0.20950 -1.67580 0
( X )T -0.91295 0.17528 ( X )T -0.91295 0
0.09911 -0.34982 0.09911 0
1.14457 0.04642 1.14457 0
0.43805 0.01776 0.43805 0
1.22382 -0.16268 1.22382 0
0 0 0 0
1.284028 0 1.284028 0
var((X )T ) = var((X )T ) =
0 0.049083 0 0 7.74
Step 6: Getting the original data back:
X T ( X )T P 1 ( ( X )T PT ) original mean, where P = [v1 v 2 ]
v 0.67787 v12 0.73518
x y x y v11 0.73518 v11 x v21 y v12 x v22 y
21 v22 0.67787
0.67787 x 0.73518 y 0.73518 x 0.67787 y
case 1 case 2
x y x y
2.5 2.4 2.37 2.52
0.5 0.7 0.61 0.60
2.2 2.9 2.48 2.64
1.9 2.2 2.00 2.11
3.1 3 2.95 3.14
2.3 2.7 2.43 2.58
2 1.6 1.74 1.84
1 1.1 1.03 1.07
1.5 1.6 1.51 1.59
1.1 0.9 0.98 1.01
1.81 1.91 1.81 1.91
※ We can derive the original data set if ※ Although when we derive the transformed data, only v1 and
we take both v1 and v2 and thus x’ thus only x’ are considered, the data gotten back is still
and y’ into account when deriving similar to the original data. That means, x’ can be a
the transformed data common factor almost able to explain both series x and y7.75
v1
v2
7.76
Factor loadings: the correlations between the principal
components (F1 = x’ and F2 = y’) and the original variables (x1
= x and x2 = y)
vij i
lij F x i j
s.d . j
0.67787 1.284028 0.73518 0.049083
l11 xx 0.97824 l21 yx 0.20744
0.785211 0.785211
0.73518 1.284028 0.67787 0.049083
l12 xy 0.98414 l22 yy 0.177409
0.846496 0.846496
7.78