Anda di halaman 1dari 17

College of Engineering, Pune

MA101
Note 6 by K. D. Joshi

MATRIX REPRESENTATION OF LINEAR TRANSFORMATIONS


In the last note we saw that every m n matrix induces a linear transformation
T = TA : IRn IRm whose properties are intimately related to those of A. For
example, the rank of TA , defined as the dimension of its range space, equals the rank of
the matrix A (which can be defined in three different, but ultimately equivalent ways).
We also saw (see the exercises of Note 5) that certain properties (e.g. nilpotency or
idempotency) carry over from A to TA and can be proved more naturally for linear
transformations than for the matrices which induce them. This happens because the
highly clumsy matrix multiplication corresponds to the very natural concept
of the composite of two functions. So, in this respect, the linear transformations
are superior. But in some other respects, the matrices are superior. For example, there
is a systematic procedure (based on row reduction) for calculating the rank of a matrix.
But there is no such procedure for calculating the rank of a linear transformation.
Summing up, linear transformations are more elegant but matrices are more practical.
To combine both the virtues, we must have some method to convert one to the other.
We already associated a linear transformation to a matrix. In this note, we reverse the
procedure. That is, given a linear transformation T : V W , we shall associate a
matrix, say A to it. Any calculations needed for T (such as finding its rank) willl be
done in terms of the associated matrix A. This is a sort of a symbiosis, like a couple in
which one partner earns money and the other manages the house.

Associating a Matrix to a Linear Transformation


A serious limitation in associating a matrix to a linear transformation is that its
domain and codomain have both to be finite dimensional, because these dimensions
correspond to the row size and the column size respectively of the associated matrix.
(If one wants, one can define infinite matrices, i.e. matrices with infinitely many rows
and columns. But they are not easy to handle. And, in any case, even these infinite
matrices, are inadequate to represent certain transformations.)
So, let us suppose that T : V W is a linear transformation where V and W
are vector spaces of dimensions n and m respectively. We shall associate m n matrix
to T as follows. First, fix an ordered basis (v1 , v2 , . . . , vn ) for V and an ordered basis
(w1 , w2 , . . . , wm ) for W . (Note that the matrix to be associated to T will depend on the
order of the vectors in both the bases too and that is why we take ordered bases, i.e. bases
whose elements are ordered in a particular way.) The elements T (v1 ), T (v2 ), . . . , T (vn )

1
are in W and so can be expressed uniquely as linear combinations of the basis elements
w1 , w2 , . . . , wm , say,


T (v1 ) = a11 w1 + a21 w2 + . . . + am1 wm



T (v2 ) = a12 w1 + a22 w2 + . . . + am2 wm



.. .. .. (1)
. . .




T (vn ) = a1n w1 + a2n w2 + . . . + amn wm

Then the m n matrix A = (aij ) is called the matrix of T w.r.t. the ordered
bases (v1 , v2 , . . . , vn ) for V and (w1 , w2 , . . . , wm ) for W . (Note that, unlike in the
case of writing a system of linear equations where the first suffix is common for all
the coefficients appearing in the same row, here the second suffix is common to all the
coefficients appearing in the same row.) The system (1) can be written compactly as


T (v1 ) w1


T (v2 )


t
w2


.. = A .. (2)
. .



T (vn ) wm
where on the R. H. S. we are multiplying a scalar matrix with a vector matrix. A
good way to remember the matrix A associated to a linear transformation T (with given
ordered bases for the domain and the codomain) is to note that for every j = 1, 2, . . . , n,
the j-th column of A consists of the coefficients when the vector T (vj ) is expresses as a
linear combination of the basis elements of W .
Let us see how this matrix A helps us write down the image of a typical element, say
v of the domain V . First write v as c1 v1 + c2 v2 + . . . + cn vn . Similarly, write T (v) as
b1 w1 +b2 w2 +bm wm . We shall show how these coefficients b1 , b2 , . . . , bm can be calculated
from the coefficients c1 , c2 , . . . cn if we know the matrix A. For this, we express linear
combinations of vectors as products of scalar and vector matrices. By linearity of T and
using (2) above, we have

w1


w2

[b1 b2 . . . bm ] .. = T (v) = c1 T (v1 ) + c2 T (v2 ) + . . . + cn T (vn )
.



wm

T (v1 ) w1


T (v2 )
w2
cn ]At

= [c1 c2 . . . cn ] .. = [c1 c2 . . . ..
. .



T (vn ) wm

2
As the vectors w1 , w2 , . . . , wm are linearly independent, the preceding equality means
[b1 b2 . . . bm ] = [c1 c2 . . . cn ]At . Taking transposes, we have

b = Ac (3)

where b and c are the column vectors [b1 b2 . . . bm ]t and [c1 c2 . . . cn ]t respectively.
Worded differently, once we fix the ordered bases (v1 , v2 , . . . , vm ) and (w1 , w2 , . . . , wm ),
elements of V correspond to column vectors of length n while those of W to column
vectors of length m. With these identifications (3) says that the transformation T
behaves very much like the linear transformation induced by the matrix A. As we shall
see later, this fact is crucial in converting concepts about abstract linear transformations
to the corrersponding concepts about matrices which are often easier to handle.

Matrices Associated to Composites


It is obvious that if T1 , T2 are both linear transformations from V to W and their
associated matrices (w.r.t. the same bases) are A1 and A2 respectively, then T1 +T2 is also
a linear transformation from V to W and further that its matrix (w.r.t. the same bases)
is A1 + A2 . In other words, the matrix of a sum of two linear transformations is the sum
of their matrices. What is not so obvious is that the matrix of the composite of two linear
transformations is the matrix product of their matrices. Specifically, suppose in addition
to the notation above, we have a third vector space U and a linear transformation
S T
S : U V . (Such composites are often writen more vividly as U V W .)
Now let (u1 , u2 , . . . , up ) be an ordered basis for U. Let B be the matrix of S w.r.t. the
basis (u1 , u2 , . . . , up ) for the domain U and the basis (v1 , v2 , . . . , vn ) for the codomain
V . The composite function T S : U W is also a linear transformation and let C be
its matrix w.r.t. the basis (u1 , u2 , . . . , un ) for U and the basis (w1 , w2 , . . . , wm ). Then
the matrix product AB is defined and we claim that it equals C. This can be done by
actually computing its entries. We begin by writing

S(u1 ) v1


S(u2 )


t
v2


.. = B .. (4)
. .



S(up ) vn
In essence this equation expresses each S(ui ) as a linear combination of the vs so by
linearity of T we get

T (S(u1 )) T (v1 )


T (S(u2 ))
T (v2 )
Bt


.. = .. (5)
. .



T (S(up )) T (vn )

3
Combining this with (2) we have

T (S(u1 )) w1


T (S(u2 ))


t t
w2


.. = BA .. (6)
. .



T (S(up )) wm

But on the other hand, we also have



T (S(u1 )) w1


T (S(u2 ))


t
w2


.. = C .. (7)
. .



T (S(up )) wm

since C is the matrix of the composite transformation T S. So we must have B t At = C t .


Taking transposes, we get C = AB as was to be proved.
So far we showed how a matrix can be associated to a linear transformation. Earlier,
to an m n matrix A we associated the linear transformation TA : IRn IRm defined
by TA (v) = Av for v IRn . It is easy to see that the matrix of TA w.r.t. the standard
bases of IRn and IRm (both ordered in the usual way) is precisely A, because for every
j = 1, 2, . . . , n, the image under TA of the j-th basic vector ej is the matrix Aej , which is
simply the j-th column of A, viz. [a1j a2j . . . amj ]T . This equals a1j e1 +a2j e2 +. . .+amj em
where (e1 , e2 , . . . , em ) is the standard ordered basis for IRm .

Studying Transformations through Matrices


Because we can associate a matrix to a linear transformation, certain concepts about
linear transformations can be defined in terms of the corresponding concepts about
matrices. This approach is not necessary for concepts like the rank because as we saw
in the last note, this concept can be defined directly for a transformation (i.e. without
taking recourse to any matrix). Later on we shall show how the concept of an eigenvalue
can be defined either through matrices or directly. However, there are some concepts
such as the trace and the determinant which cannot be defined directly for a linear
transformation. They have to be defined by first associating some square matrix to the
transformation. Note that when the dimensions of V and W are equal, the matrix, say
A, associated with a transformation T : V W is a square matrix. It is tempting to
define the trace of T as the trace of A (with a similar definition for the determinant of A).
But there is a catch. The matrix associated with a linear transformation depends not
only on that transformation but also on which bases are chosen for the domain and the
codomain. When these bases change so does the associated matrix. So unless we make

4
some canonical choice of bases or unless we show that the concept we are associating is
independent of the choice of the bases, our definitions are ambiguous.
There is no satisfactory way out of this difficulty when V and W are different vector
spaces. But when W = V , the situation can be salvaged. So, suppose T : V V is
a linear transformation where V is a vector space of dimension n (say). In such a case,
we can take two different bases for V , one as the domain and the other as the codomain
and form the matrix, say A, of T w.r.t. these bases. This is rarely done. So, when we
talk about the matrix of a linear transformation from V to V w.r.t. some basis for V
it is tacitly assumed that the same ordered basis is used for both the domain and the
codomain. Of course, when this basis is changed, the same linear transformation may
have a different matrix. But these matrices are related in a particular way. To see what
it is, suppose (v1 , v2 , . . . , vn ) and (w1 , w2 , . . . , wn ) are two ordered bases for V and let
A and B be the matrices of T w.r.t. these bases respectively. Both are n n matrices
and we want to see how A and B are related to each other. We already know that

T (v1 ) v1 T (w1 ) w1


T (v2 )
v2
T (w2 )
w2
At Bt


.. = .. and
.. = .. (8)
. . . .



T (vn ) vn T (wn ) wn
We now introduce one more square matrix C called the change of basis matrix.
Each wi is a unique linear combination of the basis elements v1 , v2 , . . . , vn . The matrix
C is obtained by collecting the coefficients of the vs in these n linear combinations. We
can therefore write

w1 v1


w2


v2


.. = C .. (9)
. .



wn vn
In essence this matrix C expresses the new basis elements in terms of the old ones and
hence is called the change of basis matrix. It is clear that the change of basis matrix
which expresses the vs in terms of ws will be the inverse of C. So we have

v1 w1


v2


1
w2


.. = C .. (10)
. .



vn wn
Consider v1 . It is a linear combination of the vectors w1 , w2 , . . . , wn with the coefficients
coming from the first row of the matrix C 1 . By linearity of T , T (v1 ) will be a linear

5
combination of the vectors T (w1 ), T (w2), . . . , T (wn ) with the same coefficients, viz. the
entries in the first row of C 1 . A similar argument applies to T (v2 ), T (v2 ), . . . , T (vn ).
Putting these expressions together, and further using (9), we get

T (v1 ) T (w1 ) w1 v1


T (v2 )



1
T (w2 )


1 t
w2


1 t
v2


.. = C .. = C B .. = C B C .. (11)
. . . .



T (vn ) T (wn ) wn vn
The first part of (8) and (11) together imply

C 1 B t C = At (12)

If the matrix C commuted with B t , then we could simplify (12) as C 1 CB t = At


which would mean that B t = At and hence B = A. That would prove that no matter
which basis we choose to represent the transformation T , the matrix associated with it
is the same. This will then enable us to define, for example, the trace of T as the trace
of the (unique) matrix A associated with T .
Unfortunately, this may not always happen. The ordered bases (v1 , v2 , . . . , vn ) and
(w1 , w2 , . . . , wn ) are totally independent of the matrix A and so there is no reason to
expect that the change of basis matrix should commute with A. So we have to accept
(12) as the best we can do. Even then, it turns out that it does a good job because
it expresses an important relationship between the matrices A and B (which then
translates into a similar relationship between A and B). We digress a little to define
this relationship and study some of its consequences.

Similarity of Matrices
In Note 4, we commented that the word faithful which is generally used socially
is also used technically in mathematics. The same is true of the word similar. Many
times we make a statement like the second assertion follows by a similar argument.
This is the common, non-technical use of the word. But when we say that two triangles
are similar, we mean that their respective sides are proportional. This is a technical use
of the word.
For matrices, too, similarity is a technical concept. The formal definition is as follows.

Definition 1: Suppose A and B are square matrices of the same order, say n. Then B
is said to be similar to A if there exists some non-singular matrix C (of order n) such
that C 1 AC = B.

If A is the identity matrix In , then it commutes with every matrix C and so it follows
that the only matrix similar to In is In itself. The same is true of the zero matrix. But

6
in general,
every
matrix has many other marices
similar
to it. For example the
matrix

4 3 1 2 0 1
B= is similar to the matrix A = as we see by taking C = .
2 1 3 4 1 0

1 0 1 1
(Note that C is its own inverse.) By changing C to , we get that is
0 2 6 4
also similar to A.
A few elementary properties of the relation just defined are listed below.

Theorem 1: (Elementary Properties of Similarity)

(i) Similarity is an equivalence relation on the set Mn of all matrices of order n.

(ii) Similar matrices have the same rank, the same determinant and the same trace.

(iii) If two matrices are similar to each other, then so are their transposes.

Proof: For (i) we have to verify that similarity is a binary relation which is reflexive,
symmetric and transitive. Taking C = In , we get C 1 AC = In AIn = A which shows
that A is similar to itself. Next, suppose B is similar to A. Then there is some C such
that B = C 1 AC. From this we get CBC 1 = CC 1 ACC 1 = A which shows that A
is similar to B because C can be written as (C 1 )1 . Finally, for transitivity, assume
B = C 1 AC and P = Q1 BQ. Then P = Q1 (C 1 AC)Q = (CQ)1 A(CQ), which
shows that P is similar to A.
For (ii), assume B = C 1 AC. Since C and C 1 are both non-singular, by Part (ii)
of Exercise (5.12), B has the same rank as A. Also from the multiplicative property of
1
determinants, it follows that det(B) = det(C 1 )det(A)det(C) = det(A)det(C).
det(C)
Even if the matrices A and C 1 or the matrices A and C may not commute with
each other, their determinants are real (or complex) numbers and they always commute
with each other. This gives det(B) = det(A). Thus we have shown that ranks and
determinants are invariant under similarity. The invariance of the trace is a bit tricky
to prove and will be given as an exercise.
For (iii) assume B = C 1 AC. Taking transposes, B t = C t At (C 1 )t . But (C 1 )t
is the same as (C t )1 as one sees by directly multiplying (C 1 )t and C t . So, if we let
P = (C 1 )t = (C t )1 we have B t = P 1 At P which shows that B t is similar to At .

Given two matrices A and B, in general there is no easy way to tell if they are simialr
to each other. But Part (ii) of the Theorem above gives some necessary conditions.
Going back to (12), we now see that the matrices B t and At are similar to each
other. By Part (iii) of the theorem above, we see that A and B are also similar to each

7
other. Put differently, for a linear transformation T : V V , even though its
matrix w.r.t. a basis may change as the basis changes, its similarity class
will not change. Therefore properties of matrices which are invariant under similarity
can be defined unambiguously for such transformations. For example we can talk about
the determinant of a linear transformation as the determinant of its matrix w.r.t. any
basis. This is well defined because similar matrices have the same determinant. The
same holds for trace and, more generally, for eigenvalues as we shall show later. As with
ranks, we shall also give a direct definition of eigenvalues of linear transformations. It
will then be seen that the two definitions coincide. In fact, the direct definition has the
advantage that it is applicable even when V is infinite dimensional. Matrices can handle
only finite dimensional vector spaces.

Some Numerical Examples


As a numerical illustration of how a change of basis affects the matrix of a transfor-
mation, suppose V is the four dimensional vector space of polynomials of degree 3 or
less (including the zero polynomial). The differential operator D defines a linear trans-
formation from V to itself by D(p(x)) = p (x). Take the basis {1, x, x2 , x3 } for V . Then

0 1 0 0

0 0 2 0
the matrix, say A, of D w.r.t. this basis is A =
.

0 0 0 3

0 0 0 0
Suppose, however, that we take a different basis for V , viz. {P0 , P1 , P2 , P3 } where
P0 (x) = 1, P1 (x) = x, P2 (x) = 32 x2 12 and P3 (x) = 52 x3 23 x. (Recall from Exercise
(2.6)(d) that this basis is obtained from the earlier basis (1, x, x2 , x3 ) by the process of
Gram-Schmidt orthogonalisation. But that has no role here.) By a calculation which
we leave as an

exercise, the matrix of D w.r.t. this new basis comes out to be B =
0 1 0 1

0 0 3 0
. It is not immediately clear that A and B are similar to each other.

0 0 0 5

0 0 0 0
(They do have the same rank, trace and the same determinant. But that is only a
necessary condition for similarity.)
As another example, consider the transformation T : IR2 IR2 defined by

x 16/25x 12/25y
T = (13)
y 12/25x + 9/25y

16/25 12/25
This linear transformation is induced by the 2 2 matrix A = . So
12/25 9/25

8
the matrix of T w.r.t. the usual ordered basis (e1 , e2 ) is simply
A. Suppose, however,

4/5 3/5
that we take a different ordered basis (v1 , v2 ) where v1 = and v2 = .
3/5 4/5
Let us find the matrix, say B, of T w.r.t. this ordered basis. By a direct calculation, we
have

16/25 12/25 4/5 4/5
T (v1 ) = = = v1 = 1v1 + 0v2 (14)
12/25 9/25 3/5 3/5

16/25 12/25 3/5 0
and T (v2 ) = = = 0 = 0v1 + 0v2 (15)
12/25 9/25 4/5 0
Hence B, the matrix of T w.r.t. the ordered basis (v1 , v2 ) is simply

1 0
B= (16)
0 0
which is considerably simpler than the original matrix A. In fact, it is a diagonal matrix.
And, diagonal matrices are the next simplest matrices to matrices which are multiples
of the identity matrix.
We already know that A and B are similar to each other. An explicit matrix P
for which B will equal P 1 AP is easy to obtain.
It is a matrix whose columns are the
4/5 3/5
vectors v1 , v2 . That is, P = . We leave it as an exercise to find P 1 and
3/5 4/5
verify that P 1 AP indeed equals the diagonal matrix B.

Eigenvectors, the Magic Vectors


The example above shows that by resorting to a suitable basis, we may be able to
drastically simplify the matrix of a given linear transformation. The crucial question,
of course, is how to find this suitable basis. We shall give the general answer shorly.
But in the present example, let us see what role these magic vectors v1 and v2 have in
relation to the transformation T . We can get the answer by looking at the geometric
interpretation of the transformation T . If we take = tan1 (3/4) in Exercise (5.6)(b),
then T is simply the orthogonal projection of the plane IR2 onto the line L making angle
with the positive x-axis. (See the figure on the next page.)
The vectors v1 , v2 are unit vectors along and perpendicular to L. So it is obvious
that T (v1 ) = v1 and T (v2 ) = 0. Had we taken 2A instead of A, then T would have
been the orthogonal projection onto L followed by a dilation (i.e. a radial stretching)
by a factor of 2 and its matrix w.r.t. (v1 , v2 ) would have been 2B. More generally if we
replace A by A for any real number , then we would have T (v1 ) = v1 . T (v2) would
0
be still 0 and the matrix of T with the ordered basis (v1 , v2 ) would be .
0 0

9
y

( x, y )
v2
O
x

v1 T( x , y )

Summing up, the magical feature of the vectors v1 , v2 was that their images under
the transformation were their own scalar multiples. Normally, if T : V V is a linear
transformation and v is some element of V , there is no reason why T (v) should have the
same (or the opposite) direction as v. Vectors for which this happens are very special
and are given a special name.

Definition 2: Let T : V V be a linear transformation. Then a non-zero vector


v V is said to be an eigenvector of T if there exists some real (or complex) scalar
(which could be 0), such that

T (v) = v (17)

This scalar (which is unique since v is a non-zero vector) is called an eigenvalue of T


corresponding to the eigenvector v.
(We remark that it is more standard to first define an eigenvalue of T as any number
for which (17) holds for some non-zero v and then call any such v as an eigenvector
corresponding to the eigenvalue. Out approach is slightly better because while the eigen-
value corresponding to a given eigenvector is unique, many different vectors can be the
eigenvectors for the same eigenvalue. In particular if v is an eigenvector corresponding
to an eigenvalue , so is v for any 6= 0. These extra eigenvectors, of course, tell us
nothing new. Sometimes, as a normalisation measure, eigenvectors are assumed to be
unit vectors. But that is not very crucial.)
For the identity transformation, every non-zero vector is an eigenvector and the
corresponding eigenvalue is the same, viz. 1, for every one of them. The same holds
for the zero transformation except that the common eigenvalue is 0. These are extreme
examples. In between we have all kinds of variations. In the example given above,
the orthogonal projection T onto the line L had two eigenvectors v1 (along L) and
v2 (perpendicular to L). The corresponding eigenvalues were 1 and 0 respectively. A
rotation of the plane around the origin through an angle (see Exercise (5.3)) has no

10
eigenvectors (except when is a multiple of 2, in which case the rotation is just the
identity transformation). As in the case of an orthogonal projection, a reflection of
the plane into a line L through the origin has two eigenvectors, v1 (along L) and v2
(perpendicular to L). Note, however, that the eigenvalue correspondingto v2 is 1 and
1 0
not 0. So, the matrix of this reflection w.r.t. the basis (v1 , v2 ) will be .
0 1
Note that the vector space appearing in Definition 2 does not have to be finite
dimensional. For example, let V be the vector space of all infinitely differentiable real
valued functions on IR (i.e. functions f : IR IR which have derivatives of all orders).
We already saw that the differential operator D : V V (defined by D(f (x)) = f (x))
is a linear transformation. Since D(ex ) = ex for every real number , every such
function is eigenvector of D and the corresponding eigenvalue is . Since here the
elements of the vector space V are functions, it is more customary to call the eigenvectors
as eigenfunctions. We leave it as an exercise to find the eigenfunctions of the operator
D 2 and the corresponding eigenvalues.

Eigenvector Bases for Transformations


For a linear transformation T : V V where V is finite dimensional, the work we
did above for the orthogonal projection of IR2 onto L can be generalised. Nothing new
is really involved. But we record it as a theorem.

Theorem 2: Suppose T : V V is a linear transformation where V is finite dimen-


sional vector space of dimension n. Suppose V has an eigenvector basis, i.e. a basis,
say {v1 , v2 , . . . , vn } where each vi is an eigenvector of T . Then the matrix of T w.r.t.
this basis (ordered in any manner we like) is a diagonal matrix. Conversely, if there is
some basis for V w.r.t. which the matrix of T is a diagonal matrix, then all elements of
this basis are eigenvectors of T .

Proof: Assume first that {v1 , v2 , . . . , vn } is an eigenvector basis for T . For i =


1, 2, . . . , n, let i be the eigenvalue corresponding to vi . (Note that it may happen
that for some i 6= j, i = j . In fact, for T = 1V , all s equal 1.) Then,
T (vi ) = i vi = 0v1 + 0v2 + . . . + 0vi + i vi + 0vi+1 + . . . + 0vn (18)
for i = 1, 2, . . . , n. Put together,

1 0 0 ... 0 T (v1 )
T (v1 ) v1

0 2 0 ... 0
T (v2 )
T (v2 ) v2


0 0 3 ... 0 T (v3 )

= = D (19)


.. ..
. .. .. .. .. .


. . . .


T (vn ) vn
0 0 0 . . . n T (vn )

11
where D = D(1 , 2 , . . . , n ) is a diagonal matrix whose (i, i)-th entry is i for i =
1, 2, . . . , n. Since D is its own transpose, it follows from (19), that it is the matrix of T
w.r.t. the ordered basis (v1 , v2 , . . . , vn ).
Conversely, suppose {v1 , v2 , . . . , vn } is a basis w.r.t. which the matrix of T is a
diagonal matrix D. Let i be the (i, i)-th entry of D for i = 1, 2, . . . , n. Then (19) holds
with D replaced by D t . But D t is the same as D. So, (19) holds as it is. But then (18)
also holds for every i = 1, 2, . . . , n. And that means each vi is an eigenvector of T (with
i as the correspnding eigenvalue).

How to Find Eigenvectors?


Theorem 2 will be of some use only if we have some method for finding an eigenvector
basis for a given transformation T : V V . For this there is no easy answer. Forget
about a whole eigenvector basis. How do we find even one eigenvector of T ? This cannot
be done simply by trial and error. That is, we cannot just go on trying all possible non-
zero vectors v V one by one, calculate T (v) and see if it is a multiple of v. Even if
V has a finite basis, there are infinitely many possible linear combinations of the basis
elements and so this process may never end. And, as we saw above in the case of a
rotation of the plane, even if we have infinite patience we may not get an eigenvector,
for the simple reason that for a rotation no eigenvector exists! (Such trial and error
method of looking for a desired element from a set S of possible choices is called an
exhaustive search. Its termination can be guaranteed only if the set S is finite and
even then it is not considered a very good method. It is resorted to only when there is
no alternative.)
In all the examples given above where we could identify some eigenvectors, we could
do so only because of some special knowledge of the transformation T . For example, we
were able to find the eigenvectors for the transformation T in (13) by recognising it as
an orthogonal projection. Without such a knowledge, if we were given (13) merely as it
is, what would we do?
This is where matrices come to help. If the vector space V happens to be IRn (or C I n)
and the transformation T : IRn IRn (or T : C I n C I n ) happens to be induced by
some n n real (or complex) matrix A, then there is a systematic procedure for finding
eigenvectors of T = TA . Before describing it, we first define the concept of an eigenvalue
for a (square) matrix A.

Definition 3: A (real or complex) number is called an eigenvalue of an n n matrix


A if there exists some non-zero v IRn (or v C
I n ) such that

Av = v (20)

12
Any such v is called an eigenvector corresponding to the eigenvalue .

Note the striking resemblance between (17) and (20). But there are two subtle
differences. First, in Definition 2 we defined eigenvectors first and then the eigenvalues
corresponding to them. Here we are turning the tables around. More importantly,
Definition 2 is applicable even when V is infinite dimensional while in Definition 3, the
vector space involved is finite dimensional (and further restricted to a euclidean space).
Despite these differences, we have the following expected relationship.

Theorem 3: Suppose A is an n n matrix and TA is the linear transformation induced


by A. Then v is an eigenvector of TA if and only if it is an eigenvector of the matrix A.
When this happens, the corresponding eigenvalues are equal. Conversely, if T : V V
is a linear transformation where V is finite dimensional and A is the matrix of T w.r.t.
some basis of V , then the eigenvalues of T are the same as those of the matrix A.

Proof: For the first part, recall that TA is defined by TA (v) = Av. So, it is clear that
(17) holds (with T replaced by TA ) if and only if (20) holds with the same value of .
For the converse, suppose A is the matrix of T w.r.t. the ordered basis (v1 , v2 , . . . , vn ).
Suppose v = c1 v1 + c2 v2 + . . . + cn vn V and T (v) = b1 v1 + b2 v2 + . . . + bn vn . Then by
Equation (3) above when we represented linear transformations by matrices, we have

b = Ac (21)

where b, c are the column vectors [b1 b2 . . . bn ]t and [c1 c2 . . . cn ]t respectively. There-
fore, the equality T (v) = v translates into Ac = c. Note also that v 6= 0 if and only
if c 6= 0. So, is an eigenvalue of T if and only if it is an eigenvalue of the matrix A
associated to T . We see further that the eigenvector v corresponding to corresponds
to the eigenvector c of A.

This theorem converts the problem of finding the eigenvalues and eigenvectors of
T : V V (when V is finite dimensional) to the problem of finding the eigenvalues
and eigenvectors of the matrix A associated to T . Incidentally, this also shows that it
does not matter which basis is used for the conversion. If we take a different basis for
V , we may get a different matrix, say B. But its eigenvalues will be the same because
they coincide with those of the transformation T (which can be defined directly without
involving any basis). It is like this. The year of birth as well as the year of death of a
person can change if you change the calendar. But the duration of his life span will be
independent of which calendar is used. A mathematical analogy would be that even if
the coordiantes of the three vertices of a triangle may change if we change the coordinate
frame, its area will not change because it is an intrinsic geometric attribute.

13
How to Find Eigenvalues of a Matrix?
The conversion achieved by Theorem 3 will be of some use only if we have some easy
way of finding the eigenvalues and corresponding eigenvectors of matrices. Fortunately,
this turns out to be the case. In fact, this is one respect in which the matrices are
superior. (Another one is the calculation of the rank of a linear transformation.)
We begin by rewriting (20) as

(A In )v = 0 (22)

where In is the identity matrix of order n. This is a homogeneous system of n linear


equations in n unknowns. To say that it has a non-trivial solution v is equivalent
to saying that the coefficient matrix A In is singular. Since singular matrices are
precisely those whose determinants vanish, we have the following simple but important
result which is the starting point for the search of an eigenvector basis.

Theorem 4: Let A = (aij ) be an n n matrix. Then is an eigenvalue of A if and


only if |A In | = 0, or, in a fuller form, p() = 0 where

a11

a12 a13 . . . a1n



a21 a22 a23 . . . a2n

p() =
.. .. .. .. (23)
. . . .



an1 an2 an3 . . . ann

Clearly, p() is a polynomial in of degree n and leading coefficient (1)n . It is called the
characteristic polynomial of A, its roots the characteristic roots and the equation
p() = 0 the characteristic equation of A. The theorem above says that the char-
acteristic roots of A are the same as its eigenvalues. But conceptually, they are very
different things. The characteristic roots are the roots of a certain polynomial associated
with the matrix while an eigenvalue is a special scalar for which something special (viz.
an eigenvector, i.e. a vector whose direction remains unchanged) exists.
Note that if A is a real matrix, then p() is a real polynomial. Still it may have
complex roots. We can consider them as eigenvalues provided we allow complex eigen-
vectors. This is what happens for rotations as will be pointed out in the exercises. A
few interesting results about eigenvalues will also be given as exercises.
Now that we know how to find eigenvalues, let us go back to the orthogonal pro-
jection
T defined by (13) above. The matrix of T w.r.t. the standard basis is A =
16/25 12/25 16/25 12/25

. Hence p() = . Upon expansion, this
12/25 9/25 12/25 9/25

becomes 2 . So the eigenvalues are 1 and 0. To find the eigenvector corresponding

14

x1 0
to = 1, we have to solve the system (A I2 ) = , i.e. the system
x2 0

9 12 12 16
x1 x2 = 0 and x1 x2 = 0 (24)
25 25 25 25
Both the equations are the same and there are infinitely many solutions. This is to
be expected because any (non-zero) multiple of an eigenvector is also an eigenvector,
corresponding to the same eigenvalue. One possible
solution
is x1 = 4, x2 = 3. So an
4
eigenvector corresponding to the eigenvalue 1 is . By a similar calculation, which
3

3
we skip, an eigenvector corresponding to the other eigenvalue, viz. 0, is . These
4
are multiples of the unit vectors v1 , v2 we obtained earlier. But that time we had to
rely on the geometric interpretation of the transformation T . This time we found the
answer in a purely self-contained manner.
It would thus appear that in order to find an eigenvector basis for any linear trans-
formation T : V V where V is n-dimensional, we should first associate an n n
matrix A to T , then find the characteristic equation of A, find all characteristic roots
(of which there would be n in all counting multiplicities) and then for each eigenvalue ,
find a corresponding eigenvector v by solving the system Av = v. This way we should
get n eigenvectors and they would form a desired eigenvector basis. The matrix of T
w.r.t. this basis would be a diagonal matrix by Theorem 2.
While this procedure is basically correct, it turns out that it does not always work.
Just exactly what the difficulties are and under what conditions the procedure will work
will be taken up in the next note, called diagonalisation.

Exercises

(6.1) Suppose A is the matrix of a linear transformation w.r.t. some ordered bases for
the domain and the codomain. If we permute the elements of these bases, how
does it affect A?

(6.2) For any two square matrices P and Q of the same order, prove that P Q and QP
have the same trace. Hence show that the trace is a similarity invariant. [Hint:
Given B = C 1 AC, choose P and Q cleverly.]

(6.3) Suppose T : V W has matrix A w.r.t. the ordered bases (v1 , v2 , . . . , vn )


and (w1 , w2 , . . . , wm ) for V and W respectively and also that it has matrix B
w.r.t. some other ordered bases (v1 , v2 , . . . , vn ) and (w1 , w2 , . . . , wm

) for V and

15
W respectively. Prove that there are invertible matrices P and Q of sizes n and
m respectively such that B = QAP .

(6.4) Suppose a linear transformation T : V V has matrix A w.r.t. some ordered


basis for V . Let B be any matrix which is similar to A. Prove that there is an
ordered basis for V w.r.t. which the matrix of T is B.

(6.5) Verify that the matrix of the differential operator w.r.t. the alternate basis given
for the vector space of all polynomials of degree 3 or less, indeed comes out as
claimed.

(6.6) For the differential operator D 2 defined on the vector space of all infinitely differ-
entiable functions from IR to IR, prove that every real number is an eigenvalue.
Identify two linearly independent eigenvectors corresponding to it. (THis is more
q question about solving differential equations.)

cos sin
(6.7) Let A = be the matrix of the counterclockwise rotation through
sin cos
an angle . (See exercise (5.3).) Prove that the eigenvalues of A are ei and ei .
Find the corresponding (complex) eigenvectors.

(6.8) Prove that similar matrices have the same eigenvalues, but that the corresponding
eigenvectors may be different.

(6.9) Prove that the determinant and the trace of an nn matrix equal, respectively, the
product and the sum of its n eigenvalues (some of which may be complex and some
may be repeated). [Hint: Using the fundamental theorem of algebra factorise the
characteristic polynomial completely as (1)n (1 )(2 ) . . . (n ). Consider
the coefficients of 0 and n1 . This and the last exercise give alternate proofs
that the trace and the determinant are similarity invariant.]

(6.10) Suppose A is a square matrix and is an eigenvalue of A.

(a) Prove that 2 is an eigenvalue of A2 , 3 is an eigenvalue of A3 and more


generally, for any polynomial f (x), f () is an eigenvalue of the matrix f (A).
[Hint: Factorise f (A) f ().]
(b) With f (x) as in (a), prove that the only eigenvalues of f (A) are of the form
f () for some eigenvalue of A. [Hint: If is an eigenvalue of f (A), factorise
f (x) completely using the fundamental theorem of algebra.]

(6.11) Prove that 0 is the only eigenvalue of a nilpotent matrix and the only possible
eigenvalues of an idempotent transformation are 0 and 1.

16

0 1
(6.12) Show that the matrix has 0 as its only eigenvalue but no eigenvector
0 0
basis. (This is the prototype which illustrates what can go wrong in finding an
eigenvector basis.)

3 2 1 1 0
(6.13) Let T : IR IR be TA where A = . Compute the matrix of A
2 0 1
w.r.t.

(i) the standard bases (e1 , e2 , e3 ) for IR3 and (e1 , e2 ) for IR2
(ii) the ordered basis (e2 , e1 + e3 , e3 ) for IR3 and the standard basis for IR2
(iii) the standard basis for IR3 and the ordered basis (e1 e2 , 2e1 + e2 ) for IR2
(iv) the ordered basis (e2 , e1 +e3 , e3 ) for IR3 and the ordered basis (e1 e2 , 2e1 +e2 )
for IR2 .

(6.14) Suppose (v1 , v2 , . . . , vn ) is an ordered basis for a vector space V and T : V V


is a linear transformation with the property that for every i = 1, 2, 3, . . . , n, T (vi )
lies in the span of {v1 , . . . , vi }. What can be said about the matrix of T w.r.t.
this basis?

(6.15) In the last exercise, suppose that T (vi ) lies in the span of {v1 , v2 , . . . , vi1 } for
i = 1, 2, . . . n (for i = 1 this means T (v1 ) = 0). Prove that T is nilpotent. Use this
to give an easier proof that a square matrix in which all the entries on and below
the diagonal vanish is nilpotent.

(6.16) (Optional) Let q(x) = a0 + a1 x + a2 x2 + an1 xn1 + xn be a monic polynomial and


let A be its companion matrix as defined in Exercise (1.14). Prove that the char-
acteristic polynomial of A is (1)n q(). (In other wrods, every monic polynomial
can be expressed as the characteristic polynomial of some matrix. The theoretical
significance of this result is that the general problem of finding the eigenvalues of a
matrix cannot be easier than the general problem of solving polynomial equations.
Of course, for a particular matrix, there may be some short cuts, as there indeed
are for the matrices that represent an orthogonal projection or a reflection.)

17

Anda mungkin juga menyukai