Anda di halaman 1dari 261

Name: _________________________________

EE/ME 701 Advanced Linear Systems


Lecture Notes

Prof. Brian Armstrong


September 11, 2012

EE/ME 701: Advanced Linear Systems

Linear Algebra

State Space Modeling, Basics of Linear Algebra

EE/ME 701: Advanced Linear Systems


1.11 Eigenvalues and eigenvectors

Linear Algebra

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.11.1 Some properties of Eigenvectors and Eigenvalues

Contents

. . . . . . . . . . . . . 24

1.11.2 Additional notes on Eigenvectors and Eigenvalues . . . . . . . . . . . . . 28


1.11.3 One final fact about the eigensystem: V diagonalizes A . . . . . . . . . . . 32

1 Elements of Linear Algebra


1.1

Scalars, Matrices and Vectors

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3

Basic arithmetic: +, - and * are well defined . . . . . . . . . . . . . . . . . . . . .

1.3.1

Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Commutative, Associative, Distributive and Identity Properties . . . . . . . . . . .

1.4.1

Commutative property . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.2

Associative property . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.3

Distributive property . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.4

Identity Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.5

Doing algebra with vectors and matrices

. . . . . . . . . . . . . . . . . .

1.5

Linear Independence of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.6

Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.4

1.6.1

Computing the determinant . . . . . . . . . . . . . . . . . . . . . . . . . 11

1.6.2

Some relations involving determinants

1.6.3

The determinant of triangular and diagonal matrices . . . . . . . . . . . . 12

1.7

Rank

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.8

The norm of a vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.9

Basic properties of the singular value decomposition (svd) . . . . . . . . . . . . . 16

1.8.1

1.9.1

Example norms

2 Two equations of interest


2.1

2.2

35

Algebraic Equations, y = A b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.1.1

Case 1: Where we have n = p independent equations, the exactly


constrained case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.1.2

Case 2: Where n > p, we have more equations than unknowns, the over
constrained case (Left pseudo-inverse case): . . . . . . . . . . . . . . . . 39

2.1.3

Case 3: Where n < p we have fewer equations than unknowns, the underconstrained case (Right pseudo-inverse case): . . . . . . . . . . . . . . . 41

Differential Equations, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3 Summary

45

. . . . . . . . . . . . . . . . . . . 12

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

SVD, condition number and rank . . . . . . . . . . . . . . . . . . . . . . 17

1.10 The condition number of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 17


1.10.1 Condition number and error . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.10.2 Example of a badly conditioned matrix . . . . . . . . . . . . . . . . . . . 20
1.10.3 How condition number is determined . . . . . . . . . . . . . . . . . . . . 21

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 1

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 2

EE/ME 701: Advanced Linear Systems

Section 1.2.0

1 Elements of Linear Algebra

EE/ME 701: Advanced Linear Systems

Section 1.3.0

1.3 Basic arithmetic: +, - and * are well defined


Define:

1.1 Scalars, Matrices and Vectors

A=

Matrices are rectangular sets of numbers or functions, examples:


Matrices have zero or more rows and columns:

v (t)
1 0 7

31

,
x
(t)

R
x (t) =
A
=
w (t)
4 5 0
z (t)
2 3 6

A R33

1 2
3 4

B=

x=

2
3

1 1

w=

C = 4 1

5 2

3 4

Operations +, - and * are well defined. The dimensions of the operands


must be compatible.

Vectors are special cases of Matrices, with only one row or column,
x is a column vector,
h
i
w = 3 4 R12 is a row vector

For addition and subtraction, the operation is element-wise, and the


operands must be the same size:

Scalar values (numbers or functions with one output variable) can also be
treated as matrices or vectors:
3 = [3] R11

A+B =

0 3
5 7

AB =

2 1
1 1

If not the same size, there is no defined result (the operation is


impossible)
A + C = undefined

Array is a synonym for Matrix

1.2 Transpose
Transposing an array rearranges each column to a row:

#
"
3 1

3 4 5
T

C
C=
=
4 1
1 1 2
5 2
C R32 , CT R23
Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 3

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 4

EE/ME 701: Advanced Linear Systems

Section 1.3.1

EE/ME 701: Advanced Linear Systems

Section 1.4.2

1.4 Commutative, Associative, Distributive and Identity

1.3.1 Multiplication
For multiplication, the operation is by row on the left and column on the
right.
To produce one element of the result, go across each row on the left and
multiply with elements of the column on the right,
A B = AB =

"

(1 (1) + 2 2) (1 1 + 2 3)
(3 (1) + 4 2) (3 1 + 4 3)

"

3 7

5 15

A and B dont have to be the same size to multiply


Ax =

"

1 2
3 4

#"

2
3

"

(1 2 + 2 3)
(3 2 + 4 3)

"

18

The size of the result matrix is determined by the number of rows in


A and columns in B

AC =

1 2
3 4

Unlike scalar algebra, multiplication does not generally commute:


A B 6= B A

3 7
5 15

BA =

11 16

Generally: There are many properties of linear algebra which may or


may not be true for some special cases.
Does not generally commute means that perhaps special matrices can
be found that do commute, but that not all matrices commute.

Like scalar algebra, +, - and * have the associative property:

3 1

4 1 = undefined

5 2

(n = 3 , m = 2 , j = 2 , k = 2)

"
#
6 10
3 1

1 2

=
CA =
1 4
4 1
3 4
11 18
5 2

Lecture: Basics of Linear Algebra

Like scalar algebra, subtraction commutes with a - sign:


A B = (B A)

1.4.2 Associative property

To multiply A and B, we must have m = j , then C Rnk

Like scalar algebra, addition commutes: A + B = B + A

AB =

The number of columns in A must match the number of rows in B


that is, m = j

"

1.4.1 Commutative property

Example:

For A B = C , given A Rnm and B R jk

Examples:

Properties

(Revised: Sep 06, 2012)

(A + B) + C = A + (B + C)

(A + B) C = A + (B C)

(A B) C = A (B C)

Page 5

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 6

EE/ME 701: Advanced Linear Systems

Section 1.4.4

1.4.3 Distributive property

EE/ME 701: Advanced Linear Systems

Section 1.4.5

1.4.5 Doing algebra with vectors and matrices

Like scalar algebra, * distributes over + and - :

Starting with an equation,

(A + B) C = A C + B C

We can add, subtract or multiply on the left or right by any allowed term
and get a new equation.

C (A B) = C A C B

Examples, given:

Note: left-to-right order of multiplications in second example.

A+B = C

1.4.4 Identity Matrix

Then

Like scalar algebra, linear algebra has a multiplicative identity:

(A + B) + D = C + D

IL C = C IR = C
E (A + B) = E A + E B = E C

Examples:

1 0 0
3

0 1 0 4

0 0 1
5

3 1

4 1

0
5 2

3 1

=
1 4 1

2
5 2

3 1

=
4 1

1
5 2

(A + B) F = C F

Where A, B, C, D, E and F are compatible sizes.


Matrices that are the appropriate size for an operation are called
commensurate .

If A is square, then IL = IR , and we just call it I, the identity matrix:

1 0
0 1

1 2
3 4

Lecture: Basics of Linear Algebra

1 2
3 4

1 0
0 1

1 2
3 4

(Revised: Sep 06, 2012)

Page 7

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 8

EE/ME 701: Advanced Linear Systems

Section 1.5.0

1.5 Linear Independence of Vectors

EE/ME 701: Advanced Linear Systems

Section 1.6.0

1.6 Determinant
The determinant is a scalar measure of the size of a square matrix.

Linear Dependence: A set of p n-dimensional vectors,




v1 , v2 , v p , vi Rn

det (A) = |A| R1

is linearly dependent if there exists a set of scalars {ai } , i = 1...p , not all
of which are zero, such that:

p
..

a1 v1 + a2 v2 + + a p v p = ai vi = 0 0 = .
(1)

i=1
0
Linear Independence: The set of vectors is said to be linearly independent if
there is no set of values {ai } which satisfies the conditions for Linear
Dependence. In other words,
p

ai vi = 0

(2)

(5)

The determinant is not defined for a non-square matrix.


The determinant of a matrix will be non-zero if and only if the rows (or,
equivalently, columns) of a matrix are linearly independent. Examples:

(a)





1 2 3




4 5 6 =0




10 14 18

(b)



1 2 3


4 5 6


7 8 9






= 54


(6)

In case (a), the third column is given by 2xCol2 - 1xCol1.


In case (b), the three columns are independent.

i=1

implies that the {ai } are all zero.

Notice that in case (a), the third row is given by 2xRow1 + 2xRow2.

Written another way, vectors {vi } are linearly independent if and only if
(iff)
p

ai vi = 0

ai = 0 , i

Always for a square matrix, if the columns are dependent, the rows
will be dependent.

(3)

i=1

or written in the form of Eqn (1):

| |

v1 v2

| |

a1

| |

a
2

=0
vp

...

| |
ap

Lecture: Basics of Linear Algebra

iff

a
1

a2

.. = 0
.

ap

(Revised: Sep 06, 2012)

(4)

Page 9

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 10

EE/ME 701: Advanced Linear Systems

Section 1.6.1

1.6.1 Computing the determinant

ai j i j

for any i = 1, 2, ..., n

(7)

j=1

where Mi j is called a minor. Mi j is the same matrix as A, with the ith row and jth
column removed.
For example, with i = 1,

2. det (A B) = det (A) det (B)


3. Invertibility of a matrix: Given M Rnn, M is invertible iff det (M) 6= 0

j=2

j=3

a b c

d f
e f
d e

+ c det
b det

det d e f = a det

g i
h i
g h
g h i
Closed-form expressions for 1x1, 2x2 and 3x3 matrices are sometimes
handy. They are:

det [a] = a ,

(b) Similarity relation:



det M A M 1 = det (M) det (A) 1/ det (M) = det (A)

det

a b
c d

= ad bc

(Revised: Sep 06, 2012)

1. If a matrix has the upper-triangular form of Au or the lower-triangular form


of Al

0
d1
T
d1

d2
d2

, or Al =
Au =

...
...

T
dn
0
dn
Then det (Au ) = nk=1 dk , or det (Al ) = nk=1 dk .

Example:
>> A = [ 1 2 3 ; 0 4 5 ; 0 0 6]
A = 1
2
3
0
4
5
0
0
6
2. A diagonal matrix is special case of 1).

a b c

det d e f = a e i a h f b d i + b g f + c d h c g e

g h i
Lecture: Basics of Linear Algebra


(a) det M 1 = 1/ det (M)

1.6.3 The determinant of triangular and diagonal matrices


j=1

1. det (I) = 1, where I is the identity matrix

4. Given M, an invertible matrix,

where ai j is the element from the ith row and jth column of A; and i j is called the
cofactor, given by:
(8)
i j = (1)i+ j det Mi j

Section 1.6.3

1.6.2 Some relations involving determinants

The determinant of a square matrix (the only kind !) is defined by Laplaces


expansion (following Franklin et al.):
det A =

EE/ME 701: Advanced Linear Systems

Page 11

Lecture: Basics of Linear Algebra

>> det(A)
ans = 24

(Revised: Sep 06, 2012)

Page 12

EE/ME 701: Advanced Linear Systems

Section 1.7.0

1.7 Rank

EE/ME 701: Advanced Linear Systems

Section 1.8.0

1.8 The norm of a vector


|||| :

The rank of a matrix is the number of independent rows (or columns)


r = rank (A) ,

A Rnm

The number of independent rows is always equal to the


independent columns.

Example:

1 2 3 4
rT1

r =
A = rT2 = 5 6 7 8
1

6 8 10 12
rT
3

(9)
number of

r3 = 1.0 r1 + 1.0 r2, so the set of 3 vectors is linearly dependent. Because


there are 2 independent vectors,

(10)

The norm of a vector, written ||x|| is a measure of the size of the vector. A norm is
any function of a vector x with these properties, for any vector x and scalar a, then
1. Positivity, the norm of any vector x is a non-negative real number

Rn 7 R

||x|| 0

2. Triangle inequality
||x + v|| ||x|| + ||v||
3. Positive homogeneity or positive scalability
||a x|| = |a| ||x||

rank (A) = 2
where |a| is the absolute value of a .

The third row is the sum of the first two.


Notice that the 3rd column is 2 x Col2 - Col1
And the 4th column is 3 x Col2 - 2 x Col1

Actually, property 1 follows from properties 2 and 3, so 2 and 3 are sufficient for
the definition.

The rank of a matrix can not be greater than the number of rows or columns
rank (A) min (n , m)
A matrix is said to be full rank if rank (A) takes its maximum possible value,
that is
rank (A) = min (n , m)
Otherwise the matrix is said to be rank deficient.
A square matrix is invertible if it is full rank.
The determinant of a square matrix is zero if the matrix is rank deficient.
Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 13

Additionally, the norms we will use have the property


4. Positive definiteness
||x|| = 0

if and only if x = 0

That is, the norm is zero only if the vector is a vector of all zeros
(called the null vector). Technically, a norm with property 4 is
called a seminorm, but this distinction will not be important
for us.
Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 14

EE/ME 701: Advanced Linear Systems

Section 1.8.1

EE/ME 701: Advanced Linear Systems

Section 1.9.0

1.9 Basic properties of the singular value decomposition (svd)

1.8.1 Example norms


The 2-norm, (a.k.a. the Euclidean norm)
||x||2 =
example:
h
i
xT = 2 3 4 ,

||x||2 =

A matrix A Rnm has p singular values, where p = min (n, m)


n

x2i

(11)

r

22 + (3)2 + (4)2 = 5.39 (12)

Other common norms are the 1-norm and the -norm, these are all special
cases of the p-norm
We can write the 2-norm as:

x2i

||x||2 =

The p-norm is given as:

|xi|

||x|| p =

||x||1 =

|xi|
i

!1

||x|| = lim

Examples:
h
i


2 3 4 = 9
1

Lecture: Basics of Linear Algebra

|xi|
i

1
4
7
2

2
5
6
1

3
6
5
1

>> S = svd(A)
S = 14.1542
2.5515
0.3857

i , the singular values, are a measure of how a matrix scales a vector.


For example, for matrix A,

(13)

!1

(14)

another vector v3 so that A v3 is scaled by 0.3857.


With example matrix A, choosing v1 and v3 to illustrate the largest and
smallest singular values (more later on choosing v1 and v3) gives:
v1 = -0.5763,
-0.5735
-0.5822

= |xi |

v3 = 0.3724
-0.8184
0.4375

(15)

>> y1 = A*v1
y1 = -3.4700
-8.6660
10.3862
-2.3083

>> y3 = A*v3
y3 = 0.0482
0.0227
-0.1159
0.3640

Now ||y1|| = 14.1542 ||v1 || and ||y3|| = 0.3857 ||v3 ||

The -norm is the maximum absolute value:


!1

A =

there is a vector v1 so that A v1 is scaled by 14.1542, and

!1

The 1-norm is the sum of the absolute values:


(also called the Manhattan metric)
1

For example:

i=1

= max |xi |

(16)

h
i


2 3 4 = 4

Since 1 = 14.1542 is the largest singular value, there is no vector v that


gives a y = A v larger than y1 .
Since 3 = 0.3857 is the smallest singular value, there is no vector v that
gives a y = A v smaller than y3

(Revised: Sep 06, 2012)

Page 15

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 16

EE/ME 701: Advanced Linear Systems

Section 1.10.0

1.9.1 SVD, condition number and rank


The SVD is used to determine the rank and condition number of a matrix.
rank (A) = # singular values > 0 ,

rank (A) = 3

(Read # singular values significantly different from zero)


cond (A) = ratio largest / smallest singular value,
For square matrices, |det (A)| = ni=1 i ,

EE/ME 701: Advanced Linear Systems

Section 1.10.1

But suppose the ideal values y are unknown, and instead we have
measurements y given by:
y = y + e
y
(20)
Where
y is a measurement of y,

y is the ideal value of y (generally unknown), and

cond (A) = 36.70


det (A) = undefined

We will be looking at the SVD in detail in several weeks (cf. Bay, Chap. 4).

e
y represents measurement noise,

For example e
y = N (0, y ) are samples with a normal (Gaussian) random
distribution, with zero mean and y standard deviation.
Then what we can compute is

1.10 The condition number of a matrix

x = A1
1

The condition number of a matrix indicates how errors may be scaled when
the matrix inverse is used.
For example if we have these two equations in two unknowns,
7

= 3 x1 + 4 x2

y = Ax

becomes

Where x is an estimate of x . Notation:


x:

a vector in R p (to be estimated)

x :

The true value of x

x :

The estimated values of x

x = A1
1 y=

Lecture: Basics of Linear Algebra

0.2 0.4

1 2

x1
x2

One answer is
(18)
In the worst case, the error is amplified by the condition number of A1
max

01.

0.3

e
x = x x

An important question is, how much error does measurement noise e


y
introduce into the calculation of x ?

We can solve for unknowns x1 , x2 with

e
x: Misadjustment of estimates

1.10.1 Condition number and error

Writing in the matrix-vector form


7

(21)

(17)

9 = x1 + 2 x2

7
9

5
2

(Revised: Sep 06, 2012)

||e
x||
||e
y||
= cond (A1 )
||x||
||y ||

or
(19)
Page 17

||e
x|| ||x|| cond (A1 )
Lecture: Basics of Linear Algebra

(22)

||e
y||
||y||

(Revised: Sep 06, 2012)

Page 18

EE/ME 701: Advanced Linear Systems

Section 1.10.1

Example:

EE/ME 701: Advanced Linear Systems

Section 1.10.3

1.10.2 Example of a badly conditioned matrix

>> A1 = [3 4 ;-1 2]

>> ystar = [7; -9]

A1 =

ybar =

3
-1

Rather than Eqn (17) suppose we have the data

4
2

2 = 2 x1 + 4 x2
198 = 200 x1 + 401 x2

7
-9
Then x is estimated by:

>> xhat0 = A1 \ ystar


xhat0 =
5
-2
>> cond(A1)
ans =
2.6180

x 2 = A1
2 y2

%% Notice left-division operator


%% Given by calculation with no noise

>> A2 = [2 4 ;200 401],


%% This is the condition number.

2
200

4,
401

ystar2 =

2
198

The Condition number of A2 is large


>> cond(A2),

If ||e
y|| = 0.01 , then
max (||e
x|| / ||x ||) = 2.6180 (0.01/11.4) = 0.0023 , or

(23)

||e
x|| 2.6180 (0.01/11.4) 5.39 = 0.012

ans =

>> ytilde1 = 0.01 * rand(2,1)


ytilde1 = 0.0087
-0.0093
>> xhat1 = A1 \ (ystar1 + ytilde1)
xhat1 =
5.0055
%% Errors of about 0.0023 * ||x||
-2.0019
%%
= 0.0023 * 5.39 = 0.012

Estimating x 2 using A2

>> ytilde2 = 0.01 * rand(2,1)


ytilde2 =
0.0121
-0.0140
>> ybar2 = ystar2 + ytilde2
ybar2 =
2
+
0.0121
198
-0.0140
>> xhat2 = A2 \ ybar2
xhat2 =
7.4592
-3.2266

(Revised: Sep 06, 2012)

1.0041e+05 = 100410.

>> xstar2 = A2 \ ystar2


xstar2 = 5
%% Calculating with ideal data still gives
-2
%% ideal results

Example numerical values:

Lecture: Basics of Linear Algebra

A2 =

Page 19

Lecture: Basics of Linear Algebra

%% An example sample of noise

%% An example measurement of Y
2.0121
197.9860

%% This is way off !

(Revised: Sep 06, 2012)

Page 20

EE/ME 701: Advanced Linear Systems

Section 1.10.3

EE/ME 701: Advanced Linear Systems

Section 1.11.0

1.11 Eigenvalues and eigenvectors

1.10.3 How condition number is determined


Condition number is given as the ratio of the largest to smallest singular
value
A
(24)
cond (A) =
A
Where A is the largest singular value of A , and
A is the smallest singular value of A .

A square matrix has eigenvectors and eigenvalues, making up the


eigensystem of the matrix.
The main properties of eigenvectors and eigenvalues are introduced here.
Consider: A vector has both a direction and a magnitude. For example:

We will learn about the singular values later in this course

v1 =

>> SingularValues = svd(A1)


SingularValues =
5.117
1.954

%% cond(A1) = 2.618

>> SingularValues = svd(A2)


SingularValues =
448.1306
0.0045

%% cond(A2) = 1.0041 x 105

1
2

, v2 =

1.5
3.0

, v3 =

1.58
1.58

v1 has the same direction as v2 , but has a different magnitude.


v3 has the same magnitude as v1 , but has a different direction.
x2
v1

In general, multiplying a vector v by a matrix A

v3

x1

y4 = A v4
What matrices have condition numbers:
Singular values are defined for any matrix, and so condition number can
be computed for any matrix.

introduces both a change of magnitude and a change of


direction. For example:

y4 = A v4 =

A rank-deficient matrix B has B = 0 , and so cond (B) = .


(The error multiplier goes to infinity !)

2.0 0.5
1.0

0.5

1
1

2.5
0.5

v2
x2

y4
x1
v4

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 21

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 22

EE/ME 701: Advanced Linear Systems

Section 1.11.0

For any square matrix A, certain vectors have the property that the matrix
changes magnitude only, not direction. That is, writing

EE/ME 701: Advanced Linear Systems

Section 1.11.1

In general, we can write:


i vi = A vi
where the i

C1

(25)

are the eigenvalues and the vi are the eigenvectors.

y = Av
then if v is an example special vector, y and v have the same direction, but
possibly different magnitude. If the directions are the same,
y = v or

Notice that in general a x 6= A x, because multiplication by matrix A will


rotate a general vector x
(Choosing vectors x at random, what is the probability of selecting an
eigenvector ?)

v = Av
1.11.1 Some properties of Eigenvectors and Eigenvalues

These special vectors are called the eigenvectors.

1. Only square matrices have an eigensystem

For example matrix A, the special vectors are:

1.5
1.5

2.0 0.5
1.0

0.5

1
1

2. The eigenvalues are the solutions to the equation

2.0 0.5
1.0

0.5

1
2

A vi i vi = 0

An eigenvector is scaled by the matrix

1.5

1.5

1
2

2.0 0.5
1.0

0.5

2.0 0.5
1.0

0.5

Lecture: Basics of Linear Algebra

1
1
1
2

det (A i I) = 0

= 1.5

= 1.0

1
1
1
2

, so v1 =
, so v2 =

, 1 = 1.5

(26)

Proof: From Eqn (25)


so

(A i I) vi = 0

But, given that vi 6= 0, the second expression is only possible if (A i I) is


rank deficient.
QED

, 2 = 1.0

(Revised: Sep 06, 2012)

Page 23

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 24

Student
Exercise

EE/ME 701: Advanced Linear Systems

Section 1.11.1

Example using det (A i I) = 0: Starting with

det

a b
c d

EE/ME 701: Advanced Linear Systems

5. A complete set of eigenvectors exists when an n n matrix has n linearly


independent eigenvectors (so V, the matrix of eigenvectors, is invertible).

= ad bc

For the example matrix

A=

and plugging in

det

2.0 0.5
1.0

0.5

1 0
0 1

= det

Section 1.11.1

2.0

0.5

1.0

0.5

= (2.0 ) (0.5 ) 1.0 (0.5) = 2 2.5 + 1.5 = 0

1.0

0.5

the matrix of eigenvectors is

V=

The expression leads to a polynomial equation in :

2.0 0.5

v1 v2

1 1

1.5 0

1 2

and the matrix of eigenvalues is

3. The polynomial equation given by Eqn (26) is called the characteristic


equation of the matrix A
(a) A 3x3 matrix gives an equation in 3, a 4x4 gives an equation in 4 , etc.
(b) Abels theorem states: there is no closed-form solution for the roots of
a polynomial of 5th order and above, therefore there is no closed-form
solution for the eigenvalues a 5x5 matrix or larger.

U=

1 0
0 2

6. A matrix that lacks a complete set of eigenvectors is said to be defective,


but this name is meaningful only from a mathematical perspective. For
many control systems, some combinations of parameter values will give a
defective system matrix.

4. Special case, when A is upper-triangular, lower-triangular or diagonal


n

det (A I) = (dk )

(27)

k=1

A defective matrix can only arise when there are repeated eigenvalues.

Where dk is the kth diagonal element.


Eqn (27) shows that for upper-triangular, lower-triangular and diagonal
matrices, the eigenvalues are the diagonal elements.
Lecture: Basics of Linear Algebra

This case corresponds to the case of repeated roots in the study of


ordinary differential equations, where solutions of the form y (t) = t et
arise.

(Revised: Sep 06, 2012)

Page 25

In this case, a special tool called the Jordan Form is required to solve
the equation x (t) = A x (t) + B u (t).
Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 26

EE/ME 701: Advanced Linear Systems

Section 1.11.1

7. The eigensystem is computed in Matlab with the eig() command,

1. As mentioned, some matrices do not have a complete set of eigenvectors


(Student Exercise: What is the name for this ?)

V is the matrix of eigenvectors, U is a matrix with the corresponding


eigenvalues on the main diagonal.
1
2
-2
4
5
6
>> [V,U] = eig(A)
V =
0.5276
0.2873
-0.7994
U =

Section 1.11.2

1.11.2 Additional notes on Eigenvectors and Eigenvalues

>> [V,U] = eig(A)

>> A =

EE/ME 701: Advanced Linear Systems

For a symmetric real matrix (or Hermitian complex matrix), it is


guaranteed that all eigenvalues are real.

3
1
3 ]

Furthermore, it is guaranteed that a complete set of eigenvectors exists !

0.5267 - 0.0250i
-0.1294 + 0.1030i
0.8335

-2.4563
0
0

0
5.2282 + 0.5919i
0

0.5267 + 0.0250i
-0.1294 - 0.1030i
0.8335
0
0
5.2282 - 0.5919i

8. Notice that eigenvalues can be real or complex.


>> Poly = poly(diag(u))
Poly =
1.0000
-8.0000

2.0000

68.0000

3 8 2 + 2 + 68 = 0

(28)

As with many things, wikipedia has a very nice article on eigenvectors and
eigenvalues:
Everything you wanted to know about the eigensystem of A, in 3 pages.
(Revised: Sep 06, 2012)

System matrices (A in x = Ax) are rarely symmetric, but certain other


important matrices are.
3. The determinant of a matrix is equal to the product of the eigenvalues of the
matrix
n

det (A) = i
i=1

Example:
>> aa = [ 1 2 3 ; 4 -1 -3; 5 2 1]
aa =
1
2
3
4 -1 -3
5
2
1
>> [Evecs, Evals] = eig(aa)

The characteristic equation corresponding to 3 3 matrix A is

Lecture: Basics of Linear Algebra

2. Symmetric matrices are a special case.

Page 27

Evecs = 0.6128
0.0136
0.7901

0.4667
-0.8751
-0.1276

0.2146
-0.8542
0.4734

>> prod(diag(Evals))
ans = 6.0000
>> det(aa)
ans = 6.0000
Lecture: Basics of Linear Algebra

%% Find the eigensystem

Evals = 4.9126
0.0000
0.0000

0.0000
-3.5706
0.0000

0.0000
0.0000
-0.3421

%% Product of the eigenvalues


%% Equals the determinant
(Revised: Sep 06, 2012)

Page 28

EE/ME 701: Advanced Linear Systems

Section 1.11.2

EE/ME 701: Advanced Linear Systems

Section 1.11.2

5. Matlab does not create the characteristic equation using det (A I) = 0,


and then solving for the polynomial roots.

4. The determinant equation


det (A I) = 0

(29)
Matlab actually goes the other way, and solves for the roots of a polynomial
by forming a matrix and finding the eigenvalues of that ! The polynomial

gives a polynomial in

1 3 8 2 + 2 + 68 = 0
+ an1
n

n1

+ + a1 + a0 = 0

(30)
is represented in Matlab as the vector

Eqn (30) is the characteristic equation of matrix A. The eigenvalues of


A are the roots of Eqn (30).

Poly = [ 1, -8, 2, 68]


and the command roots() finds the roots:

Eqn (29) is useful for theoretical results, but generally not a practical method
for calculating the eigenvalues, for n > 2:

>> roots(Poly)

Once you get Eqn (30), how are you going to find the roots of the
polynomial ? (Answer on the next page)
Working with the determinant does not lead to a numerically stable
algorithm.

ans = [ 5.2282 + 0.5919i


5.2282 - 0.5919i
-2.4563

Actually under the hood Matlab forms a matrix (called the companion
matrix) and applies the eigenvalue routine to that:
>> compan(Poly)

ans =

The determinant does not give the eigenvectors

8.0000
1.0000
0

-2.0000
0
1.0000

-68.0000
0
0

Going from Eqn (29) to Eqn (30) by solving the determinant involves
symbolic manipulation of matrix A. It is a lot of work.
>> eig(compan(Poly))
ans = [ 5.2282 + 0.5919i
5.2282 - 0.5919i
-2.4563 ]

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 29

Lecture: Basics of Linear Algebra

3 + a2 2 + a1 + a0 = 0

a2 a1 a0

C= 1
0
0

0
1
0
(Revised: Sep 06, 2012)

Page 30

EE/ME 701: Advanced Linear Systems

Section 1.11.2

Relationship

EE/ME 701: Advanced Linear Systems

Section 1.11.3

1.11.3 One final fact about the eigensystem: V diagonalizes A

Roots of Characteristic Equation Eigenvalues of the Companion Matrix


For a state-variable model, the poles of the system are the eigenvalues of the
A matrix

When a complete set of eigenvectors exists, the matrix of the eigenvectors


can be used to transform A into a diagonal matrix, by similarity transform.
The general form for the similarity transform is:

If the A matrix has poles in the right half plane, the system is unstable.
Note: I havent told you an algorithm for eig(A).

b = T A T 1
A

T must be invertible

The algorithm is: Use Matlab !

(31)

THEOREM: The similarity transform preserves eigenvalues, that is

Matlabs eig() command uses one from a library of algorithms,


depending the details of the matrix. The study of efficient algorithms
to find eigenvectors and eigenvalues has been an active area of research
for at least 200 years.

b
eig (A) = eig(A)
Proof: starting with Eqn (26),
det (A I) = 0
Left multiplying by T and right multiplying by T 1


det T (A I ) T 1 = det (T ) det (A I)

1
=0
det (T )

(32)

Where the right hand equality arises because det (A I) = 0 and det (T ) is finite.
From (32) it follows that






b I = 0
det T (A I ) T 1 = det T A T 1 T I T 1 = det A

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 31

b
Therefore any value that is an eigenvalue of A is an eigenvalue of A.
Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

QED
Page 32

EE/ME 701: Advanced Linear Systems

Section 1.11.3

THEOREM: When a square matrix A has a complete set of independent


eigenvectors
i
h
V = v1 , v2 , , vn ,

EE/ME 701: Advanced Linear Systems


V diagonalizes A (continued)
A numerical example:

the array V of eigenvectors provides a similarity transform

where U is a diagonal matrix with the eigenvalues on the main diagonal.


Proof: Since the vectors vi are eigenvectors,

V=
A vi = i vi

AV = A

v1 , v2 , , vn

1 v1 , 2 v2 , , n vn

1 v1 , 2 v2 , , n vn

v1 , v2 , , vn

1 0 0
0 2 0
.. .. . . . ..
.
. .
0

0 n

1 0

[V, U] = eig (A)

0.2320 0.7858

16.1168

1.1168

0.0000

= VU

Left multiplying both sides by V1 gives:

b = U.
Equation (33) is a similarity transform with T = V1 and A

(Revised: Sep 06, 2012)

= 0.5253 0.0868 0.8165

0.8187 0.6123 0.4082

V1 A V =

16.1168

1.1168

0.0000

1 2 3

V U V1 = 4 5 6

7 8 9

VU = AV

V1 V U = U = V1 A V

0.4082

We have:

So

Lecture: Basics of Linear Algebra

v1 v2 v3

U = 0 2 0 =

0 0 3

The matrix on the right can be expressed:

And so
h

1 2 3

A= 4 5 6

7 8 9

U = V1 A V

Section 1.11.3

(33)

(and, of course,

=U

=A

i vi = A vi )

QED
Page 33

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 34

EE/ME 701: Advanced Linear Systems

Section 2.0.0

2 Two equations of interest

y, A : known;

Section 2.1.0

2.1 Algebraic Equations, y = A b

1. Solving for unknown constants (an algebraic equation)


y = Ab

EE/ME 701: Advanced Linear Systems

solve for: b

(34)

Think of n equations in p unknowns, for example an experiment with a


process:
y (k) = b1 v (k) + b2 v2 (k)
(35)
where v (k) is an independent variable and y (k) is the measurement.
The objective is to determine the elements of the b vector from Eqn (35)
with v (k) known and y (k) measurements.

2. Solving a differential equation


x (t) = A x (t) + B u (t)

A, B, x (t = 0) : known;

solve for x (t)

Introducing some notation, we can write:


y (k) = (k) b =
T

v (k) v2 (k)

b1
b2

(36)

where (in this example)

(k) =
For example, if b =

2 3

iT

y (1) = (1) b =
T

y (2) = T (2) b =
..
.
y (n) = (n) b =
T

. See Strang for a nice treatment of the two problems of linear algebra.

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 35

Lecture: Basics of Linear Algebra

h
h
h

v (k)
v2 (k)

(37)

and v (1) = 2, v (2) = 3, ... v (n) = 1 then

2 4

3 9

1 1

= 2 2 + 4 3 = 16

= 3 2 + 9 3 = 33
2
3

= 1 2 + 1 3 = 1
(Revised: Sep 06, 2012)

Page 36

EE/ME 701: Advanced Linear Systems

Section 2.1.0

With several measurements (equations) we get a matrix vector equation (the


- emphasize that T (k) is a row vector:

y (1)
T (1)

..
y = ... =
b = Ab
.

T
y (n)
(n)
Vocabulary

EE/ME 701: Advanced Linear Systems

2.1.1 Case 1: Where we have n = p independent equations, the exactly


constrained case
In this case A Rnn is a square matrix.

(38)

When A is full rank an inverse matrix exists so that


A1 A = A A1 = I

Example

b R p is the parameter vector.

b=

y Rn is the data vector

b1 b2

iT

v (k) R1 is an independent variable that determines the regressor


vector
iT
h
(k) = v (k) v2 (k)
(k) R p is the regressor vector
A Rnp is the regressor matrix

The columns of A, which correspond to the elements of () are called


the basis vectors or basis functions of model A

The solution to Eqn (34) is given as:

y
b
y

e
y

: Measured values
b
b

e
b

: Estimated values
: Misadjustment

Generally, the true values are unknown


b
y = Ab
b

(left multiply by A

Example,

y = A1 A b
(39)

(40)

T (k) =

based on Eqn (36), i.e.,

v (k) v2 (k)

i

Suppose we have data for k = 1, 2, with v (1) = 1.5, v (2) = 3 and


y (1) = 12.75, y (2) = 43.5

Page 37

b
b = A1 y

12.75
43.5

y (1)
y (2)

b
b=
(Revised: Sep 06, 2012)

which gives the solution

and so

Lecture: Basics of Linear Algebra

A1 y = I b = b

e
y=b
yy

y = Ab

(original equation)

Notation
y b : True values

Section 2.1.1

T (1)
T (2)

1.5 2.25
3.0 9.0

Lecture: Basics of Linear Algebra

b
b = Ab
b=

y,

b
b=

2.5
4.0

1.5 2.25
3.0 9.0

b
b

(Revised: Sep 06, 2012)

Page 38

EE/ME 701: Advanced Linear Systems

Section 2.1.2

2.1.2 Case 2: Where n > p, we have more equations than unknowns, the
over constrained case (Left pseudo-inverse case):
This is a common case. For example, a model has 3 parameters, and we run
an experiment and collect 20 data. We have 20 equations in 3 unknowns.
b.
In this case there is generally no value of b which exactly satisfies y = A b

Define e
y = y Ab
b,
minimizing ||e
y||2

EE/ME 701: Advanced Linear Systems

Section 2.1.2

Consider the dimensions of A and A# :

h
:

AT

AT

Student exercise: what are the dimensions of A# ?

the left pseudo-inverse gives the solution

To derive the left pseudo-inverse, start with:

y = Ab
b

: [] =

[]

1. Left multiply each side by AT :


AT y = AT A b
b


Note that AT A is an p p matrix, for example a 3 3 matrix when
there are 3 unknowns.

1
2. When AT A R pp is full rank, left multiply each side by AT A
gives:


1
1 

b
(41)
AT A b
b = Ib
AT A
AT y = AT A

3. Which gives

1

b
AT y = A# y
b = AT A

(42)

1 T
A is called the left pseudo-inverse of A .
where A# = AT A
Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 39

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 40

EE/ME 701: Advanced Linear Systems

Section 2.1.3

2.1.3 Case 3: Where n < p we have fewer equations than unknowns, the
under-constrained case (Right pseudo-inverse case):
In the under-constrained case, A is a row matrix, so our equation has the
shape:

h
i

y = Ab
: [] =

b which exactly satisfy


In this case there are generally many values of b


b
b
y = A b , the right pseudo-inverse gives the solution with minimum b
2

EE/ME 701: Advanced Linear Systems

Section 2.1.3

Remarks
1. For each of cases 1, 2, and 3 we had the requirement that A , AT A or
A AT must be full rank. When these matrices are not full rank, a more
general method based on the Singular Value Decomposition (SVD) is
needed.
2. With the SVD and four fundamental spaces of matrix A we will be able to
find the set of all vectors b
b which solve
y = Ab

The right pseudo-inverse is given by:


1
T
A+
A AT
R = A

AT


When A AT Rnn is full rank then:

AT

We will see the SVD and four fundamental spaces of a matrix in Bay,
Chapter 4.


1
T
A AT
=I
A A+
R = AA

In this case, one solution for b


b is given by:

(43)

Plugging Eqn (43) into y = A b


b gives:

(44)

b
b = A+
Ry


1
= AT A AT
y


1
y = A AT A AT
y = Iy

Demonstrating that b
b = A+
R y is a solution.

Student exercise: what are the dimensions of A+


R ?
Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 41

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 42

EE/ME 701: Advanced Linear Systems

Section 2.2.0

2.2 Differential Equations,

EE/ME 701: Advanced Linear Systems

Section 2.2.0

The state-variable model is the topic of the second half of the course.

The linear state variable model is a very general form for modeling dynamic
systems in control theory, economics and operations research, biology and
other fields.
The state-variable model is a differential equation with the form:
x (t) = A x (t) + B u (t)

Here we only note that solving the differential equation


x (t) = A x (t) + B u (t)
is fundamentally different from solving the algebraic equation

(45)

where

y = Ab
and involves the eigensystem of the system matrix A and using a similarity
transform and transformation to modal coordinates to determine the
solution.

x (t) Rn is the state vector,


A Rnn is the system matrix,

The differences between the solution of algebraic and differential equations


is summarized in table 1.

B Rnm is the input matrix, and


u (t) Rm is the input vector.

Equation Type
Just as solving an algebraic equation,

Algebraic

Differential

Nature of Solution

A value

A function of time

Main Tools:

Gaussian Elimination
Matrix Inverse
SVD

Eigenvalues
Eigenvectors

Singular Matrix

Problems,
Solution requires
SVD

O.K.

Complete set of Eigenvectors

Not important

Important

Rectangular matrix

OK

Impossible

3x = 7
is very different from solving a scalar differential equation,
3 x (t) = 2 x (t) + 7 u (t)

Solving a matrix differential equation requires tools very different from


those for solving a matrix algebraic equation.

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 43

Table 1: Differences in the approach to algebraic and differential equations. A


singular matrix is rank deficient.

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 44

EE/ME 701: Advanced Linear Systems

Section 3.0.0

3 Summary
Basics of linear algebra have been reviewed.
The two problems of linear algebra have been discussed.
Algebraic equations
y = Ab

(46)

take us into vector subspaces, projection and the Singular Value


Decomposition (SVD)
Differential equations
x (t) = A x (t) + B u (t)

(47)

take us into the eigensystem and coordinates transformations.

Lecture: Basics of Linear Algebra

(Revised: Sep 06, 2012)

Page 45

EE/ME 701: Advanced Linear Systems

EE/ME 701: Advanced Linear Systems

Vectors and Vector Spaces

Sn : the standard basis . . . . . . . . . . . . . . . . . .

28

4.1.3

Representations on vector spaces other than Rn . . . . . .

30

4.2

A Span of a Vector Space

. . . . . . . . . . . . . . . . . . . . .

31

4.3

Change of basis . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

Contents
1 Introduction

4.1.2

4.3.1

Notation for a change of basis . . . . . . . . . . . . . . .

33

1.1

Definitions: three ways to multiply vectors . . . . . . . . . . . . .

4.3.2

Chaining changes of basis . . . . . . . . . . . . . . . . .

34

1.2

Definitions: Properties of Vectors

. . . . . . . . . . . . . . . . .

4.3.3

Change of basis example . . . . . . . . . . . . . . . . . .

34

1.2.1

Parallel . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2.2

Perpendicular (or Orthogonal) . . . . . . . . . . . . . . .

Change of basis viewed geometrically (this section is connected


to Bay, section 2.2.4) . . . . . . . . . . . . . . . . . . . . . . . .

36

1.2.3

Norm (magnitude) . . . . . . . . . . . . . . . . . . . . .

4.4.1

Example, change of basis viewed geometrically

. . . . .

37

1.3

Direction Cosine . . . . . . . . . . . . . . . . . . . . . . . . . .

10

4.4.2

1.4

Parallel, Perpendicular and the Zero Vector . . . . . . . . . . . .

11

Numerical example based on representing from basis


vectors on the to basis vectors . . . . . . . . . . . . . .

38

Summary . . . . . . . . . . . . . . . . . . . . . . . . . .

40

4.4

4.4.3
2 Vector Spaces

12

2.1

Properties the scalars in Eqn (12) must have . . . . . . . . . . . .

14

2.2

Properties a Vector Space must have . . . . . . . . . . . . . . . .

15

3 Properties of ensembles of vectors


3.1

3.2

17

5.1

5.2

41

Example proper vector subspace . . . . . . . . . . . . . . . . . .

43

5.1.1

Observations on subspaces of Euclidean spaces . . . . . .

45

What about other dimensions of A ? . . . . . . . . . . . . . . . .

46

Defining a vector space . . . . . . . . . . . . . . . . . . . . . . .

19

5.2.1

The set-theoretic meaning of almost all . . . . . . . . .

46

3.1.1

A vector space defined by a spanning set of vectors . . .

20

5.2.2

A vector y orthogonal to a proper subspace B . . . . . . .

47

3.1.2

A vector space defined by a set of basis vectors . . . . . .

20

Dimension of a Vector Space . . . . . . . . . . . . . . . . . . . .

21

4 Basis of a Vector Space


4.1

5 Vector Subspace (following Bay 2.4)

23

Representation of vector x on basis V . . . . . . . . . . . . . . .


4.1.1

(Revised: Sep 10, 2012)

Page 1

48

6.1

Projection Theorem . . . . . . . . . . . . . . . . . . . . . . . . .

50

6.2

Projection of a vector onto a proper subspace . . . . . . . . . . .

51

6.2.1

First projection example, projection onto a 1-D subspace .

51

Normalization of the basis vectors . . . . . . . . . . . . . . . . .

54

24

Introducing notation to help keep track of vectors and bases 26

Part 2: Vectors and Vector Spaces

6 Projection Theorem

6.3

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 2

EE/ME 701: Advanced Linear Systems


6.4

EE/ME 701: Advanced Linear Systems

Projection Matrices . . . . . . . . . . . . . . . . . . . . . . . . .

56

6.4.1

Bay example 2.10, projecting f onto a 2-D subspace . . .

57

9.1

Important properties of Inner Products . . . . . . . . . . . . . . .

87

6.4.2

Projection matrix for the orthogonal complement . . . . .

60

9.2

Important properties of Norms . . . . . . . . . . . . . . . . . . .

88

6.4.3

Projection with normalized basis vectors . . . . . . . . .

61

9.3

Study tip: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

7 Gram-Schmidt ortho-normalization
7.1

63

7.1.1

Example Gram-Schmidt Ortho-normalization . . . . . . .

63

7.1.2

Tolerance value for Gram-Schmidt algorithm . . . . . . .

67

7.2

Projection matrix with GS Ortho-normalization . . . . . . . . . .

68

7.3

Projection Coefficients . . . . . . . . . . . . . . . . . . . . . . .

69

7.4

Projection onto the orthogonal complement. . . . . . . . . . . . .

70

7.5

Projection and fitting parameters to experimental data . . . . . . .

71

8 Additional Topics

8.2

72

The Four Fundamental Spaces of a Matrix . . . . . . . . . . . . .

72

8.1.1

Numerical Examples of the four fundamental spaces, . . .

74

8.1.2

Computing bases for the four fundamental spaces . . . . .

78

8.1.3

Bases for the Four Fundamental Spaces, Numerical


Example . . . . . . . . . . . . . . . . . . . . . . . . . .

80

8.1.4

The Four Fundamental Spaces of a Matrix, revisited . . .

81

8.1.5

Questions that can be answered with the four fundamental


spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

Two ways to determine the four fundamental spaces . . .

83

Rank and degeneracy . . . . . . . . . . . . . . . . . . . . . . . .

84

8.1.6

Part 2: Vectors and Vector Spaces

86

62

Process of Gram-Schmidt Ortho-normalization . . . . . . . . . .

8.1

9 Summary and Review

(Revised: Sep 10, 2012)

Page 3

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 4

EE/ME 701: Advanced Linear Systems

Section 1.0.0

1 Introduction

EE/ME 701: Advanced Linear Systems

Section 1.1.0

1.1 Definitions: three ways to multiply vectors


Three types of vector products are

A vector: an n-tuple of numbers or functions

A 3-vector: x = 1

An n-vector: z =

z1
z2
..
.
zn

sin (2k/4)

cos (2k/4)

A 4-vector: x (k) =

sin (4k/4)

cos (4k/4)

A 2-vector: u (t) =

2.0 sin (t)


2.0

Euclidean Vector: A Euclidean n-space corresponds to our intuitive notion of


2-D or 3-D space:

Inner product (also scalar or dot product):

gives a scalar value

Outer product (also tensor or matrix product):

gives a matrix, and

Vector product (also cross product):

gives a vector result.

Inner (scalar) product (also dot product):


A scalar measure of the interaction of two vectors of the same dimension
hv, wi = vT w = wT v =

vi wi

where v , w , Rn

(1)

i=1

If v is complex, then vT is the complex-conjugate transpose, (note: in


Matlab the transpose operation forms the complex conjugate).
The inner product of vectors gives a scalar,

hv, wi = vT w = (a scalar)

Outer (matrix) Product:


Axes are orthogonal

The outer product gives a matrix,

Each vector corresponds to a point

v >< w = v wT = (a matrix)

Example:

Intuitive because we live in Euclidean 3-space, R3


Length (L2 -norm) corresponds to our every-day notion of length.



v= 2

3



w= 3

1

hv , wi = 11

2 3 1

v >< w = 4 6 2

6 9 3

Example of using the Outer Product: each contribution to the covariance


of parameter estimates is given by:

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 5

Part 2: Vectors and Vector Spaces

2eb = e
be
bT

(Revised: Sep 10, 2012)

Page 6

EE/ME 701: Advanced Linear Systems

Section 1.1.0

EE/ME 701: Advanced Linear Systems

Section 1.2.2

1.2 Definitions: Properties of Vectors

Cross (vector) product:

1.2.1 Parallel
vw = z

(2)

Two vectors v and w are parallel if they can be written

Produces a vector perpendicular to each of v and w

w = av

The magnitude of the vector is give by:

(4)

where a is a scalar.

||z|| = ||v|| ||w|| sin ()

(3)

Example:





v = 2 , w = 6 ,


3
9

where is the angle between v and w.


The cross-product is only defined for vectors in 3-space. It is used, for
example, electro-magnetics, 3D geometry and image metrology.



w=3 2 ,

3

so

vkw

Co-linear is another term for parallel vectors, as in


Cross product and Outer product are not much used in EE/ME 701

Vectors v and w are co-linear.

Inner (or dot) product will be used All the time.


1.2.2 Perpendicular (or Orthogonal)
If

hv, wi = 0

(5)

then v and w are said to be perpendicular (or, equivalently, orthogonal).


Example:

v = 2 , w = 5 ,

3
3

vT w =

1 2 3

so

5 = 1 10 + 9 = 0

vw
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 7

Part 2: Vectors and Vector Spaces

(6)
(Revised: Sep 10, 2012)

Page 8

EE/ME 701: Advanced Linear Systems

Section 1.2.3

EE/ME 701: Advanced Linear Systems

Section 1.3.0

1.3 Direction Cosine

1.2.3 Norm (magnitude)


The norm of a vector x is a non-negative real number, ||x|| 0

The inner prod. indicates how closely two vectors are related.

A norm is a measure of size (of a vector or matrix)\

If two vectors are parallel

(This is the correct test for ||)

The L p norms are most familiar.


|hv, wi| = ||v|| ||w||
L: Henri Lebesgue (1875-1941), set calculus (integration and
differentiation) on a rigorous foundation based on set theory.
The L2 norm, ||v||L2 (or ||v||2 , or simply ||v||) is the familiar notion of
distance:
r
||v||2 = v2i
(7)
i

hv, wi = 0

||v|| p =

|vi| = (|v1| + |v2| + ... + |vn| )


p

or

hv, wi
=0
||v|| ||w||

For vectors forming angles in between || and ,


Define :

1
p

1.0

(This is the correct test for )

If two vectors are perpendicular

The general L p norm ||v||L p (or simply ||v|| p):


r

hv, wi
= 1.0 or
||v|| ||w||

or

(8)

hv, wi
= cos ()
||v|| ||w||

(10)

The L2 norm is the default, throughout the literature, ||v|| refers to ||v||2,
unless expressly defined to be a different norm (or any norm).

The L2 norm is related to the dot product of a vector with itself:


1
2

||v||2 = hv, vi =

vT v

v2i

A handy corollary is:




hv, wi
= cos1
(11)
||v|| ||w||

w
(9)

Figure 1: Direction cosine.

In a general framework, vectors may be made with elements which are


not numbers, but so long as the dot product is defined, the 2-norm is still
given by Eqn (9).
The L2 -norm is said to be induced by the dot product. It is called the
induced norm.

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 9

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 10

EE/ME 701: Advanced Linear Systems

Section 1.4.0

Section 2.0.0

2 Vector Spaces

1.4 Parallel, Perpendicular and the Zero Vector


Looking at Eqn (4)
w = av

EE/ME 701: Advanced Linear Systems

(4, repeated)

When we talk about vector spaces we talk about sets of vectors. Closure is
the first important property of sets.
Rn is the most basic vector space. It is the set of all vectors with n elements.

If w = 0 (the zero vector) then

Closure: A set is said to be closed under an operation, if the operation is


guaranteed to return an element of the set.

0


0
= w = 0v
..
.

0

Examples:
The integers are closed under addition, subtraction and multiplication.
The integers are not closed under division.

with a = 0 .

The positive integers are closed under addition and multiplication.


The positive integers are not closed under subtraction or division.
Vector spaces are collections of vectors which are closed under scaled
addition (like superposition).

And likewise, if w = 0, then clearly


hv, wi = 0

(5, repeated)

So in a formal sense, we can say that every vector is both parallel and
perpendicular to the zero vector.

For a vector space S

S : {a set of vectors}
if
then

x1 , x2 S
x3 = a1 x1 + a2 x2 S

(12)

The central topic of Bay, chapter 2, is describing the ai , xi and S in the most
general possible way.
Lets start with a specific example, S = R3, the space in which we live.
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 11

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 12

EE/ME 701: Advanced Linear Systems

Section 2.0.0

EE/ME 701: Advanced Linear Systems

Section 2.1.0

2.1 Properties the scalars in Eqn (12) must have

Defining a vector x:

x
1

x = x2

x3

Eqn (12) comprises scalars and vectors


,

x1 , x2 , x3 R

(13)

The scalars must be elements of a field. For a set with operations to form a
field, it must have:
1. An additive identity 0:

Then S = {x} = R3 , the vector space S is the set of all vectors x


comprised of 3 real numbers.

a+0 = a

(15)

a1 = a

(16)

2. A multiplicative identify 1:
Examples:

2 ,



1 ,

5



0

0

3. Every element must have a negative:


if a F, then a F

Verifying that a1 x1 + a2 x2 S with an example (choosing the ai to be


real numbers):

1
2
0


2.0 2 + 1.0 1 = 5


1
5
3

(14)

a + a = 0

(17)

4. Operations of addition, multiplication and division must be defined, and


the set must be closed under these operations.
Examples:
The integers do not form a field (not closed under division)
The rational numbers do form a field (+, , , all work)
The field we will use most often is the real numbers, a R .

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 13

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 14

EE/ME 701: Advanced Linear Systems

Section 2.2.0

2.2 Properties a Vector Space must have

EE/ME 701: Advanced Linear Systems

Section 2.2.0

4. Closure under scalar multiplication: For all x S and for all a F, the
vector y = a x must be an element of S.

These are the fundamental properties of vector spaces and operations on


vectors. Other important properties, quantities and calculations are built out
of these basics.
A vector space is a set of vectors. It must have at least these properties:

x S,

a F , y = ax S

(23)

: for all
5. Associativity of scalar multiplication:

1. S must have at least 2 elements.


a, b F and x S ,

2. The linear vector space S must contain a zero element, 0 such that:
x+0 = x,

xS

(18)

(24)

6. Distributivity of scalar multiplication over vector addition:

3. Vector space S must contain the additive inverse of each element,


if x S , then y S s.t. x + y = 0

a (b x) = b (a x)

(19)

(a + b) x = a x + b x

(25)

a (x + y) = a x + a y

(26)

: there exists
The vector space (a set of vectors) and its operations must also have the properties:
1. The vector space S must be closed under addition:
if x + y = v, then v S

(20)

The fundamental properties of fields are such standard components of


standard algebra that we forget to think about them.
Any Vector Space (that is, set of vectors which satisfies the fundamental
properties above) will have the derived properties of vector spaces which
we will develop, such as
Orthogonality

2. Commutative property for addition:

Projection
x+y = y+x

(21)

length
etc.

3. Associative property for addition:

For 99% of what we will do in EE/ME 701, we will work with vectors in Rn
over the field of the real numbers.

(x + y) + z = x + (y + z)

(22)
Which is to say the familiar vectors and scalars, such as in Eqn (14).

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 15

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 16

EE/ME 701: Advanced Linear Systems

Section 3.0.0

3 Properties of ensembles of vectors


In many cases it is interesting to consider a finite set of vectors, such as:


v1 , v2 , ..., v p S

(27)

Example:

V=

1


2 ,

0

0 ,

0


1 ,

1

Section 3.0.0

The expression for linear dependence can also be written:

b1

| | |
|

b2

v p = v1 v2 v p1 .
=V b

..

| | |
|
b p1

(29)

And recall that a set of vectors is linearly independent if there is no set of


values {bi } which satisfies the conditions for Linear Dependence. In other
words,

1


1

1

EE/ME 701: Advanced Linear Systems

R3

bi vi = 0

i=1

implies that the {bi } are all zero.

We shall shortly see that a finite set of vectors can define a Vector Space.
Notation for a finite set of vectors: let V connote a set with p vectors:



V = v1 , v2 , ..., v p ,

or equivently ,

V = v1 v2 v p (28)

| | | |

New vectors can be formed as linear combinations of vectors v1 v p :


w = a1 v1 + a2 v2 + + a p v p
Recall that a set of vectors is linearly dependent if one or more of the
vectors can be expressed as a linear combination of the others, such as:
v p = b1 v1 + b2 v2 + + b p1 v p1

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 17

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 18

EE/ME 701: Advanced Linear Systems

Section 3.1.0

3.1 Defining a vector space

EE/ME 701: Advanced Linear Systems

Section 3.1.2

3.1.1 A vector space defined by a spanning set of vectors

In many cases, vector spaces contain an infinite number of vectors, so it


is not possible to define a vector space by writing down all its elements.
However, we can define a vector space in terms of a finite set of vectors.
Most often, we define a vector space as the set of all vectors which can be
created by linear combinations of specified vectors. For example:

2
1

S1 = x : x = a1 1 + a2 0 , a1, a2 R

1
2

(30)



S = x : x = a1v1 + a2v2 + ... + a pv p

(31)

where S is a vector space;


S is the set of all vectors formed as linear combinations of the spanning


vectors v1 , v2 , v p .
We can say that S is the space spanned by v1 , v2 , v p, and that vectors
v1 , v2 , v p span vector space S.

Vector space S1 is the set of all vectors x such that



Given an arbitrary set of p n-dimensional vectors V = v1 , v2 , v p , vi Rn ,
the set of vectors defines a vector space

Any set of vectors, at least one of which is non-zero, spans a vector space.

3.1.2 A vector space defined by a set of basis vectors

x = a1 1 + a2 0

1
2

A span is a very general concept, perhaps a bit too general. The concept of a basis
for a vector space is more restrictive. It works like a span, we write that the set of
basis vectors {vi } defines a vector space according to Eqn (31) , but for a set {vi }
to be a set of basis vectors they must additionally be linearly independent. When
the vectors vi are independent and we write

for any values of a1 and a2 , which are real numbers.


Read Eqn (30) as saying:

x = a1v1 + a2v2 + ... + ar vr

(32)

S1 =

Vector space S1 is

{}

the set

of vectors x

A set of basis vectors is the minimum set of vectors that spans a space

such that

For a given vector space S, the choice of basis vectors is not unique.

x =

x satisfies the conditions given

Any basis set for S is also a span for S

Part 2: Vectors and Vector Spaces

there is a unique solution {a1, a2... ar } for every vector x S. Because the basis
vectors vi are linearly independent, there is exactly one solution for Eqn (32).

(Revised: Sep 10, 2012)

Page 19

Part 2: Vectors and Vector Spaces

(is any span S also a basis S?)


(Revised: Sep 10, 2012)

Page 20

EE/ME 701: Advanced Linear Systems

Section 3.2.0

3.2 Dimension of a Vector Space

EE/ME 701: Advanced Linear Systems


Definition of vector space S1

Dimension: The dimension of a linear vector space is the largest number of


linearly independent vectors that can be taken from that space.
Examples:
The dimension of vector spaces Rn is n
The dimension of the vector space given by Eqn (33) is 2

A vector subspace is itself a vector space (if we call S1 a subspace, we


are emphasizing that it is embedded in a larger vector space).

Example:
Universe is R3
S1 is 2-D

(Eqn (33) repeated)

2
1

3
S1 = x : x = a1 1 + a2 0 , a1, a2 R , xi R

1
2

Set S1 is a vector space. Thus, if v and w are elements of S1 , then

2
1

3
S1 = x : x = a1 1 + a2 0 , a1 , a2 R , xi R
(33)

1
2

Section 3.2.0

z = b1 v + b2 w ,

z S1 ,

b1 , b2 R

Consider the properties of a vector space one by one, you will see that
they all hold.
The dimension of S1 is the number of linearly independent vectors that can
be selected from S1 .
Note that even though v, w, z R3 , it is still true that dim (S1 ) = 2.

1
0
1
2
3
3
2

1
0

2
Y

2
3

Figure 2: S1 , a 2-dimensional vector subspace embedded in R3 .


Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 21

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 22

Student
Exercise

EE/ME 701: Advanced Linear Systems

Section 4.0.0

4 Basis of a Vector Space

EE/ME 701: Advanced Linear Systems

Section 4.1.0

4.1 Representation of vector x on basis V

Working with a particular vector space V of vectors v, x, y Rn , if we select


a set of vectors BV = {v1, v2, ..., vr } from V such that
1. We have r vectors, where r is the dimension of vector space V, and

VECTOR REPRESENTATION THEOREM: Given a set of vectors BV =


{vi } = {v1 , v2 , ..., vr } that are basis vectors for r-dimensional vector space V
embedded in Rn , then any vector sx V can be represented as a unique linear
combination of basis vectors

2. The vectors are linearly independent


s

x = a1 v1 + a2v2 + ... + ar vr

Then we have created a basis for vector space V.

Basis: A set of linearly independent vectors, BV , in vector space V is a basis for


V iff every vector in V can be written as a unique linear combination of
vectors from this set.

x = a1 v1 + a2 v2 + + ar vr

BV = {v1 , v2 , ..., vr } is a basis for V iff

(35)

Since V is r-dimensional and the set of vectors {x, v1, v2, ..., vr } contains
r + 1 vectors, the set must be linearly dependent. Since the set is linearly
dependent there exists a set of scalars ai (called the basis coefficients) such
that x + a1 v1 + a2 v2 + + ar vr = 0, which gives

Or written mathematically,

Example basis:

x, v1 ..vr Rn

Proof:
Existence:

A more formal definition is given:

exactly one a = {a1, ..., ar }

x V,

Uniqueness:

s.t. x = a1v1 + + ar vr

1
2


BV = 1 , 0

1
2

(36)

Suppose the ai giving vector x were not unique. This means there is a second
set of basis coefficients {bi } distinct from the values {ai } such that
x = b1 v1 + b2 v2 + + br vr .

(37)

(34)
Subtracting Eqns (36) and (37) gives

BV is a basis for a Vector Space V, which has dimension = 2.

0 = (a1 b1) v1 + (a2 b2) v2 + + (ar br ) vr

(38)

Basis BV is illustrated by the two vectors in figure 2.


A basis for V is not unique, any set of r independent vectors in V is a basis
for V
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 23

But since the vi are linearly independent, Eqn (38) is possible only if each
of the (ai bi) = 0, which is to say that {ai } = {bi }, and so the basis
coefficients representing vector x on {vi } are unique.
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 24

EE/ME 701: Advanced Linear Systems

Section 4.1.0

Eqn (36) gives us a way to represent a vector x on the basis vectors.

x = 2

8
Inspection shows that

x = 2 v1 + 3 v2 =

Definition:

The vector a =

2
3

Sometimes we are interested in multiple bases. In some cases a vector x


will have a representation on each basis. It helps to have some notation to
keep track of things.

1
2

V = {v1 , v2 } = 1 , 0

1
2

v1 v2

2
3

Given a vector space S1 of vectors sx Rn with a basis V = [v1 , v2] and also
with a different basis W = [w1 , w2 ], then a vector sx S1 can be represented
3 ways:
s
x , Vx , Wx

(39)

where the left superscript indicates the basis on which the vector is
represented.
The vector sx is represented on the standard or Euclidean basis. This is
the ordinary representation of the vector (more on this later).

is the representation of the vector x on basis V .

Example: given the bases V and W for vector subspace S1 , and given the
vector sx below, find the representation for sx on each of bases V and W .

x = 2 ,

Define V to be the vector subspace spanned by V .


Notice that
x , v1 , v2 R3
while
2

aR

Vectors x , v1 , v2 have the dimension of the universe in which vector


subspace V is embedded.

The representation of a vector on a subspace always has the dimension


of the subspace (which may be less than the dimension of the universe).
(Revised: Sep 10, 2012)

Page 25

V = [v1 , v2 ] = 1

0 ,

x = a1 v1 + a2 v2 = [v1 , v2]

Part 2: Vectors and Vector Spaces

a1
a2

=V

W = [w1 , w2 ] = 1 2

3 4

Solution: since the columns of V are one basis for S1 , we know there exist
basis coefficients a1 and a2 to represent sx on V . The relationships can be
written:

The dimension of vector subspace V is 2

Part 2: Vectors and Vector Spaces

Section 4.1.1

4.1.1 Introducing notation to help keep track of vectors and bases

Example:

EE/ME 701: Advanced Linear Systems

x,

so V x =

a1
a2

(Revised: Sep 10, 2012)

Page 26

EE/ME 701: Advanced Linear Systems

Section 4.1.1

When r < n , we can solve for the basis coefficients with the left-pseudo
inverse. Recalling that it is given that sx S1 , then
Since sx = V V x ,

Example:


1
V
x = V TV
V T sx

>> x
x =
-1
0
2

x = b1w1 + b2 w2 = [w1 , w2 ]

(40)

Solving, we find

Confirm that x is represented by Vx =


>> V * Vx
ans =
1
-2
8
V x R2

2
3

Wx =

b2

=W


1
x = WTW
WT x

2.4
0.2

b1

, which gives.

2 = 2.4 1 0.2 2

8
7
4

4.1.2 Sn : the standard basis

%% should give sx

A basis that we have been using all along is the standard basis in Rn. For R3
the basis vectors are:

is the representation of vector x R3 on basis V .

The calculation of Eqn (40) always gives a well defined result.


1
The only calculation that might not give a valid result is V T V
. We
are assured the solution exists by the fact that the columns of V (the basis
vectors) are linearly independent.
Notice that x R3 while V x R2
The vector has the dimension of the universe, while the representation of
the vector has the dimension of the subspace
Part 2: Vectors and Vector Spaces

>> Vx = inv(V*V) * V * x
Vx =
2
3

1
-2
8

Section 4.1.2

We can also represent vector x on basis W :

so
>> V
V =
2
-1
1

EE/ME 701: Advanced Linear Systems

(Revised: Sep 10, 2012)

Page 27

1 0 0

S3 = 0 1 0

0 0 1

h
i

= s1 s2 s3

(41)

So we might write x = S3 sx. In words,


Every vector x is represented on the standard basis by representation sx,
with sx = x and Sn = I (the n n identity matrix) is the set of standard basis
vectors.
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 28

EE/ME 701: Advanced Linear Systems

Section 4.1.2

EE/ME 701: Advanced Linear Systems

Section 4.1.3

4.1.3 Representations on vector spaces other than Rn

Thus we have found three ways to represent vector x

We are chiefly concerned with vectors of numbers, but there are other types
of vectors and vector spaces, and our notions of basis and representation
carry over.

1. On the standard basis


x = S3 sx
2. On the V basis
x =V

x = a1 v1 + a2 v2

x=W

x = b1w1 + b2w2

Consider, for example, the set of polynomials of s of degree up to 2, such as


p (s) = 3 + 7 s + s2. It turns out that this set is a vector space. We can define
the vector space P by

3. And on the W basis

n
o
P = p (s) : p (s) = a1 + a2s + a3s2
If we choose the set of basis functions

Basis Facts:
A representation is always on a specific basis.
representation of vector x on basis V .

such as

Vx

is the

You cant talk about the representation of a vector without specifying the
basis.

The dimension of a representation:


Notice that

vi , wi

R3 ,

whereas

Each vector in a universe has the dimension of the universe, but


A representation of a vector has the dimension of the basis on which it is
represented.
(Revised: Sep 10, 2012)

with

a1

x = a2

a3

Even when the vectors of the vector space are not vectors of numbers, a
representation of a vector will be (because the representation is always an
r-dimensional vector of basis coefficients).

V x, W x R2 .

This is because, for this example, the universe has dimension n = 3, while
the vector subspace has dimension r = 2.

Part 2: Vectors and Vector Spaces

then the representation of a polynomial p (s)is simply

p (s) = P px

The representation of a given vector on a given basis is unique.

sx ,

o
n
P = p1 (s) = 1 , p2 (s) = s , p3 (s) = s2

Page 29

There is a one-to-one correspondence between a vector p P and its


representation px Rr on basis P. The representation is the vector of
basis coefficients.
Equivalently, the vector and its representation are isomorphic.
Useful tools, such as dot product and norm, can be applied to px .
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 30

EE/ME 701: Advanced Linear Systems

Section 4.2.0

4.2 A Span of a Vector Space

EE/ME 701: Advanced Linear Systems

4.3 Change of basis

Just as a set of basis vectors defines a vector space, a set of spanning vectors


v1, v2, , v p defines a vector space:


Vector Space V = x : x = a1v1 + a2v2 + + a pv p

(42)



The vector space spanned by vectors v1 , v2 , , v p is the set of all vectors
given as linear combinations of the spanning vectors
A vector space V is spanned by a set of vectors {vi } if every x V can be written
as a linear combination of the vi s .

It may be interesting to transform a vector from its representation on one


basis to its representation on another basis. A change of basis is important,
for example, for exploring some properties of state variable models.
Example:

3
2 1

V Vx = 1 0 = sx = 4

14
1 2

Find
We can say that the vi s span Vector Space V ,
Write V = span {vi }

Section 4.3.0

Wx

1 7

W = 1 2

3 4

with

, the representation of sx on basis W .

We actually know how to solve this problem, since we know how to find
from sx using W (see Eqn (40) in section 4.1.1).

( V is the vector space spanned by {vi } )

Wx

However, it will be handy to introduce a more general notation that express


from what basis and to what basis a vector is being transformed.

All bases are a spans, but not all spans are bases, the difference:
A span is any set of vectors, for example:



1
0
3
1
,
,
V = span ,
2
5
1
1

The solution is:


W

(43)
Since

defines vector space V .

Compare Eqn (42) with Eqn (43), both define vector space V
The example shows that a span may have redundant vectors

Eqn (44) gives


1

1
x = WTW
W T sx = W T W
WTV

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 31

(44)

1

0.60
0.40
= W
WTV =
WTW
VT
0.20 0.20

Equivalently, the vectors of the spanning set may be linearly dependent.


Shortly, we will learn how to form a basis from any span

x=W
VT x=

Part 2: Vectors and Vector Spaces

0.60

0.40

0.20 0.20

Vx =

4.4
0.2

(Revised: Sep 10, 2012)

Page 32

EE/ME 701: Advanced Linear Systems

Section 4.3.1

EE/ME 701: Advanced Linear Systems


Note: when V is a square matrix

4.3.1 Notation for a change of basis


To change from one representation to another requires a transformation
matrix
W
V
x= W
VT x
where

Wx

Section 4.3.3

V
sT

= V 1

(46)

is the representation on basis W , V x is the representation on V .

The symbol

4.3.2 Chaining changes of basis


W
VT

We can chain together changes of basis. For example, given

indicates that transformation matrix T converts a vector from representation


on basis V (left subscript) to basis W (left superscript). Using the left suband super-scripts leaves the right positions open, for example
W
V T1 ,

the transformation from basis V to the standard basis can be written


s
V
VT x

s
VT

with


1
x = V TV
V T sx ,

so

V
sT

=V


1
= V TV
VT

Part 2: Vectors and Vector Spaces

(47)

G
sT

(48)

it follows that
F
=G
FT s T

Find transformation matrix to convert vector sx to its representation on basis


F , and find the transformation matrix to convert this to basis G

(45)

which gives:
V

F s
x= G
FT s T x

4.3.3 Change of basis example

For the transformation from s to V , weve seen that the basis coefficients
are given by:
V

F
sT

Solution, since

Transformation from basis V to the standard basis: Since, by definition (see


Eqn (39))
s
x = V Vx

x=

find G
sT

W
V T2

might be the transformations from V to W at t1 and t2

G
FT

10

s
x= 4 ,

11

2 2

F = 1 2 ,

1 3

2 4

G = 4 1

1 4

x = Vs T sx
(Revised: Sep 10, 2012)

Page 33

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 34

EE/ME 701: Advanced Linear Systems

Section 4.3.3

EE/ME 701: Advanced Linear Systems

Section 4.4.0

4.4 Change of basis viewed geometrically (this section is

Solution: find the two transformation matrices

connected to Bay, section 2.2.4)


F
sT

and then evaluate

Fx = FT sx
s

and
and

G
FT

In section 4.3 we saw vectors represented on the standard basis and two
particular bases,
s
x , Fx , and Gx

Gx = GT Fx.
F

with
%% Find the two transformation matrices
>> FsT = inv(F*F)*F
FsT =
0.3117
-0.3506
0.0260
0.0260
0.2208
0.1688
>> GFT = inv(G*G)*G*F
GFT =
0.3333
-0.3333
0.3333
0.6667

F = 1 2 ,

1 3

G = 4 1

1 4

(49)

And we found the coordinate transformation matrix to transform between


bases

(see Eqn (44))


G
FT

1
GT G
GT F

(50)

which was derived by solving for the basis coefficients.

Now, evaluating Fx and Gx


>> sx =[10; 4; 11]
sx =
10
4
11

>> Fx = FsT * sx
Fx =
2
3

>> Gx = GFT * Fx
Gx =
-0.3333
2.6667

Bay presents a second way to derive the coordinate transformation in section


2.2.4, based on representing the basis vectors of the first basis in the second
basis.
Since F and G are sets of basis vectors for the same vector space, we can
represent each basis vector in the other basis.

Double check that both Fx and Gx give the original sx


>> F*Fx
ans =
10
4
11

Part 2: Vectors and Vector Spaces

>> G*Gx
ans = 10.0000
4.0000
11.0000

(Revised: Sep 10, 2012)

Page 35

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 36

EE/ME 701: Advanced Linear Systems

Section 4.4.1

Writing the basis vectors, explicitly showing them to be represented on the


standard basis (these are just the basis vectors shown in Eqn 49,
h
iT
)
for example s f1 = 2 1 1
F=

sf

sf

G=

sg

sg

(51)

We can form G
F T by representing the F basis vectors on the G basis:
h
i
F = G G f1 G f2

EE/ME 701: Advanced Linear Systems

Section 4.4.2

4.4.2 Numerical example based on representing from basis vectors on the


to basis vectors
>> F = [ 2 2
-1 2
1 3 ]
>> G = [ 2 4
-4 1
-1 4 ]

Now
 h
x = F Fx = G

Gf

Gf

i

x=G

h

Gf

Gf


x = G Gx (52)

Using the uniqueness of the representation of the vector sx,


i 
h
G
F
F
Gf
Gf
x=
x =G
1
2
FT x
So
G
FT

Gf

Gf

(53)

(54)

The transformation from one coordinate frame to another is given as the


representation of the from basis vectors on the to basis vectors.

>> sf1 = F(:,1)


>> sf2 = F(:,2)
sf1 = 2
sf2 =
2
-1
2
1
3
%% Find the representation of the F basis vectors on G
>> Gf1 = inv(G*G) * G * sf1
Gf1 =
0.3333
0.3333
>> Gf2 = inv(G*G) * G * sf2
Gf2 = -0.3333
0.6667

4.4.1 Example, change of basis viewed geometrically


In robotics, rotations are used to change the expression of a vector from one
coordinate frame to another (that is, a change of basis). The rotation matrix
from B to A is given by the axes of B expressed in A

|
|
|

A
A
AY
AZ
(55)
BR = X B
B
B

|
|
|
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 37

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 38

EE/ME 701: Advanced Linear Systems

Section 4.4.3

EE/ME 701: Advanced Linear Systems

Section 4.4.3

4.4.3 Summary

%% Transformation from F to G
>> GFT = [Gf1 Gf2]
GFT =
0.3333
-0.3333
0.3333
0.6667

Looking at the way G


F T is given,
G
FT

%% Representation of x on F
>> Fx = [ 2
3 ]
Fx =
2
3

Gf

Gf

A coordinate transformation is given by representing the from basis


vectors on the to basis.
Note that the equations are, of course, the same. In section 4.3 we wrote
G
FT

%% Representation of x on G
>> Gx = GFT * Fx
Gx =
-0.3333
2.6667


1
= GT G
GT F

In section 4.4 we wrote


G
FT

=
=

%% Finding sx from each of the representation of F and on G


>> sx = F * Fx
sx = 10
4
11

Gf

Gf

1 T s
G f1 ,
GT G

1

h
GT s f1
GT G

sf

1 T s i
GT G
G f2

1

GT F
= GT G

By representing the basis vectors of F in G , we can write the transformation


from F to G .

>> sx = G * Gx
sx = 10.0000
4.0000
11.0000

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 39

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 40

EE/ME 701: Advanced Linear Systems

Section 5.0.0

EE/ME 701: Advanced Linear Systems

5 Vector Subspace (following Bay 2.4)

Section 5.0.0

Recall the terminology from set theory: subset and proper subset.
From the set of colors (the universe) {blue, green, red},

Vector subspace: A vector space of dimension r , comprising vectors of ndimensional elements, where r n.

{blue, green, red} is a subset but not a proper subset.

A vector subspace is a type of vector space, it must demonstrate all of


the properties of a vector space.

{blue,green,red} Colors
{blue, red}, is both a subset and a proper subset.

Proper vector subspace: A vector space of dimension r , comprising vectors of


n elements, where r < n.

{blue,red} Colors
A vector subspace is defined by a basis or a span, for example

1
2

S1 = x : x = a1 1 + a2 0 ,

2
1

Then B =

1 ,

1 ,

B = 1 ,

B =

a1 , a2 R

A vector space is a subset of the universe in which it is embedded. For


example
(56)
S2

is a basis for S1 .

S2 R3

1 is also a basis for S1 .

1
1

0 , 1 is one possible span for S1 .

2
3

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

0
1
0




x : x = a1 0 + a2 1 + a3 0 ,
=

0
1
0

Page 41

Exercise:

A proper vector subspace is a proper subset of the universe in which it is


embedded. For example

Make up
Another
span
and basis
for S1 .

a1 , a2 , a3 R

S1

2
1

x : x = a1 1 + a2 0 ,
=

1
2

S1 R3

Part 2: Vectors and Vector Spaces

a1 , a2 R

(Revised: Sep 10, 2012)

Page 42

EE/ME 701: Advanced Linear Systems

Section 5.1.0

5.1 Example proper vector subspace

EE/ME 701: Advanced Linear Systems

Section 5.1.0

Terminology

An example 2-D subspace embedded in R3 :

2
1

S1 = x : x = a1 1 + a2 0 ,

1
2

The set given by Eqn (57) is illustrated in figure 3.

a1 , a2 R

Given S and B are vector spaces, if B S (if B is a proper subset of S),


then we can say that vector space B is embedded in vector space S .
Such as: Vector space S1 is embedded in R3.
(57)

Proper vector subspace is used to emphasize that


r = dim (B) < dim (S) .
Often we omit the words proper and vector, such as saying:
Subspace B is embedded in S, which is isomorphic to R3 .
Sub can also be omitted:

Vector space B is embedded in space S.

If a vector space is 2 dimensional, then dim (B) = 2 and we say that B is


2-D, such as:

1
0

B is a 2-D subspace in R7 ,

1
2

If dim (B) = r:

3
3

B is a r-D subspace in Rn .

1
0

2
Y

2
3

Figure 3: 2D Vector subspace embedded in R3 .


Recall that vector space S1
must include the zero element.
must be closed under scalar multiplication.
must be closed under vector addition.
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 43

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 44

EE/ME 701: Advanced Linear Systems

Section 5.1.1

EE/ME 701: Advanced Linear Systems

Section 5.2.1

5.2 What about other dimensions of A ?

5.1.1 Observations on subspaces of Euclidean spaces


Proper subspaces of R3 have dimension 1 or 2.

When B A, and thus r = dim (B) < dim (A), there must be vectors in A
not lying in B.

B, dim (B) = 0 contains only one vector: the zero vector.


Does the zero vector, 0 Rn, by itself satisfy the conditions to be a
vector space ?
B, dim (B) = 1: the vectors in B lie on a line passing through the origin.
B, dim (B) = 2: the vectors in B lie in a plane which includes the origin.
Proper subspaces of Rn have dimension 1 ... (n 1).
A 2-D vector subspace can be referred to as a plane.

Student
Exercise

Actually, when dim (B) < dim (A), almost all vectors in A are
outside B.
Consider a class room with the origin in the center of the top of the
desk in front. Almost all vectors in R3 (the class room) do not lie in the
surface of the desktop (a 2-D subspace). The surface of the desktop
has zero volume, and so 0% of the volume of the room.
5.2.1 The set-theoretic meaning of almost all

An r-D vector subspace, with r < n, can be referred to as a hyperplane or


hypersurface.

Almost all y A have property w has a precise meaning in set theory:


The elements of set A which do not have property w have measure
0.

The set of states of the system forms a 4-D hyperplane in R12


A vector spaces of dim (S) > 3 can be referred to as hyperspace, as in
The points lie on the surface of a 6-dimensional hyper-cube lying in
an 8-dimensional hyper-space.

Measure theory is whole subject unto itself (remember Lebesgue). Think of


it this way:
if you are choosing y randomly from A, the chance of getting one with
property w is 100% (almost all y have property w).
There are elements of A which do not have property w, but the chance of
choosing one is infinitesimal. In set theory we say
The set of y in A without property w has zero measure.
or, equivalently
Almost all y in A have property w.

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 45

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 46

EE/ME 701: Advanced Linear Systems

Section 5.2.2

EE/ME 701: Advanced Linear Systems

6 Projection Theorem

5.2.2 A vector y orthogonal to a proper subspace B


When proper subspace B is embedded in A, vector y A, but outside
subspace B can have two relationships to B:

Before introducing the projection theorem, we need to introduce a


generalized notion of orthogonality.

1. Vector y can partially overlap subspace B , formally:


x B

s.t.

hx, yi =
6 0

Generalized Orthogonality: three flavors of orthogonality


For this discussion (as elsewhere) b, u, v, w, x, y and z are vectors in Rn ;
bold capitals, such as B, U, S and S1 , refer to subspaces.

(58)

Case 1: A vector is orthogonal to a vector

2. Vector y can be orthogonal to subspace B, which is to say that y does


not overlap B at all. Formally:
if

hx, yi = 0

x B then

Section 6.0.0

y B

This is our familiar case, based on the the inner product:

(59)

vw

Or, equivalently,

hv, wi = 0

(63)

Case 2: A vector is orthogonal to a subspace


y B x B , hx , yi = 0

Student
exercise

Student exercise, show that if


y bi ,

This implies that the vector is orthogonal to each vector in the


subspace:
vB

hv, bi = 0 b B
(64)

(60)

Case 3: A subspace is orthogonal to a subspace


This implies that each vector in the first subspace is orthogonal to each
vector in the second subspace:

{bi } a basis for B

(61)
BU

hb, ui = 0

b B u U

(65)

(Revised: Sep 10, 2012)

Page 48

then
yB

Part 2: Vectors and Vector Spaces

(62)

(Revised: Sep 10, 2012)

Page 47

Part 2: Vectors and Vector Spaces

EE/ME 701: Advanced Linear Systems

Section 6.0.0

EE/ME 701: Advanced Linear Systems

Section 6.1.0

6.1 Projection Theorem

Now lets define a projection:


Projection: A projection of a vector onto a subspace is a vector. It is the
component of the original vector lying in the subspace. The remainder
of the original vector must be orthogonal to the subspace.
Looking at figure 4, u is the component of x lying in subspace U. This
means that the other part of x (that part not in subspace U, given by
w = x u) is orthogonal to U, or w U.
Observation: the subspace can have dimension 1, that is, we can project
a vector onto a vector.

u U

PROJECTION THEOREM: Given vector spaces U and S with U a


proper subspace of S, then for all vectors x S there exists a unique
vector u U such that w = x u and w U.
Alternatively, in formal notation, given U S ,
x S u U s.t. w = x u , and w U

(66)

The projection theorem is illustrated in figure 4.


The projection theorem tells us that given any vector and subspace (not
necessarily proper !) the vector can be broken down into 2 parts:
1. A part lying in the subspace, and

w = xu

w U

2. A part that is orthogonal to the subspace.


Notation: Introduce the notation PU x for the projection of x onto U. When u is
the projection of x onto U, we may write:
u = PU x
Consider the part which does not lie in subspace U.
Define w = x u , we may say that

Figure 4: Orthogonal projection of vector x onto subspace U (from Bay).

vector w is orthogonal to subspace U.


The set of all possible ws forms a vector subspace !
Orthogonal Complement: if U is a subspace of S, the set {w : w S , w U}
is the orthogonal complement of U in S , written: U .

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 49

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 50

EE/ME 701: Advanced Linear Systems

Section 6.2.1

6.2 Projection of a vector onto a proper subspace

Starting with

6.2.1 First projection example, projection onto a 1-D subspace


with a 1-D subspace U = span {u1 }

, u1 =
f=

hw, u1 i = 0

(70)

hf, u1 i
hf, u1 i
=
hu1 , u1 i ||u1 ||2

(71)

1 =

(67)

The inner product gives us the projection of one vector onto another.
Projection onto a vector is illustrated in figure 5.

Suppose we want to find the projection of f onto U. Whatever is left over,


must be orthogonal to U, so:
PU f = 1 u1
(68)
w = f PU f

hf, u1 i 1 hu1 , u1 i = 0
we can solve 1 :

Because the inner product operation is linear, Eqn (69) can be broken into two
parts. Keeping in mind that w = (f PU f) and for the 1-D case PU f = 1 u1
hw, u1 i = h(f PU f) , u1 i = hf, u1 i hPU f, u1 i = hf, u1 i 1 hu1 , u1 i = 0

This is the problem of projecting a vector onto a subspace.

Consider vectors in

Section 6.2.1

Example: Projection onto a 1-D subspace (continued)

In section 6.1 the projection theorem is laid out. The question arises: given
S, U and x, how can u and w be determined.

R4,

EE/ME 701: Advanced Linear Systems

(69)

Notice that the magnitude of the projected vector (PU f) is independent of


the length of the basis vector (u1 ).

u1
Pf

Pf
w

u1

(a) Projection of f onto u1

(b) Projection of f onto a similar u1

Figure 5: Projection of f onto u1 and onto u1 .

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 51

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 52

EE/ME 701: Advanced Linear Systems

Section 6.2.1

Putting numbers to the example:

EE/ME 701: Advanced Linear Systems


Eqn (71) is:

Using f and u1 as given:

i
h

4 0 2 1

PU f =
2

PU f =

0.1

0.2

4 0 2 1

0.2

0.1

PU f =

2

0.1

0.2

0.2





0.1

0.7

 
2 1.4
2

= 7 =

10
2 1.4
2

1
0.7
1

||u1 ||2

u1

(72)

1. The direction, given by u1 .


The projection of f must lie in the subspace spanned by u1 .
2. The magnitude, given by the scaled inner product of f and u1 . Call this
magnitude the projection coefficient.

6.3 Normalization of the basis vectors


If the length of u1 is one, then Eqn (72) reduces to:
PU f = hf, u 1 i u 1

hf, u1 i

which has 2 parts:

Using f and u1 , a shorter version of u1 :

Section 6.3.0

0.1
0.7

 

0.2 1.4
0.2
0.7
=

0.1
0.2 1.4
0.2

0.1
0.1
0.7
0.1

(73)

and the projection coefficient is simply the inner product. This property is
sufficiently handy that we need to call it something:
Normal basis vector: a basis vector with length 1 is said to be normal,
and is sometimes written with a hat : u 1 .
Normalization: the process of making a vector normal.
Matlab example:
>> u1 = [ 1 2 2 1 ];
>> u1hat = u1 / norm(u1);

Of course, the projection of f onto u1 is the same as onto u1 .


Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 53

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 54

EE/ME 701: Advanced Linear Systems

Section 6.3.0

EE/ME 701: Advanced Linear Systems

6.4 Projection Matrices

Consider example 1 one more time, with u 1 normalized:

Normalized :

Now

0.316

Section 6.4.0

One more interesting fact about projecting f onto U: since hf, u 1 i is a scalar,
Eqn (73) can be written:

0.632
u1

=
u 1 =

||u1 ||
0.632

0.316

(74)

 


PU f = hf, u 1 i u 1 = u 1 hu 1 , fi = u 1 u T1 f = u 1 u T1 f

(76)


M = u 1 u T1 is thus a term that multiplies a vector to give the projection of
the vector onto a subspace:

||u 1 || = 1
And so

PU f =

hf, u 1 i
||u 1 ||2

u 1 = hf, u 1 i u 1

(75)

In the example, this gives

0.316

0.316

0.316

0.70

0.632 0.632
0.632 1.40


= (2.214)
=

PU f =
4
0
2
1

0.632 0.632
0.632 1.40

0.316
0.316
0.316
0.70
Also, defining

0.316

0.1 0.2 0.2 0.1

i
0.632 h
0.2 0.4 0.4 0.2

0.316 0.632 0.632 0.316 =

M=

0.632
0.2 0.4 0.4 0.2

0.316
0.1 0.2 0.2 0.1
Now

0.1 0.2 0.2 0.1

0.70

0.2 0.4 0.4 0.2 0 1.40

Mf =

0.2 0.4 0.4 0.2 2 1.40

0.1 0.2 0.2 0.1


1
0.70

W = U
notice that

Now for any vector f, the projection is given as: PU f = M f .

3.3

1.4
W
w = f PU f =

0.6

1.7

M is pretty handy also, and so lets call it a projection matrix or


projection operator.

Student exercise, what is the dimension of W ?


Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 55

Student
exercise

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 56

EE/ME 701: Advanced Linear Systems

Section 6.4.1

EE/ME 701: Advanced Linear Systems


Notice that if we write

6.4.1 Bay example 2.10, projecting f onto a 2-D subspace

then:

0
2
0
, u1 = , u2 =
f=

2
2
1

1
1
1

(77)
and also:

Solution:
and so:
Since PU f U , PU f can be written:
PU f = 1 u1 +2 u2

1 , 2 R

u1 , u2 R4 ,

U = span {u1 , u2 }
(78)
Writing f = 1 u1 + 1 u1 + w, and using Eqn (78) and the linearity of
inner product
hf , u1 i = h1 u1 + 2 u2 + w , u1 i = 1 hu1 , u1 i + 2 hu2 , u1 i + 0

1
2

Part 2: Vectors and Vector Spaces

hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i

hf , u1 i
hf , u2 i

(Revised: Sep 10, 2012)

hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i

hf , u1 i
hf , u2 i

hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i

= UT U

hu1 , fi

hu2 , fi

(82)

= UT f

hf , u1 i
hf , u2 i

(83)

(84)


1
= UT U
U T f (85)

The projection coefficients are given by Eqn (85).


Equation (85) for projecting f onto basis vectors U is related to what
equation that we have seen before ?
Finally, the projection is give by:

(80)

Which solves to give:

(79)

hf , u2 i = h1 u1 + 2 u2 + w , u2 i = 1 hu1 , u2 i + 2 hu2 , u2 i + 0

Eqn (79) may be written:

hu1 , u1 i hu2 , u1 i
hf , u1 i
1

hf , u2 i
hu1 , u2 i hu2 , u2 i
2

U = u1 u2

| |

Consider again these vectors in R4, and project f onto the


subspace U = span {u1 , u2 }

Section 6.4.1

PU f = 1 u1 + 2 u2 = U

1
2


1
= U UT U
UT f

(86)

And in projection matrix form:


(81)
PU f = M f with
Page 57

Part 2: Vectors and Vector Spaces


1
M = U UT U
UT

(Revised: Sep 10, 2012)

(87)
Page 58

Student
Exercise

EE/ME 701: Advanced Linear Systems

Section 6.4.1

Running the numbers:

1 0

1 0

Given a vector subspace U with an array of basis vectors U

1
2
2
1
2
0
10
3
=

UT U =

0 0 1 1 2 1
3 2

1 1

hf , u1 i
1
2
2
1
0
7
=

= UT f =

hf , u2 i
0 0 1 1 2
1

or

Using the projection PU f we can find the orthogonal complement, U

1
2

10 3
3 2

7
1

1
1

w = f Pf = (I M) f

where M is the projection matrix. We can say that

w = M f
where M = I M is the projection matrix for projecting a vector onto the
orthogonal complement, and w U .
Using the data of the example,

(88)
0.091 0.091

0.182 0.182
,

0.545 0.455

0.455 0.545

Part 2: Vectors and Vector Spaces

Since

1


2

PU f = 1.0 u1 1.0 u2 =

1

0


1
M = U UT U
UT

0.182 0.364

0.364 0.727
M =

0.091 0.182

0.091 0.182

w = f PU f =

Section 6.4.2

6.4.2 Projection matrix for the orthogonal complement

2 0

U =

2 1

1 1

EE/ME 701: Advanced Linear Systems

1.0

2.0

PU f = M f =

1.0

0.0

(Revised: Sep 10, 2012)

Page 59

0.818

0.364 0.091

0.091

0.364
0.273
0.182
0.182

M = I M =

0.091 0.182 0.455 0.455

0.091 0.182 0.455 0.455


and w = M f =

3 2 1 1

Part 2: Vectors and Vector Spaces

iT
(Revised: Sep 10, 2012)

Page 60

EE/ME 701: Advanced Linear Systems

Section 6.4.3

It would be nice to get rid of the matrix inversion step in projecting a vector
onto a subspace

b=
bT U
U

0.671

0.671

For many reasons, not least because the matrix inverse can be badly
conditioned.

Starting with

hf , u 1 i
0.316 0.632 0.632 0.316 0
2.214
T
=
b f =

=U

0
0
0.707 0.707 2
0.707
hf , u 2 i

1
1
0.671
3.162
2.214

=

=

2
0.671
1
0.707
1.414

1.0

2.0

PU f = 3.16 u 1 1.414 u 2 =

1.0

0.0

Of course, the projection matrix does not change by rescaling


vectors:

0.182 0.364 0.091 0.091

1

0.364 0.727 0.182 0.182
b
bT =
b U
bT U
U
M =U

0.091 0.182 0.545 0.455

0.091 0.182 0.455 0.545


Part 2: Vectors and Vector Spaces

Section 7.0.0

7 Gram-Schmidt ortho-normalization

6.4.3 Projection with normalized basis vectors


With normalized vectors:

0.316
0

0.632
0
b =
U

0.632 0.707

0.316 0.707

EE/ME 701: Advanced Linear Systems

(Revised: Sep 10, 2012)

the basis

(90)

UT U = I
so that

Since


1
M = U UT U
UT = U I 1 UT = U UT

hu1 , u1 i hu2 , u1 i

UT U = hu1 , u2 i hu2 , u2 i

..
..
...
.
.

(91)

(92)

Computing a projection without a matrix inverse requires that


hu1 , u1 i = hu2 , u2 i = = hur , ur i = 1


hu1 , u2 i = hu2 , u1 i = = ui , u j , i 6= j = 0

If

1

UT
M = U UT U

In other words, the basis vectors must be normal and orthogonal.


(89)

Page 61

Orthonormal: Basis vectors which are both orthogonal and normal are
called orthonormal.

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 62

EE/ME 701: Advanced Linear Systems

Section 7.1.1

7.1 Process of Gram-Schmidt Ortho-normalization

EE/ME 701: Advanced Linear Systems

Section 7.1.1

1. Start with zero basis vectors, V = {}.

Gram-Schmidt Ortho-normalization is a process that starts with any span of


a vector subspace, and produces an orthonormal basis, V .

2. Take any vector from the subspace:

u1 =

2

1

In broad terms, the process to make orthogonal basis vectors works this way:
1. Start with zero basis vectors
2. Choose any vector u in the subspace
Choosing, for example, a vector from any spanning set.

V=

3. Subtract off the projection onto the existing basis vectors.


3. Subtract from u the projection of u onto the existing basis vectors.
4. If a sufficient vector remains after the subtraction, normalize the
remaining vector and add it to the set of basis vectors.

7.1.1 Example Gram-Schmidt Ortho-normalization


Starting with the spanning set, let subspace S
[u1 , u2 , u3 ]

1
0

2 0

S = span
, ,

1
1

Find a set of ortho-normal basis vectors for S .

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

0.316

0.632
u1
v1 = =
u
0.632
1

0.316


2


3



4. If a sufficient vector remains after the subtraction (if u1 > tol), normalize
the remaining bit and add it to the set of basis vectors.

be spanned by vectors A =



2 0 2

u1 =
=
2 0 2

1
0
1

5. Repeat steps 2-4 until the basis set is complete


The basis set is complete when r = n, or when all of the spanning
vectors have been used.

V = {v1} =

0.632

0.632

0.316
0.316

5. Repeat

Page 63

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 64

EE/ME 701: Advanced Linear Systems

Section 7.1.1

2. Take any vector from the subspace:

0


0

u2 =
,
1

1

V=

0.632

0.632

0.316

0.1 0.2 0.2 0.1

0.3

u3

0.286

0.572
u2
,
v2 = =

u
0.381
2

0.667

V = {v1 , v2} =

(93)

0.286


0.632 0.572
,

0.632 0.381

0.316
0.667
0.316

0.182

0.364

0.091 0.091

0.364 0.727 0.182 0.182


u3
= u3 V V u3 = u3

0.091 0.182 0.545 0.455

0.091 0.182 0.455 0.545

In general, vector u2 is not parallel to u2 , that is u2 u2 .



If u2 > tol, u2 can not perpendicular to u2 either.


4. If a sufficient vector remains after the subtraction (if u2 > tol), normalize
the remaining bit and add it to the set of basis vectors.

Note that u2 is the projection of u2 onto the orthogonal complement of V.

0.286

0.632 0.572
,

0.632 0.381

0.316
0.667
0.316

3. Subtract off the projection onto the existing basis vectors.

V=

1


2

u3 =
,
3

2

0.316

0.6
0.2 0.4 0.4 0.2

u2 =
u2 = u2 V V u2 = u2

0.4
0.2 0.4 0.4 0.2

0.7
0.1 0.2 0.2 0.1

Section 7.1.1

2. Take any vector from the subspace:

3. Subtract off the projection onto the existing basis vectors.

EE/ME 701: Advanced Linear Systems

= 10

0.444

0.888

0.444

0.444

15

4. If a sufficient vector remains, normalize the remaining bit and add it to the set
of basis vectors.


u3 < tol, u3 lies in V = span (v1, v2)
No vector added to V

5. All out of vectors in the original spanning set. Done.

5. Repeat
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 65

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 66

EE/ME 701: Advanced Linear Systems

Section 7.1.2

In step 4 of the Gram-Schmidt algorithm, a vector ui is accepted to be a basis


vector if sufficient magnitude remains after projection onto the orthogonal
complement of V :
,


include ui / ui

Using

S = span

(94)
is given by:

where
A=

u1 u2 u p

0.316

0.286

The projection matrix onto the subspace given by

One possible answer is a value that scales with the number and size of
vectors and the machine precision,


0.632 0.572

0.632 0.381

0.667
0.316

V=

This begs the question, what should be the value of tol ?

tol = max (n, p) max ||ui ||

Section 7.2.0

7.2 Projection matrix with GS Ortho-normalization

7.1.2 Tolerance value for Gram-Schmidt algorithm


if ui > tol

EE/ME 701: Advanced Linear Systems

Rnp is the initial set of p spanning vectors,

maxi ||ui || is the maximum of the norms of the columns ui Rn ,


is the machine precision
tol given by Eqn (94) reflects the fact that round-off errors in step 4 depend
on the dimensions of A and the magnitude of the vectors that make up the
calculation.


0 2


1 3

2
1



2


2

1

0.182

0.364

0.091 0.091

0.364
0.727
0.182
0.182

M = V VT =

0.091 0.182 0.545 0.455

0.091 0.182 0.455 0.545

(95)

And (compare with Eqn (88))

1.0

0
2.0
= Mf =

PU f = P

2
1.0

1
0.0

(96)

Importantly, nowhere in the whole process is a matrix inversion required !


Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 67

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 68

EE/ME 701: Advanced Linear Systems

Section 7.3.0

EE/ME 701: Advanced Linear Systems

7.3 Projection Coefficients

7.4 Projection onto the orthogonal complement.

Note that when the set of basis vectors V is orthonormal, then:

PU f = 1 v1 + 2 v2 = V

1
2





= V VT f = V VT f

At each iteration of the GS Ortho-normalization, in step 3, we are


subtracting the projection onto the existing basis vectors from the candidate
vector.
(97)
This is the same as taking the projection onto the orthogonal complement of
the existing basis vectors. For example:

and so the projection coefficients to project f onto V are given simply by:

1
2

= VT f




x2 = x2 V V T x2 = I V V T x2

(98)

is the projection of x2 onto V .


Given a basis set V, the projection matrix to project onto the orthogonal
complement is given by:

Compare with Eqn (81) (repeated here)

1
2

Section 7.4.0

hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i

hf , u1 i
hf , u2 i

hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i

Eqn (98) requires no matrix inverse operation.


inv(X*X)
can be computed in
We are accustomed to thinking
Eqn (81), and this is true for reasonably conditioned matrices that arent
too big.

UT f

M = I V V T

x2 = M x2

(99)

Eqn (99) is not too surprising, it is saying:


The portion of x2 not lying in V is the total, minus the bit that is lying
in V.
Eqn (99) can be quite handy.

But for singular, poorly conditioned or even just large matrices (50x50
or larger)
inv(X*X)
may not exist, or the computation may lead
to large errors.
Gram-Schmidt is one of the most scalable and robust algorithms in linear
algebra.

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 69

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 70

EE/ME 701: Advanced Linear Systems

Section 7.5.0

7.5 Projection and fitting parameters to experimental data

EE/ME 701: Advanced Linear Systems

Section 8.1.0

8 Additional Topics

Think back to y = A b and modeling experimental data

8.1 The Four Fundamental Spaces of a Matrix

When we write a model:

Consider matrix A and the operation y = A b :


y (k) = T (k) b

and then have


y = A b
b

with

b Rp
y Rn , b

Where p is the number of parameters, and n is the number of data points.


Where y is measured data and y is estimated from the model.
Then

1

b
AT y ,
b = AT A

b ,
y = A b

= y y

The model set (also called the reachable set) is set of outputs, y , given
by the model for any possible tuning of parameters b
b.
For a linear model, the model set forms a linear vector space:

The columns of A are the spanning set for the r-dimensional model
set in the n-dimensional output space.
b are the projection coefficients of the data y onto the
The parameters b
basis vectors of the model set (namely the columns of A).

The G-S algorithm applied to the columns of A gives a basis for the
model set.

The projection theorem tells us that with b


b given by Eqn (100):

(Revised: Sep 10, 2012)

0
1
1

y = Ab

(101)

2. Output space: y Rn

It turns out that each of these spaces is further divided in 2, so there are a total of
four fundamental spaces of a matrix. For an n p matrix:
Input Space:
1. Null Space: Set of vectors b R p such that A b = 0
2. Row Space:
Orthogonal complement of the Null Space.
The row space is spanned by the columns of AT
A basis for the row space is given by Vrow = GS AT
Output Space:

3. Column space:
b Rp

A basis for the column space is given by Vcol = GS (A)


4. Left null space:

Said another way, there is no signal power remaining in which can


possibly be modeled by y (k) = T (k) b.
Part 2: Vectors and Vector Spaces

2
A=

The set of y given by the equation y = A b

Then lies in the orthogonal complement of the model set


(write A ).

Two vector spaces associated with multiplying a vector by matrix A are:


1. Input space: b R p

(100)

Page 71

The orthogonal complement of the column space


The set of all y such that y A b , b R p
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 72

EE/ME 701: Advanced Linear Systems

Section 8.1.0

Space of b vectors

Space of y vectors

Ro
w
row Sp
(A) ace

A b=y

br
Ab
ft N
lnu ull S
ll(A pac
)
e

b
0=A n

bn

Input Space, Rp

0
2
= A bn =

0
2

0
1

Le

Nu
l
Nu l Spa
ll(A ce
)

b=br+bn

8.1.1 Numerical Examples of the four fundamental spaces,

Null Space: The set of input vectors that give no output

Y=

Section 8.1.1

Input Space = Row Space + Null Space

Co
lu
co mn
l(A Sp
)
ac

Y=A br

EE/ME 701: Advanced Linear Systems

0
0
1
1

1
2
1

3
1
2
1

1 null (A)

(102)

Row Space: Vectors with a component from the row space give a nonzero output

Output Space, Rn

Figure 6: Pictorial representation of the four fundamental spaces of a matrix.


Row space and Null space lie in the input space, while Column space
and Left-null space lie in the output space. (Adapted from Strang).
The equation illustrated in figure 6 is:
y = A b = A br + A bn = y + 0


3

1


6

=A
1


9

2
6

1


1 contains a component from row (A)

2

(103)

Consider for the Row space:

Inputs come from

All vectors which give a non-zero output contain a component from


the row space.

A part from the Row space and a part from the Null space.
Outputs lie in the Column space

Any component from the null space adds to length ||b||2 , but
contributes nothing to the output.

The Left-Null space is unreachable by

So the minimum b vector giving y lies entirely in the row space.

y = Ab

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 73

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 74

EE/ME 701: Advanced Linear Systems

Section 8.1.1

Row and Null Space example

Section 8.1.1

Column Space example



3

1
2


=A
2 =A 1



9

2
1
6

b1 = 2 = br +bn = 1 + 1 ,

1
1
2

Vector b1 has a component from the row space and from the null space.
The contribution

determines the output

EE/ME 701: Advanced Linear Systems


1


6

=A
1

9

2
6



6
col (A)

9

6

(104)

The dimension of the column space is equal to the rank of A



b1r = 1

2

The contribution

The Column Space is the set of output vectors possible from A

dim col A = rank A


The columns of A span the column space

col A = {y : y = A b }

b1n = 1

Left-Null Space
The Left-Null Space is the set of vectors in the output space that can
not be reached by
y = Ab

makes no contribution to the output.


If

for any value of b


A b = y 6= 0 ,

and

It is the orthogonal complement of col A .

b has the minimum ||b||2 such that A b = y


then b row (A) .

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 75

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 76

EE/ME 701: Advanced Linear Systems

Section 8.1.1

Left-Null Space example:

Row Space:

1
lnull A
yln =

h
iT
Since 1 1 1 1 lies in the Left Null space of A , then the choice
for b which minimizes

is

1
1
1
1

Section 8.1.2

8.1.2 Computing bases for the four fundamental spaces

The Left-Null Space is the


orthogonal complement of the
column space. For example,







e
y =

EE/ME 701: Advanced Linear Systems

(105)



b= 0

0

where

(107)

R is a set of basis vectors spanning the Row Space of A, and



GS AT indicates applying the Gram-Schmidt algorithm to AT

y=

(106)

, gives b Vr , y Rn

with the suitable choice of y .


0


than b = 0

0

(Revised: Sep 10, 2012)

Since the Row Space is spanned by the rows of A, every vector in the
row space is given by:
b = AT y

What Eqns (106) and (105) are saying is that there is no choice for b
which gets any closer to

Part 2: Vectors and Vector Spaces

 
R = GS AT

Note: Recall that the GS algorithm takes any set of vectors, and returns an
ortho-normal basis on the space spanned by the vectors.


A b

The Row Space is spanned by the rows of A, therefore an ortho-normal


basis on the row space is given by:

Column Space:
The Column Space is spanned by the columns of A
C = GS (A)

(108)

where C is a set of basis vectors spanning the column space of A.


Since the Column Space is spanned by the columns of A, every vector y
in the column space is given by:
y =Ab
with the suitable choice of b .
Page 77

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 78

EE/ME 701: Advanced Linear Systems

Section 8.1.2

Null Space:

EE/ME 701: Advanced Linear Systems

Section 8.1.3

8.1.3 Bases for the Four Fundamental Spaces, Numerical Example

Note that

Using the GS algorithm, we can determine bases for the four fund. spaces
Mr = R RT

(109)

is a projection matrix, projecting any vector b onto the Row Space.


Since the null space is the orthogonal complement of the row space, the
projection matrix onto the null space is given by:
Mn = I Mr = I R RT

(110)

Since the columns of any projection matrix span the space onto which
the matrix projects, a basis set for the null space is given by:


null (A) = GS (Mn) = GS I R RT

(111)

>> A = [ 1 0 1; 2 0 2 ; 2 1 3; 1 1 2]
A =
1
0
1
2
0
2
2
1
3
1
1
2

n = 4
p = 3
r = 2

>> Col = GramSchmidt(A)


Col =
0.3162
-0.2860
0.6325
-0.5721
0.6325
0.3814
0.3162
0.6674

dim Col =
dim Row =
rank A = 2

Left-Null Space
The Left-Null Space is the orthogonal complement of the Column Space.
The projection matrix onto the column space is given as
Mc = CCT

(112)

and so the projection matrix onto the left-null is given as:


Mln = I Mc

(113)

And finally, a set of basis vectors for the left-null space is given by:


lnull (A) = GS (Mln) = GS I CCT

Part 2: Vectors and Vector Spaces

>> Row = GramSchmidt(A)


Row =
0.7071
-0.4082
0
0.8165
0.7071
0.4082

(Revised: Sep 10, 2012)

(114)

Page 79

>> Null = GramSchmidt( eye(3) - Row*Row)


Null =
0.5774
0.5774
-0.5774

dim Null =
p - dim Row = 1

>> LNull = GramSchmidt( eye(4) - Col*Col)


LNull =
0.9045
-0.0000
-0.4020
0.3333
-0.1005
-0.6667
0.1005
0.6667

dim LNull =
n - dim Col = 2

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 80

EE/ME 701: Advanced Linear Systems

Section 8.1.4

EE/ME 701: Advanced Linear Systems

Section 8.1.4

8.1.4 The Four Fundamental Spaces of a Matrix, revisited

Space of b vectors

Space of y vectors

Ro
w
row Sp
(A) ace

A b=y
Co
lu
co mn
l(A Sp
)
ac

Y=A br

br
Ab

b=br+bn
0=A

bn

Input Space, Rp

bn

Le

Nu
l
Nu l Spa
ll(A ce
)

ft N
lnu ull S
ll(A pac
)
e

Y=

Output Space, Rn

Figure 7: Pictorial representation of the four fundamental spaces of a matrix


(repeated).
Looking back to figure 6,
The row space and null space are orthogonal complements
The input space, R p for an n p matrix, is the union of the row and
null spaces:
p = dim row (A) + dim null (A)
(115)
The column space and left-null space are orthogonal complements
The output space, Rn for an n p matrix, is the union of the column
and left-null spaces:
n = dim col (A) + dim lnull (A)
(116)
Additionally, the dimensions of the row and column spaces must be equal
and are equal to the Rank of A
dim col (A) = dim row (A) = rank (A)
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

(117)
Page 81

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 82

EE/ME 701: Advanced Linear Systems

Section 8.1.6

8.1.5 Questions that can be answered with the four fundamental spaces
Given y = A b

EE/ME 701: Advanced Linear Systems

Section 8.2.0

8.2 Rank and degeneracy


Rank: In matrix theory the rank of a matrix is defined as the size of the largest
sub-array that gives a non-zero determinant.

What is the set of all possible b


b which give a specific b
y?

This is satisfactory as a formal, mathematical definition

When y is specified, is there an exact solution for b ?

But determinant is an unsatisfactory numerical calculation, because it is


numerically very sensitive, and cant handle non-square matrices.

If there is no exact solution, so = y A b 6= 0 ,


What is the set of all possible ,

Using the Gram-Schmidt algorithm, we can find a set of basis vectors for
the column space (or row space) of A, to determine the rank of A.

What is the dimension and a basis for this set


Is there any non-zero value e
b such that A e
b=0

In Matlab, Rank is obtained by the singular value decomposition, and


counting the number of singular values larger than a tolerance value. The
help message of rank() is instructive

What is the set of all possible e


b,

What is the dimension and a basis for this set






Given y, what is the smallest y A b
b

8.1.6 Two ways to determine the four fundamental spaces


1. With Gram-Schmidt ortho-normalization
Ortho-normalize the columns of A to get the column space, the left-null
space is the orthogonal complement.

>> help rank


RANK
Matrix rank.
RANK(A) provides an estimate of the number of linearly
independent rows or columns of a matrix A.
RANK(A,tol) is the number of singular values of A
that are larger than tol.
RANK(A) uses the default tol = max(size(A)) * norm(A) * eps.
(emphasis added)

Ortho normalize the rows of A to get the row space, the null space is the
orthogonal complement.
Range: The range of any function is the set of all possible outputs of that
function.

2. With the singular value decomposition (SVD)

The range space of matrix A is another name for the column space of A.
It is the range of function y = A b. It is the vector space spanned by the
columns of A.

Gives additional insight.

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 83

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 84

EE/ME 701: Advanced Linear Systems

Section 8.2.0

EE/ME 701: Advanced Linear Systems

The nullity of a matrix is the dimension of its null space, denoted by q (A)

9 Summary and Review

with A Rnp ,

q (A) = p r (A)

Part 2 offers many definitions and concepts. However, as is often the case
with mathematical domains, there are only a few essential ideas:

(118)

where r (A) is the rank of A.

A vector space is a set of vectors, and in general will not include all
vectors of the universe.

Degeneracy

Simple operations, such as y = A b , lead naturally to vector spaces, and


our understanding of the solution can be in terms of vector spaces.

If rank (A) = min (n, p) we say the matrix is full rank. (It has the greatest
possible rank).

A vector space is defined by a span or basis.

If rank (A) < min (n, p) we say

The inner product is a measure of the degree of overlap (parallelism) of


two vectors.

The matrix is rank deficient

The norm is a measure of the length of a vector.

The matrix has lost rank (if something happened that made it rank
deficient, such as a robot has reached a singular pose)

Vectors and a vector space can be parallel, orthogonal, or somewhere in


between (include a component of each).

The matrix is degenerate.

The projection operation determines the components of a vector lying in


or orthogonal to a vector space.

THEOREM: The rank of a matrix product.


Given A Rnm and B Rmp, and form C = A B Rnp , then the following
properties hold:
rank (C) + q (C) = p

Gram-Schmidt ortho-normalization produces a basis that is handy for


determining projections.

(119)

rank (C) min (rank (A) , rank (B))


q (C) q (A) + q (B)

Naturally, there are a variety of details fleshing out each of these essential
ideas.

(120)
(121)

The rank and dimension of the Null Space of C are determined by how the
column space of B falls on the row space of A .
rank (C) = dim intersection (col (B) , row (A))
Part 2: Vectors and Vector Spaces

Section 9.0.0

(Revised: Sep 10, 2012)

Page 85

Student
Thought
Problem
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 86

EE/ME 701: Advanced Linear Systems

Section 9.1.0

9.1 Important properties of Inner Products

EE/ME 701: Advanced Linear Systems

Section 9.2.0

9.2 Important properties of Norms

An inner product is an operation on two vectors producing a scalar result.

A norm is a measure (in a general sense) of the size of something.

The following properties hold for inner products:

A norm is a function of a single vector that produces a real scalar result

1. Commutativity: hx, yi = hy, xi (overbar indicates complex conjugate).

||x|| R

2. Distributivity (linearity): hx, y1 + y2i = hx, y1 i + hx, y2i


3. Can be an induced norm: hx, xi 0 x , and hx, xi = 0 iff x = 0 .

A norm must have the following properties:


1. Positive definiteness: ||x|| 0 , ||x|| = 0 if and only if x = 0 .

It follows that:

2. Scalar Multiplication: || x|| = || ||x||

1. Scalar multiplication, right term: hx, yi = hx, yi

3. Triangle Inequality:

2. Scalar multiplication, left term: h x, yi = hx, yi

||x + y|| ||x|| + ||y||

The length of the sum of two vectors can not be greater than the
individual lengths of the vectors.

Additionally: hx, yi may be zero, positive or negative.

4. Cauchy-Schwarz Inequality: |hx, yi| ||x||2 ||y||2


Two vectors can not be more parallel than fully parallel.
Technically, a vector space can be a vector space without having any norm
defined.
To be a vector space requires only a set of elements and the 8 properties
described in section 2.2.
But for the familiar vector spaces of Rn we have seen several norms,
||x||1 , ||x||2 , ||x|| , etc.
Specifically called a normed vector space.

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 87

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 88

EE/ME 701: Advanced Linear Systems

Section 9.3.0

9.3 Study tip:

EE/ME 701: Advanced Linear Systems

Section 9.3.0

Embedded vector subspace


Proper vector subspace

Learn the definitions !

Hyperplane, Hypersurface
Terms appear in italic bold where they are introduced and defined.

All most elements of A have property p

Working together and talking about the subject mater will help toward this
goal.
Flash cards and drill may also be useful.

Orthogonality of subspaces
Representation of a vector
Transformation from one representation to another

Vocabulary :

Projection

Vector

Projection operator, Projection matrix, Projection coefficients

Euclidean vector

Non-orthogonal projection

Inner product (dot product)

Normalization of a vector, Normalized vector

Outer product (matrix product)

Ortho-normal vectors

Cross product (vector product)

Gram-Schmidt orthogonalization

Orthogonality

Model set, reachable set

Norm, Induced norm

Four fundamental spaces

Parallel, Co-linear, Orthogonal, Direction cosine

Row, Column, Null, Left-null Spaces

Definition Vector space

Range, Range space

Scalar, Vector, Closure, Linear combination of vectors

Rank, degeneracy

Additive identity, Multiplicative identity

Full rank, Rank deficient, degenerate

Span, Spanning set


Basis vectors
Dimension of a vector space
Standard basis
Vector universe
Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 89

Part 2: Vectors and Vector Spaces

(Revised: Sep 10, 2012)

Page 90

EE/ME 701: Advanced Linear Systems

EE/ME 701: Advanced Linear Systems

Linear Operators on Vector Spaces

4.3.2
4.4

Contents

Operator from data, example

. . . . . . . . . . . . . . .

46

Conclusions change of basis as a tool for analysis . . . . . . . . .

47

5 Operators as Spaces (Bay section 3.2)

1 Linear Operator

2 Rotation and Reflection Matrices

5.1

Operator Norms

. . . . . . . . . . . . . . . . . . . . . . . . . .

49

Operator norm properties . . . . . . . . . . . . . . . . . .

49

Determining the value of Operator norms . . . . . . . . . . . . .

50

5.1.1
5.2

48

2.1

Example Rotation and Reflection Matrices . . . . . . . . . . . . .

5.2.1

The L1 norm of an operator . . . . . . . . . . . . . . . . .

50

2.2

Three theorems and a corollary defining rotation matrices . . . . .

5.2.2

The L2 -norm of an operator . . . . . . . . . . . . . . . .

52

2.3

Summary of mathematical properties of rotation matrices . . . . .

14

5.2.3

The L -norm of an operator . . . . . . . . . . . . . . . .

53

2.4

Multi-axis rotations comprise rotations about each axis.

. . . . .

15

5.2.4

The Frobenius norm . . . . . . . . . . . . . . . . . . . .

54

Rotation matrix in terms of the from-frame axes


expressed in to-frame coordinates . . . . . . . . . . . .

18

Example: Photogrammetry, measurement from images . .

20

2.4.1
2.4.2

3 Linear Operators in Different Bases, or


A Change of Basis (Bay section 3.1.3)
3.1

21

Linear Operators and a change of Bases . . . . . . . . . . . . . .

4 Change of Basis as a Tool for Analysis

22
25

4.1

Example: non-orthogonal projection onto a plane . . . . . . . . .

28

4.2

Looking at the Fourier Transform as a change of basis

. . . . . .

37

4.2.1

The Fourier transform as a change of basis . . . . . . . .

42

4.2.2

Using the Fourier transform . . . . . . . . . . . . . . . .

43

4.3

Additional examples using change of basis


4.3.1

. . . . . . . . . . . .

44

Matching input and output data to discover an operator . .

44

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 1

5.3

Boundedness of an operator

. . . . . . . . . . . . . . . . . . . .

55

5.4

Operator Norms, conclusions

. . . . . . . . . . . . . . . . . . .

55

5.5

Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . .

56

6 Bay Section 3.3

57

7 Forming the intersection of two vector spaces

58

7.1

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8 Conclusion

Part 3: Linear Operators

59
60

(Revised: Sep 10, 2012)

Page 2

EE/ME 701: Advanced Linear Systems

Section 1.0.0

1 Linear Operator

EE/ME 701: Advanced Linear Systems

Section 2.0.0

2 Rotation and Reflection Matrices

An operator is a generalization of the notion of a function. Operators are


functions of numerical arguments, and also functions of functions.
See: http://mathworld.wolfram.com/Operator.html

Rotation Matrix: A rotation matrix R is an n n matrix with the properties:

Well be focusing on functions of numerical arguments, so an operator is


essentially a synonym for a function.
Linear Operator: An operator A from vector space X to vector space Y, denoted
A : X Y, is linear if it verifies superposition:
A (1 x1 + 2 x2 ) = 1 A x1 + 2 A x2

Rotation and Reflection matrices are good examples of linear operators.


They are used extensively and will play an important role in the Single Value
Decomposition.

1. The length of vectors is preserved:


||R x|| = ||x||

x Rn , R Rnn and a rotation matrix

(2)

2. Angles between vectors are preserved:


hR x1 , R x2 i
hx1, x2i
=
||x1|| ||x2 || ||R x1 || ||R x2 ||

(1)

x1 x2 Rn

(3)

(recall the direction cosine).


For vectors x Rm

and y Rn , then Eqn (1) corresponds to ordinary matrix-

Since the denominators are the same, Eqn (3) implies

vector multiplication with A Rnm .

hx1 , x2i = hR x1, R x2 i

Bay discusses linear operators using examples 3.1 - 3.13.

(4)

3. Handedness: Rotation matrices preserve handedness, equivalently for


rotation matrix R
det (R) = +1

Example: The projection operator is a linear operator


(mapping a vector from Rn to a subspace of dimension n).

Reflection Matrix:
1. Reflection matrices preserve length
2. Reflection matrices preserve angles
3. Handedness: Reflection matrices reverse handedness, for reflection
matrix Q
det (Q) = 1
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 3

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 4

EE/ME 701: Advanced Linear Systems

Section 2.1.0

EE/ME 701: Advanced Linear Systems


Setup rotation matrix R2 with = +40o, det (R2) = +1

2.1 Example Rotation and Reflection Matrices


2-D, Rotation Matrix:

cos () sin ()
R=

sin () cos ()

(5)

(Determinant will equal +1)


2-D, Rotation Matrix with Reflection:

Section 2.1.0

>>
>>
>>
>>

%% Setup a +40 deg rotation matrix


theta2 = 40
Ct2 = cosd(theta2)
St2 = sind(theta2)
Rot2 = [ Ct2, -St2; St2, Ct2 ]
Rot2 =
0.7660
-0.6428
0.6428
0.7660

>> det(Rot2)
ans =
1.0000

1 0 cos () sin () cos () sin ()


Q=

sin () cos ()
sin () cos ()
0 1

(6)
Add a reflection to R2 , det (Q) = 1

(Determinant will equal -1)


Setup rotation matrix R1 with = 40o , det (R1) = +1
%% Setup a -40 deg rotation matrix
>> theta=-40
>> Ct = cosd(theta); St = sind(theta);
>> Rot1 = [ Ct, -St; St, Ct ]
Rot1 =
0.7660
0.6428
-0.6428
0.7660
>> det(Rot1);
ans = 1.0000

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 5

%% A reflection operator can be an identity matrix


%% with an odd number -1 elements.
>> Rot2PlusReflection = [-1
0 ] * Rot2;
[ 0
1 ]
Rot2PlusReflection =

-0.7660
0.6428
>> det(Rot2PlusReflection)
ans =
-1.0000

Part 3: Linear Operators

0.6428
0.7660

(Revised: Sep 10, 2012)

Page 6

EE/ME 701: Advanced Linear Systems

Section 2.1.0

EE/ME 701: Advanced Linear Systems

2.2 Three theorems and a corollary defining rotation matrices

Process points with rotation and reflection matrices.


The rotation and reflection matrices are linear operators, they map points
in R2 onto points in R2 .
>> P1 = [ 0.5000
0.5000
0.7000
0.7500
0.2500
0.2500]
>> P2
= Rot *
P1
P2 =
0.8651
0.5437
0.6969
0.2531
-0.1299
-0.2584
>> P3
= Rot2 * P1
P3 =
-0.0991
0.2223
0.3755
0.8959
0.5129
0.6415
>> P3b = Rot2PlusReflection * P1
P3b =
0.0991
-0.2223
-0.3755
0.8959
0.5129
0.6415

THEOREM 3.1: A rotation matrix must have the property that RT = R1.
Proof: Because lengths must be preserved, the angles condition can be rewritten:
hx1, x2i = xT1 x2 = xT1 RT R x2 = hR x1, R x2i

(7)

For Eqn (7) to be true x1 , x2 Rn, RT R must be the identity matrix, ergo
RT = R1.
QED
Example: Rotation matrices are linear operators
For a rotation of [degrees] in R2, the rotation matrix is:

C
S

R =

S C

Rotation and Rotation with Reflection


1
Rotated by 40 deg
and Reflected over the Y axis
(X values inverted)

Section 2.2.0

Original

(8)

where C = cos () and S = sin ().

0.5

As an example, consider = 20o, then


0

0.94
0.34

R=

0.34 0.94

Rotated by 40 deg
0.5
1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

Verify that R given by Eqn (8) is a ortho-normal matrix for any value of .

Figure 1: Example of rotation and reflection.


Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 7

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 8

Student
Exercise

EE/ME 701: Advanced Linear Systems

Section 2.2.0

EE/ME 701: Advanced Linear Systems


If the vi are normal, then

THEOREM 3.2: For any square ortho-normal matrix A:


i)

The transpose of the square ortho-normal matrix is the matrix inverse,


AT = A1.

ii)

Square ortho-normal matrices preserve length: ||A x|| = ||x|| x Rn .

ii)

Square ortho-normal matrices preserve angles:

1
0
0

T
A A= 0 1 0

0 0 1

ii)

iv) The determinant of A is either det (A) = 1 or det (A) = 1.


Proof:
i)

To show that

A=
then

v1 v2 v3

By property i) RT R = I. By property ii) ||R xi || = ||xi ||, so

hv1, v1i hv1 , v2 i hv1 , v3i

AT A = hv2, v1i hv2 , v2 i hv2 , v3i

hv3, v1i hv3 , v2 i hv3 , v3i

hR x1 , R x2 i
hx1, x2i
=
||x1|| ||x2 || ||R x1 || ||R x2 ||
(9)

Part 3: Linear Operators

0
0
hv1, v1i

AT A =
0
0
hv2, v2i

0
0
hv3 , v3 i

(Revised: Sep 10, 2012)

iv) Two properties of the determinant are:


1. For square matrices B, Q and R with B = Q R, det (B) = det (Q) det (R),
2. For any square matrix B, det (B) = det BT

If the vi are orthogonal, then

q
q
p
Let x2 = A x1, then ||x2 || = hx2, x2i = xT1 AT A x1 = xT1 x1 = ||x1||,
where the AT A is eliminated in the 3 step because AT A = I .

iii) hR x1 , R x2 i = xT1 RT R x2

, consider the matrix




(11)

and AT = A1. The argument of this example extends directly to


A Rnn .

x1 , x2 = R x1 , R x2

AT A = I

Section 2.2.0

(10)


Since AT A = I, it follows that det AT det (A) = det (I) = 1. But since


det AT = det (A), it follows that det (A) = 1, thus, given that A is a square
ortho-normal matrix, det (A) = +1 or 1.
QED
~

Page 9

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 10

EE/ME 701: Advanced Linear Systems

Section 2.2.0

THEOREM 3.3: Any square ortho-normal matrix A is a rotation matrix if


det (A) = +1 and a reflection matrix if det (A) = 1, and any rotation or
reflection matrix is a square ortho-normal matrix.

EE/ME 701: Advanced Linear Systems


x1 =

0 1 0

Section 2.2.0

with the 1 in the kth element. Now

hx1, x2i = xT1 x2 6= xT1 x3 = xT1 RT R x2 = hR x1 , R x2i

Proof:

(13)

This contradicts the hypothesis that R is a rotation or reflection matrix.


i)

Any square ortho-normal matrix is either a rotation or reflection matrix


Theorems 3.1 and 3.2 establish that a square ortho-normal matrix preserves
lengths and angles, therefore it is either a rotation or reflection matrix.

ii)

Any rotation or reflection matrix R is a square ortho-normal matrix, proof by


contradiction.
Assume there is a rotation or reflection matrix R which is not a square
ortho-normal matrix, that would imply that either the columns of R are not
orthogonal or that the columns of R are not normalized. Show that each leads
to a contradiction.
ii.a) If the columns are not orthogonal, show that R can not preserve angles.
First, looking back to Eqn (9), if the columns are not orthogonal, there must



be at least one pair vi , v j , i 6= j such that vi , v j = ai j 6= 0. Therefore
RT R 6= I.
Next we need to show that since RT R 6= I ,
x1 , x2 s.t.

hx1 , x2i = xT1 x2 6= xT1 RT R x2 = hR x1 , R x2i

(12)

Note: In a proof, it is not enough to simply assert Eqn (12). Even though
RT R 6= I, how do we know there are allowed choices for x1 and x2 such
that xT1 x2 6= xT1 RT R x2 . The last step of the proof gives a prescription to
construct such a x1 and x2.
ii.b) If one or more columns are not normalized, show that R can not preserve
angles.

In this case there is no vi , v j = ai j 6= 0, i 6= j ; but there is hvi , vi i = aii 6= 1.


Select x2 = vi , then RT R x2 = aii x2 , and
hx1, x2i = xT1 x2 6= xT1 aii x2 = xT1 RT R x2 = hR x1 , R x2i
This contradicts the hypothesis that R is a rotation or reflection matrix.
Thus, it is shown that given R is a rotation or reflection matrix, the assumption that
R is not ortho-normal leads to a contradiction.
QED
Thus all rotation or reflection matrices are ortho-normal matrices.

Because RT R 6= I , there exists at least one x2 such that RT R x2 = x3 6= x2.


Because x3 6= x2, there must be at least one element of x3 which does not
equal the corresponding element of x2, call this the kth element, and choose

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 11

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 12

EE/ME 701: Advanced Linear Systems

Section 2.2.0

THEOREM 3.4: Any two ortho-normal coordinate frames (sets of basis vectors)
A and B in Rn are related by a rotation matrix R and at most one reflection.
Proof: We can transform vectors represented in either coordinate frame to the
standard frame by
s
s
x = A ax ,
x = B bx

EE/ME 701: Advanced Linear Systems

Section 2.3.0

2.3 Summary of mathematical properties of rotation matrices


Rotations and reflections are linear operators, they map from Rn to Rn by a
matrix multiplication.
Matrices that preserve lengths and angles are either rotation or reflection
matrices.

and so
a
bT

= A1 B

and so
det ( abT ) =

All rotation or reflection matrices are square ortho-normal matrices, giving


RT R = I

1
det (B)
det (A)

Thus RT = R1.

Note: with A, Q, R Rnn :

If R is a rotation matrix, RT is a rotation matrix.


det A1 = 1/ det (A) ,

If R1 and R2 are rotation matrices, R3 = R1 R2 is also a rotation matrix.


All square ortho-normal matrices are either rotation or reflection matrices,
if

det (Q R) = det (Q) det (R)


Since A and B are ortho-normal bases, det (A) = 1 , det (B) = 1, which

shows that det ab T = 1, and so abT incorporates at most one reflection.

Even in Rn !

det (R) = +1 the matrix is a rotation matrix,


det (R) = 1 the matrix is a reflection matrix.
~

COROLLARIES 3.4: The action of any two reflections in Rn is to restore the


original handedness.

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 13

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 14

EE/ME 701: Advanced Linear Systems

Section 2.4.0

EE/ME 701: Advanced Linear Systems

Section 2.4.0

2.4 Multi-axis rotations comprise rotations about each axis.


Original Coordinates

Bay gives the example of a rotation operator.

Pitch 30.00 [deg]

Za

In Robotics and elsewhere, the rotation operator is called simply a


rotation matrix.

Za
1

0.5

0.5

0.5
0.5

Ya

Pitch, roll and yaw rotations are seen in figure 2. These are the rotations
about the three axes, and can also be referred to as:

Yb

0
0

Rotation about the X-axis, Rx (pitch)

0.5 0.5
0.5

0.5

0.5

1
Y

0.5
0
0

1
1.5 1.5

Ya

Xa

0.5

0.5

1
Y

1
1.5 1.5

Rotation about the Z-axis, Rz (roll)


Rotation about the Y -axis, Ry (yaw)
The terms pitch, roll and yaw are assigned to different axes by different
authors.

Roll 30.00 [deg]

Yaw 30.00 [deg]

Z
b
Z

Rx , Ry , Rz have the advantage of being unambiguous.

Zb Za

0
0.5
0.5

An example multi-axis rotation is seen in figure 3.

Xa

0.5

0.5
Z

Multi-axis rotations preserve length and angles (so a Euclidean basis set
such as the X, Y, Z axes, remain orthogonal).

Xb
0

0.5

0.5

1
Y

1
1.5 1.5

0.5 0.5
0.5

0.5

0
0

0.5

0.5

1
Y

1
1.5 1.5

Figure 2: 3D rotations, illustrating individual rotations Rx (pitch), Rz (roll), and


Ry (yaw). Note right-hand rule for rotation direction.

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 15

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 16

EE/ME 701: Advanced Linear Systems

Section 2.4.0

EE/ME 701: Advanced Linear Systems

Section 2.4.1

2.4.1 Rotation matrix in terms of the from-frame axes expressed in toframe coordinates
Above view
Z

Side View
Pitch: 30.00, Roll: 20.00, Yaw: 25.00

1
0.5

For example, the rotation from B to A in figure 4 is given as


0.866 0.500
C S
A
=

BR =
0.500 0.866
S C

XX
b

0.5

Xa

Ya

0.5
0.5

0
0.5
0.5

0.5
1
Y1.5
1.5

0.5

0.5

0 0.5

1 1.5 1.5 1
Y

A rotation matrix provides a transformation from one coordinate frame to


another, we can call these the from frame and the to frame.

0.5 0

0.5

Point Pa is given as

Pa =

0.7
0.5

Pa = AB R BPa =

0.356
0.783

Pa

a
YB

xis

Y axis

0.5
xis

Top View
Pitch: 30.00, Roll: 20.00, Yaw: 25.00

a
XB

0
X axis

0.5
1

1
0.5
0
0.5
1.5

Y
1

Za

0.5

0.5
Z
0

0.5 0.5

0.5

Figure 4: Rotation from the A frame to the B frame, BA R.

1.5

0.5

Figure 3: General 3D rotation, combining rotations about all three axes.

The rotation matrix from B to A given by the axes of B expressed in A

Look at figure 4,
|
|

A
A
AY
BR = X B
B

|
|
Look at the B axes in figure 4, expressed in the A coordinate frame.

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 17

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 18

EE/ME 701: Advanced Linear Systems

Section 2.4.1

3-D example

EE/ME 701: Advanced Linear Systems

Section 2.4.2

2.4.2 Example: Photogrammetry, measurement from images


A typical application comes from photogrammetry, where it is often
necessary to shift vectors from target to camera coordinates.

The rotation from the B frame to the A frame in figure 3 is given by:

0.85
0.49
0.17

A
B R = 0.31 0.74 0.60

0.42 0.45 0.78

To shift coordinates from camera to target coordinates:

By convention, the rotation from the camera frame to the target frame is
given by rotations about the three axes, corresponding to three angles pitch,
roll and yaw: Rx (pitch, ) Rz (roll, ) Ry (yaw, ).

0.85

A
X B = 0.31

0.42

c
tR =

In general, the rotation from a B coordinate frame to an A coordinate


frame is given by:

A
A
BR = X B

AY

AZ

= tc R cPa + tPc.

Where tPc R3 is an offset vector.

The X-axis of the B frame, expressed in A coordinates is:

tP
a

(14)

c
tR

1 0

1 0
0
0

0 Rz Ry Rx

0 1
1

C S 0

0 S

= 0 1 0 S C 0 0 1 0 0 C S

0 0 1
0
0 1
S 0 C
0 S C

CC +C S CS S S S CC S

= SC
(15)
S SS +CC C SS CS

S
C S
CC

Compare Eqn (15) with Bay, Eqn (3.16). Bay uses a different ordering for
the rotations.
Because matrix multiplication does not commute, Bays the 3-axis
rotation matrix - while similar - is not exactly the same as tcR.
There are at least 48 ways to put together a 3-Axis rotation matrix, and
they are all found somewhere in the literature.

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 19

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 20

EE/ME 701: Advanced Linear Systems

Section 3.0.0

EE/ME 701: Advanced Linear Systems

3 Linear Operators in Different Bases, or

3.1 Linear Operators and a change of Bases

A Change of Basis (Bay section 3.1.3)

A linear operator is just a map from one vector onto another

When we make a change of basis, we change the axes on which a vector is


represented. For example, given two bases on R2 ,

U=

u1 u2

The vector f =

1 0
=
,
0 0.5

0.5 1

T

V=

v1 v2

y1 = A x1

If we change the representation of the vectors, we have to make a suitable


change in the linear operator.

0.8 1
=

0.8 0

Change the representation of the vectors


x2 = Bx x1

f=U f=

u1 u2

y2 = By y1

(17)

0.8
1
or f = 1.25
+0.5

0.8
0

where Bx and By are basis transformations in the input and output spaces,
respectively, and must be square and invertible.
Rewriting Eqn (16) with the change of basis
y1 = A x1
y2 = A x2

Which is equivalent to writing

(16)

can be represented

0.5
1
0
f=
= 0.5 +2.0

1.0
0
0.5

Section 3.1.0

0.5

,
2.0

or f = V f =

v1 v2

1.25

0.5

With a change of basis from U to V , the representation of a vector changes,


but the vector itself (vector f) remains the same.

(Eqn 16, repeated)


(18)

Relating Eqn (16) to Eqn (18) gives


By y1 = A Bx x1

or

y1 = B1
y A Bx x1

(19)

Using the uniqueness of Eqns (16) and (18), Eqn (19) implies that

A = B1
y A Bx

or, equivalently

A = By A B1
x

We started with the equation


y1 = A x1
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 21

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 22

EE/ME 701: Advanced Linear Systems

Section 3.1.0

EE/ME 701: Advanced Linear Systems

Section 3.1.0

Notice that when we write

and ended up with the equation


y2 = A x2

y2 = By y1

where the input and output bases of linear operator A have changed.
Which brings up this point: implicit in any linear operator are the bases
in which the input and output are expressed.
We normally assume these to be the standard Euclidean bases for Rm
and Rn .

x2 = Bx x1

the transformation matrices By and Bx must be square matrices, and full rank
to be invertible. By Rnn , Bx Rmm
A special case of basis transformation arises when A is a square matrix, or
A : Rn Rn . In this case the input and output transformations can be the
same,
y2 = B y1 , x2 = B x1
Combining with Eqns (16)-(19) above, we can write
y1 = A x1 = B1 A B x1
Which gives:
A = B1 A B
1

BAB

(20)

= A

(21)

When A and B are square and B is invertible, Eqn (20) has a special name.
It is called a similarity transformation.
Similarity transformations preserve the eigenvalues, in other words

eig (A) = eig A , where eig (A) is the vector of eigenvalues of A

Similarity transformations are going to give us the freedom to re-write


system models from one basis to another, to explore model properties such
as modal response, controllability and observability.
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 23

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 24

EE/ME 701: Advanced Linear Systems

Section 4.0.0

4 Change of Basis as a Tool for Analysis

EE/ME 701: Advanced Linear Systems

Section 4.0.0

The basic architecture of an application of change of basis is this,


Given an application:

We have seen the basic mechanics of

1. The application domain has natural basis

Representation of a vector on a specific basis, and


Example:

How to change the basis of vectors, and finally


How to adapt an operator, when the vectors it operates on change basis
These are powerful tools for several types of problems.

Time-domain, for waveform compression

Call this basis s, for the standard basis


2. The application is difficult in the original basis

Bay homework problems 3.4, 3.5, 3.6, 3.7, 3.9, 3.10 and 3.11 are all
addressed by a change of basis.

Example:

A change of basis is the mathematical foundation for many common


methods

If we just throw out 9 out of 10 samples in the time domain, the


reconstruction will be very bad

3. There is an alternative basis on which the application is easy


Fourier transform
Wavelet transform

Example:

Expansion onto Lebesgue and Legendre polynomials

If the data vector (a voice signal) is represented on a set of


wavelet vectors (discrete wavelet transform), the application is
easy:

And applications
Just throw out the coefficients for basis vectors that make
little difference for human perception

Reconstructing MR images
jpeg image compression
Speech compression for cell-phone transmission
And for control systems, a change of basis is necessary to solve
x (t) = A x (t) + B u (t)
at all.
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 25

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 26

EE/ME 701: Advanced Linear Systems

Section 4.0.0

EE/ME 701: Advanced Linear Systems

4.1 Example: non-orthogonal projection onto a plane

4. We solve the problem or achieve the application in 3 steps:


Step 1. Transform the data from the s basis into a basis where the
application is easy, call this this F basis (it can have a different
name for each application).

Bay problem 3.9 raises an interesting challenge from computer graphics.


Problem 3.9 (adjusted values):
Let P be the plane in R3 defined by 1x 1y + 2z = 0, and let l be the vector

Step 2. Solve the application with data in the new F basis

2



l= 1 .


1

Step 3. Transform the results back to the s basis, for utilization


s

A = sF T

Problem expressed
in s basis

(Problem is generally
unsolvable on the s basis)

F
sT

(22)

Solution expressed
in s basis

Alternative, F, basis
Solve
Problem

Section 4.1.0

Denote by A : R3 R3 the projection operator that projects vectors in R3


onto the plane P, not orthogonally, but along vector l.
Projected points can be pictured as shadows of the original vectors onto
P, where the light source is at an infinite distance in the direction l.

(Problem is solvable
on the F basis)

1. Find the matrix


of operator
 F A which projects a vector represented in

basis F = f1, f2 f3 onto plane P, where f1 and f2 are any non-

Figure 5: Problem solving steps using a change of basis.

collinear vectors in the plane P, and f3 = l.

Problem: Canons blowing


up, Heat distribution
known in s basis
(s: just x, y, z)
No solution to
Heat-conduction
Equation for general
h0(x,y,z)

Alternative, F, basis is
sine and cosine functions
Solve
Problem
Heat conduction can
be solved for sin(x)
initial distrubtion.

Solution expressed
in s basis.

Hot points determined,


cannon redesigned.

By superposition, the solution


for many sine functions, is the
sum of the solutions for each
individual sine fucntion.

(Revised: Sep 10, 2012)


Non-orthogonal projection is a standard operation in computer graphics.
Rendering systems do shading and compute reflections by representing
complex surfaces as many flat triangular tiles, and computing the
intersection point of many rays with these tiles.

Figure 6: Problem solving steps using a change of basis.


Part 3: Linear Operators

2. Find the matrix of operator s A which projects a vector represented in


standard R3 basis onto plane P.

Page 27

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 28

EE/ME 701: Advanced Linear Systems

Section 4.1.0

The origin For the plane to be a 2-D vector subspace, the origin must lie in the
plane. In practical systems with many tiles (planes), the origin is offset to
a point in each plane as it is processed, so that a projection operator can be
used.

y1

ne

Section 4.1.0

Definitions and background details:

x1

Pla

EE/ME 701: Advanced Linear Systems

Surface Normal The plane is specified by its surface normal, a vector that is
orthogonal to the plane. This is a standard way to specify an n 1
dimensional hyper-surface in Rn .

x2
~n: Surface normal

y2

It is necessary to find basis for vectors for plane P . We can use the fact
that all vectors in plane P are orthogonal to n.

~l: Shadow ray

The projection The calculation


y = Ax

Figure 7: Illustration of projection along ray ~l. In computer graphic, complex


surfaces can be represented as a mesh of triangular tiles.

is a projection of a point (x) onto the surface (to point y) . Since x and y are
both 3-vectors, A is a 3x3 matrix.

The problem can be stated: Given a ray originating at point

x


x= y ,

z

following the line

Note: up to now we have considered only orthogonal projections, that is, if


g = P f is the projection of f onto a subspace, and w = f g, then w g.

2


l= 1

1

But in this case, as with many projections in computer graphics, the


projection is not orthogonal. That is:

Where does the ray strike a plane defined by the surface normal n = 1 ?

2
Being able to cast the relation as a linear operator, of course,
greatly simplifies and accelerates the calculation.
A Play Station 3 can perform this calculation several billion times per
second.
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 29

The line l is not orthogonal to the plane or, equivalently,


l is not parallel to n, l n .
Bases: This problem is approached in problem 3.9 (a) on a basis F. In this
note, points in the standard space (x , y , z) are labeled sx and sy. Vectors
expressed on basis F are labeled Fx and Fy .
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 30

EE/ME 701: Advanced Linear Systems

Section 4.1.0

Suggested Approach: With many engineering challenges, it is good to ask the


question: when I have an answer, how can I verify that it is the correct
answer. Turn problem 3.9 around and ask:
given a point sxi in R3 and its shadow in plane P (call this syi ) how
can I independently verify that syi is correct ?
For this problem, the reverse problem, verification, may be easier than the
analysis to determine sy, and thinking through how to verify a correct
answer may help find how to determine sy.

EE/ME 701: Advanced Linear Systems

(b): to verify that syi is the shadow of sxi consider what it means to be the shadow:
yi sxi = l

Example Data:

1. Given a point sxi and a point syi in the plane, derive the calculation to verify:

(b) that syi is the shadow of sxi


(a): to verify that point sy1 lies in plane P we need basis vectors for P. These will
be any two independent vectors lying in P. That is, any two independent
vectors, each orthogonal to n.
1

sx

2
2.67

s
1 = 3 which projects onto y1 = 0.67

4
1.67

Verifying that syi lies in plane P: One way to check that sy1 lies in P is to form
the projection of sy1 onto P, and verify that it equals sy1 .

(a) that syi is in plane P, and

(24)

If syi sxi = l , then syi is the shadow of sxi .

Consider the point

Considering verification of the correctness of a solution:

Section 4.1.0

Considering that n = 1 ,

one choice is: P = 0 1

1 0

(23)

The projection of sy1 onto P is given by:

>> P = [ 2 1 ; 0
P =
2
1
0
1
-1
0

1 ; -1 0 ]

sy1 = -2.67
0.67
1.67

>> %% Verify sy1 lies in P


>> sy1hat = P * inv(P*P) * P * sy1
sy1hat = -2.67
0.67
1.67

>> sy1hat-sy1
ans =
1.0e-15 *
-0.8882
0
-0.1665

1

PT syi
y = P PT P

If y = syi , then sy1 lies in P.


Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 31

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 32

EE/ME 701: Advanced Linear Systems

Section 4.1.0

Verifying that syi is the shadow of sxi : To show that sy1 is the shadow of sx1,
show that ( sy1 sx1 ) is parallel to l . For the example data we find:
%% Difference Vector
>> ll = sy1-sx1
ll = -4.6667
-2.3333
-2.3333
%% Term-by-term ratio
>> ll ./ l
ans = -2.3333, -2.3333, -2.3333

%% Projection ray l
l =
2
1
1

EE/ME 701: Advanced Linear Systems

Section 4.1.0

2. Given a vector Fx1 represented on basis F, find the operator FA that


determines Fy1, that is, determines the shadow point represented on basis F.
Discussion: Using F as a set of basis vectors, than any point sxi is represented as


2
1


2






s
xi = a1 0 + a2 1 + a3 1 = F F xi






1
0
1

sy1-sx1 = -2.3333 l

with

Now considering text problem 3.9.


(Problem 3.9 part a)
Find yi on the F basis. Given

1. Construct the set of basis vectors


F=

f1 f2 l

where f1 and f2 are basis vectors in the plane, and l is the projection ray.
Given P above, the set of basis vectors for F is:

F=

f1 f2 l

b1

yi = b2

b3

so that

2 1 2

= 0 1 1

1 0 1

(Revised: Sep 10, 2012)

Part 3: Linear Operators

a1

xi = a2

a3

This gives:


b1 1 0 0

F
yi = b2 = 0 1 0

0 0 0
0

Page 33

2
1
2






yi = b1 0 + b2 1 + b3 1 = F F yi






1
0
1

To find syi in plane P , set b3 to 0 !

Many F matrices are possible, with f1 and f2 as basis vectors for P.


Part 3: Linear Operators

(25)

F
xi

(Revised: Sep 10, 2012)

Page 34

EE/ME 701: Advanced Linear Systems

Section 4.1.0

For yi to be the projection of xi along l, in the F basis vectors, we can


only modify the 3rd coefficient, giving

b2 = a2

b1 1 0 0


b2 = 0 1 0


0 0 0
0

Or

yi = FA Fxi

FA

with

(Problem 3.9 part b)

Discussion: We have these three relationships:

Properties (a) yi lies in P, and (b) yi is the projection along l of xi


determine the projection in basis F :

Section 4.1.0

1. Starting with FA, find the operator sA that determines sy1 corresponding to
a point sx represented on the standard basis.

b1 = a1

EE/ME 701: Advanced Linear Systems

a1

a2

a3

1
0
0

F
A= 0 1 0

0 0 0

(26)

Standard map from F coordinates to standard coordinates:


s

yi = F Fyi ,

s
FT

so

=F

(28)

The standard map from s coordinates to F coordinates (just the inverse


of Eqn (28))
xi = F 1 sxi ,

note

F
sT

= F 1

(29)

And we have the operator in F coordinates, FA .


(27)

Answer: put the three pieces together


s

y =

s
FT

F
sT

xi

= sA sx

(30)

is the projection operator on the F basis.


>> sA = F*[1 0 0; 0 1 0; 0 0 0] * inv(F)
sA =
0.3333
0.6667
-1.3333
-0.3333
1.3333
-0.6667
-0.3333
0.3333
0.3333
Note: there is only one operator sA that projects along line l to plane P in standard
coordinates. The numerical values come out to those given, whatever basis
is chosen for P.

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 35

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 36

EE/ME 701: Advanced Linear Systems

Section 4.2.0

4.2 Looking at the Fourier Transform as a change of basis

EE/ME 701: Advanced Linear Systems


Consider the inverse DFT

The Fourier transform is written as a convolution integral. Given f (t)


Fourier Transform:
Inverse Fourier Transform:

F () =
f (t) =

2 j t

f (t) e

dt

F () e+2 j x d

f ( j) =
(31)
(32)

F (k) =
1
f ( j) =
N

Inverse DFT:

f ( j) e

j=1
N

2i
N ( j1)(k1)

F (k) e

2i
N ( j1)(k1)

(34)

k=1

The DFT gives a signal in frequency domain, k = (k 1)/N [cycles/sample].

f(j)

2
0

20

40

60
80
j [sample number]

100

120

140

real F(k)

100
50
0
50
0

10

20

30

40

50

60

70

imag F(k)

50

50
0

10

20

30
40
k [wave number]

50

60

70

>> w1 = 1/16; w2 = 1/32;


>> jjs = 1:128;
>> for j1 = jjs,
f(j1) = cos(2*pi*w1*j1) ...
+ cos(2*pi*w2*j1);
end
>> F = fft(f);
>>
>> kk = (0:127)/2;
>> figure(1),
>>
subplot(3,1,1)
>>
plot(jjs, f)
>>
subplot(3,1,2)
>>
plot(kk, real(F))
>>
subplot(3,1,3)
>>
plot(kk, imag(F))

F (k) e

2i
N ( j1)(k1)

(Eqn 34 , repeated)

k=1

2i
N (11)(k1)

2i (21)(k1)
eN

vk =

..

2i
e N (N1)(k1)

(33)

where f ( j) is in time-domain signal and F (k) is in frequency domain, and


DFT is the Discrete Fourier Transform.

1
N

Defining basis vectors

For the Discrete Fourier Transform (DFT), the DFT and Inverse DFT are
given by summations in the place of integrals:
DFT:

Section 4.2.0

ei k (11)



i (21)
e k

=

..

.


ei k (N1)

(35)

where

2 (k 1)
N
is the frequency corresponding to component F (k).
k =

(36)

Using the vk , the inverse DFT of Eqn (34) takes the form:
f=

1
(F (1) v1 + F (2) v2 + + F (N) vN )
N

(37)


T
where f = f (1) f (2) f (N)
is the time-domain signal, and

T
is the frequency-domain signal.
F = F (1) F (2) F (N)
Eqn (37) has the form of expanding f on a set of basis vectors.
The 1/N term is a normalization.

Figure 8: Signal and Discrete Fourier Transform.


Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 37

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 38

EE/ME 701: Advanced Linear Systems

Section 4.2.0

EE/ME 701: Advanced Linear Systems

Putting the basis vectors together in an array


V=

Putting together Eqns (39) and (41) gives

v1 v2 vN

(38)

f=

We find

1
VF
N
(where F is the Fourier transform of f).
f=

1
V VTf
N

(42)

where V V T is a projection operation, when the basis vectors are orthogonal.

(39)

Lets ample, with N = 6. Eqn (35) gives the code:

An important property of the Fourier basis functions is that they are


orthogonal . That is, for Eqns (35) and (38):
V VT = N I

Section 4.2.0



or, V T = N V 1

(40)

(The Fourier basis functions require term N to be normalized)


Looking at Eqn (33), and considering the vT is the complex conjugate
transpose of v, we find that the DFT is given by:
F =VTf

(41)

Summary:
Eqn (35) defines an orthogonal basis of N- element vectors (or functions,
for the continuous time FT)

>> w0 = 2*pi/N
>> for kk = 1:N,
>>
for jj = 1:N,
>>
V(jj,kk) =
>>
end
>> end

V =
1.00
1.00
1.00
1.00
1.00
1.00

1.00
0.50
-0.50
-1.00
-0.50

+
+
+
-

0.87i
0.87i
0.00i
0.87i

0.50 - 0.87i

%% Fundamental frequency
%% Build the N basis vectors
%% The N elements of each vec
exp(j*w0*(jj-1)(kk 1));

1.00
-0.50
-0.50
1.00
-0.50

+
+

0.87i
0.87i
0.00i
0.87i

-0.50 - 0.87i

1.00
-1.00
1.00
-1.00
1.00

+
+
-

0.00i
0.00i
0.00i
0.00i

-1.00 + 0.00i

1.00
-0.50
-0.50
1.00
-0.50

+
-

0.87i
0.87i
0.00i
0.87i

-0.50 + 0.87i

1.00
0.50
-0.50
-1.00
-0.50

+
+

0.87i
0.87i
0.00i
0.87i

0.50 + 0.87i

Because the Fourier basis functions are orthogonal, no matrix inverse


is required to compute the change of basis (Fourier would have had a
difficult time doing a large matrix inverse in 1805!)
The DFT, Eqn (41), is a change of basis, from the standard basis of f ( j)
to the Fourier basis F (k).
The IDFT is the change of basis back.
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 39

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 40

EE/ME 701: Advanced Linear Systems


Now consider the FFT
>> f = [ 1 2 3 4 5 6]
f =
1
2
3
4
5
6

Section 4.2.0

Finally, V is a set of orthogonal basis vectors (and 1/N is required to


normalize)

6
0
0
0
0
0

Part 3: Linear Operators

0
6
0
0
0
0

Section 4.2.1

4.2.1 The Fourier transform as a change of basis


%% Using the standard FFT function
>> F1 = fft(f)
F1 = 21.0000
-3.0000 + 5.1962i
-3.0000 + 1.7321i
-3.0000
-3.0000 - 1.7321i
-3.0000 - 5.1962i

%% Using a matrix multiply, for a change of basis


>> F2 = V*f
%% The inverse Fourier transform
F2 = 21.0000
>> f3 = (1/N)* V * F2
-3.0000 + 5.1962i
-3.0000 + 1.7321i
f3 = 1
-3.0000 - 0.0000i
2
-3.0000 - 1.7321i
3
-3.0000 - 5.1962i
4
5
6

>> V*V
ans =

EE/ME 701: Advanced Linear Systems

0
0
6
0
0
0

0
0
0
6
0
0

0
0
0
0
6
0

1. The matrix multiply form only works for sampled signals (discrete
signals, or the discrete Fourier transform). Continuous signals require
integrals (and can be defined in terms of inner products).
2. Matrix multiplication is conceptually simple, but computationally
inefficient. For a 1024 element FFT, a 1024x1024 matrix V would be
required.
Fourier actually focused on the columns of V . Fouriers insight contained 2
parts:
1. He could solve the problem of heat conduction for an initial heat
distribution given by a basis vector vk
2. The basis vectors of the Fourier transform are orthogonal !
Ordinarily given basis vectors V and data f, to find the basis
coefficients we need to solve

1 T 
F = V 1 f ,
or V T V
V f

But the Fourier basis vectors are orthogonal, with a very simple
normalizing factor,
F =VTf

0
0
0
0
0
6

(Revised: Sep 10, 2012)

If the Fourier Transform is just a matrix multiplication for a change of basis,


why are text book derivations based on summations and integrals:

so no matrix inversion is required !

Page 41

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 42

EE/ME 701: Advanced Linear Systems

Section 4.2.2

Section 4.3.1

4.3 Additional examples using change of basis

4.2.2 Using the Fourier transform


Suppose we had an operator FA that operates on frequency data, not on
time-domain data, and given f time-domain data.
1
y= V
N

A VT f

(43)

Where the logic is:


Convert f to frequency domain

(action of V T )

Apply the operator in frequency domain

(action of FA)

Convert the answer back to time domain

(action of V )

Now, using Fouriers technique, we can just make a time-domain operator


1
A= V
N

EE/ME 701: Advanced Linear Systems

A V

(44)

And so
y = sA f

(45)

4.3.1 Matching input and output data to discover an operator


Given the data

1
4
1


{x1 , x2 , x3} =
2 , 5 , 2

3
0
6

0
4
1

{y1 , y2 , y3} =
2 , 1 , 0

2
6
1

Find the operator A to solve

y = Ax
The key to finding a good basis for solving the problem is to realize that a
basis F is found so that the x data are

1 0 0


F

x1 , Fx2 , Fx3 = 0 , 1 , 0 , with

0
1
0

F

y1, Fy2 , Fy3

Then the operator would simply be:


F



A = Fy1 , Fy2 , Fy3

1
0
so that

F
y1 = FA Fx1 = FA 0 , Fy2 FA Fx2 = FA 1

0
0
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 43

Part 3: Linear Operators

(Revised: Sep 10, 2012)

etc.

Page 44

EE/ME 701: Advanced Linear Systems

Section 4.3.1

So the trick is to find a basis F on which

1


F
x1 = 0 ,

0

EE/ME 701: Advanced Linear Systems

Section 4.3.2

4.3.2 Operator from data, example

0


F
x2 = 1 , etc.

0

Setting the X and Y data


>> X = [1
2
3

The solution is to choose as basis vectors the xi vectors themselves. Then


F = [x1 , x2 , x3]

(46)

4 -1
5 2
6 0]

>> Y = [ 0
2
-2

Now setting up the change of bases

4 -1
1 0
6 1]
Double checking:

>> sFT = X;
which gives

With Eqn (46)



x1 = F Fx1 = [x1 , x2 , x3] 0 ,

0
s
FT

F
sT

=F,

>> FsT = inv(sFT)


FsT =
-0.8000
0.4000
-0.2000

etc.

= F 1

(47)

On the F basis:
F

y =

A x

with



A = Fy1 , Fy2 , Fy3 = FsT [ sy1, sy2 , sy3 ]

Now, to find the operator in standard coordinates, it is the usual equation:


s

A = sF T

F
sT

(48)

And the operator gives y s from x s in the standard coordinates:

-0.4000
0.2000
0.4000

0.8667
-0.2667
-0.2000

The operator on the F basis


>> FA = FsT * Y
FA =
-2.5333
0.9333
1.2000

1.6000
0.2000
-1.6000

1.6667
-0.6667
0

Now convert the operator to the standard basis


>> sA = sFT * FA * FsT
sA =
1.8000
0.4000
-1.2000
-0.6000
3.8000
2.4000

-0.8667
1.4667
-3.5333

>> sA * X(:,1)
ans =
0.0000
2.0000
-2.0000
>> sA * X(:,2)
ans =
4.0000
1.0000
6.0000
>> sA * X(:,3)
ans =
-1.0000
-0.0000
1.0000

y = sA sx

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 45

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 46

EE/ME 701: Advanced Linear Systems

Section 4.4.0

EE/ME 701: Advanced Linear Systems

Section 5.0.0

5 Operators as Spaces (Bay section 3.2)

4.4 Conclusions change of basis as a tool for analysis


Some problems are much more easily solved in a special coordinate frame
that is different from the natural (standard) coordinates of the problem.
In these cases the best way, and possibly the only way, to solve the problem
is to transform the data into the special coordinates, solve the problem, and
transform the answer back out again.
With linear transformations and operators, we can then build the operator
directly in standard coordinates with Eqn (22).
Since the time of Fourier, coordinate transformation is a standard tool, but
we might call it by other names, or never explicitly compute the matrices F
and F 1
Coordinate transformation is the only general way to solve the equation

The set of linear operators from one vector space into another (or into itself)
also forms a vector space.
Working with the space of all operators A : Rm Rn , and with
y1 = A1 x , y2 = A2 x
then
A1 x + A2x = (A1 + A2) x
and all the required properties of a vector space are satisfied
1. The 0 operator is included in the set
2. For every operator there is the additive inverse operator

x (t) = A x (t) + B u (t)

3. Closure under addition


4. Commutativity of addition
5. Associativity of addition
6. Closure under scalar multiplication
7. Associativity of scalar multiplication
8. Distributivity of scalar multiplication over operator addition

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 47

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 48

EE/ME 701: Advanced Linear Systems

Section 5.1.1

5.1 Operator Norms

EE/ME 701: Advanced Linear Systems

Section 5.2.1

5.2 Determining the value of Operator norms

Operators also have norms. Intuitively, the size of an operator relates to


how much it changes the size of a vector. Given:

Just as there are several vector norms, each induces an operator norm.
5.2.1 The L1 norm of an operator

y = Ax
with suitable norms ||y|| and ||x||. Then the operator norm ||A||op is defined
by:
||y||
(49)
= sup ||y||
||A||op = sup
x6=0 ||x||
||x||=1
where sup, supremum, indicates the least upper bound.
The operator norm is induced by the vector norms ||y|| and ||x||.
An operator matrix does not have to be square or full rank, it can be any
matrix, A Rnm
5.1.1 Operator norm properties

2. ||A1 + A2 ||op ||A1 ||op + ||A2 ||op


3. ||A1 A2 ||op ||A1 ||op ||A2 ||op
4. || A||op = || ||A||op

Or said another way, what is the maximum (L1 ) gain of operator A?


Example

1 4

y = Ax = 2 5 x,

3 6

what is the max. possible ||y||1 / ||x||1

For the moment lets consider x with ||x||1 = 1 . (Through linearity, ||y||1
just scales with ||x||1 ) .

Operator norms have these properties


1. ||A x|| ||A||op ||x||

The L1 norm of an operator answers this question: Given a vector x with


||x||1 , and y = A x , what is the largest possible value of ||y||1 ?

(from the definition of the operator norm)


(triangle inequality)
(Cauchy-Schwarz inequality)

It turns out that the choice for x that gives the largest possible 1-norm
for the output, is to put all of ||x||1 on the element corresponding to the
largest column vector A . (Largest in the L1 sense.)
Choose:

(scalar multiplication)
so

0

x=
1

then

||x||1 = 1 , ||y||1 = 15

||A||1op = 15
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 49

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 50

EE/ME 701: Advanced Linear Systems

Section 5.2.1

Example:

A=

EE/ME 701: Advanced Linear Systems

Section 5.2.2

5.2.2 The L2 -norm of an operator

1
4

,
a2 =
2
5

3 6

a1

y = Ax ,

The L2 -norm of an operator gives the maximum L2 length of an output


vector for a unit input vector.
so

0.5

for x =
,
0.5

0
for x = ,
1

||a2||1 = 15
||A||2 = sup
x6=0

1



y= 2 ,


3

1
for x = ,
0

||a1||1 = 6 ,

||y||1 = 6

||A||2 =

max

||x||2 =1

xT A A x

o1/2

yT y

(52)

Eqn (52) is just a re-statement of the definition of an operator norm, with


the ||y||2 expanded inside the brackets.
The L2 norm of a matrix is the largest singular value of the matrix
||A||2 = (A)


||A||1 = max a j 1
j

(Revised: Sep 10, 2012)

(53)

It is determined using the singular value decomposition (a topic of Bay


chapter 4). For Example:
>>[U,S,V] = svd(A)
U = -0.429
0.806
-0.566
0.112
-0.704
-0.581

V =

S =
0.408
-0.816
0.408

9.508
0
0

0
0.773
0

-0.386
-0.922

(50)

The L1 norm of matrix A is the largest L1 norm of any column vector of A.


Part 3: Linear Operators

Where the overbar indicates complex conjugate.

2.5

y = 3.5 , ||y||1 = 10.5

4.5

4


y = 5 , ||y||1 = 15


6

then

(51)

Bay defines the L2 norm of a matrix operator as:

Given
A = [a1 , a2 ... , am ]

||y||2
||A x||2
= sup
= sup ||A x||2
||x||2 x6=0 ||x||2
||x||2 =1

Page 51

>> x2 = [.386; .922]


;
y2 = A*x2 ,
y2 = [ 4.0740,
5.3820
,
6.6900]
>>norm(y2)/ norm(x2),
ans =
9.5080
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 52

-0.922
0.386

EE/ME 701: Advanced Linear Systems

Section 5.2.3

EE/ME 701: Advanced Linear Systems

5.2.3 The L -norm of an operator

5.2.4 The Frobenius norm


An alternative to the operator norms induced by the vector norms (the 1-, 2and - norms, above), one can define an norm directly on the elements of
the matrix. The entry-wise norms are given by:

The L -norm of an operator gives the maximum L length of an output for


a unit input .
||A|| = sup
x6=0

Section 5.2.4

||y||
||A x||
= sup
= sup ||A x||
||x|| x6=0 ||x||
||x|| =1

(54)
n

||A||opp =

Bay defines the L norm of a matrix operator as:


m

||A|| = max
i

j=1


ai j

i=1 j=1


p
ai j

!1

(56)

Note, care must be taken with notation, because the normal notion of the
operator norm is induced, as in section 5.2.2, not given by Eqn (56)

(55)
A special case of these so-call entry-wise norms is the Frobenius norm
n

The L -norm is given by the row vector of A with the greatest L1 -norm.

||A||F =

For the example matrix:

i=1 j=1

!1
2
2
ai j

(57)

The Frobenius norm has the properties a norm (described in section 5.1)

r
1 4
1

A = r2 = 2 5 ,

3 6
r3

r1 =
r2 =
r3 =

h
h
h

1 4
2 5
3 6

i
i
i

||r1 ||1 = 5
,

The Frobenius norm has the additional advantage of being easy to


compute

||r2 ||1 = 7

(Recall that finding ||A||2 requires solving the singular value


decomposition).

||r3 ||1 = 9

So ||A|| = 9

Additional relationships for the Frobenius norm are given by:

To see why ||A|| is given by the 1-norms of the rows


-norms of anything !) consider that:

(rather than the


||A||F =


















0
1
1
1
1
= =
=
= = 1










1
1
1
1
0

One of the above vectors, multiplying A, gives the maximum -norm


output. And that max is the greatest 1-norm of a row.

Part 3: Linear Operators

(Revised: Sep 10, 2012)

||A||F

Page 53

i1/2
h 
T
tr A A

v
umin(n,m)
u
= t 2i (A)

(58)

(59)

i=1

where tr () denotes the trace of a matrix, which is the sum of the elements
on the diagonal; and the i in Eqn (59) are the singular values.
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 54

EE/ME 701: Advanced Linear Systems

Section 5.4.0

EE/ME 701: Advanced Linear Systems

Section 5.5.0

5.5 Adjoint Operators

See, for example


J-C Lo and M-L Lin, Robust H Control for Fuzzy Systems
with Frobenius Norm-Bounded Uncertainties, IEEE Transactions on
Fuzzy Systems 14(1):1-15.

5.3 Boundedness of an operator

Adjoint Operator: The adjoint of a linear operator A is denoted A and must


satisfy the relationship
hAx, yi = hx, A yi
for all x and y

An operator is said to be bounded if there exists a finite such that


||A x|| ||x||

The transpose is a special kind of operator, called an adjoint operator

Adjoint Operator is a general concept that applies to all types of vector


spaces (such as vectors of polynomials)

For finite vectors on an ortho-normal basis (e.g., x Rn),

which is equivalent to saying ||A|| .

A linear operator is just a matrix, A

5.4 Operator Norms, conclusions


All 4 matrix norm definitions satisfy properties 1-4 of operator norms
(section 5.1.1).

The adjoint of a real-valued operator is just the matrix transpose



A = AT

The adjoint
operator is just the complex-conjugate
 of a complex-valued

T
transpose A = A
Example:

A=

Matlab:
Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 55

4 j

2
5

3+ j 6
1

A =

2 3 j

4+ j 5 6
1

is A

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 56

EE/ME 701: Advanced Linear Systems

Section 6.0.0

When A = A
e.g., A =

3+ j 7
A is a symmetric matrix for the real case,
2

A is a Hermitian matrix for the complex case (Hermitian = complex


conjugate transpose, A in Matlab)

EE/ME 701: Advanced Linear Systems

Section 7.0.0

7 Forming the intersection of two vector spaces


Given vector space U, with basis vectors {u1 , u2 , ..., unu } and vector space
V, with basis vectors {v1 , v2 , ..., vnv } , where nu is the dimension of U and
nv is the dimension of V.
Sometimes it is interesting to find the intersection of the two vector spaces,
which is a vector space W given by

We can also say that A is a self-adjoint operator


Hermitian matrices have two important properties that we will use to
develop the singular value decomposition

W = UV

(60)

When U = [u1 , u2 , ..., unu ] and V = [v1 , v2 , ..., vnv ], for vectors lying in W,
it must be the case that there are representations on both U and V , that is

They always have a complete set of eigenvectors


w W,

The eigenvalues are always real (even if the matrix is complex)

w = a1 u1 + a2 u2 + + anu unu = b1 v1 + b2 v2 + + bnv vnv


(61)

Eqn (61) can be rewritten as Eqn (62).

6 Bay Section 3.3


A system of simultaneous linear equations can be written in the form
y = Ab

Which is to say, vectors made of the U


and V basis coefficients of vectors in the
intersection, must lie in the null space of
the matrix [U, V ].

Bay address the various possibilities of the equation y = A b .


We will come back to this topic after covering the singular value
decomposition

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 57

Part 3: Linear Operators

a
1

a2

.
..

an
u

[U, V ]
=0

b1

b
2

.
..

bnv

(Revised: Sep 10, 2012)

Page 58

(62)

EE/ME 701: Advanced Linear Systems

Section 7.1.0

The dimension of the intersection is the number of linearly independent


solutions to Eqn (62), which is the dimension of the null space of [U, V ]
Vectors

a
1

wi = U ...

anu

(63)

8 Conclusion
For finite vectors, linear operators are matrices

Rotation + Reflection matrices Ortho-normal matrix


Linear Operators have a number of characteristics

Are basis vectors for W.

They change when we make a change of basis


The form a vector space

7.1 Example
>> U = [

Section 8.0.0

The rotation matrix is an example operator

b
1

wi = V ...

bnv

or equivalently

EE/ME 701: Advanced Linear Systems

1
2
3

4
5
6 ]

>> Null = null([U, -V])


Null = 0.5000
0.5000
0.5000
0.5000

>> V = [ 2
3
5

The norm is well defined

3
4
4 ]

For any operator, there is an adjoint operator

>> w1 = U*Null(1:2)
w1 = 2.5000
3.5000
4.5000

Verify that w1 lies in U and V by checking the projection onto each.


(Notice the projection calculation for non-orthogonal basis set.)
>> U*inv(U*U)*U*w1
ans = 2.5000
3.5000
4.5000
Part 3: Linear Operators

>> V*inv(V*V)*V*w1
ans = 2.5000
3.5000
4.5000
(Revised: Sep 10, 2012)

Page 59

Part 3: Linear Operators

(Revised: Sep 10, 2012)

Page 60

EE/ME 701: Advanced Linear Systems

EE/ME 701: Advanced Linear Systems

Singular Value Decomposition

6 Determining the rank of A and the four fundamental spaces with the
SVD
28
6.1

Contents
1 Introduction

Example: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

7 Exact, Homogeneous, Particular and General Solutions to a Linear


Equation
34

1.1

Rotation+ Matrices . . . . . . . . . . . . . . . . . . . . . . . . .

7.1

The four types of the solution for y = A x . . . . . . . . . . . . .

34

1.2

Scaling Matrices . . . . . . . . . . . . . . . . . . . . . . . . . .

7.2

Numerical example showing the generalized inverse

36

2 The Singular Value Decomposition

2.1

Numerical Example: . . . . . . . . . . . . . . . . . . . . . . . .

2.2

Forms for the SVD . . . . . . . . . . . . . . . . . . . . . . . . .

Demonstrating that A = rk=1 uk k vTk . . . . . . . . . . .

2.2.1

3 Proof of the SVD theorem (following Will)


Solving for the j and u j , given the v j . . . . . . . . . . . . . . .

12

3.2

Example, good choices for v j . . . . . . . . . . . . . . . . . . . .

13

3.3

Example, bad choices for v j . . . . . . . . . . . . . . . . . . . .

15

3.4

Choosing the v j correctly

. . . . . . . . . . . . . . . . . . . . .

16

3.5

Proof by construction of the SVD . . . . . . . . . . . . . . . . .

19

4 The generalized inverse of A, A#

40

9 History of the Singular Value Decomposition

42

10 Conclusions

43

20

Using generalized inverse of A . . . . . . . . . . . . . . . . . . .

23

4.1.1

24

A numerical example using the generalized inverse . . . .

5 SVD Conclusion

Part 4a: Singular Value Decomposition

8 Additional Properties of the singular values

12

3.1

4.1

. . . . . . .

27

(Revised: Sep 10, 2012)

Page 1

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 2

EE/ME 701: Advanced Linear Systems

Section 1.0.0

1 Introduction

EE/ME 701: Advanced Linear Systems

Section 1.1.0

Gilbert Strang calls the SVD:


"Absolutely a high point of linear algebra."

An n m matrix maps a vector from Rm to Rn :


y = Ax ,

y Rn ,

x Rm ,

A Rnm

(1)

For a nice explanation of the SVD by Todd Will, see:


http://www.uwlax.edu/faculty/will/svd

Ax=y
y

1.1 Rotation+ Matrices


As we have seen, rotation+ matrices preserve lengths and angles.

Output Space, Rn

When lengths and angles are preserved, shapes are preserved.

Figure 1: Action of a matrix, A maps x onto y.

As we shall see, in all cases, the linear transformation from Input Space
to Output Space is made up of 3 parts:
1. An ortho-normal transformation from input coordinates to singular
coordinates
2. Scaling (in singular coordinates)
3. An ortho-normal transformation from singular to output coordinates
As we saw in the previous section, an orthonormal matrix gives either a
rotation (det R = +1) or reflection (det R = 1). Some authors use the term
rotation matrix for actions 1 and 3, because it captures the intuition of
what is going on, but rotation alone is not sufficiently complete.

The relation y = A x is the general linear operator from an Input Vector


Space to an Output Vector Space.

Input Space, Rm

1
1

1
X

1
1

1
X

Figure 2: Example: a 45o rotation.

All full-rank orthonormal matrices are rotation+ matrices.

We will use rotation+ to refer the orthonormal matrices.


If we can understand y = A x in terms of rotations+ and scaling, many
interesting results and practical methods will follow directly.
Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 3

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 4

EE/ME 701: Advanced Linear Systems

Section 1.2.0

1.2 Scaling Matrices

EE/ME 701: Advanced Linear Systems

A scaling value is also called a singular value, and can be zero:

Scaling matrices scale the Euclidean axes of a space. An example Scaling


matrix is:


1 0
0.5 0

=
(2)
=
0 2
0 3
15

10

10

Scaled by 3.0

5
5

5
5

5
X

10

15

1 0
0 2

15

15

10

10

(3)

Scaled by 0.0
Scaled by 0.5

0.5 0

15

Section 1.2.0

0
5
5

5
X

10

15

Scaled by 0.5

5
X

10

5
5

15

5
X

10

15

Figure 4: Action of a scaling matrix with a singular value of zero for the Y axis.
Figure 3: Action of a scaling matrix.
A Scaling matrix does not have to be square. The matrices
Figure 4 was generated with:
FF = [ A collection of 2-vectors (specifying points) ]
Scale = [ 0.5, 0; 0 3.0];
FF1 = Scale * FF;

0.5 0 0
0

3 0

0.5 0

= 0

Scaling Matrices:
1. Are diagonal
2. Have zero or positive scaling values
A negative value flips the configuration, well do this with rotations+.

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 5

1 0 0

1 0

0 2 0


3 = 0 2


0 0
0

(4)

(5)

are also scaling matrices, in operators which map from R3 R2 or


R2 R3 , respectively

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 6

EE/ME 701: Advanced Linear Systems

Section 2.0.0

2 The Singular Value Decomposition

EE/ME 701: Advanced Linear Systems

Section 2.1.0

2.1 Numerical Example:

The SVD THEOREM:


Every matrix A can be decomposed into:
1. An rotation+ from input coordinates onto Singular Coordinates (the
coordinates in which the scaling occurs).

%% First SVD example


>> A = [ 2 3 ;
3 4
A =
2
3
3
4
4
-2

-2 ]

>> [U, S, V] = svd(A)

2. Scaling in singular coordinates


3. An rotation+ from singular coordinates onto output coordinates.

U =

-0.5661
-0.7926
-0.2265

-0.1622
-0.1622
0.9733

S =

6.2450
0
0

0
4.3589
0

Said another way:


A Rnm ,

A = U VT

-0.8082
0.5878
-0.0367

(6)

and so
y = Ax =U V x
T

(7)

V =

-0.7071
-0.7071

0.7071
-0.7071

where:
1. V T is an rotation+ from input coordinates onto singular coordinates.
2. is a scaling matrix.
3. U is an rotation+ from singular coordinates onto output coordinates.

%% A can be remade from U, S, V


>> A1 = U*S*V
A1 = 2.0000
3.0000
3.0000
4.0000
4.0000
-2.0000

Uniqueness:
Remarks:

The singular values are unique


The columns of U and V corresponding to distinct singular values are
unique (up to scaling by 1)

1. The columns of U form an ortho-normal base for Rn (the output space). U


is a rotation+ matrix.

The remaining columns of U and V must lie in specific vector subspaces,


as described below.

2. The columns of V form ortho-normal base for Rm (the input space). V is a


rotation+ matrix.
3. The scaling matrix is zero everywhere except the main diagonal.

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 7

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 8

EE/ME 701: Advanced Linear Systems

Section 2.2.1

2.2 Forms for the SVD

EE/ME 701: Advanced Linear Systems

Section 2.2.1

With the SVD

The singular value decomposition can be written


A = U V
or

(8)

A=

uk k vTk

(9)

| | | |
0 0

T
A = U V = u1 u2 un 0 2 0

0 0 ...
| | | |

k=1

where uk are the columns of U, vk are the columns of V , and r is the rank of
A.
Eqn (8) is a decomposition into matrices, while Eqn (9) is a decomposition
into individual component vectors.

vT1

vT2

(12)

..

T
vm

Recall
p = min (n, m)
r = rank (A)
Because of the zeros in the scaling matrix, Eqn (12) reduces to one of
Eqn (13) or (14), below.

2.2.1 Demonstrating that A = rk=1 uk k vTk


The development of the proof of the SVD and generalized inverse rest on
the decomposition

When m n , A is given by Eqn (13), where the columns of U for


j = m + 1..n multiply zero rows in , and are omitted.

A=

uk k vTk

(10)

k=1

Which is the relationship that tells us the connection between a specific


vector vk in the input space and a specific vector uk in the output space.
A general proof of the singular value decomposition theorem follows in
section 3.
Here we show that,
Given that a decomposition exists with the form
A =U V

(11)

where U and V are ortho-normal matrices and is a scaling matrix,


Then a decomposition according to Eqn (9) follows from Eqn (8).
Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 9

And when n < m, then A is given by Eqn (14), where the rows of V for
i = n + 1..m multiply zero columns in , and are omitted.

A =

A =

1 vT1

| |
|
T

2 v2
u1 u2 um
..

| |
|
m vTm

|
|
|

1 u1 2 u2 n un

|
|
|

Part 4a: Singular Value Decomposition

when p = m < n
,
keep only p columns of U (13)

vT1

vT2
when p = n < m
,
keep only p rows of V T (14)
..

vTn
(Revised: Sep 10, 2012)

Page 10

EE/ME 701: Advanced Linear Systems

Section 2.2.1

Eqns (13) and (14) both reduce to Eqn (15)

A = 1 u1 2 u2

|
|

Section 3.1.0

3 Proof of the SVD theorem (following Will)

vT1

vT

2
(15)
p up

..

|
vTp
|

EE/ME 701: Advanced Linear Systems

Going across the rows on the right, and down the columns on the left, the
elements of 1 u1 form a vector outer product with the elements of vT1 ,
2 u2 with vT2, etc.
The mechanics of multiplying the terms of Eqn (15) give the result

Following the demonstration given by Will (see: http://www.uwlax.edu/faculty/will/svd)


we first show that given the v j , the decomposition
r

A=

uk k vTk = U V T

(18)

k=1

can be derived.
Then we show how to determine the v j satisfying a necessary condition.

3.1 Solving for the j and u j , given the v j

A=

k uk vTk

(16)

k=1

If r < p then the k , k = {r + 1 , ... p} are zero, and those terms can be
dropped from Eqn (16), giving the sought after form:
r

A=

k uk vTk

(17)

k=1

Given Eqn (18) we find that


r

Avj =

uk k vTk

k=1

v j = u j j vTj v j = u j j

(19)

where all the terms k 6= j drop out because vTj v j = 0, k 6= j, and 1, k = j.


Thus, the u j and v j vectors and j must satisfy
Avj = uj j

(20)

Eqn (20) immediately solves for all of the j and vectors u j corresponding
to j > 0:


j = A v j
uj =

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 11

Part 4a: Singular Value Decomposition

(21)

1
Avj
j

(22)
(Revised: Sep 10, 2012)

Page 12

EE/ME 701: Advanced Linear Systems

Section 3.2.0

V =

0.39 0.92
0.92

0.39

Doing calculations of Eqns (21) and (22)

4.08

A v1 = 5.38

6.69

u1 =

1
1

9.51

0.43

1.00 4.00

0.39 0.92

=
2.00 5.00
0.77

0.92 0.39
0
3.00 6.00
(23)

For U to be an ortho-normal matrix, u1 and u2 must be orthogonal and


normal.
They are automatically normalized by the calculation of Eqn (22).
Check that they are orthogonal:
>> u1 = U(:,1)
u1 = -0.4287
-0.5663
-0.7039
>> u2=U(:,2)
u2 = 0.8060
0.1124
-0.5812
>> u1*u2
ans =
1.1102e-16

5.38 = 0.57

6.69
0.70

Repeating steps (21) and (22) for v2 gives:

9.51 0

= 0 0.77 ,

0
0

0.43 0.81 |

U = 0.57 0.11 ?

0.70 0.58 |

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

0.81

Remarks:





4.08




1 = 5.38 = 9.51




6.69

4.08

0.43

A = 0.57 0.11 ? 0

0.70 0.58 |
0

1 4

A= 2 5

3 6

Choose:

Section 3.2.0

Then A = U V T gives:

3.2 Example, good choices for v j


Consider

EE/ME 701: Advanced Linear Systems

Our algorithm does not tell us how to choose u j if j is zero or non-existent.


But either way, it doesnt matter for Eqn (23).
Since U is an ortho-normal matrix, the missing columns must be an

orthonormal basis for the orthogonal complement of the u j , j = 1..r .
Page 13

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 14

EE/ME 701: Advanced Linear Systems

Section 3.3.0

3.3 Example, bad choices for v j

EE/ME 701: Advanced Linear Systems

For U to be an ortho-normal matrix, its columns must be orthonormal. The


columns are normal automatically, by step (22), but orthogonal ?

What happens if we dont choose the correct ortho-normal matrix V ?


0.49 +
* 0.36


0.55 , 0.57 = 0.99 6= 0


0.66
0.75

For example, choose the V corresponding to a rotation of 20o :

V =

0.94 0.34
0.34

0.94

No ! if V is not correctly chosen, U does not turn out to be an ortho-normal


matrix !
(yet to be proven).

Doing calculations of Eqns (21) and (22)

2.31





2.31




1 = 3.59 = 6.48




4.87

A v1 = 3.59

4.87

2.31

0.36

3.4 Choosing the v j correctly


For finding the j and u j according to Eqns (21) and (22), the challenge is
to find v j which are orthonormal, such that u j are orthogonal.

This requires:

u1 =
3.59 = 0.55
1

4.87
0.75

For j 6= i ,

6.48 0

= 0 7.00 ,

0
0

Part 4a: Singular Value Decomposition

(24)

When j is zero or non-existent we are free to select u j in the orthogonal


complement of {ui } (the left-null space of A), so we are interested in the
case j , i > 0 .

In this case 1/ j (1/i ) 6= 0 , so Eqn (24) is verified if and only if

0.36 0.49 |

U = 0.55 0.57 ?

0.75 0.66 |

(Revised: Sep 10, 2012)

uTj ui = 0, and so

 1 1
uTj ui = vTj AT
(A vi ) = 0
j i

Repeating steps (21) and (22) for v2 gives:

Section 3.4.0





vTj AT (A vi ) = vTj AT A vi = 0

Page 15

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

(25)

Page 16

EE/ME 701: Advanced Linear Systems

Section 3.4.0

Lemma 1: When the vectors vi are ortho-normal and span Rm , then



vTj AT A vi = 0 j 6= i if and only if
AT A vi = i vi

(26)

EE/ME 701: Advanced Linear Systems

Section 3.4.0

In the general case of an m m matrix, eigenvalues and eigenvectors can be


a bit messy:
They can be complex.
The eigenvectors will not, in general, be orthogonal, and


in which case vTj AT A vi = i vTj vi = 0

There may not even be a complete set of eigenvectors (in which case we
say the matrix is defective)

Proof:



Necessary: Because the vi are ortho-normal, vectors v j , j 6= i span the
entire m 1 dimensional space orthogonal to vi , therefore if AT A vi must
lie in the 1-D space spanned by vi .
Sufficient: Since
v j vi .

AT

A vi = i vi , then

vTj AT

A vi =

vTj i vi

= 0, because
QED

The solutions to Eqn (26) are the eigenvalues and eigenvectors of

Q = AT A.

Square matrices have special vectors which give a scaled version of


themselves back under multiplication. For example:

1
1

2
2

Lemma 2: When Q is Hermitian (that is, when Q = Q , Q is the complex


conjugate), then
1. There is a complete set of eigenvectors, and furthermore, they are
orthogonal (ortho-normal, when we normalize them), and
2. The eigenvectors and eigenvalues are real (even if Q is complex).

Quick review of eigenvalues and eigenvectors:

However when Q is a Hermitian matrix (as Q = AT A must be) we are in


luck.

= 2

1
1

(27)

Remarks:
1. A symmetric matrix, such as Q = AT A , is a real Hermitian matrix.
2. Further discussion is deferred until our in-depth discussion the eigen-system
of a matrix.

These special vectors are called eigenvectors and the scale factors are
called the eigenvalues.
For an m m matrix, there can be at most m eigenvectors.
Eigenvectors can be scaled (considering multiplying left and right of Eqn
(26) by a scale factor). We usually normalize eigenvectors.
Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 17

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 18

EE/ME 701: Advanced Linear Systems

Section 3.5.0

EE/ME 701: Advanced Linear Systems

4 The generalized inverse of A, A#

3.5 Proof by construction of the SVD


With the above lemmas taken together, for any matrix A

Given the equation


y = Ax

AT A is Hermitian (symmetric if A is real)


Lemma 2 shows that AT A has a complete set of real, orthogonal
eigenvectors.
Lemma 1 shows that these eigenvectors satisfy the condition to be the
input basis vectors the SVD, the v j in
r

A=

Section 4.0.0

With A and y given. We are looking for a particular solution that solves
y = A x p
where
y is the projection of y onto the column space of A

uk k vTk

= U V

(28)

x p lies in the row space of A

k=1

Once the v j are known, determine the j u j corresponding to j 6= 0


according to


j = A v j
uj

1
=
Avj
j

(29)
(30)

The remaining u j are determined as basis vectors for the orthogonal


complement of the u j corresponding to j 6= 0.
Following these steps, the singular value decomposition of A is constructed.
QED

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 19

Whenever we have an ortho-normal basis for the column space of A


i
h
Uc = u1 u2 ur

(31)

We may find y by projection:

y = Uc UcT y =

uk uTk y

(32)

k=1

Writing Eqn (32) out graphically shows

uT

| |
|
| |
|

uT

2
y=
y = u1 u2 ur
u1 u2 ur

..

| |
|
| |
|
uTr

a
1
| |
|

a2

= u1 u2 ur
. = a1 u1 + a2 u2 + + ar ur

..

| |
|
ar
Part 4a: Singular Value Decomposition

uT1 y

uT2 y

...

uTr y

(Revised: Sep 10, 2012)

(33)

Page 20

EE/ME 701: Advanced Linear Systems

Section 4.0.0

Eqn (33) shows that the projection of y onto the column space of A can be
written
y = a1 u1 + a2 u2 + + ar ur
(34)

EE/ME 701: Advanced Linear Systems

Section 4.0.0

When we replace x in Eqn (38) with the expanded x of (39), we get


ak = k vTk x = k vTk (b1 v1 + b2 v2 + + br vr )

(40)

with
ak = uTk y

(35)

So far there is nothing surprising about Eqns (33-35). They say that we
can write the output of
y = A x p
(36)
as a linear combination of basis vectors on the column space of A.

But the basis vectors vk are ortho-normal, so


ak = k vTk (b1 v1 + b2 v2 + + br vr ) = k vTk vk bk = k bk

Putting the pieces together

We can write Eqns (33-35) whenever we have Uc , an ortho-normal basis


for the column space of A .

Using Eqns (35) and (41) gives bk , the basis coefficients of x p

Without the SVD, the analysis would stop here, there is no way to
discover contributions of x to give the basis coefficients ak .
With the SVD there is a way to discover the contributions of x to each term !

bk = (1/k ) ak = (1/k ) uk T y

uk k vTk

y = Ax =

k=1

xp =
x

(42)

Plugging the bk into (39) to find x gives

We know each term in the output is given by:


!

(41)

k=1

k=1

vk bk = vk (1/k ) uk

y=

vk (1/k ) uk

k1

y = A# y

(37)

= a1 u1 + a2 u2 + + ar ur

(43)

Where the term in parentheses gives the generalized inverse.

with
ak = k vTk x

(38)

A# =

vk (1/k) uk T

(44)

k=1

Any vector x in the row space of A can be written


Or in Matlab:
x = b1 v1 + b2 v2 + + br vr

(39)
>> Asharp = pinv(A)

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 21

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 22

EE/ME 701: Advanced Linear Systems

Section 4.1.0

4.1 Using generalized inverse of A

EE/ME 701: Advanced Linear Systems

Section 4.1.1

4.1.1 A numerical example using the generalized inverse


Consider the data

Given the relationship


y = Ax

(45)

With data vector y, find


A# =

vk (1/k) uk T

(46)

A = [ 1
2
3
4

5
6
7
8

6
8
10
12 ]

ybar =

6
5
5
3

k=1

With y = A x, find x p
And the solution
x p = A#

(47)

lies in the row space of A, and gives


y = A x p

(48)

Before proceeding, lets find the correct y , the projection of y onto col (A)
>> W = GramSchmidt(A)
W = 0.1826
0.8165
0.3651
0.4082
0.5477
0.0000
0.7303
-0.4082

>> yhat = W*W*ybar


yhat =
6.1000
5.2000
4.3000
3.4000

With y the projection of y onto the column space of A .


Trying the left pseudo-inverse:

In Eqn (46) there is no matrix inverse step.

>> xp1 = inv(A*A) * A * ybar


Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 8.606380e-18.
xp1 = -2.8750
0.6875
-1.0000

Eqn (46) always works, for every shape or rank of matrix.


Eqn (46) gives a new degree of freedom: we can choose r !
(More on this at the end of the notes on SVD)

Double checking that Ax p1 gives y


A mile off from y !

>> A*xp1,

ans = -5.4375
-9.6250
-13.8125
-18.0000

(The warning is to be taken seriously !)


Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 23

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 24

EE/ME 701: Advanced Linear Systems

Section 4.1.1

Now with the generalized inverse (also called the pseudo inverse)
>> [U,S,V] = svd(A)
U = -0.3341
0.7671
-0.4359
0.3316
-0.5378
-0.1039
-0.6396
-0.5393
S =

V =

-0.4001
0.2546
0.6910
-0.5455

23.3718
0
0
0

0
1.3257
0
0

0
0
0.0000
0

-0.2301
-0.5634
-0.7935

-0.7834
0.5910
-0.1924

-0.5774
-0.5774
0.5774

Section 4.1.1
or

>> xp = Asharp * ybar


xp =
-2.3500
2.0500
-0.3000

-0.3741
0.7970
-0.4717
0.0488

>> xp=pinv(A)*ybar
xp = -2.3500
2.0500
-0.3000

Double checking that A x p1 gives y


Previously we saw three cases for the equation
y = Ax ,

A Rnm

Case

Size of A

Exactly constrained

n=m

Matrix Inverse

Tool

Over constrained

n>m

Left Pseudo-Inverse

Under constrained

n<m

Right Pseudo-Inverse

(49)

Must Exist

Table 1: Three cases for the solution of Y = A x

A1
1
ATA
1
A AT

Each of these cases requires that a matrix be invertible, or other methods


are required. With the generalized inverse

There are 2 non-zero s , using Eqn (46)


>> Asharp = V(:,1)*(1/S(1,1))*U(:,1) + V(:,2)*(1/S(2,2))*U(:,2)
Asharp = -0.4500
-0.1917
0.0667
0.3250
0.3500
0.1583
-0.0333
-0.2250
-0.1000
-0.0333
0.0333
0.1000
Eqn (46) is implemented by the pinv() command in Matlab
>> Asharp = pinv(A)
Asharp = -0.4500
0.3500
-0.1000

EE/ME 701: Advanced Linear Systems

xp =

k=1

k=1

vk k = vk (1/k) uTk y = A#x p

with
A# =

(50)

vk (1/k ) uTk

k=1

Equation (50), is valid in all cases where A 6= 0.


-0.1917
0.1583
-0.0333

0.0667
-0.0333
0.0333

0.3250
-0.2250
0.1000

There is no matrix inversion.


At most the basis vectors corresponding to k > tol are retained (we have
the freedom to choose r < rank (A)).
If A 6= 0, at least one k is guaranteed to be greater than zero.

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 25

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 26

EE/ME 701: Advanced Linear Systems

Section 5.0.0

5 SVD Conclusion

:
U:

Section 6.0.0

6 Determining the rank of A and the four funda-

The singular value decomposition gives us an expansion for any real matrix
A of the form
A = U V T
V T:

EE/ME 701: Advanced Linear Systems

rotation+
from input
i coordinates to singular coordinates, where
h
V = v1 vm Rmm

mental spaces with the SVD


The rank of a matrix is the number of singular values which are greater than
a tolerance.
Recall, singular values are always positive or zero.

Scaling in singular coordinates, where Rnm (same as A)

In any numerical calculations we have round-off error, j may be greater


than zero due to round-off error.

rotation+ from
singular icoordinates to output coordinates,
h
where U = u1 un Rnn

So we set a tolerance for the minimum output of Y = A b which will be


considered non-zero.

Because the columns of U and V are orthonormal, we have a second


expansion for A:
r

A=

uk k vTk

Matlabs default value is given by:


tol = max(size(A)) * norm(A) * eps
>> eps = 2.2204e-16 (for 64 bit floating point calculation)

k=1


We have seen that if the vi are the eigenvectors of AT A then the k and uk
are straight-forward to compute.
Since Q = AT A is a Hermitian matrix, the eigenvectors exist and are
orthogonal.
Note: if n < m it may be numerically more convenient to compute the

eigenvectors of Q = A AT , which is a smaller matrix.


Student exercise: how would the SVD be determined from the eigenvec
tors of Q ?

The expansion of A is given as:

..
.

..
.

..
.

..
.

..
.

..
.

U V T = u1 ur ur+1 un

..
..
..
...
...
...
.
.
.

1
..

.
r

vT1

..

.
0

vT

vT
0

r+1

..
..

.
.

vTm
0

where 1 ... r are the singular values greater than the tolerance, and the
rest of the diagonal elements of are effectively zero.

As we are about to see, bases for the four fundamental spaces of a matrix
are given directly from the singular value decomposition.

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 27

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 28

EE/ME 701: Advanced Linear Systems

Section 6.0.0

U V T partitions according to the non-zero singular values:

.. .. ..
. . .

u1 ur

.. .. ..
. . .

.. ..
..
. .
.

ur+1 un

..
.. ..
.
. .

0
...
r

vT1

..

vr

vTr+1

..

T
vm

Section 6.0.0

A program to give the four fundamental spaces

EE/ME 701: Advanced Linear Systems

0
...

function [Row, Col, Null, Lnull] = spaces(M)


%
function [Row, Col, Null, Lnull] = spaces(M)
%
%
Return the four fundamental spaces
%
[n,m] = size(M);
[u,s,v] = svd(M);
r = rank(M);
%% select spaces
Row=v(:,1:r);
Null = v(:,(r+1):m);
Col=u(:,1:r);
Lnull = u(:,(r+1):n);

where the columns of Uand V provide bases for the Column, Left-Null,
Row and Null spaces:

..
.

..
.

..
.

Uc = u1 ur ,

.. .. ..
. . .

..
.

..
.

..
.

Uln = ur+1 un ,

..
.. ..
.
. .
Part 4a: Singular Value Decomposition

..
.

..
.

..
.

..
.

Vr = v1 vr

.. .. ..
. . .
..
.

..
.

Vn = vr+1 vm

..
..
..
.
.
.
(Revised: Sep 10, 2012)

Page 29

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 30

EE/ME 701: Advanced Linear Systems

Section 6.1.0

6.1 Example:

EE/ME 701: Advanced Linear Systems

Section 6.1.0

We need to take a look at the third singular value to see if it is approximately


zero, or just quite small.

Consider

2
A=

0
0
1
1

>> ss = diag(S); ss(3)


ans =
4.2999e-16
(51)
3 is effectively zero (zero to within the round off error), so A is rank 2.
Now form the 4 fundamental spaces.

The four fundamental spaces of matrix A can be found through the SVD:
>> r = sum(ss > norm(A)*max(n,m)*eps)
r = 2

>> [U, S, V] = svd(A)


U =
-0.2543
0.3423
-0.5086
0.6846
-0.6947
-0.2506
-0.4405
-0.5928
S =

0.8987
-0.3615
-0.1757
0.1757

5.3718
0
0
0

0
1.0694
0
0

0
0
0.0000
0

V = -0.5774
-0.2113
-0.7887

0.5774
-0.7887
-0.2113

0.5774
0.5774
-0.5774

-0.1030
0.3769
-0.6509
0.6509

>> [n, m] = size(A)


n =
4 ,
m =
3
Since the rank is 2, the a basis for the row space is given by the first 2
columns of V
>> Vr
Vr =

The null space is spanned by the remaining columns of V


>> Vn
Vn =

Part 4a: Singular Value Decomposition

= V(:, 1:r)
-0.5774
0.5774
-0.2113
-0.7887
-0.7887
-0.2113

(Revised: Sep 10, 2012)

Page 31

= V(:, (r+1):NRows)
0.5774
0.5774
-0.5774

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 32

EE/ME 701: Advanced Linear Systems

Section 6.1.0

Since the rank is 2, a basis for the column space is given by the first 2
columns of U
>> Uc
= U(:, 1:r)
Uc = -0.2543
0.3423
-0.5086
0.6846
-0.6947
-0.2506
-0.4405
-0.5928

EE/ME 701: Advanced Linear Systems

Section 7.1.0

7 Exact, Homogeneous, Particular and General


Solutions to a Linear Equation
7.1 The four types of the solution for y = A x
There are four elements in the solution for
y = Ax

The left null space is spanned by the remaining columns of U

(52)

that we need to consider. The terminology is similar to that for differential


equations:

>> Un = U(:, (r+1):NCols)


Un =
0.8987
-0.1030
-0.3615
0.3769
-0.1757
-0.6509
0.1757
0.6509

Exact Solution: An exact solution is one that exactly satisfies Eqn (52). Recall
that if x is over-constrained (more ys than xs), then we can get a solution
which is not an exact solution.
Homogeneous Solution: A homogeneous solution is a solution to:
A xh = 0
There is the trivial homogeneous solution, x = 0. To have more interesting
homogeneous solutions, matrix A must nhave a non-trivial onull space.
Suppose the null space is given by a basis n1 , n2 , , nn :
xh = n1 n1 + n2 n2 + + nn nn

A xh = 0

(53)

If there is any non-trivial homogeneous solution, there will be infinitely


many of them.

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 33

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 34

EE/ME 701: Advanced Linear Systems

Section 7.1.0

Particular Solution: A particular solution is one that solves Eqn (52), either
exactly or in a least squared error sense. x p is the particular solution, and
y = A x p. The residual,
e
(54)
y = y y
may be zero (an exact solution) or may be minimal.

Each of the terms relating to the particular solution comes from a


fundamental space:
y lies in the column space.
x p lies in the row space.
Recall the several different norms, minimizing the residual for each
different norm can give a different meaning to x p. One can assume that
is minimized in the least-squares sense, unless a different norm is given.
When working with the 2-norm (the usual case),

EE/ME 701: Advanced Linear Systems

Section 7.2.0

7.2 Numerical example showing the generalized inverse


Continuing with the example in section 6.1,
>> Uc = U(:, 1:r)
Uc = [ -0.2543
-0.5086
-0.6947
-0.4405
>> Vr = V(:, 1:r)
Vr = [ -0.5774
-0.2113
-0.7887

'

Keep in mind:

0.3423
0.6846
-0.2506
-0.5928 ]

A = Uc

Even though A R43 , r is


a 2x2 matrix, and Uc and Vr
each have 2 columns.

0.5774
-0.7887
-0.2113 ]

>> SigmaR = Sigma(1:r, 1:r)


SigmaR = [
5.3718
0
0
1.0694

r VrT

2 is the rank of matrix A .


&

y is given as the projection of y onto the column space.


Choose an example y , project y onto the column space.

e
y lies in the left-null space.

>> Y = [ 1 2 3 4 ]

x p is given as

xp = A y

(55)

General Solution: The general solution is a parametrized solution, it is the set of


all solutions which give y :
xg = x p + xh

y = A xg

(56)

>> Yhat = Uc * Uc * Y
Yhat = [ 0.8182
1.6364
3.9091
3.0909 ]

If the null space and any particular solution is known, then all of the general
solutions are given by Eqn (56).

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 35

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 36

EE/ME 701: Advanced Linear Systems

Section 7.2.0

EE/ME 701: Advanced Linear Systems

Section 7.2.0

Lets take a look at the residual e


y = y y

Now compute x p
>> Xp = Vr* inv(SigmaR) * Uc * Y
Xp = [ -0.2121
1.2424
1.0303 ]

ytilde =
0.1818
0.3636
-0.9091
0.9091

Double check that A x p gives y


>> A * Xp
ans = [ 0.8182
1.6364
3.9091
3.0909

e
y should lie in the left-null space. The basis for the left null space is Un
>> Un = U(:, (r+1):NCols)
Un = [ 0.8987
-0.1030
-0.3615
0.3769
-0.1757
-0.6509
0.1757
0.6509

The example y = A x looks like an over-constrained problem (4 data, 3


unknowns), try matrix pseudo-inverse:
>> XX = inv(A*A) * A * Y
Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 5.139921e-18.
XX = [ -6.2500
1.5000
4.2500 ]

Checking that e
y lies in Un :
>> ytilde2 =
0.1818
0.3636
-0.9091
0.9091

Un * Un * ytilde

Check result (the warning from inv() is to be taken seriously!)


A * XX
---------2.0000
-4.0000
1.7500
3.7500

A * Xp
-------0.8182
1.6364
3.9091
3.0909

Part 4a: Singular Value Decomposition

Yp
-----0.8182
1.6364
3.9091
3.0909

Y
----1
2
3
4

(Revised: Sep 10, 2012)

Page 37

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 38

EE/ME 701: Advanced Linear Systems

Section 7.2.0

EE/ME 701: Advanced Linear Systems

Section 8.0.0

8 Additional Properties of the singular values

The general solution to y = A x is given as:

Define is the largest singular value of A and is the smallest singular


value of A. Also, order the singular values so that 1 = , ... , p = .

xg = x p + xh
In some cases the null space will be empty, so xg = x p . But in this case
dim null (A) = 1

.577

xh = a1 vm = a1 .577

.577

||A||2 =

By writing y = rk=1 uk k vTk x , it is clear that the largest ||y||2 is given
when x corresponds to the largest singular value of A. Choosing x = v1
gives y = 1 u1 and ||y||2 = 1 .
The condition number of A is given as

So xg is given as:

cond (A) =

0.2121
.577

xg = 1.2424 + a1 .577

1.0303
.577

Consider that when we form A# we use 1/k , if there is a k which is very


small, there will be terms in A# which are very large. With
x = A# y

0.2121
.577

xg = 1.2424 + 0.42 .577

1.0303
.577
>> Xg = Xp - 0.42*Xh
Xg = [ -0.455
1.000
1.273 ]

(57)

The condition number is a measure of how much round off errors can be
expanded by multiplying a matrix by A.

Coefficient a1 can be used to satisfy one constraint, such as make


x (2) = 1.0 . Which gives a1 = .2424/.577 = 0.42

if there are noise components in y, when they hit the large terms in A#
they will give large errors.
Example: A problem with cond (A) = 10 is said to be well conditioned,
if cond (A) = 10000 the problem is said to be poorly conditioned.

>> A * Xg = Yhat
= 0.8182
1.6364
3.9091
3.0909

If = 0, the condition number is unbounded.


A method to handle poorly conditioned problems:
q

use A# = k=1vk (1/k ) uTk with some q < r


That is, throw out the smallest singular values.

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 39

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 40

EE/ME 701: Advanced Linear Systems

Section 8.0.0

The absolute value of the determinant of any matrix is the product of its
singular values
p

|det (A)| = k

(58)

k=1

Corollary: if any k = 0, then det (A) = 0.


Recall that when A Rnn
n

det (A) = k
k=1

So for square matrices,




n
p

k = k
k=1 k=1

EE/ME 701: Advanced Linear Systems

Section 10.0.0

9 History of the Singular Value Decomposition


The singular value decomposition was originally developed by differential
geometers, who wished to determine whether a real bilinear form could be made
equal to another by independent orthogonal transformations of the two spaces
it acts on. Eugenio Beltrami and Camille Jordan discovered independently,
in 1873 and 1874 respectively, that the singular values of the bilinear forms,
represented as a matrix, form a complete set of invariants for bilinear forms
under orthogonal substitutions. James Joseph Sylvester also arrived at the singular
value decomposition for real square matrices in 1889, apparently independent
of both Beltrami and Jordan. Sylvester called the singular values the canonical
multipliers of the matrix A. The fourth mathematician to discover the singular
value decomposition independently is Autonne in 1915, who arrived at it via
the polar decomposition. The first proof of the singular value decomposition
for rectangular and complex matrices seems to be by Eckart and Young in 1936;
they saw it as a generalization of the principal axis transformation for Hermitian
matrices.
In 1907, Erhard Schmidt defined an analog of singular values for integral operators
(which are compact, under some weak technical assumptions); it seems he was
unaware of the parallel work on singular values of finite matrices. This theory was
k
further developed by mile Picard in 1910, who is the first to call the numbers sv
singular values (or rather, valeurs singulires).
Practical methods for computing the SVD date back to Kogbetliantz in 1954, 1955
and Hestenes in 1958 resembling closely the Jacobi eigenvalue algorithm, which
uses plane rotations or Givens rotations. However, these were replaced by the
method of Gene Golub and William Kahan published in 1965 (Golub & Kahan
1965), which uses Householder transformations or reflections. In 1970, Golub
and Christian Reinsch published a variant of the Golub/Kahan algorithm that is
still the one most-used today.
[Wikipedia]

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 41

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 42

EE/ME 701: Advanced Linear Systems

Section 10.0.0

10 Conclusions
The Singular Value Decomposition gives us a geometric picture of matrixvector multiplication, comprised of
A rotation+
Scaling
A rotation+
Using the SVD we can find basis vectors for the four fundamental spaces.
And the basis sets are ortho-normal
And the basis vectors of the row and column spaces are linked by the
singular values
Numerically, computation of the SVD is robust because computation of the
eigenvectors of a symmetric matrix is robust.
The SVD can be used to compute:
rank (A)
||A||2
cond (A)
Ortho-normal basis vectors for the four fundamental spaces
The solution to y = A x for the general case, without matrix inversion.

Part 4a: Singular Value Decomposition

(Revised: Sep 10, 2012)

Page 43

EE/ME 701: Advanced Linear Systems

EE/ME 701: Advanced Linear Systems

Eigenvalues, Eigenvectors and the Jordan


Form

3.2.4
3.3

The Jordan form, a second example . . . . . . . . . . . .

27

One more twist, freedom to choose the regular eigenvector . . . .

33

3.3.1

Contents
3.4
1 Introduction
1.1

Review of basic facts about eigenvectors and eigenvalues . . . . .


1.1.1

Repeated eigenvalues . . . . . . . . . . . . . . . . . . . .

Analysis of the structure of the eigensystem of a matrix . . . . . .

2 Properties of the Eigensystem


Bay section 4.1, A-Invariant Subspaces . . . . . . . . . . . . . . .

2.2

Finding eigenvalues and eigenvectors . . . . . . . . . . . . . . .

2.3

Interpreting complex eigenvalues / eigenvectors . . . . . . . . . .

10

2.3.1

Example: 3D Rotation . . . . . . . . . . . . . . . . . . .

11

The eigen-system of symmetric (Hermitian) matrices . . . . . . .

13

3 The Jordan-form
3.1

3.2

41

Why Matlab does not have a numeric Jordan command . .

42

4 Conclusions

43

5 Review questions and skills

45

16

How the Jordan form relates to the real world . . . . . . . . . . .

16

3.1.1

An example calling for the Jordan form . . . . . . . . . .

17

Constructing the Jordan Form . . . . . . . . . . . . . . . . . . .

20

3.2.1

Regular and Generalized Eigenvectors

. . . . . . . . . .

20

3.2.2

First Jordan Form Example . . . . . . . . . . . . . . . . .

22

3.2.3

More on Jordan blocks . . . . . . . . . . . . . . . . . . .

26

Part 4b: Eigenvalues and Eigenvectors

3.4.1

. . . . . . . . . . . . . . . . . . .

2.1

2.4

Summary of the Jordan Form

35

Looking at eigenvalues and eigenvectors in relation to the


null space of ( I A) . . . . . . . . . . . . . . . . . . .

1.1.2
1.2

Example where regular E-vecs do not lie in the column


space of (A k I) . . . . . . . . . . . . . . . . . . . . .

(Revised: Sep 10, 2012)

Page 1

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 2

EE/ME 701: Advanced Linear Systems

Section 1.1.0

1 Introduction

EE/ME 701: Advanced Linear Systems

Section 1.1.2

1.1.1 Looking at eigenvalues and eigenvectors in relation to the null space


of ( I A)

We have seen the basic case of eigenvalues and eigenvectors


Starting from
A vk = k vk .

(k I A) vk = 0

(1)

In this chapter we will elaborate the relevant concepts to handle every case.

(4)

The eigen-values are values of k such that (k I A) has a non-trivial null


space.
The eigenvectors are the basis vectors of the null space !

1.1 Review of basic facts about eigenvectors and eigenvalues


Since
Only square matrices have eigensystems

det (k I A) = 0

Every n n matrix has n eigenvalues, 1 ... n

We know the null space is at least 1 dimensional.

The eigenvector satisfies the relationship A vk = k v , which leads to the


eigenvector being the solution to
(A k I) vk = 0

(2)

Theorem: for each distinct eigenvalue, there is at least one independent


eigenvector.
Proof: The proof follows directly from Eqns (4) and (5).

or, said another way, the eigenvector is a vector in the null space of the
matrix (k I A) .

1.1.2 Repeated eigenvalues

Notes:

Example:

1. Any vector in the null space of (k I A) is an eigenvector. If the null


space is 2 dimensional, then any vec. in this 2D subspace is an E-vec.
2. Since the determinant of any matrix with a non-empty null space is
zero, we have:
(3)
det (k I A) = 0 , k = 1..n
which gives the characteristic equation of matrix A.
Part 4b: Eigenvalues and Eigenvectors

(5)

(Revised: Sep 10, 2012)

Page 3

Matrices can have repeated eigenvalues.

>> A = [ 2 1;

0 2]

>> [V,U] = eig(A)


V =
1.0000
-1.0000
0
0.0000
Part 4b: Eigenvalues and Eigenvectors

A =

2
0

U = 2
0

1
2

0
2
(Revised: Sep 10, 2012)

Page 4

EE/ME 701: Advanced Linear Systems

Section 1.1.2

EE/ME 701: Advanced Linear Systems

Section 1.2.0

1.2 Analysis of the structure of the eigensystem of a matrix

When there are repeated eigenvalues:


1. We are assured to have at least 1 independent eigenvector.

Analysis of the eigensystem of a matrix proceeds by completing table 1.

2. There may fewer independent eigenvectors than eigenvalues

1. Group the into k sets of repeated eigenvalues (one set for each unique
)

Definitions:
The algebraic multiplicity of an eigenvalue is the number of times the
eigenvalue is repeated.

The number of k in the kth set is called the algebraic multiplicity,


and is given by mk

The geometric multiplicity is the number of independent eigenvectors


corresponding to the eigenvalue. (dim null (A I))

Since (k I A) always has a non-trivial null space, every k set of


eigenvalues has at least one eigenvector vk ,

Consider the example above:


>> A = [ 2 1;

0 2]

A =

2
0

2. Determine the number of independent eigenvectors corresponding to k


by evaluating
q (k I A) = dim null (k I A) .

1
2

This is called the geometric multiplicity, and is given by gk .


>> [V,U] = eig(A)
V =
1.0000
-1.0000
0
0.0000

U = 2.0000
0

If mk 2, it is possible that there are fewer independent eigenvectors


than eigenvalues.

0
2.0000

1 q (k I A) mk

Eigenvalue: 2.0
Algebraic multiplicity: 2

(6)

3. If mk > gk for any k, the Jordan form and generalized eigenvectors are
required.

Geometric Multiplicity: 1
Number of missing eigenvectors: 1

k k mk gk

Recall the eigen-decomposition of a matrix:

1
2
..
.

A = V U V 1
The eigen-decomposition only exists if V is invertible. That is if there is a
complete set of independent eigenvectors.
Part 4b: Eigenvalues and Eigenvectors

# Needed Generalized Evecs


mk gk

(Revised: Sep 10, 2012)

Page 5

Table 1: Analysis of the structure of the eigensystem of A.

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 6

EE/ME 701: Advanced Linear Systems

Section 1.2.0

EE/ME 701: Advanced Linear Systems

Section 2.1.0

2 Properties of the Eigensystem

Example,
>> A = [ 2 3 4 ; 0 2 1 ; 0 0 2 ]
A =
2
3
4
0
2
1
0
0
2

First well cover properties of the eigenvalues and regular eigenvectors.

2.1 Bay section 4.1, A-Invariant Subspaces

Recall: for triangular and diagonal matrices, the eigenvalues are the
diagonal elements

>> [V, U ] = eig(A)


V =
1.0000
-1.0000
0
0.0000
0
0
U =

2
0
0

0
2
0

1.0000
-0.0000
0.0000

>> RoundByRatCommand(V)
ans = 1
-1
1
0
0
0
0
0
0

0
0
2

(7)

D = E

Table 2: Analysis of the structure of the eigensystem of A.

Part 4b: Eigenvalues and Eigenvectors

(8)

In an isotropic dielectric medium, the electric field follows the relation

# Needed Generalized Evecs


mk gk

( I A) v = 0

Example (Bay Example 4.1): Electric Fields

The Jordan form will be required.

Av = v

Eqn (8) shows that the Eigenvectors lie in the null space of ( I A)

dim (1 I A) = 1, there is one eigenvector.

The eigenvectors of A are basis vectors for A-invariant subspaces.

v Av = 0

We see that there is one eigenvalue that is triply repeated.

When the operator A is understood from context, X1 is sometimes said


to be simply invariant.

Eqn (7) gives:

The analysis of the structure of the eigensystem of matrix A is seen in


table 2.

k k mk gk

Let X1 be a subspace of linear vector space X. This subspace is A-invariant


if for every vector z X1 , y = A z X1 .

(Revised: Sep 10, 2012)

where E is the electric field vector, D is the electric flux density (also called the
displacement vector) and is the dielectric constant.
Page 7

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 8

EE/ME 701: Advanced Linear Systems

Section 2.2.0



D
1 11 12 13

D2 = 21 22 23

D3
31 32 33

Section 2.3.0

2.3 Interpreting complex eigenvalues / eigenvectors

Some materials, however, are anisotropic, governed by

EE/ME 701: Advanced Linear Systems

Real eigenvalues correspond to scaling the eigenvector

E
1

E2

E3

Complex eigenvalues lead to complex eigenvectors, and correspond to


rotations
Complex eigenvalues come in complex conjugate pairs

Find the directions, if any, in which the E-field and flux density are collinear.
Examples: Consider the basic rotation matrix:

Solution:
For the E-field and flux density to be collinear they must satisfy D = E,

R=

Which is to say, the anisotropic directions are the eigenvectors of the


dielectric tensor.

C S
S

Eigenvalues:

2.2 Finding eigenvalues and eigenvectors


For ( I A) v = 0 to have a solution, ( I A) must have a non-trivial null
space. This is equivalent to saying
det ( I A) = 0

(9)

( I R) =

which gives the characteristic equation

det (I R) = 2 2C +C2 + S2 = 2 2C + 1 = 0
which solves to give

We have seen that Eqn (9) gives an nth order polynomial in


This is more important for understanding than as a solution method

Actual eigenvalue / eigenvector algorithms are beyond the scope of


EE/ME 701.
Use:

q
4C2 4
2

2C

q
2

4 S2

= C j S

The eigenvalues of R are a complex conjugate pair for any value of


6= 0o , 180o !
i = 1.0

>> [V, U] = eig(A)


Part 4b: Eigenvalues and Eigenvectors

2C

(Revised: Sep 10, 2012)

Page 9

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 10

EE/ME 701: Advanced Linear Systems

Section 2.3.1

EE/ME 701: Advanced Linear Systems

Section 2.3.1

The matrix R of Eqn (11) also describes a rotation of 600 about the axis v

2.3.1 Example: 3D Rotation


Weve seen the rotation matrix

c
tR

CC +C S CS S S S CC S

= SC
S SS +CC C SS CS

S
CS
CC

0.577

v3 = 0.577
0.577

(10)

w
y
Rw
x
z

For example, the rotation matrix

Figure 1: Illustration of the action of rotation matrix R on a vector w to give


vector R w.

2/3 1/3 2/3

R = 2/3 2/3 1/3

1/3 2/3 2/3

(11)

The mathematical manifestation of the absence of any other R-invariant


subspace is that the other two eigenvalues are complex.

Corresponds to
= 135o ,

= 135o ,

= 19.47o

When we form solutions of the form

This matrix (like all 3D rotation matrices with general angles) has one real
eigenvalue and a complex conjugate pair
>> [V, U] = eig(R)
V = -0.5774
0.2887 + 0.5000i
0.2887 - 0.5000i
U =

Every vector w v will be rotated by R. So the only R-invariant subspace


lies along the axis of rotation.

0.5000 + 0.8660i
0
0

Part 4b: Eigenvalues and Eigenvectors

-0.5774
0.2887 - 0.5000i
0.2887 + 0.5000i

0.5774
0.5774
0.5774

0
0.5000 - 0.8660i
0

0
0
1.0000

(Revised: Sep 10, 2012)

x (t) = V eU t V 1 x (0)

(12)

the complex eigenvalue and eigenvector pair combines to give solutions that
can be written
x (t) = a1et cos (t + ) w1 + a2et sin (t + ) w2

(13)

where the complex is written = j and the complex eigenvectors


form the real eigenvectors:
1
(real part)
(14)
w1 = (v1 + v2)
2
1
(v1 v2)
2
Part 4b: Eigenvalues and Eigenvectors
w2 = j

Page 11

(imaginary part)
(Revised: Sep 10, 2012)

(15)
Page 12

EE/ME 701: Advanced Linear Systems

Section 2.4.0

2.4 The eigen-system of symmetric (Hermitian) matrices


The eigensystem of a Hermitian matrix Q , (symmetric matrix, if real) have
special, and very helpful properties.

EE/ME 701: Advanced Linear Systems

Property 3: The eigenvectors of a Hermitian matrix, if they correspond to distinct


eigenvalues, must be orthogonal.
Proof: Starting with the given information, eigenvalue values 1 6= 2, and
corresponding eigenvectors v1 and v2

Notation: use A to be the complex-conjugate transpose (equivalent to A in


Matlab).
Property 1: If A = A, then for all complex vectors x, x A x is real.
Proof: Define y = (x A x). Applying the transpose to the product,

Section 2.4.0

Q v1 = 1 v1

(16)

Q v2 = 2 v2

(17)

Forming the complex-conjugate transpose of Eqn (16)

y = (x A x) = x A x = x A x

v1 Q = 1 v1 = 1 v1

Since A = A , y = y . For a number to equal its complex conjugate, it


must be real.

(18)

where we can drop the complex conjugate, because we know 1 is real.


Now multiplying on the right by v2 gives

Property 2: The eigenvalues of a Hermitian matrix must be real.


Proof: Suppose the is an eigenvalue of Q , with v a corresponding
eigenvector, then
Qv = v

v1 Q v2 = 1 v1 v2 = 1 v1 v2

(19)

But the multiplication also gives:


Now multiply on the left and right by v,
v1 (Q v2) = v1 2 v2 = 2 v1 v2

(20)

1 v1v2 = 2 v1 v2

(21)

v Q v = v v = v v = ||v||

So we find
By property 1,

v Q v

must be real, and ||v|| must be real, there for


=

If 1 6= 2, Eqn (21) is only possible if v1v2 = 0, which is to say that v1 and


v2 are orthogonal.

v Q v
||v||2

must be real.
Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 13

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 14

EE/ME 701: Advanced Linear Systems

Section 2.4.0

Property 4: A Hermitian matrix has a complete set of orthogonal eigenvectors.

EE/ME 701: Advanced Linear Systems

Section 3.1.0

3 The Jordan-form

Proof: Any square matrix Q has a Schur decomposition

3.1 How the Jordan form relates to the real world


T = V 1 QV

(22)

The solution to the equation

where V is an orthonormal matrix and T is an upper-triangular matrix.


Since T is upper-triangular, T will be lower-triangular. However, V is
ortho-normal and Q is Hermitian, so
V = V 1 ,
So

Q = Q , and



V 1 = (V ) = V

For T to be both upper-triangular and lower-triangular, it must be


diagonal. Let U = T be a diagonal matrix.
Multiplying Eqn (22) on the left by V and on the right by V 1gives
=VUV

(23)

Which is precisely the form of the eigen-decomposition of Q, where


Diagonal matrix U holds the eigenvalues of Q, and

Remarks:
Since T = T , the diagonal elements (the eigenvalues) must be real
(see property 2).
When Q is real (symmetric), V will be real.
(Revised: Sep 10, 2012)

(25)

U =

0
2

...

and

(26)

eU t =

e1 t

0
e2 t

But what if the set of vk is incomplete, so there is no V 1 !

...

(27)

Such matrices can arise with repeated eigenvalues and do not have a
complete set of eigenvectors. Such matrices are called defective.
If we generated A matrices randomly, defective matrices would be quite
rare.
But repeated poles (eigenvalues) are relatively common in the analysis
of dynamic systems, and sometimes even a design goal.

Orthogonal matrix V holds the eigenvectors of Q.

Part 4b: Eigenvalues and Eigenvectors

x (t) = eAt x (0)

has terms of the form

eAt = V eU t V 1

Since T = T , T must be both upper-triangular and lower-triangular.

Q=V TV

(24)

When there is a complete set of eigenvectors,



T = V Q V 1 = V 1 QV = T

x (t) = A x (t) + B u (t)

Page 15

When A has repeated eigenvalues and missing eigenvectors (gk < mk ),


analysis of eAt requires converting matrix A to the Jordan form.
When we have a complete set of independent eigenvectors, eAt is given by
Eqn (26).
Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 16

EE/ME 701: Advanced Linear Systems

Section 3.1.1

EE/ME 701: Advanced Linear Systems

Section 3.1.1

Consider eJ t for a matrix of the form

3.1.1 An example calling for the Jordan form

J=

With scalar differential equations, we know that equations with repeated


roots give solutions of the form

1
0

(31)

The expression for eJ t is

y (t) = c1 e1 t + c2 t e1 t .
For example,
y (t) + 6 y (t) + 9y (t) = 0

(28)

has the characteristic equation s2 + 6 s + 9 = 0 which as the roots


s = {3, 3} . The solution to Eqn (28) is:
y (t) = c1 e3t + c2 t e3t .

(29)

J2 t 2
Jk t k
eJ t = I + J t +
+ +
+
(32)
2!
k!

2
3
1 0
1
2 2
3 32
+ t
+ t
+
+ t
=
1! 0
2 ! 0 2
3 ! 0 3
0 1

t k k k k1
+
(33)
+
k! 0
k
The (1,1) and (2,2) elements give the series

But Eqn (27) for eAt has no terms of the form t e3t . And yet Eqn (28) is
simply represented in state space with:

x (t) =

y (t)
y (t)

d y (t) 6 9 y (t)
x (t) =
=
dt y (t)
1 0
y (t)

(30)

And the solution to Eqn (30) is (as always for x (t) = A x (t))

1 k k
t = et
k
!
k=1

1+

The (2,1) element of eJ t is 0. Now consider the (1,2) element, which is


given as
0+

1
t
+
k k1 t k =
1 ! k=2 k !

x (t) = eAt x (0) .

t 1+

k=2

So how can eAt have a term of the form t e3t ?

(Revised: Sep 10, 2012)

1
k1t k1
(k 1) !

= t 1+

k=1

1 kk
t
k!

= t et

So if J has the form of Eqn (31), then

Jt

Part 4b: Eigenvalues and Eigenvectors

(34)

Page 17

e = exp

1
0

Part 4b: Eigenvalues and Eigenvectors

t =

et t et
0

et

(Revised: Sep 10, 2012)

(35)
Page 18

EE/ME 701: Advanced Linear Systems

Section 3.1.1


et t et
1 0


exp 0 1 t = 0 et


0
0
0 0

1 2 t
2t e
t et

et

(36)

1 0 0


(37)

3.2.1 Regular and Generalized Eigenvectors


The regular eigenvectors are what we have considered all along, they satisfy
the relationship

For our specific example, A decomposes according to


A = M J M 1

A=

M=

And

J is a block-diagonal matrix, it is composed of mk mk blocks along the


main diagonal. Eqns (31), (36) and (37) give examples of 2x2, 3x3 and
4x4 blocks.
Each block on the diagonal of J is called a Jordan block.

gives terms of the form t 3 et , etc.

6 9

(39)

The columns of M are regular eigenvectors and generalized eigenvectors


0 1 0

t
exp

0 0 1


0 0 0

Matrix A has been transformed to the Jordan form (sometimes called the
Jordan canonical form) when
J = M 1 A M

which gives terms of the form t 2 et ,

with

Section 3.2.1

3.2 Constructing the Jordan Form

By the argument of Eqns (31)-(35) above,

EE/ME 701: Advanced Linear Systems

0.949
0.316

eAt = M eJ t M 1 = M

(38)

0.0316
0.0949

, J=

e3t t e3t
0

e3t

3
0

1
3

A v = k v

or

(A k I) v = 0

(40)

From Eqn (40), a set of independent regular eigenvectors is given as the null
space of (A k I).

M 1

So the solution x (t) = eAt x (0) will have terms of the form e3t and t e3t ,
as need !
Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 19

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 20

EE/ME 701: Advanced Linear Systems

Section 3.2.1

The generalized eigenvectors form chains starting with a regular


eigenvector. The generalized eigenvectors satisfy the relationship
l+1
l
A Vk,l+1
j = k Vk, j +Vk, j

(41)

l
(A k I) Vk,l+1
j = Vk, j

(42)

Or, rearranging

is the the next generalized eigenvector in a chain


(see Bay Eqn (4.14)).

In this notation,
(43)

is the first element of a chain; it is a regular eigenvector, and it is the jth


regular eigenvector of the kth distinct eigenvalue.
= 1 designates that Vk,1 j

The l
be a regular eigenvector.

Section 3.2.2

3.2.2 First Jordan Form Example


Consider eAt , with A given as:
>> A = [ 3 3 3 ; -3 3 -3 ; 3 0 6 ]
A =
3
3
3
-3
3
-3
3
0
6
First look at the eigenvalues

Where Vk,l+1
j

Vk,1 j

EE/ME 701: Advanced Linear Systems

is the first eigenvector in a chain, so it must

Eqn (42) is an example of a recursive relationship, it is an equation that is


applied repeatedly to get all elements of the chain.
The method presented here to determine the Jordan form is the bottom-up
method presented in Bay, section 4.4.3.

>> U = eig(A)
U = 3.0000
6.0000
3.0000
%% This command rounds-off values to nearby rational numbers
%% which may be integers
>> U = RoundByRatCommand(U)
U =
3
6
3
A has a repeated eigenvalue, we can make a table analyzing the structure of
the eigensystem of A
k k mk
1
2

3
6

2
1

gk

# Needed Gen. Evecs


mk gk

1 or 2
1

0 or 1
0

Table 3: Analysis of the structure of the eigensystem of A.


Table 3 shows that A has two distinct eigenvalues, and we dont yet know if
1 has 1 or 2 independent eigenvectors.
Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 21

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 22

EE/ME 701: Advanced Linear Systems

Section 3.2.2

Evaluate the geometric multiplicity of 1

EE/ME 701: Advanced Linear Systems

In the example, we need 1 generalized eigenvector for the k = 1 eigenvalue.

>> lambda1=3; lambda2=6; I = eye(3);


>> v1 = null(A-lambda1*I); v1 = v1/v1(1)
v1 = 1
%% Eigenvector, scaled so the
1
%% first element is an integer
-1

In this case for k = 1 we have only one regular eigenvector, so it must serve
as the first element, or anchor, of the chain of generalized eigenvectors
regular eigenvector:
generalized eigenvector:

The geometric multiplicity is the dimension of the null space in which the
eigenvectors lie. For 1, g1 = 1. Putting this information into the table
k k mk gk
1
2

3
6

2
1

# Needed Generalized Evecs


mk gk

1
1

1
0

Table 4: Analysis of the structure of the eigensystem of A.


The total number of eigenvectors (regular+generalized) needed for k is mk .
The number of regular eigenvectors is gk . The regular eigenvectors get the
notation:
1
1
1
Vk,1
, Vk,2
, ..., Vk,g
k
where j = 1 ... gk .
The number of needed generalized eigenvectors, corresponding to k , is
mk gk .

Part 4b: Eigenvalues and Eigenvectors

Section 3.2.2

(Revised: Sep 10, 2012)

Page 23

1
V1,1

solves

2
V1,1
solves

1
(A 1 I)V1,1
= 0

(44)

2
1
(A 1 I)V1,1
= V1,1

In Matlab
>> lambda1=1; lambda2=2; I = eye(3);
%% Find the first regular eigenvector,
>> V111 = null(A-lambda1*I); V111=V111/V111(1)
V111 = 1.0000
1.0000
-1.0000
%% Find the generalized eigenvector by solving Eqn (44)
>> V112 = pinv(A-lambda1*I)*V111
V112 = -1.0000
1.0000
0.0000
%% Find the regular eigenvector for lambda2
>> V211 = null(A-lambda2*I); V211=V211/V211(2)
V211 =
0
1.0000
-1.0000

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 24

EE/ME 701: Advanced Linear Systems

Section 3.2.2

Put the eigenvectors (regular and generalized) together in the M matrix.


The regular and generalized eigenvectors of a chain must be put in order.
vectors Vk,l j

For each k and j, put the


going to the end of the chain.

into M, starting with l = 1, and

For the example,

>> M = [V111, V112, V211]


M =
1.00
-1.00
0
1.00
1.00
1.00
-1.00
0.00
-1.00

..
.
..
.
..
.
..
.

3.2.3 More on Jordan blocks


A matrix in Jordan canonical form has a block diagonal structure, with
Eigenvalues on the main diagonal

>> J = inv(M) * A * M
>> J = RoundByRatCommand(J)
J =
3
1
0
0
3
0
0
0
6

0
,

A 2x2 block has 1 one, a 3x3 block has 2 ones, etc.


One Jordon block corresponds to each regular eigenvector
If the regular eigenvector has no generalized eigenvectors, then it creates a
1x1 block.

i
h
1 V2 V1
M = V1,1
1,1
2,1

3 1

0 3
J=

0 0

If the regular eigenvector anchors a chain with one generalized eigenvector,


then it creates a 2x2 block, etc.
Each Jordan block corresponds to:
1x1 block:

a regular eigenvector

n n block, n 2: a chain anchored by a regular eigenvector, with n 1


generalized eigenvectors
eAt = M eJ t M 1

(45)

For a system governed by x (t) = A x (t), and considering the J matrix, the
output of the system will have solutions of the form
y (t) = c1 e3t + c2 t e3t + c3 e6t

(46)

Using the Vk,l j notation, if we look at the structure of the M matrix, we can
determine the layout of Jordan blocks. For example,
h
i
1 V1 V2 V1 V2
M = V1,1
1,2
1,2
2,1
2,1
Then the blocks of J are arranged:

1x1

J=

where the first two terms correspond to the first Jordan block, and the last
term to the second Jordan block.
Part 4b: Eigenvalues and Eigenvectors

Section 3.2.3

Ones on the upper diagonal within each block.

Put in the chain corresponding to for each regular eigenvector ( j) for


each distinct eigenvalue (k). Chains may have a length of 1.

J has 2 Jordon blocks

EE/ME 701: Advanced Linear Systems

(Revised: Sep 10, 2012)

Page 25

2x2
2x2

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 26

EE/ME 701: Advanced Linear Systems

Section 3.2.4

3
1
0
0
0
0

-1
1
0
0
0
0

1
-1
2
0
0
0

1
-1
0
2
0
0

0
0
1
-1
1
1

0
0
1
-1
1
1

>> U = eig(A)
U = 2.0000
2.0000
2.0000
2.0000
2.0000
0

>> lambda1=0; lambda2=2; I = eye(6);


>> [Row, Col, Null, LNull] = spaces(A-lambda2*I);
>> g2=rank(Null); g2 = 2
(A 2 I) has a 2-dimensional null space, so there are 2 independent regular
eigenvectors.
=

0
0.7071
0
0.7071
-0.7071
0.0000
0.7071
0.0000
0.0000
0
0.0000
0
For convenience, scale the eigenvectors to get integer values
>> V211 = Null(:,1)/Null(3,1)
V211 =
0
0
1
-1
0
0
Part 4b: Eigenvalues and Eigenvectors

>> %% Check that the eigenvectors are indeed eigenvectors


>> %% These norms come out to zero, very small would be sufficient
>> NDiff1 = norm( A*V211 - lambda2*V211 )
NDiff1 =
0
>> NDiff2 = norm( A*V221 - lambda2*V221 )
NDiff2 =
0
Note: All vectors from null (A 2I) are eigenvectors. For example,

1 = 0 , 2 = 2 is repeated 5 times.

Null

Section 3.2.4

Check that the eigenvectors are actually eigenvectors for 2 = 2

3.2.4 The Jordan form, a second example


A =

EE/ME 701: Advanced Linear Systems

>> V221 = Null(:,2)/Null(1,2)


V221 = 1
1
0
0
0
0
(Revised: Sep 10, 2012) Page 27

>> x = 0.3*V211 + 0.4*V221


x = 0.4
>> NDiffx = norm( A*x - lambda2*x )
0.4
NDiffx = 0
0.3
-0.3
0.0
0.0

Make a table of the structure of the eigensystem.


k k mk gk
1
2

0
2

1
5

1
2

# Needed Gen. Evecs


mk gk
0
3

Table 5: Structure of the eigensystem of A.


We need 3 generalized eigenvectors to have a complete set.
These three will be in chains anchored by one or both of the regular
eigenvectors of 2 .

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 28

EE/ME 701: Advanced Linear Systems

Section 3.2.4

EE/ME 701: Advanced Linear Systems

Section 3.2.4

2
Find V2,1

The equation giving generalized eigenvectors is


l
(A k I) Vk,l+1
j = Vk, j

(47)

This is simply the relation


Bx = y

(48)

with B = (A k I) , y = Vk,l j and x = Vk,l+1


j .
We know some things about the solutions of Eqn (47)
1. For an exact solution to exist Vk,l j must lie in the column space of
(A k I)
2. If we find a solution Vk,l+1
j , it is not unique. We can add any component
from the null space of (A k I) , and it will still be a solution.

>> V212 = pinv(A-lambda2*I)*V211


V212 =
0
0
0.00
-0.00
0.50
0.50
%% Check that V212 is a generalized eigenvector
>> NDiffV212 = norm( (A-lambda2*I)*V212 - V211 )
NDiffV212 = 2.7581e-16
2 is a generalized eigenvector.
Yes, V2,1

Considering again the example problem,


Check that V211 and V221 lie in the column space of (A 2 I) by
checking that the projection of each onto the column space is equal to
the original vector

2 is in the column space of (A I)


Test to see if V2,1
2

>> norm( Col*Col*V212-V212 )


ans = 0.7071

>> [Row, Col, Null, LNull] = spaces(A-lambda2*I);


3
No, so there is no V2,1

>> NIsInColumnSpaceV211 = norm( Col*Col*V211-V211 )


NIsInColumnSpaceV211 = 1.1102e-16
%% V211 is in the column space of (A-lambda2*I)
>> NIsInColumnSpaceV2 = norm( Col*Col*V221-V221 )
NIsInColumnSpaceV2 = 1.1430e-15
%% V221 is in the column space of (A-lambda2*I)
Both vectors lie in the column space of (A 2 I) , so each will have at
least one generalized eigenvector.
Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 29

2
Now evaluate V2,2

>> V222 = pinv(A-lambda2*I)*V221


V222 = 0.50
>>
%% Check that V222 is a gen. eigenvector
-0.50
>>
NDiffV222
= norm( (A-lambda2*I)*V222 - V221 )
-0.00
NDiffV222
=
6.2804e-16
-0.00
0.00
2 is a generalized eigenvector
Yes, V2,2
0.00

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 30

EE/ME 701: Advanced Linear Systems

Section 3.2.4

2 is in the column space of (A I)


Now check to see that V2,2
2

3 . This will be the third generalized eigenvector


Yes, so there is a V2,2

V223 =

0.00
-0.00
0.25
0.25
-0.00
-0.00

0
0
0
1
2
0

0
0
0
0
1
2

J has 3 Jordan blocks


J =

First we need the regular eigenvalue corresponding to 1


>> V111 = null(A-lambda1*I);
>> V111 = V111/ V111(5)

V111 =

0
0
0
0
1
-1

Put in the chains of E-vecs, starting each chain with its regular E-vec

Part 4b: Eigenvalues and Eigenvectors

>> J = inv(M)*A*M;
>> J = RoundByRatCommand(J)
J =
0
0
0
0
0
2
1
0
0
0
2
0
0
0
0
2
0
0
0
0
0
0
0
0
Interpreting the result

Now build the M matrix

>> M = [ [V111] [V211 V212]


>> M = RoundByRatCommand(M)
M =
0
0
0
0
0
0
0
1.00
0
0 -1.00
0
1.00
0
0.50
-1.00
0
0.50

Section 3.2.4

Now find J

>> NIsInColumnSpaceV222 = norm( Col*Col*V222-V222 )


NIsInColumnSpaceV222 = 4.2999e-16

>> V223 = pinv(A-lambda2*I)*V222,

EE/ME 701: Advanced Linear Systems

[V221 V222 V223] ]


1.00
1.00
0
0
0
0

0.50
-0.50
0
0
0
0

(Revised: Sep 10, 2012)

Correspondingly, M has 3 chains of eigenvectors


M

0
0
0.25
0.25
0
0

0
| 0
0 |
0
0
0
----------------0
| 2
1 |
0
0
0
0
| 0
2 |
0
0
0
----------------------------0
0
0 |
2
1
0
0
0
0 |
0
2
1
0
0
0 |
0
0
2

0
0
0
0
1.00
-1.00

|
0
|
0
| 1.00
| -1.00
|
0
|
0

0
0
0
0
0.50
0.50

|
|
|
|
|
|

1.00
1.00
0
0
0
0

0.50
-0.50
0
0
0
0

0
0
0.25
0.25
0
0

The first eigenvector in each chain is a regular eigenvector.


Page 31

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 32

EE/ME 701: Advanced Linear Systems

Section 3.3.0

3.3 One more twist, freedom to choose the regular eigenvector


Fact: If a matrix A has repeated eigenvalues, with gk > 1 independent eigenvector,
the gk eigenvectors form a vector subspace, any vector from which is an
eigenvector.
When mk 3, it is possible that gk 2, and we need to find a generalized
eigenvector.

EE/ME 701: Advanced Linear Systems

Section 3.3.0

When gk 2, we have the freedom to choose the anchor for the chain of
generalized eigenvectors
1 , V 1 , ...
Not just from a list, Vk,1
k,2

But as any vector from the null space of (A k I) .


1 , V 1 , neither one of
It may be that we have valid eigenvectors , Vk,1
k,2
which lies in the column space of (A k I) !

By forming the intersection of the null and column spaces of (A k I), we


can find the needed regular eigenvector.

In this case,
dim null (A k I) = gk 2
and any vector from the 2-dimensional (or larger) null space of (A k I) is
an eigenvector.

W = col (A k I) null(A k I) ,

Vk,1 j W

(50)

Consider the generating equation for the generalized eigenvector


(A k I) Vk,2 j = Vk,1 j

(49)

The anchor Vk,1 j must also lie in the column space of (A k I)


A regular eigenvector that anchors a chain of generalized eigenvectors must
lie in 2 spaces at once:
The null space of (A k I) . . . . . . . . . . . . . . . To be a regular e-vec of A.
The column space of (A k I) . . . To generate a generalized e-vec of A.

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 33

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 34

EE/ME 701: Advanced Linear Systems

Section 3.3.1

3.3.1 Example where regular E-vecs do not lie in the column space of
(A k I)
Consider the matrix

2 1

0 4
A=

0 1

0 1

>> NIsInColumnSpaceV1 =
NIsInColumnSpaceV1 =
>> NIsInColumnSpaceV2 =
NIsInColumnSpaceV2 =

>> RoundByRatCommand ( eig(A) )


ans = 2
2
2
2

-0.7118
-0.2468
-0.4650
-0.4650

norm( Col*Col*v1-v1 )
0.6325
norm( Col*Col*v2-v2 )
0.6325

But what about the possibility that there exists another eigenvector which
lies in the null space and column space of (A k I):

Null =

0
0.8165
-0.4082
-0.4082

1
0
0
0

So the structure of the eigensystem is given in table 6


k k mk gk

Null(3,1) )

No, neither eigenvector lies in the column space of (A k I)

>> I = eye(4) lambda1=2,


>> [Row, Col, Null, Lnull] = spaces(A - lambda1*I)
0.3055
0.7342
-0.4287
-0.4287

A first choice for eigenvectors are the two basis vectors of the null space of
(A k I)

Determine if the eigenvectors like in the column space of (A k I)

Analyzing the structure of its eigensystem

Col =

Section 3.3.1

>> v1 = RoundByRatCommand( Null(:,1) /


>> v2 = RoundByRatCommand( Null(:,2) )
v1 =
0
v2 = 1
-2
0
1
0
1
0

2 2

0 0

EE/ME 701: Advanced Linear Systems

x1 = a1 v1 + a2 v2
First, consideration of the possibilities
The universe is R4 , or 4D, with dim col (A k I) = 2, and
dim null (A k I) = 2. So there are 3 possibilities:
1. Two 2D spaces can fit in a 4D universe and not intersect, so it is
possible
col (A k I) null (A k I) = 0

# Needed Gen. Evecs


mk gk
2

2. It is possible that the intersection is 1 D

Table 6: Analysis of the structure of the eigensystem of A.

3. It is possible that the intersection is 2 D


Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 35

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 36

EE/ME 701: Advanced Linear Systems

Section 3.3.1

Previously, we have seen how to form the intersection of two subspaces,


Given sets of basis vectors U = [u1 , u2 , ..., unu ] and V = [v1 , v2 , ..., vnv ],
vectors in the intersection
W = UV

That is

Now find w1 , a vector in both the column and null spaces of (A k I)


>> w1 = Col*CoeffVec(1:2)
>> w1 = RoundByRatCommand( w1/w1(1)),

w1 =

1
2
-1
-1

Verify that w1 is an eigenvector

wi = a1 u1 + a2 u2 + + anu unu = b1 v1 + b2 v2 + + bnv vnv

Section 3.3.1

(51)

Are solutions to

for some non-zero values

EE/ME 701: Advanced Linear Systems

a1 anu , b1

(52)

i
a1 anu , b1 anv .
i
anv must solve

a
1

[U, V ] ... = 0

bnv

(A k I) has only one regular eigenvector that can anchor a chain. So the
chain must have length 3 (2 generalized E-vecs).
2 as solution to
Compute a candidate for the first generalized eigenvector, V1,1
2 = V1
(A k I) V1,1
1,1

(53)

h
i
The coefficient vector must lie in the null space of Col, Null , where
[Col] and [Null] are sets of basis vectors on the column and null spaces.
>> CoeffVec = null([Col, -Null])
CoeffVec =

0.7033
-0.0736
0.6547
0.2673
h
i
Since the null space of Col, Null is one dimensional, the intersection
of the column and null spaces is 1D
Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

>> NIsEigenvectorW1 = norm( A*w1 - lambda1 * w1)


NIsEigenvectorW1 =
0

Page 37

>> V111 = w1,


>> v3 = pinv(A - lambda1*I) * V111,

v3 =

0
0.3333
0.3333
0.3333

3 , V 2 must be in the
To find the remaining generalized eigenvector, V1,1,
1,1
column space of (A k I)

>> NIsInColumnSpaceV112 = norm( Col*Col*v3 - v3)


NIsInColumnSpaceV112 = 0.4216
It is not !
2 ,
v3 is a candidate generalized eigenvector, but we can not use it for V1,1
3 .
because it does not lead to V1,1

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 38

EE/ME 701: Advanced Linear Systems

Section 3.3.1

2 must solve
Going back to the generating Eqn, V1,1

EE/ME 701: Advanced Linear Systems

Section 3.3.1

2. Determine the candidate vector


>> v3b = v3 + Null * CoeffVec2(3:4),

2
1
(A k I) V1,1
= V1,1

(54)

v3 is a particular solution to Eqn (54), but it is not the only solution. Any
vector
2
V1,1
= v3 + b1 n1 + b2 n2
(55)
is a solution to (54), where n1 and n2 are basis vectors for the null space of
(A k I).
2 that is in the column space of (A I), we need a
To find a value for V1,1
k
solution to
2
= v3 + b1 n1 + b2 n2 = a1 c1 + a2 c2
(56)
V1,1

Or

c1 c2 n1

1. Find the coefficient vector

a
1

i
a2
= v3
n2

b1

b2

(57)

3. Check to be sure the new candidate is in the column space of (A k I)


>> NIsInColumnSpaceV112b = norm( Col*Col*v3b - v3b)
NIsInColumnSpaceV112b =
6.0044e-16
Yes !
2 and compute V 3
Set V1,1
1,1

>> V112 = v3b


V112 = 0.5714
0.1429
0.4286
0.4286

>> V113 = pinv(A - lambda1*I) * V112


V113 =
0
0.1905
0.6905
-0.3095

1 is any independent regular eigenvector. Compute J.


Build the M matrix, V1,2

>> M = [ V111, V112, V113, V121 ]


M = 1.0000
0.5714
0
2.0000
0.1429
0.1905
-1.0000
0.4286
0.6905
-1.0000
0.4286
-0.3095

>> CoeffVec2 = pinv( [Col, -Null]) * v3


CoeffVec2 = -0.0880
-0.8406
-0.2333
0.5714

v3b = 0.5714
0.1429
0.4286
0.4286

1.0000
0
0
0

>> J = RoundByRatCommand( inv(M) * A * M )


J = 2
1
0
0
0
2
1
0
0
0
2
0
0
0
0
2
J has a 3x3 block, and a 1x1 block.

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 39

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 40

EE/ME 701: Advanced Linear Systems

Section 3.4.0

3.4 Summary of the Jordan Form

EE/ME 701: Advanced Linear Systems

Section 3.4.1

3.4.1 Why Matlab does not have a numeric Jordan command

Square matrices always have n eigenvalues, i

Strikingly, Matlab has no numerical routine to find the generalized


eigenvectors or Jordan form (standard Matlab no jordan() routine !)

The regular eigenvectors are given as the null space of (A i I)


For a repeated eigenvalue k
The algebraic multiplicity, mk , is the number of times k is repeated

This is because the Jordan form calculation is numerically very sensitive,


a small perturbation in A produces a large change in the chains of
eigenvectors
This sensitivity is true of the differential equations themselves,

The geometric multiplicity, gk , is the dimension of null(A k I)


When eigenvalues are repeated, we may not have enough independent
regular eigenvectors (gk < mk ), in which case the Jordan form is required.
The Jordan form corresponds to scalar differential equations with
repeated roots and solutions of the form

y (t) + 6 y (t) + 9.00001y (t) = 0


has two distinct roots !
Consider the stages where a decision must be made
When there are two eigenvalues with a b, are they repeated or
distinct ?

y (t) = a1 e1 t + a2 t e1 t + a3 t 2 e1 t ...

What is the dimension of null (A I) ?


For repeated eigenvalues, regular eigenvectors give rise to chains of
generalized eigenvectors. The generalized eigenvectors are solutions to
l+1
l
(A i I) vk,
j = vk, j

Does vlk, j lie in the column space of (A I), or does it not ?


l+1
Is vk,
j independent of the existing eigenvectors ?

(58)

There is no known numerical routine to find the Jordan form that is


sufficiently numerically robust to be included in Matlab.

Examples have shown several characteristics of eigensystems with repeated


roots.

The Matlab symbolic algebra package does have a jordan() routine. It


runs symbolically on rational numbers to operate without round-off error,
for example:

where Eqn (58) is repeated as needed to obtain mk eigenvectors.

21/107 52/12

A=

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 41

119/120

1/1

11/12

Part 4b: Eigenvalues and Eigenvectors

3/2

8/5

13/14

(Revised: Sep 10, 2012)

Page 42

EE/ME 701: Advanced Linear Systems

Section 4.0.0

4 Conclusions

EE/ME 701: Advanced Linear Systems

Section 5.0.0

The solution in Jordan for comprises Jordan blocks

To solve

Each Jordan block corresponds to a chain of generalized eigenvectors.


x (t) = A x (t) + B u (t)

The generalized eigenvectors solve the equation

we are interested to make a change of basis from physical (or other)


coordinates to modal coordinates.
This involves the eigenvalues and eigenvectors. The eigenvectors solve the
equation
(A i I) vi = 0

l+1
l
(A i I) vk,
j = vk, j

For the l th generalized eigenvector, of the jth Jordan block, of the kth distinct
eigenvalue
v1k, j is a regular eigenvector, corresponding to k

So the eigenvectors lie in null (A i I)

l+1
l
For vk,
j to exist, vk, j must lie in the column space of (A k I)

Eigenvalues may be complex

The Jordan form leads to solutions of the differential equation in the form
t et , t 2 et , etc. For example:

Complex eigenvalues correspond to solutions with terms


x (t) = a2et cos (t + ) w1 + a2et sin (t + ) w2
The complex terms correspond to the action of a rotation in state space,
in the sub space spanned by the complex eigenvectors.
The eigenvectors corresponding to an eigenvalue (or a complex conjugate
pair of eigenvalues) define and A-invariant subspace. Vectors in this
subspace stay in this subspace.
The modal matrix is the transformation from from modal to physical
p
coordinates, M = m T .

If we lack a complete set of regular eigenvectors, M includes generalized


eigenvectors, and
J = M 1 A M
gives the system matrix in Jordan form.
(Revised: Sep 10, 2012)

Page 43

..
.
..
.
..
.
.. 2 t
. e
t e2 t
..
. 0 e2 t
..
. 0
0

..
1 1 .

1 ..

...
J =

..

. 2 1

..

.
2 1

..
.
2

With a complete set of eigenvectors, M = V .

Part 4b: Eigenvalues and Eigenvectors

(59)

eJt

1 t t e1 t
e

0 e1 t

Part 4b: Eigenvalues and Eigenvectors

1 2 2 t
2t e
t e2 t

e2 t

(Revised: Sep 10, 2012)

Page 44

EE/ME 701: Advanced Linear Systems

Section 5.0.0

5 Review questions and skills


1. In what fundamental space do the regular eigenvectors lie.
2. Given the eigenvalues of a matrix, analyze the structure of the eigensystem
(a) Determine the number of required generalized eigenvectors
3. Indicate the generating equation for the generalized eigenvectors.
4. Indicate in what fundamental space the vectors of the generating equations
must lie.
5. When gk 2, and no regular eigenvector lies in the column space of
(A k I), what steps can be taken ?
6. When additional generalized eigenvectors are needed, and v, a candidate
generalized eigenvector does not lie in the column space of (A k I), what
steps can be taken ?

Part 4b: Eigenvalues and Eigenvectors

(Revised: Sep 10, 2012)

Page 45

EE/ME 701: Advanced Linear Systems

EE/ME 701: Advanced Linear Systems

Review-By-Example of some of the Basic


Concepts and Methods of Control System
Analysis and Design

4.1.1
4.2

Contents
1 Differential Eqns, Transfer Functions & Modeling

4.3

Real Pole location and response characteristics . . . . . .

Basis of pole-zero maps, 2nd order, p1,2 = j

27

. . . . . . .

28

4.2.1

Notation for a complex pole pair . . . . . . . . . . . . . .

29

4.2.2

Complex pole location and response characteristics

. . .

30

4.2.3

The damping factor: . . . . . . . . . . . . . . . . . . .

33

Determining approximate performance measures from a dominant


second-order mode . . . . . . . . . . . . . . . . . . . . . . . . .

35

4.3.1

Rise time from pole locations, 10%-90% . . . . . . . . .

36

4.3.2

Peak time from pole locations . . . . . . . . . . . . . . .

36

4.3.3

Settling time from pole locations . . . . . . . . . . . . . .

36

1.1

Example 1, Golden Nugget Airlines . . . . . . . . . . . . . . . .

1.2

Block diagram

. . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3

Laplace Transforms and Transfer Functions . . . . . . . . . . . .

1.3.1

Laplace transform . . . . . . . . . . . . . . . . . . . . .

1.3.2

Some basic Laplace transform Pairs . . . . . . . . . . . .

5 Design

1.3.3

Transfer Functions are Rational Polynomials . . . . . . .

5.1

Design methods

. . . . . . . . . . . . . . . . . . . . . . . . . .

38

1.3.4

A transfer function has poles and zeros . . . . . . . . . .

10

5.2

Root Locus Design . . . . . . . . . . . . . . . . . . . . . . . . .

38

1.3.5

Properties of transfer functions . . . . . . . . . . . . . . .

12

1.3.6

The Impulse Response of a System . . . . . . . . . . . .

15

2 Closing the loop, feedback control

16

2.1

Analyzing a closed-loop system . . . . . . . . . . . . . . . . . .

20

2.2

Common controller structures: . . . . . . . . . . . . . . . . . . .

21

2.3

Analyzing other loops

22

. . . . . . . . . . . . . . . . . . . . . . .

3 Analysis
4 Working
4.1

37

6 Summary

39

7 Glossary of Acronyms

40

25
with

the

pole-zero

constellation
26

Basics of pole-zero maps, 1st order, p1 = . . . . . . . . . . .

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

26

Page 1

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 2

EE/ME 701: Advanced Linear Systems

Section 1.1.0

1 Differential Eqns, Transfer Functions & Modeling


1.1 Example 1, Golden Nugget Airlines

EE/ME 701: Advanced Linear Systems

Section 1.2.0

1.2 Block diagram


A block diagram is a graphical representation of modeling equations and
their interconnection.

Dynamic systems are governed by differential equations


(or difference equations, if they are discrete-time)

Eqns (1)-(3) can be laid out graphically, as in figure 2.

Example (adapted from Franklin et al. 4thed., problem 5.41, figure 5.79)

Mp(t)
Elevator
servo

v(t)

GNA

7
s + 10

Aircraft
Dynamics

+
Me(t)
+

Mt(t)

Integrator

(t)

s+3
s2 + 4 s + 5

(t)

1
s

(t), (t)

Figure 2: Block diagram of the aircraft dynamics and signals.


Mp(t) Me(t)

Figure 1: Golden Nugget Airlines Aircraft. Me (t) is the elevator-moment (the


control input), M p (t) is the moment due to passenger movements (a
disturbance input), and (t) is the aircraft pitch angle, (t) = (t) .
For example, the aircraft dynamics give:
1

d Mt (t)
d 2 (t)
d (t)
+ 5 (t) = 1
+ 3 Mt (t)
+4
dt 2
dt
dt

Mt (t) = Me (t) + M p (t) ;


1

(t) =

(1)

d (t)
dt

(2)

d Me (t)
+ 10 Me (t) = 7 v(t)
dt

(3)

Where Mt (t) is the total applied moment.


Eqn (1) has to do with the velocity of the aircraft response to Mt (t).
Eqn (2) expresses that the pitch-rate is the derivative of the pitch angle.
And Eqn (3) describes the response of the elevator to an input command
from the auto-pilot.
Main Fact: The Differential Equations come from the physics of the system.
Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 3

Signal Symbol

Signal Name / Description

Units

v (t)

Voltage applied to the elevator drive

[volts]

Me (t)

Moment resulting from the elevator surface

[N m]

M p (t)

Moment resulting from movement of the passengers

(t)

[N m]

Pitch Rate (angular velocity)

[rad/sec]

Aircraft pitch angle

[radians]

(t)

Table 1: List of signals for the aircraft block diagram.

When analyzing a problem from basic principals, we would also have a list
of parameters.

Parameter Symbol

Signal Name / Description

Value

Units

b0

Motion parameter of the elevator

[N-m/volt]

...

...

...

Table 2: List of parameters for the aircraft block diagram.


Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 4

EE/ME 701: Advanced Linear Systems

Section 1.3.1

1.3 Laplace Transforms and Transfer Functions

EE/ME 701: Advanced Linear Systems

Section 1.3.2

1.3.2 Some basic Laplace transform Pairs

To introduce the Transfer Function (TF), we need to review the Laplace


transform.
Time Domain Signal

L{f(t)}

Scaled Impulse:
F(s)

f(t)
L-1{F(s)}

Unit step:

F (s) = b
1
s

f (t) = t, t 0

F (s) =

1
s2
n!

Decaying Exponential:

"Time Domain"

Space of all
functions of s
"Frequency Domain"

f (t) = t n , t 0

F (s) =

(t) = b et

F (s) =

f (t) = sin (wt)

Sinusoid:

f (t) = Bc cos (t) + Bs sin (t)

Sinusoidal oscillation:

f (t) = et (Bc cos (t) + Bs sin (t))

Oscillatory Exp. Decay:

s2 +2
c s+Bs
F (s) = sB2 +0
s+2
Bc s+(Bc +Bs )
F (s) = s2 +2 s+2
n

2 + 2

Table 3: Basic Laplace-transform pairs.

Answers:

U(s)

u(t)

Solve
Algebraic
Eqn.

Solve
Diff Eq.
Y(s)

y(t)

2. The LT makes possible the


transfer function.

1/

Area = 1.0

1+(t)

t=0

t=0

t=0
(t) , impulse function

1+ (t), unit step

Unit ramp
e-t

Space of all functions


of time

Space of all
functions of s

t=0

4. Frequency domain: find Guy (s)


for all U (s).
Figure 3: Solve differential equation in time domain or solve an algebraic
equation in s domain.
(Revised: Sep 08, 2012)

t=0
Decaying expon.

Page 5

t=0

A Sinusoid

Part 1A: Controls Review-By-Example

sn+1
b
s+

F (s) =

n =

Why pay attention to Laplace


transforms ?

Part 1A: Controls Review-By-Example

f (t) = b (t)

F (s) =

Higher Powers of t:

Space of all functions


of time

3. Time domain: find y (t) for one


u (t).

F (s) = 1

f (t) = 1+ (t) = 1.0, t 0

Unit ramp:

The Inverse Laplace transform


maps F (s) to f (t).

1. Differential equations in time


domain correspond to algebraic
equations in s domain.

f (t) = (t)

Unit Impulse:

1.3.1 Laplace transform


The Laplace transform (LT) maps
a signal (a function of time) to a
function of the Laplace variable s .

Laplace transform

Oscil. decay exp.

(Revised: Sep 08, 2012)

Page 6

EE/ME 701: Advanced Linear Systems

Section 1.3.2

Two theorems of the Laplace transform permit us to build transfer functions:


1. The Laplace transform obeys superposition and scaling
Given: z (t) = x (t) + y (t)

then

Z (s) = X (s) +Y (s)

z (t) = 2.0 x (t)

then

Z (s) = 2.0 X (s)

Given:

Given: X (s) = L {x (t)} ,

then

d x (t)
dt

Section 1.3.2

Eqns (5)-(7) tell us something about the ratio of the LT of the output to the
LT of the input:


a2 s2 + a1 s + a0 Y (s) = (b1 s + b0) U (s)

(7, repeated)

(b1 s + b0)
Output LT
Y (s)
=
=
= G p (s)
U (s)
(a2 s2 + a1 s + a0)
Input LT

2. The Laplace transform of the derivative of a signal x (t) is s times the LT


of x (t):


EE/ME 701: Advanced Linear Systems

(8)

A Transfer Function (TF) is a ratio of the input and output LTs


Given

G p (s) =

= s X (s)

(b1 s + b0)
,
(a2 s2 + a1 s + a0)

Y (s) = G p (s) U (s)

(9)

Where G p (s) is the TF of the plant.


Putting these rules together lets us find the
transfer function of a system from its governing
differential equation.

u(t)

System

y(t)

Gp(s)

The transfer function is like the gain of the system, it is the ratio of the
output LT to the input LT.

Consider a system that takes in the signal u (t) and gives y (t), governed by
the Diff Eq:
a2

Important fact:

d 2 y (t)
d y (t)
d u (t)
+ a1
+ a0 y (t) = b1
+ b0 u (t)
dt
dt
dt 2

However, the TF depends only on the parameters of the system


(coefficients a2..a0 and b0..b1 in the example), and not on the actual
values of U (s) or Y (s) (or u (t) and y (t)).

(4)

(Notice the standard form: output (unknown) on the left, input on the right.)
Whatever signals y (t) and u (t) are, they have Laplace transforms. Eqn (4)
gives:

Basic and Intermediate Control System Theory are present transfer


function-based design.
By engineering the characteristics of the TF, we engineer the system to
achieve performance goals.
r(t)

d 2 y (t)
d y (t)
+ a1
+ a0 y (t)
L a2
dt 2
dt


d u (t)
= L b1
+ b0 u (t)
dt

a2 s2 Y (s) + a1 sY (s) + a0 Y (s) = b1 sU (s) + b0 U (s)




a2 s2 + a1 s + a0 Y (s) = (b1 s + b0) U (s)
Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

(5)
(6)
(7)

Page 7

e(t)

Controller

u(t)

ys(t)

Plant

y(t)

Gp(s)

KcGc(s)

Sensor Dynamics
Hy(s)

Figure 4: Block diagram of typical closed-loop control.


Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 8

EE/ME 701: Advanced Linear Systems

Section 1.3.3

1.3.3 Transfer Functions are Rational Polynomials

EE/ME 701: Advanced Linear Systems

Section 1.3.4

1.3.4 A transfer function has poles and zeros


A TF has a numerator and denominator polynomial, for example

GNA

G p (s) =

(t), (t)
Mp(t) Me(t)

Figure 5: Golden Nugget Airlines Aircraft. (t) [radians/second] is the


pitch-rate of the aircraft, and Mt (t) is the moment (torque) applied by
the elevator surface.
Consider the differential equation of the Aircraft pitch angle:
1

d 2 (t)
d (t)
d Mt (t)
+4
+ 5 (t) = 1
+ 3 Mt (t)
dt 2
dt
dt

N (s)
2 s2 + 8 s + 6
= 3
D (s) s + s s2 + 4 s + 0

The roots of the numerator are called the zeros of the TF, and the roots of
the denominator are called the poles of the TF. For example:
>> num = [2 8 6]
>> den = [1 2 4 0 ]
num =
2
8
6
den =
1
2
4

>> zeros = roots(num)


zeros =
-3
-1
>> poles = roots(den)
poles =
0
-1.0000 + 1.7321i
-1.0000 - 1.7321i

(10)

From (10) we get the TF, take LT of both sides, and rearrange:


s2 + 4 s + 5 (s) = (s + 3) Mt (s)
(s)
s+3
= 2
Mt (s)
s +4s +5

(11)

We can also use Matlabs system tool to find the poles and zeros
>> Gps = tf(num, den)
Transfer function:

Note:
We can write down the TF directly from the coefficients of the differential
equation.
We can write down the differential equation directly from the coefficients
of the TF.
Transfer functions, such as Eqn (11), are ratios of two polynomials:
s+3
numerator polymonial
(s)
=
:
Mt (s) s2 + 4 s + 5
denominator polynomial

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

>> zero(Gps),

>> pole(Gps),

We call a TF such as (11) a rational polynomial.


Page 9

(12)

%% Build the system object


2 s^2 + 8 s + 6
----------------s^3 + 2 s^2 + 4 s

ans = -3
-1
ans =
0
-1.0000 + 1.7321i
-1.0000 - 1.7321i

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 10

EE/ME 701: Advanced Linear Systems

Section 1.3.4

Interpreting the poles (p1, p2, ..., pn) and zeros (z1, ..., zm)

EE/ME 701: Advanced Linear Systems

Section 1.3.5

1.3.5 Properties of transfer functions

We can use the poles and zeros to write the TF in factored form:

Just as a differential equation can be scaled by multiplying the left and right sides
by a constant, a TF can be scaled by multiplying the numerator and denominator
by a constant.

2 s2 + 8 s + 6
b2 (s z1) (s z2)
=
s3 + 2 s2 + 4 s (s p1) (s p2) (s p3)
2 (s + 3) (s + 1)
=
(s 0) (s + 1 1.732 j) (s + 1 + 1.732 j)

G (s) =

Monic: A TF is said to be monic if an =1. We can always scale a TF to be monic.


If G1 (s) is scaled to be monic, then
b0
(13)
G1 (s) =
s + a1

With a complex pole pair we can do two things,


1. Use a shorthand
G (s) =

2 (s + 3) (s + 1)
s (s + 1 1.732 j)

with b0 = b0/an and a1 = a1 /an .

(Because poles always come in complex conjugate pairs.)


(a) Write the complex pole pair as a quadratic, rather than
G (s) =

1st

order terms

Rational Polynomial Form: A TF is in rational polynomial form when the


numerator and denominator are each polynomials. For example

2 (s + 3) (s + 1)
s (s2 + 2 s + 4)

G p (s) =

The zeros are values of s at which the transfer function goes to zero

(14)

An example of a TF not in rational polynomial form is:

The poles are values of s at which the transfer function goes to infinity
We can plot a pole-zero map:

2 s2 + 8 s + 6
s3 + s s2 + 4 s + 0

G3 (s) =

PZ Map

2 (s + 3)/ (s)
(s2 + 2 s + 4)/ (s + 1)

(15)

imaginary

>> Gps=tf([2 8 6], [1 2 4 0])


>> pzmap(Gps)

By clearing the fractions within the fraction, T3 (s) can be expressed in rational
polynomial form

1
0

G3 (s) =

1
2
4

2 (s + 3)/ (s)
(s2 + 2 s + 4)/ (s + 1)

(s) (s + 1)
2 s2 + 8 s + 6
2 (s + 3)(s + 1)
= 3
= 2
(s) (s + 1) (s + 2 s + 4)(s) s + 2 s2 + 4 s

Note the middle form above, which can be called factored form.
2

real

Figure 6: Pole-Zero constellation of aircraft transfer function.


Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 11

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 12

EE/ME 701: Advanced Linear Systems

Section 1.3.5

General Form for a rational polynomial transfer function


for a rational polynomial transfer function is

G (s) =

sm + b

The general form

sm1 + b

bm
m1
1 s + b0
ansn + an1 sn1 + + a1 s + a0

1.3.5 Properties of Transfer function (continued)

(16)
So, for example, the aircraft transfer function from elevator input to pitch
rate gives a type 0 system:
(s)
s+3
=
Me (s) s2 + 4 s + 5

m = number of zeros, n = number poles


A TF with m n is said to be proper.

poles : s = 2 1 j ,

type : 0

But the TF from the elevator to the pitch angle gives a type I system:

When m < n the TF is said to be strictly proper

s+3
(s)
=
Me (s) s (s2 + 4 s + 5)

Example of a TF that is not proper:


2s2 + 5 s + 4
s+1

Section 1.3.5

System Type: A property of transfer functions that comes up often is the system
type. The type of a system is the number of poles at the origin.

Proper and Strictly Proper Transfer Functions:

G4 (s) =

EE/ME 701: Advanced Linear Systems

note: m = 2,
n=1

poles : s = 0, 2 1 j ,

type : I

If we put a PID controller in the loop, which adds a pole at the origin, the
system will be type II.

such a TF can always be factored by long division:

G4 (s) =

2s2 + 5 s + 4 2 s (s + 1) 3 s + 4
3s +4
1
=
+
= 2 s+
= (2s + 3)+
s+1
(s + 1)
s+1
s+1
s+1

A non-proper TF such as G4 (s) has a problem: As s = j j , the gain


goes to infinity !
Since physical systems never have infinite gain at infinite frequency,
physical systems must have proper transfer functions.

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 13

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 14

EE/ME 701: Advanced Linear Systems

Section 2.0.0

EE/ME 701: Advanced Linear Systems

Section 2.0.0

2 Closing the loop, feedback control

1.3.6 The Impulse Response of a System


When we have a system such as in figure 7, the Laplace transform of the
output (a signal) is given by

u(t)

Gp(s)

r(t) + u(t)

y(t)

y(t)
Gp(s)
sensor

Y (s) = G p (s) U (s)

(17)

A unit impulse input is a very short pulse with an area under the curve of
the pulse of 1.0.
Since the Laplace transform of a unit-impulse is 1, then the Laplace
transform of the impulse response is the transfer function

Open Loop

Closed Loop

Figure 8: A plant with TF G p (s) in open and closed loop. Closed loop requires a
sensor.
Feedback is the basic magic of controls. A feedback controller can
Make an unstable system stable . . . . . . . . . . . . . . . . . . Helicopter autopilot
Make a stable system unstable . . . . . . . . . . . . . . . . Early fly-ball governors

Yimp (s) = G p (s) Uimp (s) = G p (s) 1 = G p (s)

Make a slow system fast . . . . . . . . . . . Motor drive, industrial automation


Make a fast system slow . . . . . . . . F16 controls, approach / landing mode

u(t)

Make an inaccurate system accurate . . . . . . . . . . . . . . . . . . . . machine tools

y(t)
t

u(t): impulse
U(s)=1

The magic comes because closing the loop changes the TF

t
System
Gp(s)

Open loop:

y(t): impulse response


Y(s)=Gp(s)

Closed loop:

Figure 7: The transfer function is the Laplace transform of the impulse response.
The connection between the impulse response and TF can be used to
determine the TF
Apply an impulse and measure the output, y (t). Take the LT and use
G p (s) = Y (s).
The connection between the impulse response and TF helps to understand
the mathematical connection between an LT and a TF
Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Y (s)
= G p (s)
U (s)

Page 15

Y (s) = G p (s) (R (s) Y (s)) = G p (s) R (s) G p (s)Y (s)


Y (s) (1 + G p (s)) = G p (s) R (s)
G p (s)
Y (s)
=
R (s)
1 + G p (s)

(18)

Y (s)
Forward Path Gain
=
= Try (s)
R (s)
1 + Loop Gain

(19)

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 16

Mt(t)
+

s+3

s +4s+5

(t)

Figure 9: Simple feedback of aircraft pitch angle.

s+3

G p (s)
(s)
s (s2 +4 s+5)
=
=
R (s) 1 + G p (s) 1 + 2 s+3
s (s +4 s+5)

(20)

Eqn (20) is not in rational polynomial form, so


s+3

(s)
s (s2 +4 s+5)
=
R (s) 1 + 2 s+3
s (s +4 s+5)

Analyzing the response


Gps
= tf([1 3], [1 4 5 0])
Try = tf([1 3], [1 4 6 3])
figure(1), clf
[Y_open, Top]
= step(Gps, 6);
[Y_closed, Tcl] = step(Try, 6);
plot(Top, Y_open, Tcl, Y_closed)
xlabel(t [seconds]);
ylabel(\Omega, pitch-rate)
title(Open- and closed-loop)
text(3, 1.6, Open-loop, rotation, 45)
text(4, 0.8, Closed-loop)
SetLabels(14)
print(-deps2c, OpenClosedResponse1)


s s2 + 4 s + 5
s+3
=
(21)
s (s2 + 4 s + 5) s (s2 + 4 s + 5) + (s + 3)

The closed-loop TF is still not quite in Rat Poly form, here is the final step:
(s)
s+3
=
R (s) s3 + 4 s2 + 6 s + 3

(22)

Open and closedloop response, aircraft


3.5

The open-loop
system is type I
The closed-loop
system is type 0

r(t)

(s)

2.5

1.5

1
Closedloop

(t)

0.5

R(s)

0
0

Figure 10: Block with r (t) as input and (t) as output.

Part 1A: Controls Review-By-Example

The response
completely
changes !
, pitchrate

r(t)

(Revised: Sep 08, 2012)

Section 2.0.0

oo
p

Example, wrapping a simple feedback loop around the aircraft dynamics

EE/ME 701: Advanced Linear Systems

n
l

Section 2.0.0

O
pe

EE/ME 701: Advanced Linear Systems

3
t [seconds]

Figure 11: Open and Closed loop response of the aircraft, the two responses have
very different characteristics.
Page 17

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 18

EE/ME 701: Advanced Linear Systems

Section 2.0.0

e(t)
Kc

Mt(t)

Section 2.1.0

2.1 Analyzing a closed-loop system

Introduce a proportional controller gain


r(t)

EE/ME 701: Advanced Linear Systems

s+3

s2 + 4 s + 5

A typical, basic loop (such as velocity PI control of a motor drive) has 3


components:

(t)

1. Plant (thing being controlled)


Figure 12: Feedback for aircraft pitch control, with P-type gain Kc .


Look at Kc = 1.0, 3.0, 10, 0


Kc = 1; Try1 = tf(Kc*[1 3], [1 4 5
Kc = 3; Try2 = tf(Kc*[1 3], [1 4 5
Kc = 10; Try3 = tf(Kc*[1 3], [1 4 5
figure(1), clf
...
plot(Top, Y_open, Tcl, Y_closed1,
...

2. Controller or compensator (usually a computer, a often PLC for motor


drives)

0]+Kc*[0 0 1 3])
0]+Kc*[0 0 1 3])
0]+Kc*[0 0 1 3])

3. A sensor

r(t)

e(t)

Controller
KcGc(s) =

Nc(s)
Kc D (s)
c

u(t)

Plant
Gp(s) =

y(t)

Np(s)
Dp(s)

Tcl, Y_closed2, Tcl, Y_closed3)


Sensor Dynamics
Ny(s)
Hy(s) = D (s)
y

ys(t)

Open and closedloop response, aircraft


3.5

Figure 14: Basic loop with a plant, compensator and sensor.

lo
op

The TF is given as

pe

2
O

System gets less


stable as Kc
increases

2.5
, pitchrate

System gets much


faster as Kc
increases

Try (s) =

Kc =10

1.5

Kc =3

Kc Gc G p
Y (s)
Forward Path Gain
=
=
R (s)
1 + Loop Gain
1 + Kc Gc G p Hy

1
Closedloop, Kc =1

Often, for the controls engineer the plant, G p (s), is set (e.g., the designer of
a cruise-control does not get to change the engine size).

0.5

0
0

3
t [seconds]

Figure 13: Open and Closed loop response of the aircraft, with Kc = 1.0,
Kc = 3.0, and Kc =10.0 .
Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 19

As a controls engineer, we get to pick Gc (s) and maybe influence Hy (s)


(e.g., by convincing the project manager to spring $$$ for a better sensor).

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 20

EE/ME 701: Advanced Linear Systems

Section 2.2.0

2.2 Common controller structures:


PI

PID

Proportional-Derivative

Prop.-Integral

Prop.-Int.-Deriv.

Gc (s) =

k p s +ki
s

Section 2.3.0

2.3 Analyzing other loops

PD

Gc (s) = kd s + k p

EE/ME 701: Advanced Linear Systems

Gc (s) =

kd s2 +k p s +ki
s

d(t)

Lead-Lag

Disturbance Filter
Nd
Gd(s) = D
d

(s+z ) (s+z )

Gc = Kc (s+p11 ) (s+p22 )
r(t)

Input Shaping
Hr(s)

Common Applications

PID: Many many places, Astrom has estimated that 80% of controllers are
PID (good, speed accuracy, stability)
Lead-Lag: Used where a pole at the origin is unacceptable, can be as good
as PID (notice, 5 parameters rather than 3)

e(t)

Controller
N
KcGc(s) = Kc Dc
c

uc(t)

Plant

up(t)

Gp(s) =

y(t)

Np
Dp

Sensor Dynamics
Ny
Hy(s) = D
y

ys(t)

PI: Velocity control of motor drives, temperature control (good speed and
accuracy, acceptable stability)
PD: Position control where high accuracy is not required (good speed and
stability, so-so accuracy)

Figure 15: Basic loop with a disturbance input, d (t) , and sensor noise, Vs (t)
added.
In some cases we may want to consider additional inputs and outputs.
Many systems have a disturbance signal that acts on the plant, think of
wind gusts and a helicopter autopilot.
All systems have sensor noise.
Any signal in a system can be considered an output. For example, if we
want to consider the controller effort, uc (t), arising due to the reference
input
Tru (s) =

Hr (s) Kc Gc (s)
Uc (s) Forward Path Gain
=
=
R (s)
1 + Loop Gain
1 + Kc Gc (s) G p (s) Hy (s)

(23)

If we wanted to consider the error arising with a disturbance, we would have


Tde (s) =

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 21

Gd (s) G p (s) Hy (s) (1)


E (s)
=
D (s) 1 + Kc Gc (s) G p (s) Hy (s)

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Vs(t)

(24)

Page 22

EE/ME 701: Advanced Linear Systems

Section 2.3.0

2.3 Analyzing other loops (continued)


As a final example, lets consider the output arising with sensor noise
Tvy (s) =

Y (s) Hy (s) (1) Kc Gc (s) G p (s)


=
Vs (s)
1 + Kc Gc (s) G p (s) Hy (s)

(25)

The example transfer functions, Eqns (23), (24) and (25) show some
interesting properties. The TFs are repeated here (omitting the many (s)s)
Try (s) =

Hr Kc Gc G p
1 + Kc Gc G p Hy

Tru (s) =

Hr Kc Gc
1 + Kc Gc G p Hy

Ted (s) =

Gd G p Hy (1)
1 + Kc Gc G p Hy

Tvy (s) =

Hy (1) Kc Gc G p
1 + Kc Gc G p Hy

EE/ME 701: Advanced Linear Systems

Section 2.3.0

If we consider what happens as Kc , we can see what happens for very


high gain. For this, assume that Hr (s) = 1.0 and Gd (s) = 1.0, since these
two terms merely pre-filter inputs.
When Kc , 1 + Kc Gc G p Hy Kc Gc G p Hy , so
Try (s)

Kc Gc G p
1
=
Kc Gc G p Hy Hy

Tde (s)

Gd G p Hy (1) Gd v
=
Kc Gc G p Hy
Kc Gc

Tru (s)
,

Kc Gc
1
=
Kc Gc G p Hy G p Hy

Tvy (s)

Hy (1) Kc Gc G p
= 1
Kc Gc G p Hy

Try (s) 1/Hy (s) shows that the TF of the plant can be compensated, it
disappears from the closed-loop TF as Kc .
Try (s) 1/Hy (s) also shows that the TF of the sensor can not be
compensated. If the characteristics of Hy (s) are bad (e.g., a cheap sensor)
there is nothing feedback control can do about it !
Tde (s) 1/Kc Gc shows that disturbances can be compensated, as Kc ,
errors due to disturbances go to zero ;)

The denominators are all the same


The poles are the same for any input/output signal pair
The stability and damping (both determined by the poles) are the same
for any signal pair
The numerators are different
The zeros are in general different for each input/output signal pair
Since the numerator help determine if the signal is small or large, signals
may have very different amplitudes and phase angles

Tru (s) 1/G p Hy shows that U1 (s) does not go up with Kc , and also, if the
plant has a small gain (G p (s) is small for some s = j ) then a large control
effort will be required for a given input.
Tvy (s) 1 shows that there is no compensation for sensor noise. If there
is sensor noise, it is going to show up in the output !

Summary:
Feedback control can solve problems arising with characteristics of the
plant, G p (s), and disturbances, d (t).
Feedback control can not solve problems with the sensor, Hy (s), or
sensor noise, vs (t) .

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 23

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 24

EE/ME 701: Advanced Linear Systems

Section 3.0.0

EE/ME 701: Advanced Linear Systems

3 Analysis

Section 4.1.0

4 Working with the pole-zero constellation


4.1 Basics of pole-zero maps, 1st order, p1 =

Analysis is determining system performance based on a model.

You have seen that:

The performance criteria any controls engineer should be aware of are seen
in table 4.

Each real pole is associated with a mode of the system response.

The criteria fall into 3 groups

A real pole gives the terms of Eqn (27), as seen in figure 16.

In EE/ME 574 we will be using all 18 of the criteria listed


Y (s) =

Many performance measures are introduced in first semester controls (we


will review them here). Those with (*) are introduced in 2nd semester
controls.

As seen in table 4, some performance measures are evaluated in the time


domain, and others in the frequency domain.

Amplitude

1
0

(26)

Impulse response

PoleZero Map

()

y (t) = C1 et

2
Imaginary Axis [sec1]

Speed
Degree of Stability Accuracy
Time
Rise Time
Stable / Unstable
Steady-State Error
Domain
Settling Time
Overshoot
ISE ()
IAE ()
Peak Time
Damping Factor
Frequency Pole Locations
Pole Locations
Disturbance Rejection
Noise Rejection ()
Domain
Bandwidth ()
Phase Margin ()
()
()
(Or S-plane) Cross-over Freq.
Gain Margin
Tracking Error ()
Table 4: A range of controller specifications used in industry.
Note: () marks items developed in 2nd semester controls.

3
C1
=
s+2 s+

Faster

h(t) =
t
t/
2 t
e
=e =e
C1 / e

Splane
2
3

2
1
0
1
Real Axis [sec ]

0
0

1
2
Time (secs)

Figure 16: First order pole and impulse response.


Depending on the performance measure, one of four methods of evaluation
is used:
1. Evaluated directly from the transfer function (e.g., steady state error)
2. Evaluated by looking at the system response to a test signal (e.g., trise )
3. Evaluated from the pole-zero constellation (e.g., stability, settling time)
4. Evaluated from the bode plot (e.g., phase margin)
Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 25

A real pole has these characteristics:


y (t) et/ where [sec] = 1/ is the time constant.

Further to the left indicates a faster response (smaller ).

The pole-zero constellation does not show either the KDC or the Krlg of
the TF or the amplitude of the impulse response.
Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 26

EE/ME 701: Advanced Linear Systems

Section 4.1.1

Example 1: Shifting poles to the left accelerates transient decay

And we have seen for second order:

Two example first-order responses are shown.


X Case:

Case:

= 16 [sec1]

Each complex pole pair gives a mode of the system response.

= 1/4 [sec]

A complex pole pair gives the terms of Eqn (27), as seen in figure 19.

= 1/16 [sec]

Using the Laplace transform pair


F (s) =

Bc s + (Bc + Bs )
f (t) = et (Bc cos (t) + Bs sin (t))
s2 + s s + 2n

one finds

Y (s)
Faster Decay

5
15

10
5
0
Real Axis [sec1]

y (t)

Figure 17: A change in pole location changes the decay rate and damping.

2 s + 14
b1 s + b0
=
s2 + 2 s + (2 + 2) s2 + 3 s + 18.25

PoleZero Map

0.5

0.5

2
t

0.5
Time (secs)

0
0

0.5
Time (secs)

6 4 2
0
Real Axis [sec1]

t/

=e

1.5t

=e

0
1

5
8

0
0

Impulse response
3

Splane

System step response


1.5
= 0.06 [sec]
Amplitude

Amplitude

Imaginary Axis [sec ]

X System step response


1.5
= 0.25 [sec]

(27)

A et cos (t + ) = 3.4 e1.5t cos (4t 53.97o) (28)

Amplitude

Imaginary Axis [sec1]

PZ map showing two systems


5

20

Section 4.2.0

4.2 Basis of pole-zero maps, 2nd order, p1,2 = j

4.1.1 Real Pole location and response characteristics

4 [sec1]

EE/ME 701: Advanced Linear Systems

2
0

3.40 cos(4t 54.0o)

2
4
Time (secs)

Figure 19: Second order pole and impulse response.

Figure 18: A change in changes the time constant.


Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 27

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 28

EE/ME 701: Advanced Linear Systems

Section 4.2.1

4.2.1 Notation for a complex pole pair

*
p1, p1

=1.5 j 4

p1

X Case: 1
Case: 4

[sec1]

PO

4 [rad/sec] 4.12 [rad/sec] 0.25

[Dim less]

44%

4 [rad/sec] 5.66 [rad/sec] 0.71

[Dim less]

4%

PZ map with sgrid (indicates and n)

PZ map showing two systems

1
0
Real Axis [sec1]

Description

Given by

Units

Decay Rate
Damped Nat. Freq.
Natural Freq.
Pole Angle
Damping Factor

p1 = + j
p1 = + j

2n = 2 + 2
= atan2 (, )
= /n

[sec1]
[rad/sec]
[rad/sec]
[deg]
Dimless, []

4
2

Faster Decay

0
2
4
6
8

4 2
0
2
1
Real Axis [sec ]

0.7

2
0

Amplitude

Polar

2
4

0.9
0.7 0.5 0.3
6 4 2
0
2
1
Real Axis [sec ]

6
8

System step response

X System step response


1.5

p1 = + j
p1 = n (90 + )
= n
= sin ()
p
= 1 2 n
= / n
p

n 1 2
H (s) =
H
(s)
=
s2 + 2 n s + 2n
(s + )2 + 2
Table 6: The terms of table 5 relate to rectangular or polar form for the poles.

0.5 0.3

4 0.9

Figure 21: A change in changes the decay rate and damping.

Table 5: Factors derived from the location of a complex pole.


(Note: Franklin et al. often use , d and .)
Rectangular

6
Imaginary Axis [sec1]

Splane

p1

= 0.24
46 % Overshoot

0.5

0
0

1.5

Amplitude

Figure 20: Complex pole pair with n and defined.

or
or d
n

or

[sec1]

6
3

Term

Example 1: Shifting poles to the left accelerates transient decay

Imaginary Axis [sec1]

Imaginary Axis [sec ]

6
4

Section 4.2.2

4.2.2 Complex pole location and response characteristics

A complex pole pair can be expressed in polar or rectangular coordinates:


PZ Map,

EE/ME 701: Advanced Linear Systems

2
4
Time (secs)

= 0.71
4 % Overshoot

0.5

0
0

2
4
Time (secs)

Figure 22: Step response: a change in , is unchanged.


Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 29

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 30

Section 4.2.2

Example 2: Shifting poles out vertically increases oscillation frequency

X Case: 4 [sec1]

4 [rad/sec]

4%

Case: 4 [sec1] 16 [rad/sec] 16.49 [rad/sec] 0.24 [Dim less] 44%

Faster
Oscillation

0
10
20

0.5 0.3

0.9

20

12

10
0.9
20

20 10
0
10
Real Axis [sec1]

30

0.5

0
0

1
Time (secs)

4%

16 [rad/sec] 22.63 [rad/sec] 0.71

[Dim less]

4%

5.66 [rad/sec]

10
0
10
20

0.7 0.50.3

20

0.9
10
0

28 20 12

= 0.24
46 % Overshoot

0.5

0.9
20

0.7 0.50.3
30 20 10
0
10
1
Real Axis [sec ]

System step response


1.5

= 0.71
4 % Overshoot

0.5

0
0

1
Time (secs)

= 0.71
4 % Overshoot

0.5

= 0.25 [sec]

1
Time (secs)

10

X System step response

0
0

0.71

4 [rad/sec]

20

1.5

Amplitude

1.5

= 0.71
4 % Overshoot

PO

Figure 25: A radial change in pole location changes the decay rate and oscillation
frequency, but not the damping.

System step response

Amplitude

Amplitude

1.5

[Dim less]

30 20 10
0
10
1
Real Axis [sec ]

0.7 0.5 0.3


20 10
0
10
Real Axis [sec1]

Figure 23: A change in changes the oscillation frequency and damping.


X System step response

[sec1]

PZ map showing two systems PZ map with sgrid (indicates and n)

10
28

[sec1]

Case: 16

Imaginary Axis [sec ]

0.7

20
Imaginary Axis [sec1]

Imaginary Axis [sec1]

20

30

X Case:

PZ map with sgrid (indicates and n)

PZ map showing two systems

10

PO

0.71 [Dim less]

5.66 [rad/sec]

Example 3: Shifting poles out radially rescales time

Section 4.2.2

Imaginary Axis [sec ]

EE/ME 701: Advanced Linear Systems

Amplitude

EE/ME 701: Advanced Linear Systems

= 0.06 [sec]
2

0
0

0.2
0.4
Time (secs)

Figure 26: Maintaining , time is rescaled.

Figure 24: Step response: a change in .

Note: The S-plane has units of [sec1 ].


Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 31

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 32

EE/ME 701: Advanced Linear Systems

Section 4.2.3

4.2.3 The damping factor:

b0
b0
=
s2 + s a1 + a0 s2 + 2 n s + 2n

(29)

value

System characteristic

<0
=0

Unstable
Marginally Stable (poles on imaginary axis)

0<<1
giving:

Section 4.2.3

A full list of what we can learn from is seen in table 7.

Plugging = /n back into the 2nd order form gives:


H (s) =

EE/ME 701: Advanced Linear Systems

a1
a1
=
=
2 n 2 a0

(30)

is defined by Eqn (30) for either real poles: ( 1.0), or


a complex pole pair: (0.0 < < 1.0).

Stable Complex Pole Pair

0.5 < 1.0 Typical value for a well designed controller


=1
>1

Critical Damping (repeated real poles)


Over Damped (two separate real poles)

Table 7: Ranges of damping factor.

As illustrated in figure 24, above, on the range 0.0 < < 1.0, the damping
factor relates to percent overshoot. For a system with two poles and no
zeros, the percent overshoot is given by Eqn (31) and plotted in figure 27
q
P.O. = 100 e / 1 2

(31)

S-plane
Stable Region
( >0 )

Unstable Region
( <0 )

Marginally Stable Region ( =0 )


100

Figure 28: Damping factor and stability.

90
80

Percent Overshoot

70
60
50
40
30
20
10
0
0

0.1

0.2

0.3

0.4
0.5
0.6
Damping factor [.]

0.7

0.8

0.9

Figure 27: Percent overshoot versus damping fact. Exact for a system with two
complex poles and no zeros, and approximate for other systems.
Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 33

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 34

EE/ME 701: Advanced Linear Systems

Section 4.3.0

EE/ME 701: Advanced Linear Systems

4.3 Determining approximate performance measures from a

Section 4.3.3

4.3.1 Rise time from pole locations, 10%-90%

dominant second-order mode


1st order:
Returning to performance measures, in section 3 weve seen that these are
defined from the step response.
Rise time

Settling time

Peak time

Overshoot

Step response of complex pole pair


1.5

Step response of complex pole pair


2.5

Output y(t)

Output y(t)

ss

0.5

| |
T : 1090%
r

Rise Time

yp

r (1)
ss

2%

4 |

6
8
Ts: 98% Settling Time

tr =

1.8
n

y (1)
ss

1.5

Time [sec]

(32)

4.3.3 Settling time from pole locations


Settling time is sometimes defined as the time to approach within 4% or 2% or
even 1% of the steady-state value. These give slightly different definitions of Ts .
The one will will use (corresponding approximately to 2%) is:

|
tp

90

y (0)
0
ss

tp =

Mp

0.5
rss(0) y10

1.8

4.3.2 Peak time from pole locations

Overshoot
1

tr =

2nd order:

T : 1090% Rise Time

ts =

0 |
t

10

| 1
t

90

2
Time [sec]

(33)

Figure 29: Quantity definitions in a CL step response.


While there are not equations that give these measures exactly for any
system other than 2 poles and no zeros. But for this special case
T (s) =

b0
s2 + 2 n s + 2n

We can derive the relations below.

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 35

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 36

EE/ME 701: Advanced Linear Systems

Section 5.0.0

5 Design

EE/ME 701: Advanced Linear Systems

Section 5.2.0

5.1 Design methods


The major design methods are:

In some sense, Design is the opposite of Analysis

Root locus

Analysis

Speed, Stability: determine by determining pole locations


Accuracy: increase the system type, check the SSE

Completed
Controller
Design

Performance
Specifications

Frequency response

()

Speed: Bandwidth and Cross-over frequency directly from bode plot


Stability: Phase margin, Gain margin directly from bode plot
Accuracy: tracking accy., disturbance rejection directly from bode plot

Design

State Space methods

Figure 30: Design = Analysis1

In Analysis, we use mathematical methods to determine performance


specifications from a completed controller design (all structure and
parameters specified).

Design using state-space design methods, check speed, stability and


accuracy from the step response

5.2 Root Locus Design


Devised by Walter R. Evans, 1948 (1920 1999)

In Design, we use whatever method works !


Mathematical

W.R. Evans, Control system synthesis by


root locus method, Trans. AIEE, vol. 69,
pp. 6669, 1950.

Gut feeling
Trial and error

(Amer. Institute of Elec. Engs became


IEEE in 1963)

Calling a colleague with experience


to determine a controller structure and parameters to meet performance
goals.
Part 1A: Controls Review-By-Example

()

(Revised: Sep 08, 2012)

Page 37

Evans was teaching a course in controls,


and a student (now anonymous) asked a
question about an approximation.
Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 38

EE/ME 701: Advanced Linear Systems

Section 6.0.0

6 Summary

EE/ME 701: Advanced Linear Systems

Section 7.0.0

7 Glossary of Acronyms

Dynamic systems are governed by differential equations

LT: Laplace Transform

Every input or output signal of the system has a unique Laplace transform

TF: Transfer Function

For linear systems,

FPG: Forward path gain

The ratio of Laplace transforms, however, does not depend on the signals.

LG: Loop Gain, (also Gl (s))

The ratio depends only on properties of the system. We call it the transfer
function. For example:

RHP: Right-half of the S plane (unstable region)


LHP: Left-half of the S plane (stable region)

b1 s + b0
Guy (s) = 2
s + a1 s + a0

(34)

PO: Percent Overshoot

Transfer function gives us many results


The pole locations tells us the stability and damping ratio
We can get approximate values for rise time, settling time and other
performance measures
Control system analysis is the process of determining performance from a
system model and controller design

Control system

* Steady-state error

* Step response

LF: Low frequency (as in LF gain).


frequency.

Frequencies below the cross-over

HF: High frequency (as in HF gain).


frequency.

Frequencies above the cross-over

CL: Closed loop


OL: Open loop

We have various tools for evaluating performance, including


* Pole locations

SSE: Steady-state error

* Bode plot

P, PD, PI, PID: Proportional-Integral-Derivative control, basic and very common


controller structures.

Design =Analysis1

it is the process of determining a controller design given a system model


and performance goals.
The root locus method is one method for controller design.
Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 39

Part 1A: Controls Review-By-Example

(Revised: Sep 08, 2012)

Page 40

EE/ME 701: Advanced Linear Systems

EE/ME 701: Advanced Linear Systems

Building Linear System Models


Contents
1 System modeling,
classical vs. state-variable modeling
1.1
1.2

1.3
1.4

State variable modeling: . . . . . . . . . . . . . . . . . . . . . .

1.2.1

Signals that are not states or inputs . . . . . . . . . . . . .

1.2.2

Writing the n first-order differential equations in standard


form . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Standard form for the linear, time-invariant state-variable


model: . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

Why consider linear state-variable models ?


(when algebraic form is just fine) . . . . . . . . . . . . . . . . .

11

A quick history of state variable modeling . . . . . . . . . . . . .

12

2 Formal properties of systems

21

3.6

Lumped Parameter system . . . . . . . . . . . . . . . . . . . . .

22

3.7

Continuous-time versus sampled (discrete-time) signals and systems 23

3.8

Discrete-time (sampled) signals and systems . . . . . . . . . . . .

24

3.9

Continuous signals and systems, continuity in the mathematical


sense . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

3.10 Quantized signal . . . . . . . . . . . . . . . . . . . . . . . . . .

26

3.11 Deterministic Signals and systems . . . . . . . . . . . . . . . . .

27

3.12 A Note on units . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4 State, revisited

14

4.1.2
4.1.3

For direct application of Superposition and Scaling, state


must be zero . . . . . . . . . . . . . . . . . . . . . . . .

30

Definition of a Linear System considering non-zero state


(e.g., DeCarlo definition 1.8) . . . . . . . . . . . . . . . .

31

Time-invariant system, considering non-zero state

32

. . . .

6 Steps of building a state-variable model (for a linear system)

35

18

7 Examples with state-variable models

37

2.2

Simple example system

3.1

Admissible signal . . . . . . . . . . . . . . . . . . . . . . . . . .

18

3.2

Linear System (without internal variables) . . . . . . . . . . . . .

19

3.3

Time-invariant system (without internal variables) . . . . . . . . .

20

3.4

Causal (non-anticipative) system . . . . . . . . . . . . . . . . . .

21

(Revised: Sep 10, 2012)

4.1.1

30

16

15

3 Classification of Systems and Signals

Modified definition of linearity, considering state . . . . . . . . .

33

A system Maps inputs to outputs . . . . . . . . . . . . . . . . . .


. . . . . . . . . . . . . . . . . . . . . .

4.1

28

5 Standard Notation for State Model

2.1

Part 5: Models of Linear Systems

Realizable system . . . . . . . . . . . . . . . . . . . . . . . . . .

Modeling with a single, higher-order differential equation


(sometimes called classical modeling) . . . . . . . . . . . . . . .

1.2.3

3.5

Page 1

7.1

Example 1, an electrical circuit

. . . . . . . . . . . . . . . . . .

37

Building and exercising the circuit model . . . . . . . . .

43

Example 2, a mechanical system . . . . . . . . . . . . . . . . . .

45

7.2.1

53

7.1.1
7.2

Building and exercising the suspension model

Part 5: Models of Linear Systems

. . . . . .

(Revised: Sep 10, 2012)

Page 2

EE/ME 701: Advanced Linear Systems


7.2.2
7.3

7.4

EE/ME 701: Advanced Linear Systems

Conclusions, Quarter Suspension Example . . . . . . . .

59

Example 3, state-variable model from a differential equation . . .

60

7.3.1

Steps of deriving the state-variable model . . . . . . . . .

61

7.3.2

Building and exercising a differential equation model

. .

64

State-variable model and simulation diagram . . . . . . . . . . .

66

8 Some basic operations with state-variable models


8.1

73

Deriving the transfer function from the state-variable model . . . .

73

8.1.1

Interpreting the transfer function

. . . . . . . . . . . . .

74

8.1.2

DC gain of a state-variable model . . . . . . . . . . . . .

77

1 System modeling,
classical vs. state-variable modeling
To introduce state modeling, lets first look at an example. Consider the
RLC circuit of figure 1. The parameters and signals of the circuit are:

Parameters

8.2

Interpreting D . . . . . . . . . . . . . . . . . . . . . . . .

77

Coordinate transformation of a State Variable Model . . . . . . .

78

8.2.1

81

Example coordinate transformation . . . . . . . . . . . .

9 State-variable Feedback control

Determination of a new model with state feedback

. . . . . . . .

85

9.2

State-variable feedback example: Inverted pendulum . . . . . . .

87

9.2.1

Designing a pole placement controller

. . . . . . . . . .

90

9.2.2

LQR Design . . . . . . . . . . . . . . . . . . . . . . . .

92

volts
amp

volts
amp/sec
amps
volt/sec

Units

Signals

Units

Vs (t)

[volts]

VL (t)

[volts]

iS (t)

[amps]

iL (t)

[amps]

VR (t)

[volts]

VC (t)

[volts]

iR (t)

[amps]

iC (t)

[amps]

iR
+

is
Vs(t)

10 Conclusions

Signals

Table 1: List of parameters and signals.

84

9.1

Units

L
8.1.3

Section 1.0.0

iL
+
L

iC
+
C

VC(t)

95

Figure 1: RLC Circuit with voltages and currents marked.

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 3

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 4

EE/ME 701: Advanced Linear Systems

Vs(t)

iL
+

EE/ME 701: Advanced Linear Systems

Section 1.1.0

1.1 Modeling with a single, higher-order differential equation

iR
is

Section 1.0.0

iC

(sometimes called classical modeling)

VC(t)

Develop the nth order differential equation, write:

iR (t) iL (t) iC (t) = 0


1
(Vs (t) VC (t)) iL (t) CVC (t) = 0
R
1
1
1
VC (t) + iL (t) =
Vs (t)
VC (t) +
RC
C
RC

The constituent relations are:


VR (t) = R iR (t)
d iL (t)
VL (t) = L
dt
d VC (t)
iC (t) = C
dt

(1)

Where known quantities are on the right and unknowns on the left.

(2)

In Equation (5) we have unknown signals VC (t) and iL (t) on the left, we
must eliminate one of them, to have one equation in one unknown.
Take the derivative of Eqn (5) to produce iL (t), and use inductor
constituent relation and VC (t) = VL (t) to give:

The continuity constraints are (Kirchhoffs laws for electrical systems):


VR (t) +VL (t) Vs (t) = 0

(3)

VC (t) VL (t) = 0
is (t) iR (t) = 0
iR (t) iL (t) iC (t) = 0

(5)

(4)

1
1
1
VC (t) +
VC (t) + iL (t) =
Vs (t)
RC
C
RC
1
1
1
VC (t) +
VC (t) =
Vs (t)
VC (t) +
RC
LC
RC

(6)

Limitations of classical modeling:


Complexity goes up quickly in the number of variables, 3rd order 4-5X
as much algebra as 2nd order , 4th order 10-20 X more algebra.

Two approaches to modeling this type of dynamic system


1. Classical modeling, with a single high-order differential equation

The significance of the initial conditions is unclear: VC (t0 ) , VC (t0 ) ...,


how to get VC (t0 ) ?

2. State-variable modeling

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 5

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 6

EE/ME 701: Advanced Linear Systems

Section 1.2.0

1.2 State variable modeling:

EE/ME 701: Advanced Linear Systems

Section 1.2.1

1.2.1 Signals that are not states or inputs

Develop n first order differential equations in terms of states and inputs:


Determine the state variables of the system, these are variables that
make up the initial condition for the system (formal definition and more
examples coming later).
For an nth order system, there are n states
For the RLC circuit, there are two states, we can choose them to be
VC (t) and iL (t), giving the state vector:

x (t) =

VC (t)
iL (t)

Systems generally have many signals. Table 2 shows 8 signals for this
simple circuit. Only 2 are states, and 1 is an input, so 5 are just signals
within the system.
A Deep Property of state variable modeling: all of the signals in the
system have a unique expression in terms of the states and inputs.
If no expression exists, the model does not have enough states, and
If there are multiple possible expressions, the model has redundant
states that should be removed from the state vector.
Examples of using Eqns (1) - (4) to write other signals in terms of the
states, VC (t) and iL (t) and the input Vs (t)

We are working toward the state equation:


x (t) = A x (t) + B u (t)

(7)

VL (t) = VC (t)
VR (t) = Vs (t) VL (t) = Vs (t) VC (t)

Starting with differential equations that come directly from the model:
1
VC (t) =
iC (t)
C

(8)

1
iL (t) = VL (t)
L

(9)

Note that the derivative is written on the left, and all signals that
determine the derivative are on the right.
n
o
All signals on the right must be states, VC (t) , iL (t) , or inputs, Vs (t)

iR (t) = VR (t) /R = 1/R (Vs (t) VC (t))


Use the basic modeling equations to re-write the differential equations with
only states and inputs on the right hand side:
1
1
1
iC (t) = (iR (t) iL (t)) =
VC (t) =
C
C
C


1
(Vs (t) VC (t)) iL (t)
R
(10)

1
1
iL (t) = VL (t) = VC (t)
L
L

(11)

Notice that in Eqns (8) and (9), the right hand terms including iC (t) and
VL (t), neither of which is a state or an input.
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 7

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 8

EE/ME 701: Advanced Linear Systems

Section 1.2.2

1.2.2 Writing the n first-order differential equations in standard form

EE/ME 701: Advanced Linear Systems

1.2.3 Standard form for the linear, time-invariant state-variable model:

The model equations are written with:

Input
u(t)

Each state derivative on the left

A, B, C, D

(12)
(13)

Standard form for state-variable model equations:

(14)

State Equation: x (t) = A x (t) + B u (t)

(16)

Output Equation: y (t) = C x (t) + D u (t)

(17)

y (t) = C x (t) + D u (t)


Putting Eqns (12) and (13) in the form of (14)

VC (t)
iL (t)

1
RC
1
L

C1

VC

x(t)

Figure 2: Block diagram showing the elements of a state-variable model,


including inputs, outputs and states, and model matrices A, B, C, D.

Put the model equations in matrix / vector form:


x (t) = A x (t) + B u (t)

Output
y(t)

System

The right hand side written with states and inputs as the only signals
1
1
1
VC (t) =
VC (t) iL (t) +
VS (t)
RC
C
RC
1
iL (t) = VC (t)
L

Section 1.2.3

(t) +

1
RC

iL
0
0

h
i VC
(t) + [0] Vs (t)
y (t) =
1 0
iL

Signals

Vs (t)

x (t)

Name
State
Vector

u (t)

(15)

Input

Units
"

[volts]
[amps]

Parameters
#

A=

"

C1
0

"

1
L

B=

[volts]

C=

Vector
y (t)

Output Vector

1
RC

D = [0]

x (t) =

VC (t)
iL (t)

[volts]
[amps]

Name

1
RC

[volts]

where the state vector is:

x (t0 )

Initial Condition :

System Matrix

Input Matrix

Output Matrix
Feed-forward

Units
i

h

volts/sec
sec1
i  amp 
h
amps/sec
1
sec
volt
"  1  #
h sec i
amps/sec
volt

[]

volts
amp

[]

i i

Matrix

Table 2: List of parameters and signals of the State-Variable Model. Each of the
seven elements has a name, and each has physical units, which depend
on the details of the system.

States are signals (i.e., functions of time)


States have units (like all physical signals)
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 9

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 10

EE/ME 701: Advanced Linear Systems

Section 1.3.0

1.3 Why consider linear state-variable models ?

EE/ME 701: Advanced Linear Systems

Section 1.4.0

1.4 A quick history of state variable modeling

(when algebraic form is just fine)

Prior to the development of state-variable approaches, analytic work focused


on algebraic equations for modeling.

Many types of systems are modeled with sets of linear equations


Plants and controllers for control

Controls, design, economics, statistics, etc.

Multi-variable statistical analysis (e.g. data analysis)

Algebraic approaches tended to give specific results without general


theory

Business and economic models

Example, it was not known until publication of Masons rule in 1953


when control system equations could be simplified.

Signal processing
And more
In fact, the only models which can be worked out in a general way are linear.

Many properties of the system made only clear by state-variable


approach:

Many truly nonlinear results are known, but they are applicable only in
specific cases.

Controllability and Observability

Nonlinear systems are often analyzed by linearizing them about an


operating point ...

Coordinate transformations and modal coordinates


Interest in state variable modeling for analysis and design of systems
accelerated with the early work of Kalman, Bellman and others.

To apply the methods of linear analysis.


Any time there are two or more variables, vectors and matrices become
powerful tools for analysis
Earlier algebraic approaches do not extend well to multi-variable systems
(such as multi-input, multi-output (MIMO) systems).
Powerful mathematical results for linear systems are applicable to problems
from all domains. !!!! Geometric interpretation !!!!
Vector spaces bring us notions of size, orthogonal, independence,
sufficiency, degrees of freedom (DOF), which apply equally well across
all domains.
Part 5: Models of Linear Systems

Optimal control (and Kalman filtering)

(Revised: Sep 10, 2012)

Page 11

Rudolf E. Kalman, On the General Theory of Control Systems, Proc.


1st Inter. Conf. on Automatic Control, IFAC: Moscow , pp 481-93, 1960.
Richard E. Bellman, Dynamic Programming, Princeton University Press,
1957.
Rudolf E. Kalman, Mathematical Description of Linear Systems,
SIAM J. Control, vol. 1, pp 152-192, 1963.
Vasile M. Popov, On a New Problem of Stability for Control Systems,
Automatic Remote Control, 24(1):1-23, 1963.
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 12

EE/ME 701: Advanced Linear Systems

Section 1.4.0

The computational tools to work with vectors and matrices were being
introduced at about the same time

EE/ME 701: Advanced Linear Systems

Section 2.0.0

2 Formal properties of systems


Definition: A system is something which maps inputs to outputs.

Wilkinson, James Hardy, Rounding errors in algebraic processes,


Englewood Cliffs:Prentice-Hall, 1963.
George E. Forsythe and Cleve B. Moler, Computer solution of linear
algebraic systems, Englewood Cliffs: Prentice-Hall, 1967.
Cleve Moler went on to co-found Mathworks (Matlab); and is
Mathworks chief scientist today.

u(t)

y(t)
Aircraft
Outputs:
- Pitch
- Roll
- Yaw
- Velocity

Inputs:
- Aileron
- Rudder
- Elevator
- Throttle

Figure 3: A System maps inputs to outputs.

Analysis: What can we say about how an aircraft responds ?


Design: How do we engineer the aircraft so that it responds well ?
Control: How do we pick u(t) ?

Starting Point
What is a system ?
What are the properties of a system ?

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 13

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 14

EE/ME 701: Advanced Linear Systems

Section 2.1.0

2.1 A system Maps inputs to outputs

EE/ME 701: Advanced Linear Systems

Section 2.2.0

2.2 Simple example system


The System

System

N{}

u(t)

.y(t)

+
+

u(t)

y(t)

Vc(t)

Space of All
Possible Input Signals

Figure 5: Simple example system with a single state. This system is an RC


circuit.

Space of All
Possible Output Signals

1+(t)

Figure 4: Spaces of input and output signals.

y(t) = N [u(t)]

(18)

0 t<0
u(t) = 1+ (t) =
1 t0

Examples (describe the components and signals):


System
Car Cruise Control

Components

Signals

Car, Engine, Throttle

v (t) , (t) , RPM (t)

Rocket, engine,

v(t) , (t)

Rocket Guidance

steering hydraulics

(t) , (t)

(19)

Look at two cases:


Solving the Diff. Eqn. gives:
Case 1: VC (t = t0 ) = 0.0 [volts]
Case 2: VC (t = t0 ) = 3.0 [volts]

Temperature in Oil Refining

t=0

Input Signal, unit step function:

y1 (t) = 1 e RC t
y2

1
(t) = 1 + 2 e RC t

t >0
t >0

Economy
Biosphere
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 15

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 16

EE/ME 701: Advanced Linear Systems

Section 2.2.0

EE/ME 701: Advanced Linear Systems

Section 3.1.0

3 Classification of Systems and Signals

2.5

Signals and systems are characterized by several properties

2
y (t)

These have formal mathematical definitions (See Bay section 1.1)

y(t)

1.5

Characteristic

* Admissible
* Linear
* Time Invariant
* Causal (Non-anticipative)
* Realizable
* Lumped
* Continuous
* Discrete
* Quantized
* Deterministic

y (t)

0.5

0
0

0.5

1.5

Time [sec]

Figure 6: Responses y1 (t) and y2 (t) of the RC circuit.


So the system response, y(t) , depends on two things:
The input u(t)
Something inside the system at t = t0 , in this case VC (t = t0)
Definition: The state of the system is information about conditions inside the
system at t = t0 that is sufficient and necessary to uniquely determine the
system output.

Signal System

Table 3: Characteristics of Signals and Systems

3.1 Admissible signal

Write:
y(t) = N [u(t), x(t0 )]

(20)

Some facts about the system state:

A signal is admissible if it has mathematical properties such that it has a Laplace


transform (all physical signals are admissible). Signal u (t) is admissible if:
i)

it is piecewise continuous (discontinuous at most at a finite number of


locations in any finite interval)

ii)

t0

It is equal to the number of independent storage elements in the system


(such as independent capacitors, inductors, masses and springs)

iii)

u (t) is exponentially bounded (there exists an exponential signal which goes


to infinity faster than u (t))

The selection of state variables is not unique (more on this and state-variable
transformations later)

iv)

u (t) is a vector of the correct dimension (e.g., for aircraft inputs [aileron,
rudder, elevator, throttle], u (t) R4).

The number of state variables is equal to the order of differential equation


required to describe the system.

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 17

s.t.

u (t) 0 t t0

Part 5: Models of Linear Systems

( :

there exists;

for all)

(Revised: Sep 10, 2012)

Page 18

EE/ME 701: Advanced Linear Systems

Section 3.2.0

EE/ME 701: Advanced Linear Systems

Section 3.3.0

3.2 Linear System (without internal variables)

3.3 Time-invariant system (without internal variables)

System: a system is linear if and only if (iff) it obeys the superposition rule (which
incorporates scaling and homogeneity).

A system is time invariant if a time shifted input gives the time-shifted output.
Given:

Given:
y = N [u]
y1 = N [u1]
y2 = N [u2 ]
N [u (t T )] = y(t T )

and
Examples:

u3 = c1 u1 + c2 u2

Time Invariant: a boiler


fuel in heat out (independent of clock time t0 )

Then:

Not Time Invariant (time varying)

y3 = c1 y1 + c2 y2

Example: the economy


Define: u(t) = Loans for buying corn seed

Or

u(t) = 1 in April is not the same as u(t) = 1 in August

N [c1 u1 + c2 u2 ] = c1 N [u1] + c2 N [u2]

Example: flight dynamics

Related:

x (t) = A (t) x (t) + B (t) u (t)

Scaling:

N [c1 u1 ] = c1 y1

Homogeneity:

N [0] = 0

y (t) = C (t) x (t) + D (t) u (t)


A (t) , B (t) , C (t) and D (t) depend on altitude and Mach number.

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 19

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 20

EE/ME 701: Advanced Linear Systems

Section 3.5.0

3.4 Causal (non-anticipative) system

EE/ME 701: Advanced Linear Systems

Section 3.6.0

3.6 Lumped Parameter system

For a causal system, the output at time t0 , y(t0 ), is completely determined


by the inputs for t t0 .
That is: u(t1 ) , t1 > t0 does not influence y(t0 ) .

A lumped parameter system is characterized by ordinary differential


equations (the coefficients of the differential equation are the lumped
parameters)
Distributed systems are characterized by partial differential equations

Causal Example:

A good example of an Lumped Parameter system is an RLC circuit

iR
y(t) = 2 u(t 1)

is
Vs(t)

Non-causal (anticipative) Example:

iL
+

iC
+

VC(t)

y(t) = 2 u(t + 1)
Figure 7: An RLC circuit.

3.5 Realizable system


A realizable system is a physically possible system, one which in principle
could be built. To be realizable, a system must be:
1. Causal

When the frequency is not too high, the circuit of figure 7 is characterized
by:
1
1
1
vc (t) +
(21)
vc (t) +
vc (t) =
vs (t)
RC
LC
RC
When the frequency is high enough wave phenomena become important
At some high frequency, lumped parameter models break down

2. y(t) is real (not complex) if u(t) is real

Partial differential equations and wave propagation are required


A distributed system model is required.
We will deal exclusively with lumped parameter systems
All continuous-time systems have a cross-over frequency where
distributed phenomena (often wave phenomena) become important
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 21

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 22

EE/ME 701: Advanced Linear Systems

Section 3.7.0

3.7 Continuous-time versus sampled (discrete-time) signals

EE/ME 701: Advanced Linear Systems

Section 3.8.0

Continuous (time) system: governed by differential equations, such as:

and systems
x(t)

= 2 x(t) + 3 u(t)

(22)

y(t) = x(t)

0.9
0.8
0.7

3.8 Discrete-time (sampled) signals and systems

u(t)

0.6
0.5

Signal: A discrete (sampled) signal u(k) is defined (sampled) only at specific


sample instants, t = tk (see figure 8).

0.4
0.3

System: A discrete (sampled) system is governed by a difference equation, such


as:

0.2
0.1
0
0

0.5

1.5
Time [sec]

2.5

xk = 2 xk1 1 xk2 + 3 uk

Figure 8: A continuous and sampled signal.


Signals:
A continuous signal is defined
for all values of time, e.g.,



u (t) = 0.5 + 0.2 cos t/3 +
6
+0.2 cos (2t/3)
+0.2 cos (2t + )

Part 5: Models of Linear Systems

yk = xk

A sampled (discrete) signal is defined only


particular instants, e.g.,

t (k) =

0.3

0.6

0.9

1.2

1.5

1.8

2.1

2.4

(23)

u (k) =

0.805

0.297

0.235

0.400

0.128

0.0937

0.5247

0.528

(Revised: Sep 10, 2012)

Page 23

Usage: generally we say sampled signal and discrete system, though all
combinations are sometimes seen.
Any computer-based data acquisition results in sampled signals; any
computer-based signal processor is a discrete system
Some mathematical results are more straight-forward or intuitive (or
possible) for one time or the other.

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 24

EE/ME 701: Advanced Linear Systems

Section 3.9.0

3.9 Continuous signals and systems, continuity in the mathematical sense


Signal: A signal is continuous if the limit of a sequence of values is the value of
the limit:
lim u (t) = u (t1 )
(24)
tt1

Said another way

EE/ME 701: Advanced Linear Systems

Section 3.10.0

3.10 Quantized signal


A signal can be quantized, the signal takes only certain possible values, for
example
u(t) {0, 1/4, 1/2, 3/4, 1, 5/4, ...}
(27)
Discrete signals are often quantized, but continuous signals can also be.
Example signals with the possible combinations of the characteristics
continuous and discrete and/or quantized are seen in figure 9.

if t t1 ,

then

u (t) u (t1 )

System: A system is continuous if, when a sequence of input signals ui converges


to u, then the corresponding sequence of outputs converges to the output
signal of the limiting input. That is, if
lim ui = u

(25)

N [u ] = lim N [ui ]

(26)

then
i

Figure 9: Example signals: a) Continuous-time signal (with a discontinuity), b)


Continuous-time, quantized signal, c) Discrete signal, d) Discrete,
quantized signal.

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 25

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 26

EE/ME 701: Advanced Linear Systems

Section 3.12.0

EE/ME 701: Advanced Linear Systems

Section 4.0.0

4 State, revisited

3.11 Deterministic Signals and systems


Signal: Deterministic signals have no random component, examples:

We saw above that internal variables partially determine the output of a


system. Look at two cases:

Deterministic signal:

u(t) = 1+ (t)

the ideal voltage on an RC circuit, with u(t), VC (t0 ) given.

y1 (t) = 1 e RC t

Case 1: vc (t = t0 ) = 0 [volts]

Non-deterministic signal:

Case 2: vc (t = t0) = 3 [volts]

y2

1
(t) = 1 + 2 e RC t

t >0
t >0

wind gusts acting on a helicopter.


3

System: A deterministic system does not introduce random components into the
output signals

System
2.5

R
+
2

u(t)
Vc(t)

RLC circuit

y (t)

y(t)

y(t)

Deterministic:

1.5

y (t)

0.5

Non-deterministic:

0
0

0.5

economy,

1
Time [sec]

Figure 10: Simple example system.


biological systems,
These internal variables are called states.

quantum mechanical systems

Definition: The state of a system at a time t0 is the minimum set of internal


variables which is sufficient to uniquely specify the system outputs given
the input signal over [t0 , ).

3.12 A Note on units


All physical signals have units
Systems with physical inputs and outputs have units.
The units of the system are [Output] / [Input].
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 27

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 28

1.5

EE/ME 701: Advanced Linear Systems

Section 4.0.0

EE/ME 701: Advanced Linear Systems

Section 4.1.1

4.1 Modified definition of linearity, considering state

Examples of states
Elementary dynamics:

position and velocity of particles

Circuits:

voltages across a capacitor, currents through


inductors

4.1.1 For direct application of Superposition and Scaling, state must be zero

Fluid system:

Rate of flow, levels in tanks

Economy:

balances in accounts, levels of material in


inventories, position of material in transit

For superposition and scaling to apply to a system in the simple way, the
internal states must be zero, consider:
y (t) = N [u (t) , x (t0 )] = u (t) + x (t0 )

(28)

so with u3 = u1 + u2

Some facts about states:


The number of states is equal to the order of the differential (or
difference) equation of the model
States are often associated with energy-storage elements,

y3 (t) = u1 (t) + u2 (t) + x (t0)

(29)

but simple application of superposition would require:

More generally, states are associated with storage of something, where


the stored amount changes with time, giving:

y3 (t) = (u1 (t) + x (t0 )) + (u2 (t) + x (t0))

(30)

= u1 (t) + u2 (t) + 2 x (t0 )

d Amount Stored
= f (x, u,t)
dt
The definition of the states of a system is not unique, consider analyzing
a circuit for a voltage, or for a current.
Keep in mind that the state variables must be independent. Consider the
circuit of figure 11

So x (t0 ) 0 is required for Eqns (29) and (30) to be consistent.


(0 is the Null or zero vector, it is a vector of zeros)
Definition: zero state response:

R1

iL1(t)

y (t) = N [u (t) , 0] = u (t) + 0

(31)

L1
Va (t)

iL2(t)
L2

Figure 11: Circuit with inductors in series. This system has only one state, iL1 (t) .
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 29

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 30

EE/ME 701: Advanced Linear Systems

Section 4.1.2

EE/ME 701: Advanced Linear Systems

4.1.2 Definition of a Linear System considering non-zero state (e.g., DeCarlo


definition 1.8)

4.1.3 Time-invariant system, considering non-zero state


The definition of a time-invariant system is a bit more complex when state
is considered, because we must account for the time-shifted state.

Let N [u, x (t0 )] be the response of system N [] to the input signal u ()


defined over [t0, ) , with initial state x (t0 ) .

Definition:

Then system N [] is linear if and only if, for any two admissible inputs
signals u1 and u2 , and for any scalar k,
k (N [u1, x (t0 )] N [u2 , x (t0 )]) = N [k (u1 u2) , 0 (t0)]

Section 4.1.3

A system is time-invariant if, t t0 x1 Rn such that


NT [N [u, x (t0)]] = N [NT [u] , x1 (t0 + T )]

x (t0 ) Rn
(32)

(33)

the time-shifted output = output from the time-shifted input

where 0 () is the zero vector.


For linear systems, the response can be factored into the response due to the
initial state and the response due to the input

where NT [] is the time delay system:


Student
Exercise

NT (u (t)) = u (t T ) .

(34)

Interpretation:
Eqn(33) says that there exists a possibly different initial condition x1 (t + T )
such that the delayed output of the original system with IC x (t0 ) is identical
to the output of the system with a delayed input and the new IC x1 (t + T ) .
Study question:
For a linear time-invariant system, what is the relationship between x (t0 )
and x1 (t0 + T )

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 31

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 32

EE/ME 701: Advanced Linear Systems

Section 5.0.0

5 Standard Notation for State Model

EE/ME 701: Advanced Linear Systems

Section 5.0.0

For the linear, time-varying, continuous-time system we add (t) as an


argument to the model matrices

The notation for a state model depends on its properties, whether it is linear
or nonlinear, time invariant or time varying, continuous or discrete, etc.
Most general continuous case, nonlinear and time varying
State equation:

x (t) = f (x (t) , u (t) , t)

(35)

Output equation:

y (t) = g (x (t) , u (t) , t)

(36)

x (t) = A (t) x (t) + B (t) u (t)

(41)

y (t) = C (t) x (t) + D (t) u (t) ;

(42)

The discrete-time system does not have derivatives, the system equation
gives x (k) at the next sample instant:
x (k + 1) = A x (k) + B u (k)

(43)

y (k) = C x (k) + D u (k)

(44)

all models have x (t0 ) as the IC. Also, see DeCarlo example, Eqn (1.24).
For the nonlinear, time invariant system, time is no longer an argument of
f () and g ()

And if the discrete-time system is time-varying, A, B, C, D become


functions of sample:

State equation:

x (t) = f (x (t) , u (t))

(37)

x (k + 1) = A (k) x (k) + B (k) u (k)

(45)

Output equation:

y (t) = g (x (t) , u (t))

(38)

y (k) = C (k) x (k) + D (k) u (k)

(46)

y (t) = C x (t) + D u (t) ;

(40)

nx1

u(t)
+

B
nxm

n: number of states
m: number of inputs
p: number of outputs

x(t)

px1

y(t)
=

C
pxn

u(t)
+

D
pxm

mx1

Output equation:

(39)

A
nxn

nx1

State equation:

x (t) = A x (t) + B u (t)

x(t)

mx1

.
x(t)
nx1

The linear, time-invariant, continuous-time system we have seen (and will


be the one we most commonly use)

Figure 12: Configuration of the signals (vectors) and parameter (matrices) of a


state variable model.
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 33

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 34

EE/ME 701: Advanced Linear Systems

Section 6.0.0

6 Steps of building a state-variable model (for a


linear system)

EE/ME 701: Advanced Linear Systems

Section 6.0.0

3. Write the differential equations in state-variable form


(a) Higher order differential equations are written as a chain of first-order
equations

1. Write the relevant relations for the system


(b) Put derivative in the left-hand side, these must be the state derivatives
(a) Define symbols for the signals and parameters

(c) All signals on the right-hand side must be expressed in terms of the states
and inputs

(b) Write the equations

(d) Put the model in state-variable form

i. Constituent relations (for elements)


ii. Continuity constraints (for how elements are linked into a system)

4. Write the equation of the output signal (or signals) using the states and
inputs

(c) Record the units, verify that units balance in the equations
5. Check units throughout, to verify correctness.
The equations express laws of physics, the units must balance


Essential things to keep in mind:

2. Identify the differential equations

1. Always distinguish between signals and parameters

(a) Determine the system order


The system order will almost always be sum of the orders of the
contributing differential equations
Rarely, differential equations may be inter-dependent in a way that
reduces the order
(b) Select the state variables

Signals are functions of time and change when the input (signals) change.
Parameters are generally constant (or slowly varying) and are
properties of the system.
Both have physical units
2. Pay special attention to the states

n state variables for an nth order system

Systems have many signals, only n signals are states

State variables must be independent

States correspond to the initial condition needed to determine the output


of the system

The choice is not unique


Often the storage variables are a good choice (often called physical
coordinates)
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 35

With experience, it is usually pretty straight-forward to determine the


states.
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 36

EE/ME 701: Advanced Linear Systems

Section 7.1.0

7 Examples with state-variable models

EE/ME 701: Advanced Linear Systems

Section 7.1.0

(b) Write the equations


i. Constituent relations

7.1 Example 1, an electrical circuit

v1(t)
Is(t)

d
Vc (t)
dt 1
d
VL1 (t) = L1 IL1 (t)
dt

VR1 (t) = R1 IR1 (t) ,

IR1(t) VR1(t)
+
R1

Ic1 (t) = C1

VR2 (t) = R2 IR2 (t) ,

IL1(t) VL1(t)
v2(t)
+
Ic1(t)
L1
+
+
VIs(t)
Vc1(t) C1

ii. Continuity Constraints

IR2(t)
+
VR2(t)

Kirchhoffs voltage law

R2

Kirchhoffs current law

VIs (t) +VL1 (t) +VC1 (t) = 0

IR1 (t) + IL1 (t) Ic1 (t) IR2 (t) = 0

Vc1 (t) = VR2 (t)

Is (t) IR1 (t) IL1 (t) = 0

VL1 (t) = VR1 (t)

Is (t) Ic1 (t) IR2 (t) = 0

Figure 13: Electrical circuit.


Sum of the voltages around a loop, + if you enter at the + terminal
Sum of the currents entering a node

1. Write the relevant relations for the system

(c) Record the units, verify that units balance in the equations

(a) Define symbols for the signals and parameters

Units recorded in table 4. Check

Signals
Is (t)
IR1 (t)

Sup. current

[Amps]

Vs (t)

Sup. voltage

Parameters

[volts]

R1 voltage

[volts]

R1 , R2

Resistance

VR2 (t)

R2 voltage

[volts]

C1

Capacitance

Vc1 (t)

C1 voltage

[volts]

L1

Inductance

VL1 (t)

L1 voltage

[volts]

R1 current

[Amps]

VR1 (t)

IR2 (t)

R2 current

[Amps]

Ic1 (t)

C1 current

[Amps]

IL1 (t)

L1 current

[Amps]

i
volts
amp
h
i
amp-sec
h volt i
volt-sec
amp

Table 4: Signals and parameters of the electrical circuit.

Ic1 (t) = C1

d
Vc (t)
dt 1

[amps] =

VL1 (t) = L1

d
IL (t)
dt 1

[volts] =

h amp-sec i  volt 
volt
second


volt-sec h amp i
amp

second

2. Identify the differential equations


d
Vc (t)
dt 1
d
VL1 (t) = L1 IL1 (t)
dt
Ic1 (t) = C1

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 37

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 38

EE/ME 701: Advanced Linear Systems

Section 7.1.0

(a) Determine the system order

EE/ME 701: Advanced Linear Systems


3. Write the differential equations in state-model form

1 Cap + 1 Inductor, 2nd Order

(a) Higher order differential eqns are written as a chain of first-order eqns.

(b) Select the state variables

(b) Put derivative in the left-hand side, these must be the state derivatives

For this RLC circuit there is a clear choice:

Vc1 (t)

x (t) =

IL1 (t)

x2 (t) =

VR1 (t)
IR2 (t)

x3 (t) =

1
d
Ic (t)
Vc (t) =
dt 1
C1 1
1
d
VL (t)
IL (t) =
dt
L1 1

(47)

VIs (t)
VR2 (t)

(48)
(49)

This step shows why Vc1 (t) and IL1 (t) are natural choices for the
states.

Other possible choices

Section 7.1.0

(c) All signals on the right-hand side must be expressed in terms of the states
and inputs

Later, we will see how to convert the state model into a state model
with any of these state vectors.
If we wanted a model based on x3 (t), it is probably easiest to derive
the model based on the physical coordinates, Eqn (47), and then
make a change of basis to transform the model to x3 (t).
Illegal choices

In Eqn (48) we need to express Ic1 (t) in terms of {Vc1 (t) , IL1 (t) , Is (t)}
This involves using the constituent and continuity equations that
describe the system.
From

Is (t) Ic1 (t) IR2 (t) = 0

From Vc1 (t) = VR2 (t) , IR2 = VR2 /R2

=
=

Ic1 (t) = Is (t) IR2 (t)

Ic1 (t) = Is (t) Vc1 (t) /R2


(50)

These selections for the states are not independent

x4 (t) =

VR1 (t)
IR1 (t)

, x5 (t) =

Vc1 (t)
VR2 (t)

This is the needed form. For VL1 (t)

VL1 (t) = VR1 (t) = R1 IR1 (t) = R1 (Is (t) IL1 (t))
Using Eqns (50) and (51) in Eqns (48) and (49)

And this one is not allowed because Is (t) is an input

Part 5: Models of Linear Systems

x6 (t) =

Vc1 (t)
Is (t)

(Revised: Sep 10, 2012)

(51)

Page 39

d
1
1
1
(Is Vc1 (t) /R2) =
Vc (t) + 0 IL1 (t) + Is (t)
Vc (t) =
dt 1
C1
C1 R2 1
C1
R1
R1
1
d
R1 (Is (t) IL1 (t)) = 0Vc1 (t) IL1 (t) + Is (t)
IL (t) =
dt
L1
L1
L1
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012) Page 40

EE/ME 701: Advanced Linear Systems

Section 7.1.0

EE/ME 701: Advanced Linear Systems

(d) Put the model in state-variable form


x (t) = A x (t) + B u (t)



d Vc1 (t) C11R2 0 Vc1 (t)
=
+
x (t) =
dt IL (t)
0
RL11
IL1 (t)
1

1
C1
R1
L1

5. Check units throughout, to verify correctness.

[volts/sec]
[volts]

, x (t) :
x (t) :
[amps/sec ]
[amps ]

(52)

Is (t)


sec1

A:
 1 ,

sec

This is the state equation, with

x (t) =

Vc1 (t)
IL1 (t)

u (t) = [Is (t)] ,

A=

C11R2
0

0
RL11

4. Write the equation of the output signal (or signals) using the states and
inputs
Suppose the output were VR1 (t). In Eqn (51) we have already derived
that VR1 (t) = R1 (Is (t) IL1 (t)), so
y (t) =

0 R1

Vc1 (t)
IL1 (t)

y (t) = C x (t) + D u (t)


This is the output equation, with
y (t) = [VR1 (t)] , C =

0 R1

Units check in the state equation.

Units in the output equation are given as:



h h
i i [volts]
volts
+ volts [amps]
[volts] = amp
amp
[amps ]
Units check in the output equation.

We need to find a way to express Vs (t) in terms of the states and inputs,
{Vc1 (t) , IL1 (t) , Is (t)}. Going back to the original equations,

, D = [R1 ]

(Revised: Sep 10, 2012)

So units in the state equation are given as:




h
i

volt

sec1
[volts/sec]
[volts]
=

+ amp-sec
 1
 1 [amps]

sec
[amps/sec ]
[amps ]
sec

y (t) = [Vs (t)]

Notice, the output is made up of a signal, and the output and feed-forward
matrices are made up of parameters.
Part 5: Models of Linear Systems

1
C1
R1
L1

i
volt
amp-sec

B:
 1
sec
h

Alternative output, suppose we wanted to determine Vs (t), for example to


calculate the supply power Ps (t) = Vs (t) Is (t). The output will be Vs (t):

+ [R1 ] Is (t)

B=

Notice, the state and input are made up of signals, and the system and
input matrices are made up of parameters.

Section 7.1.0

Page 41

Vs (t) = VR1 (t) +Vc1 (t) = R1 (Is (t) IL1 (t)) +Vc1 (t)

(53)

This gives Vs (t) in terms of the states and inputs !


Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 42

EE/ME 701: Advanced Linear Systems

Section 7.1.1

EE/ME 701: Advanced Linear Systems

For y (t) = [Vs (t)]

Section 7.1.1

Build the Matlab state-variable model object


y (t) = C x (t) + D u (t)

>> SSmodel = ss(A, B, C, D)

(54)

a =

with
C =

1 R1

, D = [R1 ]

x1
x2

x1
-500
0

y1

x1
0

x2
0
-1000

b =
x1
x2

(55)
c =

A deep property of state models: just as with Eqns (54) and (55) we can find
a row for the C and D matrices to give any signal in the system.

Now we can look at poles and zeros

x2
-100

>> Poles = pole(SSmodel)


Poles = -1000
-500
>> Zeros = zero(SSmodel)
Zeros = -500
0

d =

u1
y1 100
Continuous-time model.

7.1.1 Building and exercising the circuit model


To build the model, build the A , B , C and D matrices:

>>
>>
>>
>>

%%
R1
R2
L1
C1

Looking at the step and frequency response

Set up the parameter values


= 100
%% Ohms
= 200
%% Ohms
= 0.1
%% Henries
= 10e-6
%% Farads

>> figure(1), clf; step(SSmodel); print(-deps2c, CircuitAStepResponse)


>> figure(2), clf; bode(SSmodel); print(-deps2c, CircuitAFreqResponse)

Bode Diagram

Step Response

40

100

30
Magnitude (dB)

90
80

%% Build the state equation


>> A = [ -1/(C1*R2), 0; 0, -(R1/L1) ]
A = -500
0
0
-1000

Amplitude

70

Part 5: Models of Linear Systems

10
0

50

10
90

Phase (deg)

40
30

%% Build the output Eqn


>> C = [ 0, -R1]
C =
0 -100
>> D = [ R1 ]
D = 100
(Revised: Sep 10, 2012)

20

60

20

>> B = [ 1/C1; R1/L1]


B = 1.0e+05 *
1.0000
0.0100

u1
1e+05
1000

Page 43

45

10
0

3
Time (sec)

6
3

x 10

0
1
10

10

10
Frequency (rad/sec)

10

10

Figure 14: Circuit step and frequency response

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 44

EE/ME 701: Advanced Linear Systems

Section 7.2.0

7.2 Example 2, a mechanical system

The parameters are listed in table 5. Signals are listed in table 6.

See tables 5 and 6.


(b) Write the equations
m2

i. Constituent relations (for elements)

zv(t)
vcar

ks

bs

m1

zw(t)
kw

Road Surface

r(t)

Figure 15: Quarter vehicle suspension.


Parameters

m1
kw

wheel mass

[kg]

ks

i
Newtons
h meter i
Newtons
Suspension damping meter/sec
Suspension stiffness

[kg]
bs
i
Newtons
tire stiffness
meter
Table 5: Parameters of the mechanical quarter suspension.
h

F (t) = m x (t)

(56)

Hooks law:

F (t) = k (x1 (t) x2 (t))

(57)

Damper Eqn:

F (t) = b (x1 (t) x2 (t))

(58)

ii. Continuity constraints (for how elements are linked into a system)

Inerial Reference

m2 1/4 vehicle mass

The constituent relations for mechanical systems are:


Newtons 2nd law:

It is desired to model the system


and determine the response zv (t)
to the road profile.

Section 7.2.0

(a) Define symbols for the signals and parameters

A Quarter Suspension is shown in figure 15.

The vehicle is moving across a


road, with the surface profile given
by r (t).

EE/ME 701: Advanced Linear Systems

Sum of the forces acting on any free body is zero


Sum of the velocities around any loop is zero
(c) Record the units, verify that units balance in the equations
See tables 5 and 6.
2. Identify the differential equations
The differential Eqns come from Eqns (56) and (58)

Signals
zv (t)
Height of vehicle
[m]
zw (t) Height for tire center [m]

r (t) Height of road surface [m]

Table 6: Signals of the mechanical quarter suspension.


1. Write the relevant relations for the system
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 45

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 46

EE/ME 701: Advanced Linear Systems

Section 7.2.0

EE/ME 701: Advanced Linear Systems


Model equations (repeated)

(a) Determine the system order


We get two 2nd order differential equations:
m1 zw (t) =
m2 zv (t) =

Section 7.2.0

m1 zw (t) = kw (r (t) zw (t)) ks (zw (t) zv (t)) bs (zw (lt) zv (t))


m2 zv (t) = +ks (zw (t) zv (t)) + bs (zw (t) zv (t))

Forces on wheel
Forces on 1/4 vehicle

The system will be 4th order. From the free-body diagrams

Lets look at one of the 2nd order terms first

m1 zw (t) = kw (r (t) zw (t)) ks (zw (t) zv (t)) bs (zw (t) zv (t))

(59)

m2 zv (t) = +ks (zw (t) zv (t)) + bs (zw (t) zv (t))

(60)

zw (t) =

(b) Select the state variables

The natural physical coordinates are


mass

zw (t)

zw (t)

x (t) =

zv (t)

zv (t)

Part 5: Models of Linear Systems

position and velocity of each

[m]

[m/sec]

[m]

[m/sec]

(Revised: Sep 10, 2012)

zw (t)

Examining Eqn (62)


h

Likewise, from Eqn (63)


h

+ks /m2 +bs /m2 ks /m2

Part 5: Models of Linear Systems

zw (t) = (kw + ks ) /m1 bs /m1 +ks /m1

zv (t) =

i
zw (t)

+ [__] r (t)
__ __ __ __

zv (t)

zv (t)

(61)

Page 47

zw (t)

zw (t)
d x (t)
=
x (t) =

dt
zv (t)

zv (t)

- The system order will almost always be sum of the orders of the
contributing differential equations -

We need 4 state variables, we have two 2nd order differential


equations.

(63)

3. Write the differential equations in state-variable form

(62)

zw (t)

i
zw (t)
+[kw ] r (t)
+bs /m1

zv (t)

zv (t)
(64)

z (t)
w

i
zw (t)
+ [0] r (t)
bs /m2

zv (t)

zv (t)
(65)
(Revised: Sep 10, 2012)

Page 48

EE/ME 701: Advanced Linear Systems

Section 7.2.0

Eqns (64) and (65) gives two of the rows of the A matrix.

Looking back at x (t),

zw (t) =

zw (t)
d x (t)

=
x (t) =

dt
zv (t)

zv (t)

For the first row we must write:

zw (t)

zw (t)

i
zw (t)

+ [__] r (t)
__ __ __ __

zv (t)

zv (t)

zw (t) is an element of the state vector, we just hook it up !

zw (t) =

Likewise

z (t)

i
zw (t)
+ [0] r (t)

0 1 0 0

zv (t)

zv (t)

zw (t)

h
i
zw (t)
+ [0] r (t)
zv (t) = 0 0 0 1

zv (t)

zv (t)
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 49

Section 7.2.0

Putting the pieces together


x (t) = A x (t) + Bu (t)

0
1
0
0

(kw +ks )
bs
ks

m1 + m1 + mbs1
m1
x (t) =

0
0
0
1

ks
bs
ks
+ m2 m2 mbs2
+ m2

(66)

0
z (t)

zw (t) mkw
+ 1 r (t)

zv (t) 0

zv (t)
0

4. Write the equation of the output signal (or signals) using the states and
inputs
If the output of interest is the vehicle response

y1 (t) = zv (t) =

EE/ME 701: Advanced Linear Systems

zw (t)

i
zw (t)
+ [0] r (t)
0 0 1 0

zv (t)

zv (t)

(67)

Suppose the desired output is the force in the suspension spring. Since
Fs (t) = ks (zw (t) zv (t))
The output equation is given as:

y2 (t) = Fs (t) =

Part 5: Models of Linear Systems

ks 0 ks

zw (t)

i
zw (t)
+ [0] r (t)

zv (t)

zv (t)
(Revised: Sep 10, 2012)

(68)

Page 50

EE/ME 701: Advanced Linear Systems

Section 7.2.0

Suppose the desired output is the road force on the tire. Since

EE/ME 701: Advanced Linear Systems

Section 7.2.0

5. Check units throughout, to verify correctness.




Noting that 1.0 [Newton]=1.0 kg m/sec2

Fw (t) = kw (r (t) zw (t))

Terms such as kw /m1 or bs /m2 have units of

The output equation is given as:

y3 (t) = Fw (t) =

kw

zw (t)

kw
:
m1

i
zw (t)
+ [kw ] r (t)
0 0 0

zv (t)

zv (t)

(69)

y (t) = C x (t) + D u (t)


z (t)
v

y (t) = Fs (t)

Fw (t)

zw (t)
0 0 1 0

zw (t)
= ks 0 ks 0

zv (t)
kw 0 0 0
zv (t)

(70)

kg-m 1
sec2 m

kg-m 1
sec2 m/s





1
kg
1
kg




=
=

i
h
sec2
i
h
sec1

Thus the state equation has units of:

Now suppose the desired output is all three

bs
:
m2

0

+ 0
r (t)

kw

[]
[1]
[]
[]
[m/sec]



  1  2  1 

sec2
m/sec2
sec
sec
sec
=
x (t) :

[]
[m/sec]
[]
[]
[1]




  1  2  1 
2
2
sec
sec
sec
m/sec
sec

[]



sec2
[m]
+

[]

[]

(Revised: Sep 10, 2012)

Page 51

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

[m/sec]

[m]

[m/sec]

Units check !

Part 5: Models of Linear Systems

[m]

Page 52

EE/ME 701: Advanced Linear Systems

Section 7.2.1

EE/ME 701: Advanced Linear Systems


Examine the poles and zeros and response

7.2.1 Building and exercising the suspension model


Build the model
= 10000
= 2500
= 10000
= 25
= 250

%%
%%
%%
%%
%%

N/m
N/m
N/m/s
kg
kg

>> A = [ 0
1
0
0
-(kw+ks)/m1 -bs/m1
ks/m1
bs/m1
0
0
0
1
ks/m2
bs/m2 -ks/m2 -bs/m2 ]
>> B = [ 0; kw/m1; 0 ; 0]
>> C1 = [ 0, 0, 1, 0];
D1 = [0]
>> SSmodel2a = ss(A, B, C1, D1)

%% Fast mode
%% Lightly damped mode
%% Slow mode

Bode Diagram

Step Response

50

1.8
Magnitude (dB)

kw
ks
bs
m1
m2

>> Poles = pole(SSmodel2a)


Poles =
-438.92
-0.41 + 6.00i
-0.41 - 6.00i
-0.25
>> Zeros = zero(SSmodel2a)
Zeros = -0.2500

1.6

Amplitude

1.4

50
100

1.2

150

200
0

0.8
Phase (deg)

>>
>>
>>
>>
>>

Section 7.2.1

0.6
0.4

90

180

0.2
0

a =

b =
x1
x2
x3
x4

x1
0
-500
0
10

x2
1
-400
0
40

x3
0
100
0
-10

c =

x4
0
400
1
-40

x1
x2
x3
x4

u1
0
400
0
0

d =
y1

x1
0

x2
0

x3
1

x4
0

y1

u1
0

6
Time (sec)

10

12

14

270
1
10

10

10
10
Frequency (rad/sec)

10

10

Figure 16: Suspension step and frequency response.

The shock absorber is not doing its job !


Examine the damping of the modes
>> [Wn, rho] = damp(SSmodel2a)
Wn = 0.2516
6.0186
6.0186
438.9211

rho =

1.0000
0.0687
0.0687
1.0000

Mode with damping of 0.07 is too lightly damped.

Continuous-time model.
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 53

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 54

EE/ME 701: Advanced Linear Systems

Section 7.2.1

EE/ME 701: Advanced Linear Systems

Modify the parameters


>>
>>
>>
>>
>>
>>

%%
kw
ks
bs
m1
m2

Examine the poles, zeros and damping

2nd parameter
= 10000
%%
= 1000
%%
= 1000
%%
= 25
%%
= 250
%%

set
N/m
N/m
N/m/s
kg
kg

>> Poles = pole(SSmodel2b)


Poles = -31.4478
-5.4731 + 1.3136i
-5.4731 - 1.3136i
-1.6059
>> Zeros = zero(SSmodel2b)
Zeros = -1.0000

%% Softer spring
%% Softer shock absorber

x1
x2
x3
x4

x1
0
-440
0
4

x2
1
-40
0
4

x3
0
40
0
-4

x4
0
40
1
-4

%% Fast Mode
%% Oscillatory mode,
%%
now alpha>omega
%% Slow mode

The slow mode is faster, the fast mode is much slower, and the oscillatory
mode is better damped

>> A = [ 0
1
0
0
-(kw+ks)/m1 -bs/m1
ks/m1
bs/m1
0
0
0
1
ks/m2
bs/m2 -ks/m2 -bs/m2 ]
>> B = [ 0; kw/m1; 0 ; 0]
>> C1 = [ 0, 0, 1, 0]; D1 = [0]
>> SSmodel2b = ss(A, B, C1, D1)
a =

Section 7.2.1

Compute the damping factors


>> [Wn, Z] = DAMP(SSmodel2b)
Wn = 1.6059
5.6286
5.6286
31.4478

b =
x1
x2
x3
x4

Z =

1.0000
0.9724
0.9724
1.0000

Damping factor of 0.97 is much better

u1
0
400
0
0

Examine the step and frequency response


Bode Diagram

Step Response

50

1.4

x3
1

x4
0

d =
y1

u1
0

50

100

Continuous-time model.

0.8

150
0
0.6

Phase (deg)

x2
0

Amplitude

y1

x1
0

Magnitude (dB)

1.2

c =

0.4

0.2

0.5

1.5
Time (sec)

2.5

3.5

90

180

270
1
10

10

10
Frequency (rad/sec)

10

10

Figure 17: Suspension step and frequency response.


Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 55

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 56

EE/ME 701: Advanced Linear Systems

Section 7.2.1

Lets consider the Transfer Function

EE/ME 701: Advanced Linear Systems


The model now has 3 outputs

>> TFmodel2b = tf(SSmodel2b)


r(t)

A, B, C, D

Section 7.2.1

zv(t)
Fs(t)
y(t)

Transfer function:
1600 s + 1600
-------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600

x(t)

Fw(t)

n: states
m: inputs
p: outputs
mxp: TFs

Figure 18: Block diagram of quarter suspension system reflecting 3 outputs.


Building the state-variable model and computing the transfer function
may be the easiest way to get the TF of a complex system.
Lets consider multiple outputs, from Eqn (70).
The C and D matrices change.
Output Eqn: y (t) = C x (t)+D u (t)
>> C2c = [ 0 0
1 0
ks 0 -ks 0
-kw 0
0 0 ];
>> D2c = [ 0; 0; kw]
>> SSmodel2c = ss(A, B, C2c, D2c)
a =
x1
x2
x3
x4
c =
y1
y2
y3

x1
0
-440
0
4

x1
0
1000
-1e+04

x2
1
-40
0
4

x3
0
40
0
-4

x2
0
0
0

x3
1
-1000
0

x4
0
40
1
-4

b =
x1
x2
x3
x4
x4
0
0
0

u1
0
400
0
0

d =
y1
y2
y3

u1
0
0
1e+04

Continuous-time model.
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 57

Examine the transfer function and step response


>> TFmodel2c = tf(SSmodel2c)
Transfer function from input to output...
1600 s + 1600
#1: -------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600

#2:

4e05 s^2 - 3.207e-10 s + 2.754e-09


-------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600

#3:

10000 s^4 + 4.4e05 s^3 + 4.4e05 s^2 + 6.487e-10 s + 1.913e-10


------------------------------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600

Each input/output pair (in this case 1x3) gives a TF.


The zeros and gain are different between the TFs
The poles are the same
Note: the coefficients of 109 1010 in TFs 2 and 3 are due to round-off
error. These TFs have a double zero at the origin.
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 58

EE/ME 701: Advanced Linear Systems

Section 7.2.2

equation
Given a differential equation, such as

Step Response

a3 y(t) + a2 y (t) + a1 y (t) + a0 y (t) = b2 u (t) + b1 u (t) + b0 u (t)

1.5

We can immediately write down the TF model

0.5
0

To: Out(2)

(71)

T (s) =

1000
Amplitude

Section 7.3.0

7.3 Example 3, state-variable model from a differential

>> figure(1), clf;


>> step(SSmodel2c);
>> print(-deps2c, Mechanism2StepResponsec)

To: Out(1)

EE/ME 701: Advanced Linear Systems

b2 s2 + b1 s + b0
3
a3 s + a2 s2 + a1 s + a0

(72)

500

How do we construct a state-variable model ?

A. In Matlab
500

>> num = [b2, b1, b0]; den = [a3, a2, a1, a0]
>> TFmodel = tf(num, den);
>> SSmodel = ss(TFmodel)

To: Out(3)

10000
5000
0

B. Put together the state-variable model in one of the canonical forms.


5000

0.5

1.5

2
Time (sec)

2.5

3.5

Figure 19: Step response shown with 3 outputs.

There are 4 canonical forms. The canonical forms have a strong


relationship to properties called controllability and observability, and
we will see all four (cf. Bay chapter 8).
Here we consider Controllable Canonical Form
(cf. Bay section 1.1.3)

7.2.2 Conclusions, Quarter Suspension Example

First, the Diff Eq must be monic. That means that the an coefficient is 1.0

Follow the steps to put together the model

Divide through Eqn (71) (or 72) by a3

It is straight forward to
test various parameter configurations

1.0y(t) + a2 y (t) + a1 y (t) + a0 y (t) = b2 u (t) + b1 u (t) + b0 u (t) (73)

obtain a transfer function


State-variable model naturally represents systems with multiple outputs
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 59

With a2 = a2 /a3 , a1 = a1 /a3 , ... , b2 = b2 /a3 , ... , b0 = b0 /a3


Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 60

EE/ME 701: Advanced Linear Systems

Section 7.3.1

EE/ME 701: Advanced Linear Systems

Section 7.3.1

1.0z(t) + a2 z (t) + a1 z (t) + a0 z (t) = 1.0 u (t)

7.3.1 Steps of deriving the state-variable model

(74, repeated)

1. Write the relevant relations for the system


2. Identify the differential equations
(a) Define symbols for the signals and parameters.

(a) Determine the system order:

Parameters
h
i
 1
y
Signals
a2 Coefficient sec
b2 Coefficient
u sec
h
i


y (t) Output y
y
a1 Coefficient sec2 b1 Coefficient
2
u (t) Input u
h u sec i


y
a0 Coefficient sec3 b0 Coefficient
u sec3
Table 7: Signals and Parameters for the Differential Equation.
Units result from

dividing through by a3 , which has units of sec3 .

3rd order
(b) Select the state variables
Considering Eqn 74, select the state variables to be the output variable
and its derivatives up to the n 1 derivative

(b) Write the equations


To model a differential equation in Controllable Canonical form,
break the TF into two components, with the denominator first.
u(t)

y(t)

B(s)
A(s)

(a) Original TF

u(t)

1
A(s)

z(t)

(76)

The state vector is said to be in phase-variable form.

The derivative of the last phase variable is given by Eqn (74), re-written
as Eqn (77) (cf. Bay Eqn (1.25))

The breakdown of figure 20 gives two equations:

1.0z(t) = a2 z (t) a1 z (t) a0 z (t) + 1.0 u (t)

(74)
(75)

x1 (t)

x (t) = x2 (t)

x3 (t)

(c) Record the units, verify that units balance in the equations
See table 7.
(Revised: Sep 10, 2012)

The derivative of each of the phase variables is simply its successor

Figure 20: Transfer function as two components.

Part 5: Models of Linear Systems

x (t) = z (t)

z(t)

3. Write the differential equations in state-variable form

(b) TF in two components

y (t) = b2 z (t) + b1 z (t) + b0 z (t)

z (t)

The state-derivative vector includes the highest derivative of z (t) .

y(t)
B(s)

1.0z(t) + a2 z (t) + a1 z (t) + a0 z (t) = 1.0 u (t)

z (t)


x (t) = z (t)
,

z (t)

Page 61

z (t)



= z (t)

z(t)

Part 5: Models of Linear Systems



= 0
0
1

a0 a1 a2

z (t)

z (t)

z (t)

(77)



+ 0 u (t)

1
(78)

(Revised: Sep 10, 2012)

Page 62

EE/ME 701: Advanced Linear Systems

Section 7.3.1

y (t) = b2 z (t) + b1 z (t) + b0 z (t)

(75, repeated)

4. Write the equation of the output signal (or signals) using the states and
inputs

z (t)

h
i

y (t) = b0 b1 b2 z (t) + [0] u (t)


(79)

z (t)

EE/ME 701: Advanced Linear Systems


5. Check units throughout, to verify correctness (continued)
Putting the units together:


u-sec1



x (t) : u-sec2


u-sec3

(Note: Bays development includes a b3 term. Eqn (1.25) show how the b3
term is incorporated).

5. Check units throughout, to verify correctness.


From Eqn (74), z (t) has units of [u], so the phase-variable state vector
has units

z (t)

x (t) = z (t)

z (t)

[u]



x (t) : u-sec1


u-sec2

[]
1
[]

A : []
[]
1

  2   1
3
sec
sec
sec
C:

h

y
u

i h

y-sec
u

Part 5: Models of Linear Systems

i 

y-sec2
u



[]

B : []


sec3


y
u

y
u

i h

[]

[]

[u]

= []
u-sec1
[]
1



  2  1

u-sec2
sec
sec
sec3

[]

+ [] [u]


sec3
y-sec
u

i 

[u]





y-sec2
u-sec1
u


u-sec2



(Revised: Sep 10, 2012)

Page 63

>> a2 = 5; a1 = 7; a0=8;
>> b2= -2; b1 = 3; b0 = 4;
>> A = [ 0
1
0 ;
0
0
1;
-a0, -a1, -a2];
>> B = [ 0;0; 1];
>> C = [ b0, b1, b2];
D = 0;

Part 5: Models of Linear Systems

 y 

+ [u]

Build the model

D:

y (t) : [y] =

h

7.3.2 Building and exercising a differential equation model

Looking at table 7, the model matrices have units of

Section 7.3.2

(Revised: Sep 10, 2012)

Page 64

EE/ME 701: Advanced Linear Systems

Section 7.3.2

EE/ME 701: Advanced Linear Systems


Look at the step and frequency response

>> SSmodel3a = ss(A, B, C, D)

x1
x2
x3

x1
0
0
-8

x2
1
0
-7

x3
0
1
-5

b =

>> figure(2), clf; step(SSmodel3a);


>> figure(3), clf; bode(SSmodel3a);

u1
0
0
1

x1
x2
x3

Bode Diagram
Step Response

10
1.2

0
Magnitude (dB)

a =

y1

x1
4

x2
3

x3
-2

d =

u1
0

y1

0.8

10
20
30

Amplitude

c =

Section 7.4.0

0.6

40
360

Phase (deg)

0.4

0.2

>> Poles = pole(SSmodel3a) >> Zeros = zero(SSmodel3a)


Poles = -0.6547 + 1.3187iZeros = 2.3508
-0.8508
-0.6547 - 1.3187i
-3.6906
Looking at the pole-zero map (constellation)

0.2

4
5
Time (sec)

270

180

90
1
10

10

10

10

Frequency (rad/sec)

Figure 22: Circuit step and frequency response.

7.4 State-variable model and simulation diagram

>> figure(1), pzmap(SSmodel3a);


Analog computers include integrators, gain blocks and summing junctions
PoleZero Map

1.5

b3

y(t)

Imaginary Axis

0.5

b1

b2

b0

u(t)

0.5

.
x3

.
x2

x3

x2

.
x1

x1

1.5
4

-a2

Real Axis

-a1

-a0

a3=1

Figure 21: PZ map of the state-variable model.

(Controllable Canonical Form, Notation follows Bay)

Figure 23: Simulation diagram for system in controllable canonical form.


Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 65

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 66

EE/ME 701: Advanced Linear Systems

Section 7.4.0

Integrators, gain blocks and summing junctions are built with Op-Amps
C1

Integrators:
Vin(t)

+
iR1(t)

R1
V-(t)
V+(t)

Gain Blocks:

Rf

Vin(t)

Vo(t)

Ra

V+(t)

(105..107),

iRf(t)
+

Vol(t)

V (t)

iRa(t)

Figure 24: An Op-Amp integrator.


Op Amp gain is very high
negative feedback, so

Section 7.4.0

iC1(t)

EE/ME 701: Advanced Linear Systems

Figure 25: An Op-Amp gain block.

and the circuit is configured with


An Op-Amp gain block is shown in figure 26.

V (t) V + (t)

Using the principles of virtual ground and low input current

And in the circuit of figure 24, V + (t) is wired to ground. So node V


becomes a virtual ground
V (t) 0
Op Amp input currents are very small (109.. 1012 amps), so
IR1 (t) + Ic1 (t) = 0

V0 (t) =

Rf
Vin (t)
Ra

(83)

Summing junctions:
Rf

(80)

Eqn (80) gives




1
d
Vin (t) V (t) +C1
Vo (t) V (t) = 0
R1
dt

IRa (t) + IR f (t) = 0

(81)

Va(t) i
Ra
Ra
+
Vb(t) i
Rb

V-(t)

Rb

V+(t)

iRf
+
Vol(t)

With the properties of the virtual ground, Eqn (82) becomes


Figure 26: An Op-Amp summing junction.

d
1
d
1
Vin (t) +C1 Vo (t) = 0 , Vo (t) =
Vin (t)
R1
dt
dt
R1 C1

The Op-Amp virtual ground configuration sums the currents at the V


node.

Giving
Vo (t) =
Part 5: Models of Linear Systems

Z t
t0

1
Vin (t) dt
R1 C1
(Revised: Sep 10, 2012)

(82)
Page 67

V0 (t) =
Part 5: Models of Linear Systems


Rf
Rf
Va (t) + Vb (t) + ...
Ra
Rb
(Revised: Sep 10, 2012)

(84)
Page 68

EE/ME 701: Advanced Linear Systems

Section 7.4.0

A note on the inversions. Recall


Vo (t) =

Z t
t0

1
Vin (t) dt
R1 C1

EE/ME 701: Advanced Linear Systems

Section 7.4.0

Returning to the simulation diagram, we can write down the state model
directly from the simulation diagram (and vice-versa)
(82, repeated)
b3

In each of Eqns (82), (83) and (84) the output voltage is inverted relative
to the input voltage.
This is an inherent property of the virtual ground configuration.

b1

b2
u(t)

.
x3

.
x2

x3

x2

y(t)

b0

.
x1

x1

In analog computers, either


-a2

1. Introduce - signs as needed, and invert signals as needed with a gain


blocks, g = 1, or
2. Include a second Op-Amp in each element (Integrator, Gain block,
Summing junction), so that the block is non-inverting.

-a1

-a0

a3=1
(Controllable Canonical Form, Notation follows Bay)

Figure 27: Simulation diagram for system in controllable canonical form.

The output of each integrator is a state


1. Write the relevant relations for the system
(a) Define symbols for the signals and parameters
Signals
u(t)
Input
y (t)
Output
x1 (t) , x2 (t) , x3 (t) States

Parameters
b0..b3 Numerator Coefficients
a0..a2 Denominator Coefficients

Table 8: Signals and parameters of the simulation diagram.

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 69

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 70

EE/ME 701: Advanced Linear Systems

Section 7.4.0

b3

b1

b2
u(t)

.
x3

.
x2

x3

-a2

x2

EE/ME 701: Advanced Linear Systems

y(t)

.
x1

-a1

y3(t)

b3

b0

Section 7.4.0

b1

b2

x1

u(t)

.
x3

.
x2

x3

-a0

-a2

a3=1

b0

.
x1

x2

y(t)

x1

-a1

-a0

a3=1
(Controllable Canonical Form, Notation follows Bay)

(Controllable Canonical Form, Notation follows Bay)

Figure 28: Simulation diagram for system in controllable canonical form.

Figure 29: Simulation diagram for system in controllable canonical form.


3. Write the differential equations in state-variable form

(b) Write the equations

Directly transcribe from (or to !) the block block diagram

Examining the block diagram we may write:


x1 (t) = x2 (t)

(85)

x2 (t) = x3 (t)

(86)

x3 (t) = a2 x3 (t) a1 x2 (t) a0x1 (t) + u (t)


y (t) = b3 (a2 x3 (t) a1 x2 (t) a0x1 (t) + u (t))

x (t)
1

x (t) = x2 (t)

x3 (t)

(87)

+b2 x3 (t) + b1 x2 (t) + b0 x1 (t)

(88)

0
1
0


= 0
0
1

a0 a1 a2

x (t)
1

x2 (t)

x3 (t)

0


+ 0 u (t) (89)

1

4. Write the equation of the output signal (or signals) using the states and
inputs (cf. Bay Eqn (1.25))

2. Identify the differential equations


(a) Determine the system order:

y3 (t) = b3 (u (t) a0 x1 (t) a1 x2 (t) a2 x2 (t))

3rd order

(b) Select the state variables


The physical coordinates of the system are
the integrator outputs
These are voltages we can observe on an
oscilloscope.
Part 5: Models of Linear Systems

x1 (t)

x (t) = x2 (t)

x3 (t)

(Revised: Sep 10, 2012)

y (t) =

x (t)
i 1

(b0 b3a0) (b1 b3a1) (b2 b3a2) x2 (t)

x3 (t)

(90)

+ [b3] u (t)

(91)

5. Check units throughout, to verify correctness.


See example 3.

Page 71

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 72

EE/ME 701: Advanced Linear Systems

Section 8.1.0

8 Some basic operations with state-variable models


8.1 Deriving the transfer function from the state-variable
model
It is straight forward to derive the transfer function from a state variable
model.
Starting with the state equation

x (t) = A x (t) + B u (t)


y (t) = C x (t) + D u (t)

EE/ME 701: Advanced Linear Systems

Section 8.1.1

8.1.1 InterpretingYthe
(s)transfer function
= C (s I A)1 B+D
U (s)

(96, repeated)

Eqn (96) can be solved symbolically by Cramers rule, to give the symbolic
transfer function.
Recall from basic linear algebra that Cramers rule gives the matrix
inverse as:
1
U 1 =
adj U
(97)
det U
Where U is an n n matrix, and adj U is the adjugate of matrix U.
Defining

Considering the Laplace transform of the state equation:

Vi, j = (1)i+ j Mi, j

(98)

adj U = V T

(99)

Then

Eqn (92) leads to:

s X (s) = A X (s) + B U (s)

(92)

Y (s) = C X (s) + D U (s)

(93)

(s I A) X (s) = B U (s)

Examples:

or
X (s) = (s I A)1 B U (s)

(94)

With (94), Eqn (93) leads to:


Y (s) = C (s I A)1 B U (s) + D U (s)

(95)

If m = 1 (one input) and p = 1 (one output), then Eqn (95) gives the transfer
function:
Y (s)
= C (s I A)1 B + D
(96)
U (s)

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Mi, j is the i, jth minor of matrix U and is the determinant of the matrix by
removing the ith row and jth column from U.

Page 73

U2 =

a b
c d

a b c

U3 = d e f

g h i

Part 5: Models of Linear Systems

adj U2 =



e f
+

h i
adj U3 =

..
.







b c






h i

...

(Revised: Sep 10, 2012)

Page 74

EE/ME 701: Advanced Linear Systems

Section 8.1.1

Using Cramers rule, we can symbolically solve Eqn (96). Since

(s I A)
It follows that

EE/ME 701: Advanced Linear Systems

Section 8.1.1

Multiplying out

adj (s I A)
=
det (s I A)

h
1
Y (s)
= 2
U (s) s + 7 s + 10

(100)

Y (s)
1
=
C adj (s I A) B + D
U (s) det (s I A)

(101)

2 s + 19
3s +6

= 28 s + 131
s2 + 7 s + 10

(103)

Today, we wouldnt want to apply Cramers rule by hand for any case larger
than 3x3.
Under the hood, this is how Matlab finds the TF from a state-variable model.

Example

A=

C=

B=

2
3

D = [0]

>> A = [ -2 3 ; 0 -5] B = [ 2 ; 3 ] C = [ 5
>> SSmodel = ss(A, B, C, D)
>> tf(SSmodel)

Transfer function:

Then

det (sI A) = det

(s + 2)

(s + 5)

adj (sI A) = adj

= (s + 2)(s + 5) 0 = s2 + 7 s + 10

(s + 2)

(s + 5)

So the TF is given by:

h
Y (s)
1
= 2
U (s) s + 7 s + 10

Part 5: Models of Linear Systems

(s + 5)

(s + 2)

i (s + 5)
3
2
+ [0] (102)
6
0
(s + 2)
3
(Revised: Sep 10, 2012)

Page 75

6 ] D = 0

28 s + 131
-------------s^2 + 7 s + 10

Note:
The denominator is given by det (s I A), showing the the poles are the
eigenvalues of the A matrix.
The B, C, and D matrices play no role in determining the poles, and thus
the stability
If we had additional inputs (additional columns in B) or outputs (additional
rows in C) the term
C adj (s I A) B
would give an array of numerator polynomials, one for each input/output
pair.
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 76

EE/ME 701: Advanced Linear Systems

Section 8.1.3

From Eqn (96) we can determine the DC Gain of a state-variable model,

s j 0

Section 8.2.0

8.2 Coordinate transformation of a State Variable Model

8.1.2 DC gain of a state-variable model

KDC = lim

EE/ME 701: Advanced Linear Systems

Given a state-variable model with state vector x (t)

Y (s)
= lim C (s I A)1 B + D = C (A)1 B + (D104)
U (s) s j 0
C A1 B + D (105)

x (t) = A x (t) + B u (t)

(106)

y (t) = C x (t) + D u (t)

(107)

And an invertible transformation matrix T giving a new state vector z (t)

8.1.3 Interpreting D

z (t) = T x (t)

A direct transmission term corresponds to a transfer function that is not


strictly proper.

(108)

We can derive a new state model based on state vector z (t).

If D =0 , then

We can say that we have transformed the system from the coordinate
system of x (t) to the coordinate system of z (t).

the number of zeros equals the number of poles.


If D = 0 , we call the system strictly proper

Derivation of the transformation is straight forward. From Eqn (108) we


can solve for x (t)

If D 6= 0 , we call the system is proper, but not strictly proper

x (t) = T 1 z (t)
x (t) = T

z (t)

(109)
(110)

Plugging (108) and (109) into Eqns (106) and (107) gives
T 1 z (t) = A T 1 z (t) + B u (t)

(111)

From (111) we can write


z (t)

T A T 1 z (t) + T B u (t)

y (t) = C T
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 77

Part 5: Models of Linear Systems

(112)

z (t) + D u (t)
(Revised: Sep 10, 2012)

Page 78

EE/ME 701: Advanced Linear Systems

Section 8.2.0

EE/ME 701: Advanced Linear Systems

Eqn (112) gives the transformed state model

Equation (115) is a similarity transform.

b z (t) + Bb u (t)
z (t) = A

b = T A T 1
A

(113)

y (t) = Cb z (t) + D u (t)

with

Section 8.2.0

(114)

A similarity transform preserves eigenvalues,

b = T A T 1
A

(115)

Cb = C T 1

(117)

Bb = T B

 
b
eig (A) = eig A
The poles of system II are the same as the poles of the original system.

(116)

D : unchanged

Coordinate transformation is very powerful

(118)

We can convert a given model with x (t) to an equivalent model with z (t)
by choice of any invertible matrix T

The transformation is illustrated by figure 30.

This directly gives the set of all possible equivalent models !

The input and output are unchanged


Note that the D matrix, which directly couples u (t) to y (t), is unchanged
Only the internal representation of the system is changed

u(t)

A, B, C, D

y(t)

u(t)

A, B, C, D

x(t)

y(t)
z(t)

Figure 30: Block diagrams of original linear state-variable system and


transformed system.

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 79

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 80

EE/ME 701: Advanced Linear Systems

Section 8.2.1

EE/ME 701: Advanced Linear Systems


Suppose we were interested in the suspension deflection

8.2.1 Example coordinate transformation


Consider the example of section 7.2.
m2

zs (t) = zw (t) zv (t)

zv(t)

If our interest was such that we wanted a state model with zs (t) and zs (t)
directly as states, we could introduce the transformation

vcar
ks

bs

m1

zs (t)

kw
r(t)

Inerial Reference

zw (t)
,
x (t) =

zv (t)

zv (t)

1 0 1

x (t) =

y1 (t) = zv (t) =

(kw +ks )
m1

mbs1 + mks1

+ mks2

+ mbs2

mks2

zw (t)

kw
m
+ mbs1
x (t) + 1 r (t)

0
1

0
mbs2

1
0

zw (t)

1 zw (t)
= T x (t)

0 zv (t)

zv (t)
1

(120)

(Revised: Sep 10, 2012)

(119)

a =
x1
x2
x3
x4
c =
y1

Page 81

1 0 1
0
1
0

Original State Variable Model (section 7.2.1)

i
zw (t)
+ [0] r (t)

0 0 1 0

zv (t)

zv (t)

Part 5: Models of Linear Systems

0 1
T =

0 0

0 0

We derived the model


zw (t)

So transformation T is given as:

Figure 31: Quarter vehicle suspension.


zs (t) 0 1

=
z (t) =

zv (t) 0 0


zv (t)
0 0

zw(t)

Road Surface

Section 8.2.1

x1
0
-440
0
4
x1 x2
0
0

x2
x3
1
0
-40
40
0
0
4
-4
x3 x4
1
0

Part 5: Models of Linear Systems

x4
0
40
1
-4

b =
x1
x2
x3
x4
d =
y1

u1
0
400
0
0
u1
0

(Revised: Sep 10, 2012)

Page 82

EE/ME 701: Advanced Linear Systems

Section 8.2.1

EE/ME 701: Advanced Linear Systems

9 State-variable Feedback control

Introduce the transformation


>> T = [ 1 0 -1 0 ; 0 1 0 -1 ; 0 0 1 0 ; 0 0 0 1]
T =
1
0
-1
0
0
1
0
-1
0
0
1
0
0
0
0
1

Put in State Feedback control, control signal u (t) is given as:


u (t) = K x (t) + N f r (t)

x1
x2
x3
x4
c =
y1

x1
0
-444
0
4
x1 x2
0
0

x2
x3
1
0
-44 -400
0
0
4
0
x3 x4
1
0

x4
0
0
1
0

Putting in feedback control is illustrated in figures 33, 34 and 35.


Input
u(t)

b =

u1
0
400
0
0
u1
0

x1
x2
x3
x4
d =
y1

Ap, Bp, Cp, Dp


System

Output
y(t)
x(t)

Figure 33: State-variable model of the open-loop system. This is the plant before
feedback control is applied, u (t) as input, and y (t) as output.
r(t)

r(t)

Nf

u(t)

y(t)

Ap, Bp, Cp, Dp

x(t)

x(t)

-K

Bode Diagram

Step Response

(121)

The control signal depends on the state vector and a reference input.

>> Ahat = T * A * inv(T);


Bhat = T * B
>> Chat = C1 * inv(T);
Dhat = D1
>> SSmodelHat = ss(Ahat, Bhat, Chat, Dhat)
a =

Section 9.0.0

50

1.4

New System
0
Magnitude (dB)

1.2

Figure 34: State-variable model of the closed-loop system with feed-forward gain
in input. The closed-loop system has r (t) as input and y (t) as output.

50

Amplitude

100
0.8

150
0

Phase (deg)

0.6

0.4

0.2

r(t)

90

Acl, Bcl, Ccl, Dcl


New System

0.5

1.5
Time (sec)

2.5

3.5

270
1
10

10

10
Frequency (rad/sec)

10

Of course, the step and frequency response are unchanged. The model
transformation changes only the internal representation of the system.
(Revised: Sep 10, 2012)

x(t)

10

Figure 32: Suspension step and frequency response of transformed model.

Part 5: Models of Linear Systems

y(t)

180

Page 83

Figure 35: State-space model of the closed-loop system, with r (t)as input and
y (t) as output.
Feedback control fundamentally transforms the system. Changing the statevariable model from figure 33 to 35.
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 84

EE/ME 701: Advanced Linear Systems

Section 9.1.0

9.1 Determination of a new model with state feedback

EE/ME 701: Advanced Linear Systems

Section 9.1.0

Plugging the control law into the output equation, we find

To determine the state-variable model of the system with feedback, start


with the open-loop model (figure 33)

y (t) = C p x (t) + D p u (t) = C p x (t) D p K x (t) + D p N f r (t)


= (C p D p K) x (t) + D p N f r (t)

x (t) = A p x (t) + B p u (t)

(16, repeated)

y (t) = C p x (t) + D p u (t)

(17, repeated)

So we can write
y (t) = Ccl x (t) + Dcl r (t)

(125)

Ccl = C p D p K

(126)

Dcl = D p N f

(127)

with

Plugging the control law


u (t) = K x (t) + N f r (t)

(121, repeated)

Eqns (122)-(127) describe how we determine the state-variable model of the


system with feedback control.

into the state equation, we find


x (t) = A p x (t) + B p u (t) = A p x (t) B p K x (t) + B p N f r (t)

State feedback control is fundamentally different from single-loop,


compensated feedback:

= (A p B p K) x (t) + B p N f r (t)

Feedback is based on the state vector

So we can write

There is no compensator transfer function Gc (s)


x (t) = Acl x (t) + Bcl r (t)

(122)

Acl = A p B p K

(123)

Bcl = B p N f

(124)

The control gains form a vector, K Rmn , where m is the number of


inputs.

with

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 85

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 86

EE/ME 701: Advanced Linear Systems

Section 9.2.0

9.2 State-variable feedback example: Inverted pendulum


An inverted pendulum is a mechanism comprising a cart and pendulum that
holds the pendulum upright by feedback control. As illustrated in figure 36.
The system is open-loop unstable

EE/ME 701: Advanced Linear Systems

Section 9.2.0

Linearizing the equations about the operating point


0 = 0 , 0 = 0
For simplification, defining
p = I (M + m) + M m l 2

The system models an aspect of the challenge of rocket launch.

Choosing x (t) and forming the state-space model:

(t)
Ff=B z(t)

z (t)

0
Ap =

z (t)
,
x (t) =

(t)

(t)

z(t)
F(t)

Figure 36: Mechanical schematic of an inverted pendulum.


From Lagranges equations, the equations of motion for the inverted
pendulum:
(M + m) z (t) + b z (t) + m l (t) cos ( (t)) m l 2 (t) sin ( (t)) = F (t)(128)


I + m l 2 (t) + m l g sin ( (t)) + m l z (t) cos ( (t)) = 0 (129)

(I+m l 2 )

p
Bp =

ml
p

Cp =

(I+m l 2 ) b
p

( m2 g l 2 )

mpl b

m g l (M+m)
p

1 0 0 0
0 0 1 0

Dp =

0
0

Parameters

Mass of cart

[kg]

z (t)

Cart position

[m]

Mass of pendulum

[kg]

(t)

Pendulum Angle

[deg]

Length of Pendulum

F (t)

Applied force

[N]

Friction coef.

Inertial of pendulum

g = 9.8

Accel. of gravity

Part 5: Models of Linear Systems

[m]


kg m


2

[m/s2 ]

N
m/s

(Revised: Sep 10, 2012)

Page 87

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 88

EE/ME 701: Advanced Linear Systems

Section 9.2.0

EE/ME 701: Advanced Linear Systems

Example Data:

9.2.1 Designing a pole placement controller

M = 0.5;
i = 0.006;

m = 0.2;
g = 9.8;

>>

p = i*(M+m)+M*m*l^2,

>>

Ap = [

0
0
0
0

Ap =

0
0
0
0

b = 0.1;
l = 0.3;

p =

1
-(i+m*l^2)*b/p
0
-(m*l*b)/p

1.0000
-0.1818
0
-0.4545

>> Bp = [0;
(i+m*l^2)/p;
0;
m*l/p]
Bp =

Section 9.2.1

0
1.8182
0
4.5455

0.0132

0
(m^2*g*l^2)/p
0
m*g*l*(M+m)/p

0
2.6727
0
31.1818

>> Sdesired = [ -3 -4 -5 -6]

0;
0;
1;
0]

0
0
1.0000
0

1
0

7.0310

>> Acl = Ap - Bp * K
Acl =
0
1.0000
0
14.6939
13.9592 -63.2776
0
0
0
36.7347
34.8980 -133.6939

0
-12.7837
1.0000
-31.9592

>> Nf = 1/(Cp*inv(-Acl)*Bp)
Nf =
-8.0816
0

>> Cp = [ 1 0 0 0 ;
0 0 1 0 ]
Cp =

>> K = place(Ap, Bp, Sdesired)


K =
-8.0816
-7.7776
36.2727

0
0

0
1

0
0

>> SScl = ss(Acl, Nf(1)*Bp, Cp, Dp)


a =
x1
x2
x3
x4
x1
0
1
0
0
x2
14.69
13.96 -63.28 -12.78
x3
0
0
0
1
x4
36.73
34.9 -133.7 -31.96
b =

>> Dp = [ 0 ; 0 ]
Dp =
0
0

x1
x2
x3
x4

u1
0
-14.69
0
-36.73

Continuous-time model.

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 89

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 90

EE/ME 701: Advanced Linear Systems

Section 9.2.1

With linear time-invariant systems there is a very nice optimal control result.
Consider the cost function

Step Response
1

0.5

J=

x (t)T Q x (t) + u (t)T R u (t) dt

(130)

t=0

0
Amplitude

Section 9.2.2

9.2.2 LQR Design

>> step(SScl)

To: Out(1)

EE/ME 701: Advanced Linear Systems

0.5

This cost function is minimized by a suitable choice of controller K

0.4

To: Out(2)

0.2

First LQR controller example

0.2

0.4

>> Q = diag( [1 1 1 1])


0

0.5

1.5
Time (sec)

2.5

3.5

Q =

Figure 37: Step response of Inverted Pendulum with control.

1
0
0
0

0
1
0
0

0
0
1
0

0
0
0
1

>> R = 1
R = 1
>> K = lqr(Ap, Bp, Q, R)
K =
-1.0000
-2.0408
>> Acl = Ap - Bp * K
Acl =
0
1.0000
1.8182
3.5287
0
0
4.5455
8.8217

20.3672

0
-34.3585
0
-61.3961

3.9302

0
-7.1458
1.0000
-17.8646

>> Nf = 1/(Cp*inv(-Acl)*Bp)
Nf =
-1.0000
0
Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 91

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 92

EE/ME 701: Advanced Linear Systems

Section 9.2.2

Step Response
1.5

To: Out(1)

0.5

Amplitude

0.5
0.1

0.05

Step Response
1.5

0.05

1
0

4
Time (seconds)

To: Out(1)

0.1

Section 9.2.2

Example use of Brysons rules, to accelerate the response, place greater


cost on position error
>> Q = diag( [100 1 100 1])
Q =
100
0
0
0
0
1
0
0
0
0
100
0
0
0
0
1
>> K = lqr(Ap, Bp, Q, R)
K = -10.0000
-8.2172
38.6503
7.2975
>> Acl = Ap - Bp * K
>> Nf = 1/(Cp*inv(-Acl)*Bp)
Nf = -10.0000
0
>> Poles2 = eig(Acl)
Poles2 =
-6.9654 + 2.7222i
-2.2406 + 1.7159i
-6.9654 - 2.7222i
-2.2406 - 1.7159i

>> Poles1 = eig(Acl)


Poles1 =
-8.3843
-3.7476
-1.1020 + 0.4509i
-1.1020 - 0.4509i

To: Out(2)

EE/ME 701: Advanced Linear Systems

Figure 38: Step response of first LQR controller.

0.5

Amplitude

0.5
0.4

Second LQR controller example

0.2
To: Out(2)


Brysons rules: choose elements of Q and R to be 1/ x2i , where xi is the
allowed excursion of the ith state.

0.2

0.4

4
Time (seconds)

Figure 39: Step response of second LQR controller.


Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 93

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 94

EE/ME 701: Advanced Linear Systems

Section 10.0.0

10 Conclusions
Weve seen
Some of the basic properties of system models, and classification of signals.
How to build a state-variable model in 5 steps:
1. Write the relevant relations for the system
2. Identify the differential equations
3. Write the differential equations in state-variable form
4. Write the equation of the output signal (or signals) using the states and
inputs
5. Check units throughout, to verify correctness.
Advantages of state-variable modeling
List at least three reasons why state-variable models are advantageous
relative to differential equation modeling
Construction of several example state-variable models
Basic operations on a state variable model, including
Determining the transfer function
Coordinate transformation
State feedback control.

Part 5: Models of Linear Systems

(Revised: Sep 10, 2012)

Page 95

Student
exercise

EE/ME 701: Advanced Linear Systems

EE/ME 701: Advanced Linear Systems

Solutions to the State Equation and


Modes and Modal Coordinates

Section 1.1.0

1 Modal coordinates
1.1 Derivation of modal coordinates

Contents

Considering the response of a linear system:

1 Modal coordinates
1.1

Derivation of modal coordinates . . . . . . . . . . . . . . . . . .


1.1.1

x (t) = A p x (t) + B p u (t)

(1)

y (t) = C p x (t) + D p u (t)

(2)

Choose the basis vectors to be the columns of the modal


matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1.2

Example transformation into modal coordinates . . . . . .

1.1.3

Interpretation of the transformations

. . . . . . . . . . .

10

. . . . . . . . . . . . . . . . . . .

12

1.2

Example, a double pendulum

1.3

Transformation of the state model to and from modal


coordinates (Similarity transform from A p to Am or back) . . . . .
1.3.1
1.3.2

18

1 (t)

(t) = ...

n (t)

19

2.1

Example, double pendulum revisited . . . . . . . . . . . . . . . .

2.2

General form for combining complex conjugate parts of a 2nd


order mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

2.3

Deriving real basis vectors for a complex mode . . . . . . . . . .

25

2.4

Example with 2nd order modes, converting basis vectors to real . .

27

3 x (t) = A p x (t) defines A p -invariant spaces

(3)

where the i (t) are the basis coefficients and

Example, second-order system, 2 first-order modes.

19

is the representation of x (t) on basis vectors


Writing M =

e1 , e2 en

o
,

x (t) = M (t)

e1 , e2 en

o
.

so (t) = M 1 x (t)

(4)

30

4 Conclusions
Part 6: Response in Modal Coordinates

x (t) = i (t) ei

16
17

2 Complex eigenvalue pairs

Given a state vector x (t) Rn and a set of basis vectors {ei } for Rn , we
know that we can represent x (t) on basis {ei } by

i=1

Case of a full set of real, distinct eigenvalues


(diagonalizing A p ) . . . . . . . . . . . . . . . . . . . . .
. . .

where A p, B p , C p, D p form the model in physical coordinates.

31
(Revised: Sep 10, 2012)

Page 1

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 2

EE/ME 701: Advanced Linear Systems

Section 1.1.0

Likewise, we can represent the input signal, B p u (t) on the same basis:

Section 1.1.1

1.1.1 Choose the basis vectors to be the columns of the modal matrix

i (t) ei = B p u (t)

EE/ME 701: Advanced Linear Systems

(5)

i=1

However, when the ei are the columns of the modal matrix of A p (assuming
for now a complete set of independent eigenvectors), then the terms

where the i (t) are the basis coefficients representing B p u (t) on {ei }.
Writing

i (t) A p ei

i (t) i ei

become

i (t) ei = M (t)

(6)
And Eqn (7) becomes

i=1

then
B p u (t) = M (t)

so (t) = M 1 B p u (t)

i=1

If we expand x (t) in the state equation using the representation on {ei }, we


find:
from:
x (t) = A p x (t)
+ B p u (t)
n

i (t) ei

i=1

i=1

i=1

i (t) A p ei + (t) ei

(7)

x (t) A p x (t) B p u (t) = 0


n

i (t) ei i (t) A ei (t) ei

i=1

i=1

= 0

(8)

i=1

Even though the i (t) and i (t) terms in Eqn (8) are scalar, because of the
middle term, with matrix A p, Eqn (8) leads in general to the vector equation
n

i=1

i (t) ei i (t) A ei i (t) ei = 0


i=1

i=1

i=1

(10)

Since the ei are independent, Eqn (10) is verified only if each term in
parenthesis is zero, which gives a set of simultaneous scalar equations (see
Bay Eqn (6.25))


i (t) i (t) i i (t) = 0

Rearranging the state equation gives (see Bay, Eqn (6.24))


from:




n 
i (t) ei i (t) i ei i (t) ei = i (t) i (t) i i (t) ei = 0

(8, repeated)

i=1

Eqn (11) we know how to solve:


i (t) = i (t0 ) e (tt0 ) +

t0

e (t) i () d

(12)

And the full solution is given from Eqn (3)

i=1

(Revised: Sep 10, 2012)

Z t

(9)
x (t) = M (t)

which is not especially helpful.


Part 6: Response in Modal Coordinates

(11)

Representing the state and input on the modal matrix of A p , an nth order
coupled differential equation becomes a set of n first order uncoupled
differential equations !

i=1

!
n
n
i (t) I
i (t) A i (t) I ei = 0

i (t) = i i (t) + i (t)

or

Page 3

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 4

EE/ME 701: Advanced Linear Systems

Section 1.1.1

The basis vectors ei that decouple the system are the columns of the modal
matrix
i
h
1
(13)
e1 e2 en = M , recall A p = M J M
Eqn (13) is the general form.
eigenvectors exists, then
M =V ,

J =U

EE/ME 701: Advanced Linear Systems


1.1.2 Example transformation into modal coordinates
Consider the system governed by

When a complete set of independent

and

A p = V U V 1

(14)

Section 1.1.2

x (t) =

0.164

0.059

0.059

0.164

with initial condition

where V is the matrix of eigenvectors and U has the eigenvalues on the main
diagonal.
The eigensystem is:

x (t) +

x (t0 ) =

1
0

1.085
0.031

u (t)

(15)

>> [V, U] = eig(Ap)


V =
-0.7071
-0.7071
0.7071
-0.7071
U =

-0.2231
0

0
-0.1054

There is a complete set of independent eigenvectors. The modal matrix is


given by:

1 1
1

(16)
V , M=
M=
0.707
1 1
Eqn (16) illustrates that the basis vectors of the modal matrix can be
scaled by a parameter (actually each vector can be scaled independently).
By scaling M to 1.0, the some of the coefficients below get simpler.

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 5

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 6

EE/ME 701: Advanced Linear Systems

Section 1.1.2

The i are given from Eqn (5), which can be rewritten as

1 (t)

.. = M (t) = B u (t)
.
p

n (t)

so
(t) = M

B p u (t) =

0.527

0.558

The initial conditions are given from

u (t)

EE/ME 701: Advanced Linear Systems

Section 1.1.2

For a step input, which gives constant i (t), the form for the solution is:

1  t
e 1

i (t) = i (0) et + i
(17)

(20)

which gives

1  0.223t
1
e
0.223

1  0.105t
0.527
1
e
0.105

1 (t) = 0.5 e0.223t 0.558


(18)

2 (t) = 0.5 e0.105t

(21)

The transformation back to physical coordinates is given by:


(t0 ) = M 1 x (t0 ) =

0.5
0.5

(19)

And the two uncoupled solutions are (with t0 = 0)


1 (t) = 0.5 e0.223t 0.558

Z t

e0.223 (t) u () d

2 (t) = 0.5 e0.105t 0.527

Z t

e0.105 (t) u () d

x (t) = M (t) = e1 1 (t) + e2 2 (t) =

1
1

1 (t) +

1
1

2 (t) (22)

Eqn (22) can be used in two ways


1. Eqn (22) shows that the output x (t) is the superposition of n
contributions. Each contributing vector is a basis vector. They are the
columns of the modal matrix. And each has a basis coefficient i (t).

2. We can add up the contributions for an individual xi (t), such as:






x1 (t) = .5 e0.223t + 2.5 1 e0.223t + .5 e0.105t + 5 1 e0.105t
(23)

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 7

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 8

EE/ME 701: Advanced Linear Systems

Section 1.1.2

Putting the pieces together:

EE/ME 701: Advanced Linear Systems

Section 1.1.3

1.1.3 Interpretation of the transformations


From

1. By transformation into modal coordinates, the states are uncoupled.

rT1

..
(t) = M 1 x (t) =
.

rTn

2. The uncoupled equations can be solved


3. The solution in physical coordinates is found by transformation from modal
coordinates back into physical coordinates.
The order of an equation in the uncoupled equations can be:

x (t)

(24)

the jth row of M 1 gives the coupling of the physical states into the jth
mode. We can write that
1 (t) = rT1 x(t) = hr1, x(t)i
..
.

First, for a real pole


Second, for a complex pole pair

(25)

n(t) = rTn x(t) = hrn , x(t)i

Eqn (11) works directly for complex pole pairs, but it may be more
convenient to organize each pair of complex poles into a 2nd order realvalued mode (see section 2).
nth order for an n n Jordan block.
When there is not a complete set of eigenvectors, the Jordan form is used,
and gives l coupled states for each chain of l eigenvectors (one regular
and l 1 generalized).

From
x (t) = M (t)

(26)

each mode contributes to the physical states according to a column of M,


and each physical state is determined by a combination of modes according
to a row of M.

The response of a linear system can be thought of as a collection of firstand second-order mode responses with forcing functions.
Through the M matrix, the response of the physical states of the system
(voltages and velocities, say), generally involves a superposition of all of
the modes of the system.

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 9

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 10

EE/ME 701: Advanced Linear Systems

Section 1.1.3

EE/ME 701: Advanced Linear Systems

1.2 Example, a double pendulum

From (assuming B p is constant)


= M 1 B p

(27)

the input is coupled to each mode according to the elements of . Following


the example of Eqn (25)

i = ri , B p
(28)

Consider 2 masses connected by a spring, this system will have two oscillatory
modes:
a lower frequency mode in which the masses swing in phase, and
a higher frequency mode in which the mass swing in opposite phase.

Notice in particular that if there is an element in that is zero, there is no


forcing function for that mode in Eqn (12).
i (t) = i (t0 ) e (tt0 ) +

Z t
t0

e (t) i () d

Section 1.2.0

2
k

(Eqn (12), repeated)


u

Since the modes are uncoupled, if a i is zero, the input u (t) is not
connected to that mode.

m1

m2

Figure 1: Two mass system at rest.

We call the system expressed in terms of (t) and using Am , Bm , Cm , D p


as the system in modal coordinates, because each state variable is uniquely
associated with a mode.
In Phase

Opposite Phase

Figure 2: Modes of the two mass system.

The linearized equations of motion are:


m1 l 2 1 + d 1 + (k + m g l) 1 = k 2 + u
m2 l 2 2 + d 2 + (k + m g l) 2

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 11

Part 6: Response in Modal Coordinates

= k 2

(Revised: Sep 10, 2012)

(29)
(30)

Page 12

Section 1.2.0

With m = 2, l = 1, d = 3, k = 20, and g = 9.8 , we find:

1 (t)
1.50 19.80 0.00 10.00

1 (t)
1.00
0.00
0.00
0.00
, Ap =

x (t) =

2 (t)
0.00 10.00 1.50 19.80

2 (t)
0.00
0.00
1.00
0.00

EE/ME 701: Advanced Linear Systems

Section 1.2.0

0.5

Theta 1

EE/ME 701: Advanced Linear Systems

(31)
0.5
0

10

5
6
t [seconds]

10

Theta 2

0.5

0.5
0

Figure 4: Two masses swinging in phase (mode 2, mode 1 not excited).


2
Theta 1 [degrees]

A slower, in-phase mode, excited from the initial condition (figure 4)


h
iT
xT (0) = 0 1 0 1

The system has two 2nd order modes:

A faster, out-of-phase mode, excited from the initial condition (figure 3)


h
iT
xT (0) = 0 1 0 1

We can excite the modes individually. Setting up initial conditions that lie
in the column space formed by the basis vectors of the mode.
General initial conditions give motion that is a superposition of all modes.

1
0

10

5
6
t [seconds]

10

1
Theta 2 [degrees]

Theta 1 [degrees]

1
0.5
0

0
0.5

0.5
1
0

1
0
1

10

Figure 5: Two masses swinging, both modes excited. (Figure 5 is the


superposition of figures 4 and 3.)

1
Theta 2 [degrees]

0.5

0.5
0
0.5

Notes:
1
0

5
6
t [seconds]

10

Figure 3: Two masses swinging out of phase (mode 1, mode 2 not excited).
Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 13

The unforced response (to initial conditions) is considered in figures 3-5.


The forced response (to u(t)) can also be understood in terms of modes.
Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 14

EE/ME 701: Advanced Linear Systems

Section 1.2.0

For an n dimensional system there may be n1 first order modes and n2


second order modes, where n = n1 + 2 n2
In modal coordinates, we can write the system dynamics as:
(t) = Am (t) + (t)

1.3 Transformation of the state model to and from modal


coordinates (Similarity transform from A p to Am or back)

(32)
Choosing the modal matrix M as a special transformation matrix, and
introducing (t), the state vector in modal coordinates,
x(t) = M (t)

Am = M 1 A p M

(33)

A p = M Am M 1

(34)

If the system has second order modes, Am will have 2x2 blocks, one
corresponding to each 2nd order mode.

Then:
(t) =

Or:
(t) = = Am (t) + Bm u (t)
y (t) = Cm (t) + D p u (t)
Where:
Am = M 1 A p M
Bm = M 1 B p
Cm = C p M

Modes give distinct contributions to the total output, such as

second order:



M 1 A p M x(t) + M 1 B p u(t)

y (t) = C p M (t) + D p u (t)

If the system requires the Jordan form, Am will have a block


corresponding to each Jordan block.
The important physical property of modes is that they are uncoupled. In
figure 4 mode 1 evolves without mode 2, and likewise in figure 3. In figure
5 both modes evolve, independently.

(35)

(t) = M 1 x(t)

If the system has only first order modes, Am is diagonal (all blocks are
1x1).

first order:

Section 1.3.0

Weve seen that we can transform a state model to a new set of basis vectors
with a transformation matrix.

where Am is a block diagonal matrix,

and

EE/ME 701: Advanced Linear Systems

In the general case, Am is block diagonal:

y(t) = a1e1t

1x1 block for every real eigenvalue

y(t) = a2e2t cos(2t + 2)

2x2 block for every pair of complex eigenvalues


Modal coordinates are best (and perhaps only) understood in state space.
Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 15

l l block for every l l block in the Jordan form (if needed)


Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 16

EE/ME 701: Advanced Linear Systems

Section 1.3.1

1.3.1 Case of a full set of real, distinct eigenvalues


(diagonalizing A p)

U =

1
...
n
..
.

..
.

Section 1.3.2

1.3.2 Example, second-order system, 2 first-order modes.

When we have real and distinct eigenvalues we will have a complete set of
independent eigenvectors, and

EE/ME 701: Advanced Linear Systems

Given the system of section 1.1.2. Recall that the solution for x (t) is of the
form

1
1
1 (t) +
2 (t)
x (t) = M (t) = e1 1 (t) + e2 2 (t) =
1
1
Figure 6 is a plot of the phase portrait, showing the state trajectory from
initial condition

1
x (0) =
0

V = v1 vn

..
..
.
.

A phase portrait is a plot of the state as a function of time.


It can be in any coordinate frame.
It will be an n-dimensional plot, for an nth order system.

Then
M =V

and Am = M

Ap M

(36)
1

The modal matrix is the eigenvector matrix.

x2

The system matrix in model coordinates, Am, is the diagonal matrix of


eigenvalues.

e2

x2

0.5

x(t)

x1

0.5
e1
1
0.5

0.5
x1

1.5

Figure 6: Response of a system with two first-order modes. The x show the
state at 1.0 second intervals.
Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 17

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 18

EE/ME 701: Advanced Linear Systems

Section 2.1.0

2 Complex eigenvalue pairs

EE/ME 701: Advanced Linear Systems

When each pole is distinct, the state-response of a system is the


superposition of the individual modal responses. Putting together Eqns (25)
and (24) above, the response due to initial condition and each mode is given
by:

Complex pole pairs correspond to 2nd order, oscillatory modes.

2.1 Example, double pendulum revisited


Considering the double pendulum example of section 1.2, the A p matrix
gives the eigenvectors and eigenvalues:

0.70 + j 0.00, 0.70 j 0.00, +0.68 + j 0.00, +0.68 j 0.00

+0.02 + j 0.13, +0.02 j 0.13, 0.05 j 0.21, 0.05 + j 0.21


V =

+0.70 + j 0.00, +0.70 j 0.00, +0.68 + j 0.00, +0.68 j 0.00

0.02 j0.13, 0.02 + j0.13, 0.05 j 0.21, 0.05 + j 0.21

Am =

0.75 + j 5.41

0.00

0.00

0.00

0.00

0.75 j 5.41

0.00

0.00

0.00

0.00

0.75 + j 3.04

0.00

0.00

0.00

0.00

Section 2.1.0

0.75 j 3.04

x (t) = e1 1 (t) + e2 2 (t) + e3 3 (t) + e4 4 (t)

(37)

i (t) ei = eit ei rTi x(0)

(38)

with

When the poles are complex, Eqns (37), (38) are none-the-less valid. For
the example above,

1 (t) e1

2 (t) e2

e(0.75+ j5.41)t

e(0.75 j5.41)t

0.70 + j 0.00

+0.02 + j 0.13

+0.70 + j 0.00

0.02 j0.13

0.70 j 0.00

+0.02 j 0.13

+0.70 j 0.00

0.02 + j0.13

h
i

0.36 j 0.05, +0.00 j 1.98, +0.36 + j 0.05, 0.00 + j 1.98 x (0)

h
i

0.36 + j 0.05, +0.00 + j 1.98, +0.36 j 0.05, 0.00 j 1.98 x (0)

Making the example very specific, with

Corresponding to 2 complex modes (figures 3 and 4).

x (0) =

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 19

Part 6: Response in Modal Coordinates

and

t = 0.6

(Revised: Sep 10, 2012)

Page 20

EE/ME 701: Advanced Linear Systems

Section 2.1.0

Matlab code to evaluate Eqn (37):

EE/ME 701: Advanced Linear Systems

Section 2.1.0

The example above illustrates

>> [V, U] = eig(Ap);


>> Vinv = inv(V);
>> v1 = V(:,1)
v1 =
-0.6955 + 0.0000i
0.0175 + 0.1262i
0.6955
-0.0175 - 0.1262i
>> r1 = Vinv(1,:)
r1 =
-0.36 - 0.05i
0.00 - 1.98i
>> v2 = V(:,2) %% v1*
>> r2 = Vinv(2,:) %% r1*
>> x0 = [ 1;2;-1;-2]
>> t = 0.6
>> xi1 = exp(U(1,1)*t) * v1 * r1 * x0
xi1 =
0.0477 - 3.5723i
-0.6494 + 0.0813i
-0.0477 + 3.5723i
0.6494 - 0.0813i
>> xi2 = exp(U(2,2)*t) * v2 * r2 * x0
xi2 =
0.0477 + 3.5723i
-0.6494 - 0.0813i
-0.0477 - 3.5723i
0.6494 + 0.0813i

If i is complex, ei and ri will in general also be.


In this case Eqn (37) can make a complex contribution xi (t), but complex
terms always come in complex conjugate pairs, and two equations like
(37) make up the mode.
We can combine the two 1st order complex terms into a single 2nd order
real term.
0.36 + 0.05i

-0.00 + 1.98i

>> RoundByRatCommand(xi1+xi2) %% First and second terms


ans =
0.0954
%% create a real-valued 2nd order mode
-1.2988
-0.0954
1.2988
Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 21

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 22

EE/ME 701: Advanced Linear Systems

Section 2.2.0

EE/ME 701: Advanced Linear Systems

2.2 General form for combining complex conjugate parts of a


2

nd

Section 2.2.0

All the imaginary terms cancel (as they must !) and the result reduces to:

order mode

x(t) = 2 a er rr T b er ri T a ei ri T b ei rr T x (0)

Considering two terms from Eqn (37), a 2nd order mode makes a
contribution to x (t) according to:

x(t) = [ (er + jei ) (er jei ) ]

...
e(+ j )t

e( j )t

..
.

(rr + jri )T

(rr jri )T

..
...
.
(39)

Define a + j b as
a + j b = e(+ j )t = et (cos( t) + j sin( t))

(40)

Or, equivalently,
a = et cos ( t) ,

Which is given by:

x(0)

x(t) = [ er ei ]

b a

rTr
rTi

x(0)

(42)

Considering the numerical example above


>> alpha = real( U(1,1) )
alpha = -0.7500
>> omega = imag( U(1,1) )
omega =
5.4072
>> a =
a =
>> b =
b =

b = et sin ( t)

(41)

exp(alpha*t) * cos(omega*t),
-0.6343
exp(alpha*t) * sin(omega*t),
-0.0654

Multiplying out Eqn (39) gives:


x(t) = [a er rr T + j a er ri T + j b er rr T b er ri T
+ j a ei rr T a ei ri T b ei rr T j b ei ri T
a er rr T j a er ri T j b er rr T b er ri T
j a ei rr T a ei ri T b ei rr T + j b ei ri T ] x(0)

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 23

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 24

EE/ME 701: Advanced Linear Systems

Section 2.3.0

2.3 Deriving real basis vectors for a complex mode

EE/ME 701: Advanced Linear Systems

Section 2.3.0

When M has a complex conjugate pair of eigenvectors

Going back to the definition of modal coordinates, starting with

M = ei ei

x (t) = A p x (t) + B p u (t)


Convert to modal coordinates

ei = er + j e j

with

(44)

The 2 2 element in N that selects the real and imaginary parts of ei is:

(t) = = Am (t) + Bm u (t)

11 j
N2 =
2 1 j

with
Am = M 1 A p M

Bm = M 1 B p

Then

Am

(Revised: Sep 10, 2012)

(45)

...
|

(43)

Part 6: Response in Modal Coordinates

j j

Now considering Am . The complex pair of eigenvectors will have complex


conjugate eigenvalues

and

M = er e j

| |

Then

Am = N 1 M 1 A p M N = N 1 Am N

N2

...

N =

We can introduce a modification to the modal matrix by:

M 1 = N 1 M 1

N21 =

which gives

Choose N to be

M = MN

Page 25

1
= N Am N =

...
1

j j

Part 6: Response in Modal Coordinates

...

...
i
i
...



1

2

(Revised: Sep 10, 2012)

...

(46)

1 j

...
1

Page 26

EE/ME 701: Advanced Linear Systems

Section 2.4.0

...

=
2


...

Note, if it is convenient, we can scale the columns of M as desired. For


example, scaling the largest element of each column to 1.0 gives

(47)

(48)

2.4 Example with 2nd order modes, converting basis vectors to

1.00

1 j
N=
2
0

0
then

j 0 0

0 1 1

0 j j

Finding the system matrix in Modal coordinates,

real
The double pendulum has two second order modes. Using the example from
section 2.1, choose

0.03 1 0.08 1

M =

1.00 0
1
0

0.03 1 0.08 1

where
= + j

Section 2.4.0

...



+
j ( )
=

j ( ) +

EE/ME 701: Advanced Linear Systems

0.75 5.41

5.41
1 1
1
Am = N M A p M N = N Am N =

0.75
0
0

0.75 3.04

3.04 0.75
0

The first mode oscillates in a vector subspace of state space, given by


S1 = {x : x = 1 e1 + 2 e2 }

0.70

0.00

0.68

0.02 0.13 0.05 0.21


M =NM =

0.70 0.00 0.68


0

0.02 0.13 0.05 0.21

Part 6: Response in Modal Coordinates

where e1 and e2 are the first two columns of M .

(Revised: Sep 10, 2012)

The second mode oscillates in a vector subspace of state space, given by


S2 = {x : x = 3 e3 + 4 e4 }

where e3 and e3 are the third and fourth columns of M .

Page 27

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 28

EE/ME 701: Advanced Linear Systems

Section 2.4.0

For example, the initial condition of figure 3 in physical and modal


coordinates is:

1
,
x (0) =

(0) = M p

Section 3.0.0

3 x (t) = A p x (t) defines A p-invariant spaces


Considering the unforced response of

1.0

x (0) =

EE/ME 701: Advanced Linear Systems

x (t) = A p x (t) + B p u (t)

(49)

and the initial condition of Eqn (49) excites only the first (faster) mode.

(that is, u (t) = 0), a real i and ei define a 1-D A p -invariant subspace. That
is, if
x (t = 0) {x : x = ai ei }
then
x (t) {x : x = ai ei }

x (0) =
,
0

1

(0) = M p

x (0) =

1.0

If
(50)

(Revised: Sep 10, 2012)

x (0) {x : x = a1 er + a2 ei }
then
x (t) {x : x = a1 er + a2 ei } t > 0

and the initial condition of Eqn (50) excites only the second (slower) mode.

Part 6: Response in Modal Coordinates

(51)

In the same way, complex eigenvalues define a 2-D A p-invariant subspace,


with
and
ei = imag (vi )
(52)
er = real (vi )

The initial condition of figure 4 is

t > 0

Page 29

(53)

Eqn (53) implies that rr , ri {x : x = a1 er + a2 ei } , which is verified


numerically for the example above.

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 30

EE/ME 701: Advanced Linear Systems

Section 4.0.0

4 Conclusions
The response of linear dynamic systems can be decomposed into the
responses of individual modes.
The modes are decoupled: if the input or initial condition excites only
one mode, only that mode responds.
Modes are either
First order, or
Second order, or
Correspond to a Jordan block
The basis vectors of the modes (vectors ei in Eqn (3) et seq.) are the bases
of A p -invariant subspaces in state-space, defined by the modes.
The modal matrix M is used to transform between modal and physical
coordinates.
The forcing function, u (t), can also be decomposed into forcing functions
for each of the modes.

Part 6: Response in Modal Coordinates

(Revised: Sep 10, 2012)

Page 31

EE/ME 701: Advanced Linear Systems

EE/ME 701: Advanced Linear Systems

Phase Portraits and Lyapunov Stability

Section 1.0.0

1 Introduction to Stability Theory


We will consider the stability of

Contents
Linear time-invariant systems
1 Introduction to Stability Theory

Linear time-varying systems

2 The phase plane (or phase portrait)

Nonlinear systems

2.1

Phase space in higher dimensions . . . . . . . . . . . . . . . . . .

2.2

Local stability, definitions . . . . . . . . . . . . . . . . . . . . . .

2.3

Local stability, additional terminology . . . . . . . . . . . . . . .

13

2.4

Determining local stability . . . . . . . . . . . . . . . . . . . . .

14

3 Limit Cycles

15

4 Lyapunov stability

17

4.1

Generalization of Lyapunovs energy function: . . . . . . . . . . .

20

4.2

Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . .

21

4.3

Lyapunov stability of a linear system

22

. . . . . . . . . . . . . . .

easy, given by the eigenvalues


harder, some unexpected results
in general quite hard

Stability of linear time-invariant systems is relatively simple:


Examine poles (eigenvalues of the system matrix)
LTI systems can show exponential decay, marginal stability, or
exponential growth
Linear time-varying systems:
We cant look at the succession of instantaneous systems
Even if the eigenvalues are always in the left half-plane, the system can
be unstable
In discrete time, a sequence of stable systems can be unstable.
More powerful tools needed.

5 Summary

23
For nonlinear systems the picture is even more complex:
Local stability,
Properties for small inputs may not extend to large input
Stability may be input dependent
Stability may be initial-condition dependent

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 1

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 2

EE/ME 701: Advanced Linear Systems

Section 1.0.0

EE/ME 701: Advanced Linear Systems

Section 2.0.0

2 The phase plane (or phase portrait)

A tool is needed to understand stability properties in general cases


For time-varying and nonlinear systems, various flavors of stability:

For general systems given by:

Stable (or delta-epsilon stable)


x (t) = f (x, u, t)

If the system starts within of the equilibrium point, it never goes


outside of distance from the equilibrium point.

(1)

y (t) = g (x, u, t)

Uniform stability
we need a tool for looking at the behavior of the system.
Stability is not dependent on the time (for time-varying systems).
One general and powerful tool is the phase plane.
Asymptotic stability
Example, consider the undamped system :

As t , the state draws ever closer to the equilibrium point.

d 2y
+ 2 y = 0
dt 2

Exponential stability
As t , the state draws ever closer to the equilibrium point at
least as fast as some exponential curve.

x (t) =

x (t) =

0 2
1

Undamped oscillator response

A time-plot shows:

1
0.8
0.6

State signals

0.4
0.2
0
0.2
0.4
0.6
0.8
1
0

10

15

20

25

time [sec]

An nth order system will have n curves in the time plot.


Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 3

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 4

EE/ME 701: Advanced Linear Systems

Section 2.0.0

Phase plane (continued)

EE/ME 701: Advanced Linear Systems

Section 2.0.0

Phase plane (continued)

To draw the phase portrait, take away time, and make the axes of the plot
the states (2-D plot for 2nd order system, 3-D plot for 3rd order system, etc.)

A path is called a phase trajectory or simply a trajectory .


We can consider any number of trajectories, to create a phase portrait.

Undamped oscillator response

Undamped oscillator response


1

1.5

Velocity [m/s]

0.5

Velocity [m/s]

0.5

0.5

0.5

0.5

0
Position [m]

0.5

Figure 1: Trajectory on phase plane.


1.5
1.5

0.5

Consider the damped oscillator.

0
Position [m]

0.5

1.5

Figure 3: Phase portrait, several trajectories.

Undamped oscillator response

1
Undamped oscillator response

Each point on a phase plane corresponds to one specific value of state, x .

0.5
Velocity [m/s]

0.8

State signals

0.6
0.4

An important property of phase portraits comes from the very definition of


state:

0.2

0.5
0

What is state, it is all the information needed to determine the future


trajectory of the system (given inputs).

0.2

1
0.4
0.6
0

4
time [sec]

0.5

0
Position [m]

0.5

Corollary: the trajectory departing from a point xa is a function only of


xa , with no dependence on how the trajectory arrived to state xa .

Figure 2: Damped oscillator.


Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 5

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 6

EE/ME 701: Advanced Linear Systems

Section 2.0.0

Phase plane (continued)

Section 2.2.0

2.1 Phase space in higher dimensions

At each point in phase-space (at each possible state) there is a direction of


departure.

The concept of a trajectory through phase space extends naturally to higherorder systems.

We can plot the phase arrows along trajectories, or at any point, such as grid
points.

There are two challenges:

Undamped oscillator response, vector field


1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0
0.2

An equilibrium point is a steady-state operating point of a system, it is a


point xe where:
x = f (xe , u, t) = 0
(2)

0.4

0.6

0.6

0.8

0.8

2.2 Local stability, definitions

0.2

0.4

For homogeneous linear systems the origin is always an equilibrium


point
x = A 0 = 0

1
1

0.5

0
Position [m]

0.5

1. We cant easily plot 3rd and higher order phase portraits.


2. The theorem that phase trajectories can not cross is less useful.

Undamped oscillator response, vector field

Velocity [m/s]

Velocity [m/s]

EE/ME 701: Advanced Linear Systems

(a) Along trajectories

0.5

0
Position [m]

0.5

If A has a null space, there will be infinitely many equilibrium points


(a null space in A corresponds to a pole at the origin, consider they
dynamics of a car, there are infinitely many places your car can stop).

(b) At grid points

Figure 4: Phase portrait, plotting x vectors at x points.

To plot 4(b), we need only be able to compute x (x), it is not necessary to


solve the differential equation.

If there is an input, the equilibrium will be shifted:


x (t) = A x (t) + B u ,

xe : A xe + B u = 0

A handy feature for nonlinear systems.


In 2-D we have powerful theorem: phase trajectories can not cross.

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 7

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 8

EE/ME 701: Advanced Linear Systems

Section 2.2.0

EE/ME 701: Advanced Linear Systems

Section 2.2.0

Equilibrium points come in four flavors


1. Stable node

3. Saddle point

2. Unstable node

4. Center

Stable node, the phase trajectories travel into xe ,


Unstable node, the phase trajectories travel away from xe ,

Figure 6: Stable and unstable equilibrium points, 2nd order modes.

Figure 5: Stable and unstable equilibrium points, 1st order modes.

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 9

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 10

EE/ME 701: Advanced Linear Systems

Section 2.2.0

Saddle Point: Trajectories move in on some paths and out on others.

EE/ME 701: Advanced Linear Systems

Section 2.2.0

Center, marginal stability (the first example).

Figure 8: Phase portrait for marginal stability.

Figure 7: Saddle points, system has both stable and unstable modes.

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 11

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 12

EE/ME 701: Advanced Linear Systems

Section 2.3.0

2.3 Local stability, additional terminology

EE/ME 701: Advanced Linear Systems

Section 2.4.0

2.4 Determining local stability

As you would expect, a variety of terminology is applied to the


characteristics of differential equations.

Consider again the most general model,


x = f (x, u, t)

In the controls literature, one will sometimes see an equilibrium point


referred to as:

(3)

With x = xe + x , x (x) is given by the Taylor series expansion by:


A fixed point
x (x) = x (xe ) +

An attractor (if the system is stable).


If the fixed point is called an attractor , we can call the region (of
state space) from which trajectories converge to the attractor the basin of
attraction.
Figure 7.2 of Bay illustrates multiple attractors and basins of attraction of a
pendulum. Notice stable and unstable equilibria.



f (x, u, t)
x + O x2
x
x=xe

(4)

0+J x

The derivative of ha vector is a matrix of thei individual scalar derivatives.


T
, then:
With f (x, u, t) = f1 () f2 () fn ()

f1
x1
f2
x1

f1
x2
f2
x2

..
.
..
.

f1
xn
f2
xn

f (x, u, t)
J=
=

x

x=xe

.. fn
fn fn
. xn
x1 x2

x=xe

Near to xe , the first-order term will dominate and the stability properties
(stable vs. unstable vs. saddle point vs. center) is determined by the
f (x, u,t)
, which is sometimes called the Jacobian matrix.
eigenvalues of
x
Local stability is usually computable, because we can usually form the
Jacobian matrix, even for non-linear systems.
Figure 9: Multiple attractors and basins of attraction (Bay figure 7.2).
Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 13

Local stability is a limited result. For example, even for a stable fixed point,
it may be difficult to determine the size of the basin of attraction.
Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 14

EE/ME 701: Advanced Linear Systems

Section 3.0.0

3 Limit Cycles

EE/ME 701: Advanced Linear Systems

Section 3.0.0

Nonlinear systems have trajectories which are closed curves.

Contrasting marginal stability (figure 8), a phase portrait for marginal


stability has closed curves, but these are isolated, and are not the limit of
a trajectory.

Consider different stability cases

Establishing the existence of a limit cycle has no general solution


Limited solution for the 2D case:
Poincares Index: for the region enclosed by a closed curve, write
n = N S
where N is the number of centers, foci and nodes,
and S is the number of enclosed saddle points.
For the closed curve to be a limit cycle, it is necessary but not
sufficient that n = 1 .

Summary on Stability
The local stability of a (sufficiently smooth) nonlinear system can be
determined by examining the stability of the equilibrium points.
Nonlinear systems can have a new kind of behavior: a limit cycle.
Needed: a global stability result.

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 15

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 16

EE/ME 701: Advanced Linear Systems

Section 4.0.0

4 Lyapunov stability

EE/ME 701: Advanced Linear Systems

Section 4.0.0

With nonlinear stiffness and damping

Goal: address stability without solving the differential equation.

k (y)

Lyapunov method:

b (y)

Introduce an energy function (or, in general, a Lyapunov function)


Show that energy decays everywhere, except at the equilibrium point,
where it is zero.

Write the state space model:

x=

First published in Russian in 1892,


Translated into French in 1907

x =

Fist use in controls in 1944

y
0

x1
x2

k (y) b (y)

Consider the second order system


Assuming unit mass, the sum of potential and kinetic energy is given by:

E (x1 , x2 ) =

x22
+
2

k (x1 ) dx1

Looking at the rate of change of energy in the system:


dE
= x2x2 + k (x1 ) x1
dt
= x2 [x2 + k (x1 )]

But from the dynamics, x2 = k (x1 ) b (x2), and so

dE
dt

becomes:

dE
= x2 [k (x1 ) b (x2) + k (x1)]
dt
Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 17

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 18

EE/ME 701: Advanced Linear Systems

Section 4.0.0

EE/ME 701: Advanced Linear Systems

Section 4.1.0

4.1 Generalization of Lyapunovs energy function:


Introduce a scalar function of the system state:
V (x)
which is positive-definite
V (x) > 0 ,

x 6= 0

(5)

Evaluate the Lyapunov derivative:


dV
V d x
=
dt
dx dt
Conclusions:

(6)

and show:

If b () = 0 and the system has only a spring,


in the system remains constant.

dE
dt

= 0 and the total energy

dV
< 0 , x 6= 0
(7)
dt
Which establishes the Bounded-input bounded-output (BIBO) stability:

If b > 0 x2 the system is dissipative, and


V (t) V (t0)
As long as
then :

t > t0

(8)

E>0
dE
dt

Challenge: how do we find V (x) ?

<0

Energy function
Clever choice

Observations:
For any initial condition giving energy E0 , the system never goes to a
state with E (x) > E0
When b > 0 energy steadily decays, and the system converges to
E = 0.

Typical choice:
V (x) = xT P x ,

P : positive definite matrix

(9)

Of course, failure to find a suitable V (x) does not prove instability.


Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 19

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 20

EE/ME 701: Advanced Linear Systems

Section 4.2.0

EE/ME 701: Advanced Linear Systems

4.2 Quadratic Forms

Section 4.3.0

4.3 Lyapunov stability of a linear system

Physical energy is often quadratic in the state variables

Given
x (t) = A x (t)

Consider Vc = 12 CV 2

(11)

Introduce the Lyapunov function

Because we want a positive definite function V , a quadratic form is often


chosen for a Lyapunov function.

V (x) = xT P x

(12)

General quadratic form:


V (x) = xT P x

(10)

where P is a symmetric positive definite matrix.

Where

Then V (x) is given by

P is a matrix
V (x) = x T P x + xT P x
It can be shown (see chapter) P may be restricted to symmetric matrices
without loss of generality
(What is the advantage of a symmetric matrix, in terms of the
eigensystem of P?)

= xT AT P x + xT P A x = xT Q x

Student
exercise
where

Q = AT P + P A

If P is positive-definite

(13)

(14)

Q positive definite proves the stability of system (11).


xT P x > 0 x 6= 0
Eqn (14) is called the matrix Lyapunov equation.
Requirement for a PD matrix:

For a stable linear system, Eqn (14) has a solution for every positive
definite P

All eigenvalues > 0.


P may be written P = RT R
(Student exercise, prove that any P written P = RT R is symmetric and
positive (semi)definite, for arbitrary choice of R.)

Equations (11)-(14) show the concept applied to a linear system


And provide a starting point for studying nonlinear systems.

P is positive-semidefinite if all eigenvalues 0, and some eigenvalues


may be 0.
Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 21

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 22

EE/ME 701: Advanced Linear Systems

Section 5.0.0

5 Summary
Weve seen that stability is more complex for nonlinear systems
Property of system, and also of initial conditions and inputs
(Think of the stability of a plane in flight).
A system at equilibrium and perturbed can:
Return to equilibrium
Diverge
Go to a different equilibrium point
Go into a limit cycle
The Lyapunov stabiltiy method gives one approach for non-linear systems.

Part 7: Phase Portraits and Stability

(Revised: Sep 10, 2012)

Page 23

EE/ME 701: Advanced Linear Systems

EE/ME 701: Advanced Linear Systems

Controllability and Observability of


Linear Systems

3.2

3.3

Contents
1 Introduction

1.1

Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Main facts about uncontrollable or unobservable modes: . . . . .

2 Tests for Controllability and Observability


Controllability, discrete time system . . . . . . . . . . . . . . . .

2.2

Controllability, continuous time system . . . . . . . . . . . . . .

Polynomial expressions on matrices, and the CayleyHamilton Thrm . . . . . . . . . . . . . . . . . . . . . . .

Controllability for continuous-time systems . . . . . . . .

10

2.3

Controllability example . . . . . . . . . . . . . . . . . . . . . . .

12

2.4

Observability, discrete time system . . . . . . . . . . . . . . . . .

15

2.5

Observability, continuous time system . . . . . . . . . . . . . . .

18

2.6

Observability example . . . . . . . . . . . . . . . . . . . . . . .

18

2.7

Popov-Belevitch-Hautus Tests . . . . . . . . . . . . . . . . . . .

19

2.7.1

PBH Test of Controllability . . . . . . . . . . . . . . . .

19

2.7.2

PBH Test of Observability . . . . . . . . . . . . . . . . .

20

2.2.2

2.8

Controllability and Observability in modal coordinates.

. . . . .

3 Controllability and observability example


3.1

3.2.1

A badly placed actuator . . . . . . . . . . . . . . . . . .

34

Summary Controllability and Observability . . . . . . . . . . . .

37

3.3.1

38

Uncontrollable and unobservable modes and rocketry . . .

4 Additional topics in Bay


4.1

39

Alternative definitions of Controllability and Observability . . . .

39

21
26

Testing Controllability and Observability . . . . . . . . . . . . . .

Part 8: Controllability and Observability

31

2.1

2.2.1

A badly placed sensor . . . . . . . . . . . . . . . . . . . . . . . .

(Revised: Sep 10, 2012)

30

Page 1

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 2

EE/ME 701: Advanced Linear Systems

Section 1.1.0

1 Introduction

EE/ME 701: Advanced Linear Systems

Section 1.1.0

Figure 1(B) shows an unobservable system.

No discussion of State-Variable models would be complete without a


discussion of observability and controllability (R. E. Kalman [1, 2]).

y(t)

s+3

u(t)

(s+3) (s+4)

Modes correspond to energy storage elements


When a pole and a zero cancel, the mode - nonetheless - remains in the
system:

(A) System with pole-zero cancellation.

Example: a pole-zero cancellation (fig 1(A)).

u(t)

y(t)
s+3

(s+3) (s+4)

The mode disappears from the transfer function.


The mode - and energy storage element - remains in the system:

(B) Unobservable system (state hidden from output)

Either the signal of the mode does not appear in the output
(Unobservable, fig 1(B));
Or the input signal cant reach the mode (Uncontrollable, fig 1(C)).

1.1 Definitions

u(t)

s+3

y(t)

(s+3) (s+4)

(C) Uncontrollable system (state untouched by input)

Controllable System: The input signal reaches each of the modes, so that the
system can be driven from any state, x0 , to the origin by suitable choice of
input signal.

Figure 1: A pole-zero cancellation can physically occur in either the input or


output path.

Figure 1(C) shows an uncontrollable system.


Observable System: The response of each mode reaches the output. An initial
condition, x0, can be determined by observing the output and input of the
system.
Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 3

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 4

EE/ME 701: Advanced Linear Systems

Section 1.2.0

1.2 Main facts about uncontrollable or unobservable modes:


Even if a pole and zero cancel, the system still has that mode in its response.

EE/ME 701: Advanced Linear Systems

Section 2.1.0

2 Tests for Controllability and Observability


2.1 Controllability, discrete time system
Consider example 3.10 in [Bay]. For the LTI discrete-time system:

Challenge: if a pole and zero cancel, it means that feedback control can do
nothing about the mode:

x (k + 1) = A x (k) + B u (k)

If the mode is unstable, feedback control cannot stabilize the system.

(1)

y (k) = C x (k) + D u (k)

If the mode is slow or lightly damped, it will remain slow or lightly


damped.
If the mode is fast and well damped: being uncontrollable or
unobservable is often not a problem.
Exception: rocket, the stress introduced by oscillation of an
uncontrollable or unobservable mode can over-stress the structure.
How do uncontrollable or unobservable modes get excited:

Controllability, example 3.10: Given and arbitrary initial condition x (0),


under what conditions will it be possible find an input sequence,
u (0) , u (1) , ..., u (l) which will drive the state vector to zero, x (l) = 0.
Solution:
Consider the output for a few samples of the input:
At time k = 0 :

Disturbances (dont go through zeros of control input path),


Initial conditions,

x (1) = A x (0) + B u (0)

(2)

Non-linearities, etc.
At time k = 1 :
x (2) = A x (1) + B u (1)

(3)

= A [A x (0) + B u (0)] + B u (1)


= A2 x (0) + A B u (0) + B u (1)

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 5

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 6

EE/ME 701: Advanced Linear Systems

Section 2.1.0

At time k = 2 :

EE/ME 701: Advanced Linear Systems

Section 2.1.0

Define the controllability matrix to be:

x (3) = A x (2) + B u (2)


i
h
= A A2 x (0) + A B u (0) + B u (1) + B u (2)

= A3 x (0) + A2 B u (0) + A B u (1) + B u (2)

P , B | A B | ... | Al2 B | Al1 B

and also

The ascending powers of A arise because x (k + 1) depends on x (k)


(recursion).
For 0 = x (l), rearranging Eqn (4) gives:

then

x (l) Al x (0) = Al x (0) = B u (l 1) + A B u (l 2) + ... + Al1 B u (0)


(5)
The right hand side of (5) can be put in matrix-vector form, to give:

u
(l

1)

u (l 2)

h
i

..

Al x (0) = B | A B | ... | Al2 B | Al1 B


.

u (1)

u (0)

Part 8: Controllability and Observability

u
(l

1)

u (l 2)

..
(7)
u=
.

u (1)

u (0)

(4)

(Revised: Sep 10, 2012)

Al x (0) = P u

(8)

From Eqn (8), it is clear that if Al x (0) lies in the column space of P , then
a solution u exists which is the control sequence to drives the initial state to
the origin. Thus:
If the rank of P = n , then a control sequence u is guaranteed to exist;
and the system is controllable.

(6)

Page 7

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 8

EE/ME 701: Advanced Linear Systems

Section 2.2.1

2.2 Controllability, continuous time system

EE/ME 701: Advanced Linear Systems


2.2.2 Controllability for continuous-time systems

Controllability for a continuous time system follows from a similar analysis,


except that integrals and the Cayley-Hamilton theorem are required.

THEOREM (Controllability): A n-dimensional continuous-time LTI


system is controllable if and only if the matrix

A polynomial expression p () has the form (following Bay, chap 5):


p () = + n1

n1

+ + 1 + 0

p (A) = A + n1 A

n1

(9)

PROOF: Recall that the solution of the LTI equations is:


x (t) = eA(tt0 ) x (t0) +

+ + 1 A + 0I

Z t

eA(t) B u () d

(13)

t0

(10)

We want to establish whether a control signal u (t)exists such that


x (t1 ) = 0, so consider:

Cayley-Hamilton Theorem:
A matrix satisfies its own characteristic equation

Z t1

THEOREM (Cayley-Hamilton theorem): If the characteristic


polynomial of an arbitrary n n matrix A is denoted (), computed
as () = det (A I), then matrix A satisfies its own characteristic
equation, denoted by (A) = 0.

t0

(A) = An + n1 An1 + + 1A + 0I
n

n1

= n1 A

1 A 0I

(11)

where the i are the coefficients of the characteristic equation.


In any polynomial expression in A, we can always replace An and higher
powers with the right hand side of Eqn (11).
Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

eA(t1 ) B u () d = eA(t1 t0 ) x (t0 ) , 1 Rn

(14)

We dont need to solve for the 1 for the proof, only to know it
exists.

As a result, all matrix polynomial expressions are equivalent to an


expression of order n or less, because

so

(12)

has rank n.

A matrix polynomial expression has a similar form:


n

P , B | A B | ... | Al2 B | Al1 B

2.2.1 Polynomial expressions on matrices, and the Cayley-Hamilton Thrm

Section 2.2.2

Page 9

Term eA(t1 t0 ) has a polynomial expansion, so by the CayleyHamilton theorem, it can be expressed in terms of a low-order
polynomial:
eA(t1 t0 ) =

i () Ani

(15)

i=1

We dont need to solve for the i () for the proof, only to know
they exist.

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 10

EE/ME 701: Advanced Linear Systems

Section 2.2.2

EE/ME 701: Advanced Linear Systems

2.3 Controllability example

From Eqns (14) and (15):

1 =
=

Z t1
t0

Z t1
t0

"

i () A

ni

i=1

"

Ani B i ()

i=1

Z t1 h

n1

t0

= An1 B

t0

Consider the electrical circuit given in figure 3:


B u () d
#

(16)

R
u () d

i
B 1 () u () + + A B n1 () u () + B n () u () d

Z t1

Section 2.3.0

1 () u () d + + A B

Define:
i ,

Z t1
t0

Z t1
t0

n1 () u () d + B

Z t1
t0

+
vs(t)

+
vx(t)
-

(17)

Eqn (17) with Eqn (16) gives:

n1

.
..

(18)

(Revised: Sep 10, 2012)

The state equations are given as:

2
1
1
RC
d vC (t)
C vC (t) RC

vs (t)
=
+
1
dt iL (t)
L1 0
iL (t)
L

h
i vC (t)
+ [1] vs (t)
vx (t) =
1 0
iL (t)

P = [B | A B] =

Page 11

(19)

The Controllability matrix is given as:

From Eqn (14), i is a vector depending on x (t0); from Eqns (17) and (18), if
P is full rank, then there exists a control input u (t) , t [t0 , t1 ] to give some
i such that x (t1 ) = 0 . Thus, the system is guaranteed to be controllable.
Part 8: Controllability and Observability

n () u () dx

As before, we dont need to be able to compute the i , only to know they


exist.

n1

h
i

=P
1 = B | A B | ... | Al2 B | Al1 B

+ vc(t)

Figure 2: RLC circuit example.

i () u () d

iL(t)

1
RC
1
L

1
R22C2 + LC
1
R LC

(20)

When P becomes rank-deficient, the determinant will be zero. So we can


compute the determinant to test for values of R which make the system
uncontrollable.
Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 12

EE/ME 701: Advanced Linear Systems

Section 2.3.0

Note: The controllability matrix doesnt have to be square, (well see this
example with an extra input below). A determinant calculation can only
be used when P is square.



|P | =

So rank (P ) < 2 if

1
RC
1
L


1

R22C2 + LC
= 1 1
R2 LC2 L2 C
1

R LC
R=

It means that the transfer function has a pole-zero cancellation, and


that the zero is effectively in the input path:

1
s s + RC
Vx (s)
=
2
1
Vs (s) s2 + RC
s + LC

(23)

p
L/C then L = R2 C the denominator factors into:

s
Vx (s)
=
Vs (s) s2 + 2

iL(t)

+ vc(t) I (t)
b

+
vx(t)
-

(22)

What does this strange situation mean physically:

1
s + RC


1
1 2
RC s + RC

+
vs(t)

L
C

Section 2.3.0

Consider adding another control input (one way to deal with an


uncontrollable system)

(21)

What it means for the system to be uncontrollable: for arbitrary to , t1 and


x (t0 ) , we can not find a control input vs (t) , t [t0, t1 ] to give x (t1 ) = 0; in
other words, to drive the state to 0.

If R =

EE/ME 701: Advanced Linear Systems

1
s s + RC


1
1
s + RC
s + RC

Figure 3: RLC circuit example with second input.


The state model becomes:

2
1
1
1
RC
vC (t)
vs (t)
d vC (t)
C
RC
C

(24)
=
1
dt iL (t)
L1 0
iL (t)
Ib (t)
0
L

h
i vC (t)
+ [1] vs (t)
vx (t) =
1 0
iL (t)
The controllability matrix is:

s

1
s + RC

The second mode is not gone from the system, it is just unreachable from
the input.

P = [B | A B] =

1
RC
1
L

1
C

0 |

1
R22C2 + LC
1
R LC

R2
1
LC

(25)

Notice that P changes shape. It still has n rows, but now has 2n columns.
A 2 4 matrix has to be quite special to have rank less than 2.
In the present case, P is rank 2 as long as the two rows are independent.
The zero in the (2, 2) element assures the rows will be independent.

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 13

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 14

EE/ME 701: Advanced Linear Systems

Section 2.4.0

2.4 Observability, discrete time system

EE/ME 701: Advanced Linear Systems

Section 2.4.0

At time k = 0 :

Consider example 3.11 in [Bay]. For the LIT discrete-time system:


y (0) = C x (0) + D u (0)
x (k + 1) = A x (k) + B u (k)

(27)

(26)
At time k = 1 :

y (k) = C x (k) + D u (k)

y (1) = C x (1) + D u (1)


Observability, example 3.11: Given and arbitrary initial condition x (0),
when l samples of the inputs and outputs, u (k) and y (k), k {0..l},
are known, under what conditions will it be possible to determine
(observe) the initial condition x (0).

(28)

= C [A x (0) + B u (0)] + D u (1)


= C A x (0) +C B u (0) + D u (1)

At time k = 2 :
Duality: Duality situations are those where, with a systematic set of
changes or exchanges, one system turns out to have properties equivalent
to another. For example, circuit duality:

y (2) = C x (2) + D u (2)


i
h
= C A2 x (0) + A B u (0) + B u (1) + D u (2)

= C A2 x (0) +C A B u (0) +C B u (1) + D u (2)

Voltage sources Current Sources


Capacitors

Inductors

Resistances

Conductances

(29)

Generalizing, Eqn (29) becomes:

and the new dual circuit will behave identically to the original.
In controls there are several dualities, an important one is the
Observability / Controllability duality. As we will see, with an
appropriate transpose and C substituted for B, the properties carry over.
Solution:

C


y (1) C A

2
y (2)
= CA


..
..

.
.


C Al1
y (l 1)
y (0)

x (0) +

Big Term
(based on u (k))

(30)

Consider the output for a few samples of the output:


Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 15

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 16

EE/ME 701: Advanced Linear Systems

Section 2.4.0

Observability (continued)

y (0)
y (1)
y (2)
..
.
y (l 1)

Section 2.6.0

2.5 Observability, continuous time system

It doesnt matter what the big term is, as long as it is known. Write

EE/ME 701: Advanced Linear Systems

Proof Q must be full rank for a continuous time system to be observable is


omitted. It can be constructed by considering y (t) and its derivatives.

Big Term
(based on u (k))

= Q x (0)

2.6 Observability example


(31)

Using the circuit example,

C
1
0


Q =

=
2
1
CA
RC C

where the observability matrix is given as:

(33)

which is independent of L and full rank for any values of R and C.

CA

2
Q ,
CA

..

C Al1

(32)

From Eqn (31), it is clear that if x (0) lies in the row space of Q , then by
knowing the left hand side it is possible to determine x (0). It is guaranteed
that x (0) lies in the row space of Q if the rank of Q is n.
If the rank of Q = n , then x (0) can be determined from known samples
of the input and output; and the system is observable.

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 17

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 18

EE/ME 701: Advanced Linear Systems

Section 2.7.2

2.7 Popov-Belevitch-Hautus Tests

EE/ME 701: Advanced Linear Systems

Section 2.7.2

2.7.2 PBH Test of Observability

These tests for controllability and observability do not involve testing any
matrix for rank, but rather examine the left eigenvectors (eigenvectors) for
orthogonality with B (C) .

LEMMA (PBH Test of Controllability): An LTI system is not


controllable if and only if there exists a eigenvector v of A such that
Cv = 0

Left Eigenvectors: Left eigenvectors are row vectors which have the property
that
vA = v
(34)

PROOF: We will see that it is a direct consequence of examining


observability in modal coordinates.

For example,


so v =

1 1

1 1



3 1

= 3 1 1
0 2

(35)

is a left eigenvector. Left eigenvectors are row vectors,

and they are the (right) eigenvectors of A .


2.7.1 PBH Test of Controllability
LEMMA (PBH Test of Controllability): An LTI system is not
controllable if and only if there exists a left eigenvector of A such
that
B = 0T
PROOF: The text establishes the PBH test by direct proof. We will see
that it is a direct consequence of examining controllability in modal
coordinates.

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 19

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 20

EE/ME 701: Advanced Linear Systems

Section 2.8.0

2.8 Controllability and Observability in modal coordinates.

EE/ME 701: Advanced Linear Systems


The rotation to modal coordinates is given by

Recall that we can use the eigenvectors of A to transform the state variable
model onto a new basis (modal coordinates) in which the modes are
decoupled.

V = Evecs A ; T = V 1
b = T A T 1
A
Bb = T B

Controllability: For any decoupled mode (all modes in modal coordinates)


we must have an input into the mode, in order for the mode to be
controllable.

The system is controllable if and only if all elements of the


input matrix B are non-zero in modal coordinates.

Observability: For any decoupled mode (all modes in modal coordinates)


we must have an output from the mode, in order for the mode to be
observable.

The system is observable if and only if all elements of the


output matrix C are non-zero in modal coordinates.

Section 2.8.0

Cb = C T 1
The eigenvectors of A are given as:
>> A = [-2/(R*C), (1/C)
-(1/L)
0]
>> B = [1/(R*C) ; (1/L)]
>> C = [ -1 0]; D = 1;
>> [V, U] = eig(A)

Controllability and Observability in modal coordinates Example:


Consider the original circuit (figure 3) with L = 1 [H], C = 1 [F] and R = 2.
p
R 6= L/C , so the system is controllable. The state model is given by:

1
1

A=

1 0

0.5

B=

1.0

C=

D = [1]

1 0

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 21

V =

0.3536 - 0.6124i
0.7071

0.3536 + 0.6124i
0.7071

Notice that the eigenvalues and eigenvectors are complex. This is fine. The
complex terms will come in complex-conjugate pairs, with the imaginary
components canceling in the output.

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 22

EE/ME 701: Advanced Linear Systems

Section 2.8.0

Controllability and Observability in modal coordinates Example (continued)


>> T = inv(V)
T =
0.0000 + 0.8165i
-0.0000 - 0.8165i

Controllability and Observability in modal coordinates Example (continued)

>> VV = [V(:,1), V2]


VV =
0.7071
-0.3536
0.7071
0.3536

0 + 0.0000i
-0.5000 - 0.8660i

>> Bhat = T * B
Bhat = 0.7071 - 0.0000i
0.7071 + 0.0000i
>> Chat = C * inv(T)
Chat = -0.3536 + 0.6124i

Section 2.8.0

Following the steps in section 4.4 to construct the T matrix (called the modal
matrix M by Bay), the transformation is given by:

0.7071 - 0.4082i
0.7071 + 0.4082i

>> Ahat = T * A * inv(T)


Ahat = -0.5000 + 0.8660i
-0.0000 - 0.0000i

EE/ME 701: Advanced Linear Systems

>> T = inv(VV)
T = 0.7071
-1.4142

0.7071
1.4142

-0.3536 - 0.6124i

The system is controllable and observable, B has no zero elements, neither


does C .

>> Ahat = T * A * inv(T)


Ahat = -1.0000
1.0000
0
-1.0000

When R = 1 [] .
The is more difficult because of the repeated eigenvalue (the repeated
pole at s = 1/ (RC)).
It is a coincidence that the pole-zero cancellation occurs at the same
value of R as the double pole, and that this A matrix has only 1
eigenvector.

>> Bhat = T * B
Bhat = 1.4142
0.0000
>> Chat = C * inv(T)
Chat = -0.7071

0.3536

It is necessary to use the Jordan form.

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 23

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 24

EE/ME 701: Advanced Linear Systems

Section 2.8.0

Section 3.0.0

3 Controllability and observability example

With
b x (t) + Bu
b (t)
x (t) = A
Notice:

EE/ME 701: Advanced Linear Systems

(36)

y (t) = Cb x (t) + D u (t)

A double pendulum will be used to illustrate how uncontrollable or


unobservable modes can arise in a physical system.
The example shows that modes arise with choices of where or how
actuators or sensors are placed.

b
1. The repeated poles at 1 in A.

2. Bb (2) = 0 (there is no input into x (2) of the system: uncontrollable).

3. Cb has all non-zero elements (the system is observable).

A typical remedy for uncontrollable or unobservable modes is to reengineer a system to change an actuator or sensor.
The double pendulum, seen in figures 4 and 5:
Has 4 energy storage elements (kinetic energy of each mass, potential
energy of each mass);
Is a 4th order system;
Has 2 oscillatory modes.

l1

M1

l2

M2

Figure 4: Double pendulum: example system for unobservable and


uncontrollable modes.
Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 25

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 26

EE/ME 701: Advanced Linear Systems

Section 3.0.0

Controllability and observability example (continued)


Mode 1: Motion together, 2 = 1
(Potential energy, no spring energy)

Section 3.0.0

Controllability and Observability Example (continued)

Mode 2: Motion together, 2 = 1


(Potential energy and spring energy, faster oscillation)

Ap = 0

Figure 5: Double pendulum: two modes of the double pendulum.

-19.8000 0
10.0000
1.0000 0
0
0
0
10.0000 0
-19.8000
0
0
1.0000 0

>> OLPoles = eig(Ap)


OLPoles = 0.0000 +
0.0000 -0.0000 +
-0.0000 -

The modeling equations are:




M1 l12 1 (t) = 1 (t) = M1 g l1 1 k (1 2 ) b 1


M2 l22 2 (t) = 2 (t) = M2 g l2 2 k (2 1 ) b 2
h
iT
With state vector x (t) = 1 1 2 2 we get the state variable model:
>>
>>
>>
>>
>>
>>
>>
>>

EE/ME 701: Advanced Linear Systems

>> %% Double Pendulum, y/observability example


%% Setup parameters
M1 = 2; M2 = M1;
% Ball mass
l1 = 1; l2 = l1;
% Link length
b = 0.1;
% Damping factor
k = 20
% Spring stiffness
M1l2 = M1*l1^2
% Link M1 inertia
M2l2 = M2*l2^2
% Link M2 inertia
g = 9.8
% Gravity constant

5.4589i
5.4589i
3.1305i
3.1305i

%% Fast mode
%% Slow mode

And if we apply an input force on M1and measure 2 as illustrated in


figure 6, we get the input and output matrices:

>> Bp = [ l1/M1l2 0 0 0]

%% Input is force on M1

>> Cp = [ 0 0 0 1 ]
>> Dp = [ 0]

%% Output is 2

>> Ap = [ -b/M1l2 -k/M1l2-g/l1 0


+k/M1l2 ;
1
0
0
0 ;
0
+k/M2l2
-b/M2l2 -k/M2l2-g/l2 ;
0
0
1
0 ]
Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 27

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 28

EE/ME 701: Advanced Linear Systems

Section 3.0.0

This model gives the step response of figure 7, where both modes are
present.

EE/ME 701: Advanced Linear Systems

Section 3.1.0

3.1 Testing Controllability and Observability


1. Controllability: construct the controllability matrix

2(t)
(Sensor)

>>
%% Controllability matrix
>> Ccontrol = [Bp, Ap*Bp, Ap*Ap*Bp, Ap*Ap*Ap*Bp]
Ccontrol = 0.5000
0 9.9000 0
0
0.5000 0
-9.9000
0
0
5.0000 0
0
0
0
5.0000

u(t)
(Applied Force)

M1

M2

>> rank(Ccontrol)
ans = 4

Figure 6: Double pendulum with sensor and actuator.

The system is controllable.


Step Response

2. Observability: construct the observability matrix

0.05

0.04

>>>> %% Observability matrix


>> Oobserve = [Cp; Cp*Ap; Cp*Ap*Ap; Cp*Ap*Ap*Ap]
Oobserve =
0
0
0
1.0000
0
0
1.0000
0
0
10.0000
0
-19.8000
10.0000
0
-19.8000
0

Amplitude

0.03

0.02

0.01

0.01

0.02

50

100

150

200

250

Time (sec)

>> rank(Oobserve)
ans =
4

Figure 7: Response of the first double pendulum model.

The system is observable.

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 29

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 30

EE/ME 701: Advanced Linear Systems

Section 3.2.0

3.2 A badly placed sensor

u(t)

>> Oobserve1 = [Cp1; Cp1*Ap; Cp1*Ap*Ap; Cp1*Ap*Ap*Ap]


Oobserve1 = 0
0.5000 0
0.5000
0.5000 0
0.5000 0
0
-4.9000 0
-4.9000
-4.9000 0
-4.9000 0
>> rank(Oobserve1)
ans = 2

(Sensor)
k

M1

M2

Figure 8: Double pendulum with sensor moved to a new location.


The output signal is given by:
y (t) =

C p1 =
The controllability matrix:

l1 1 + l2 2
2

0 l1 /2 0 l2 /2

P = B p A p B p A2p B p A3p B p ;

The observability matrix:


>> Cp1 = [ 0 l1/2 0 l2/2 ]
Cp1 = [0
0.5000 0
0.5000 ]

y(t)

or

Section 3.2.0

A badly placed sensor (continued)

Now lets model a system with a position sensor placed in the middle of the
spring (rather than sensing position 2 ):

(Applied Force)

EE/ME 701: Advanced Linear Systems

Rank(Q ) < n , the system is unobservable.


i

unchanged

(37)

Moving the sensor does not change the controllability of a system.

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 31

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 32

EE/ME 701: Advanced Linear Systems

Section 3.2.0

A badly placed sensor (continued)

EE/ME 701: Advanced Linear Systems

Section 3.2.1

3.2.1 A badly placed actuator

Lets take a look at the poles and zeros

Lets go back to the original system, and rather than applying a force to M1,
lets put in a linear motor that acts between M1 and M2

Original system:

>> Sys0 = ss(Ap, Bp, Cp, Dp);


>> Poles0 = pole(Sys0)
Poles0 = 0.0000 + 5.4589i
0.0000 - 5.4589i
-0.0000 + 3.1305i
-0.0000 - 3.1305i
>> Zeros0 = zero(Sys0)
Zeros0 = Empty matrix: 0-by-1

%% Fast mode

Linear Motor
f(t) = ks va(t)

u(t)

%% Slow mode

2(t)
(Sensor)

u(t)
(Applied Force)

(Applied Force)
k

M1

%% No zeros

M2

Figure 9: Double pendulum with actuator moved to a new location.

System with the sensor in the middle of the spring:


>> Sys1 = ss(Ap, Bp, Cp1, Dp);
>> Poles1 = pole(Sys1);
%% Poles1:
>> Zeros1 = zero(Sys1)
Zeros1 =
0 + 5.4589i
0 - 5.4589i

Unchanged.

0
B p2 =

l2 /M2 l22

The input signal is given by:

%% Zeros overlap Fast mode

The new sensor position introduces zeros which collide with the fast mode
poles.
This system is controllable but unobservable.

The observability matrix:

l1 /M1 l12

Cp

Cp A p
Q =

C p A2p

C p A3p

unchanged (38)

Moving the actuator does not change the observability of a system.

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 33

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 34

EE/ME 701: Advanced Linear Systems

Section 3.2.1

A badly placed actuator (continued)

EE/ME 701: Advanced Linear Systems

Section 3.2.1

A badly placed actuator (continued)

The controllability matrix:

This system is observable by uncontrollable, take a look at the system in


modal coordinates:

>> Bp2 = [l1/M1l2; 0; -l2/M2l2; 0 ]


Bp2 = 0.5000
0
-0.5000
0

%% Rotate to modal coordinates


[V, P] = eig(Ap);
mpT = V;
mpT1 = inv(mpT)
Am = mpT1 * Ap * mpT
= diag([0+j5.5, 0-j5.5, 0+j3.1, 0-j3.1])

>> Ccontrol2 = [Bp2, Ap*Bp2, Ap*Ap*Bp2, Ap*Ap*Ap*Bp2]


Ccontrol2 = 0.5000 0
-14.9000
0
0
0.5000
0
-14.9000
-0.5000 0
14.9000
0
0
-0.5000
0
14.9000
>> rank(Ccontrol2)
ans = 2

Input matrix of system with badly placed actuator:


Bm = mpT1 * Bp2
= [3.6; 3.6; 0; 0]

%% slow mode uncontrollable.

Output matrix of system with badly placed sensor:


Cm = Cp2 * mpT
= [0, 0, 0+j0.22, 0-j0.22]

Rank(P ) < n , the system is uncontrollable.

%% fast mode unobservable

Lets take a look at the poles and zeros


System with the actuator between the links.
>> Sys2 = ss(Ap, Bp2, Cp, Dp);
>> Zeros2 = zero(Sys2)
Zeros2 =
0 + 3.1305i
%% Zeros overlap Slow mode
0 - 3.1305i
The new actuator design introduces zeros which collide with the slow mode
poles.
Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 35

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 36

EE/ME 701: Advanced Linear Systems

Section 3.3.0

3.3 Summary Controllability and Observability

EE/ME 701: Advanced Linear Systems

Section 3.3.1

3.3.1 Uncontrollable and unobservable modes and rocketry

The order of the system is based on the number of storage elements (or 1st
order differential equations).
If a pole and zero cancel, it does not remove an energy storage element
from the system: the mode is still in there.

Space launch vehicles are one type of system where controllability and
observability problems arise:
Flexible structures
Restricted freedom to place sensors or actuators

Controllability and observability are properties of the system, not the


controller.

The poles and zeros move around vigorously during launch, as the
aerodynamic condition changes and mass changes.

There is no way to fix a controllability or observability problem by


choosing a different controller.

Uncontrollable or unobservable modes must be very carefully considered


for rocketry

Most often controllability or observability problems are fixed by changing


(or adding) an actuator; or changing (or adding) a sensor.

Open-loop unstable mode (without control, the rocket will fall over)
Lightly damped structural modes:

Unobservable or uncontrollable modes may not be a problem if they are


sufficiently fast and well damped.

Can be excited by vibrations during launch


Moderate oscillations can (and did!) exceed structural limits.
Transfer functions provide little insight into controllability and observability. State variable models were needed to see what was going on (R.E.
Kalman [1])

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 37

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 38

EE/ME 701: Advanced Linear Systems

Section 4.1.0

4 Additional topics in Bay

EE/ME 701: Advanced Linear Systems

Section 4.1.0

Definitions related to Controllability

Bay also addresses time varying systems and alternative definitions of


Controllability and Observability

4.1 Alternative definitions of Controllability and Observability


We have several alternative definitions for the concepts of Controllability
and Observability
In most cases (except as noted), these are equivalent to basic Controllability
and Observability for Linear Time-Invariant Systems.
The differences become more important (and subtle) for nonlinear or timevarying systems.

Controllability:
A system is controllable if there exists a u (t) to drive from an arbitrary
state x0 to the origin.
Reachability:
Can find u (t)to drive from an arbitrary state x0 to any second
arbitrary state x1 .
LTI systems: Controllability are reachability are equivalent for continuous
systems. For discrete systems, there are (defective) discrete
systems which are controllable in a trivial way, consider:

xk+1 =

This is partly why it takes a Ph.D. to design control for an air or space
craft.

0 0
0 0

xk +

0
0

uk

Reachability is actually the more interesting property, requiring


Invertibility of the controllability matrix.
Stabilizability:
A system is stabilizable if its uncontrollable modes, if any, are openloop stable. Its controllable modes may be stable or unstable.

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 39

Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 40

EE/ME 701: Advanced Linear Systems

Section 4.1.0

Definitions related to Observability


Observability:
Can the initial state be determined.
Reconstructability:
Can the final state be determined.
Detectability:
A system is detectable if its unobservable modes, if any, are open-loop
stable. Its observable modes may be stable or unstable.
Stabilizability and Detectability
Stabilizability and Detectability relate to whether a closed-loop system can
be stabilized.
For simple stability (as opposed to good performance), we dont need to
observe or control open-loop stable modes.
If any unobservable or uncontrollable modes are fast enough and well
damped, and not excessively excited by disturbances, we may also be
able to achieve good control.

References
[1] R.E. Kalman. On the general theory of control systems. In Proc. 1st Inter.
Congress on Automatic Control, pages 481493. Moscow: IFAC, 1960.
[2] R.E. Kalman, Y.C. Ho, and K.S. Narendra. Controllability of Linear
Dynamical Systems, Vol 1. New York: John Wiley and Sons, 1962.
Part 8: Controllability and Observability

(Revised: Sep 10, 2012)

Page 41