Anda di halaman 1dari 261

# Name: _________________________________

Lecture Notes

## Prof. Brian Armstrong

September 11, 2012

Linear Algebra

## EE/ME 701: Advanced Linear Systems

1.11 Eigenvalues and eigenvectors

Linear Algebra

. . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

## 1.11.1 Some properties of Eigenvectors and Eigenvalues

Contents

. . . . . . . . . . . . . 24

## 1.11.2 Additional notes on Eigenvectors and Eigenvalues . . . . . . . . . . . . . 28

1.11.3 One final fact about the eigensystem: V diagonalizes A . . . . . . . . . . . 32

1.1

## Scalars, Matrices and Vectors

. . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

Transpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3

## Basic arithmetic: +, - and * are well defined . . . . . . . . . . . . . . . . . . . . .

1.3.1

Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

## Commutative, Associative, Distributive and Identity Properties . . . . . . . . . . .

1.4.1

Commutative property . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.2

Associative property . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.3

Distributive property . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.4

Identity Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.4.5

## Doing algebra with vectors and matrices

. . . . . . . . . . . . . . . . . .

1.5

## Linear Independence of Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.6

Determinant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.4

1.6.1

1.6.2

1.6.3

## The determinant of triangular and diagonal matrices . . . . . . . . . . . . 12

1.7

Rank

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.8

1.9

1.8.1

1.9.1

Example norms

## 2 Two equations of interest

2.1

2.2

35

Algebraic Equations, y = A b . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.1.1

## Case 1: Where we have n = p independent equations, the exactly

constrained case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.1.2

Case 2: Where n > p, we have more equations than unknowns, the over
constrained case (Left pseudo-inverse case): . . . . . . . . . . . . . . . . 39

2.1.3

Case 3: Where n < p we have fewer equations than unknowns, the underconstrained case (Right pseudo-inverse case): . . . . . . . . . . . . . . . 41

Differential Equations, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3 Summary

45

. . . . . . . . . . . . . . . . . . . 12

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

## 1.10 The condition number of a matrix . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.10.1 Condition number and error . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.10.2 Example of a badly conditioned matrix . . . . . . . . . . . . . . . . . . . 20
1.10.3 How condition number is determined . . . . . . . . . . . . . . . . . . . . 21

Page 1

Page 2

Section 1.2.0

Section 1.3.0

Define:

A=

## Matrices are rectangular sets of numbers or functions, examples:

Matrices have zero or more rows and columns:

v (t)
1 0 7

31

,
x
(t)

R
x (t) =
A
=
w (t)
4 5 0
z (t)
2 3 6

A R33

1 2
3 4

B=

x=

2
3

1 1

w=

C = 4 1

5 2

3 4

## Operations +, - and * are well defined. The dimensions of the operands

must be compatible.

Vectors are special cases of Matrices, with only one row or column,
x is a column vector,
h
i
w = 3 4 R12 is a row vector

## For addition and subtraction, the operation is element-wise, and the

operands must be the same size:

Scalar values (numbers or functions with one output variable) can also be
treated as matrices or vectors:
3 =  R11

A+B =

0 3
5 7

AB =

2 1
1 1

## If not the same size, there is no defined result (the operation is

impossible)
A + C = undefined

## Array is a synonym for Matrix

1.2 Transpose
Transposing an array rearranges each column to a row:

#
"
3 1

3 4 5
T

C
C=
=
4 1
1 1 2
5 2
C R32 , CT R23
Lecture: Basics of Linear Algebra

Page 3

Page 4

Section 1.3.1

Section 1.4.2

## 1.4 Commutative, Associative, Distributive and Identity

1.3.1 Multiplication
For multiplication, the operation is by row on the left and column on the
right.
To produce one element of the result, go across each row on the left and
multiply with elements of the column on the right,
A B = AB =

"

(1 (1) + 2 2) (1 1 + 2 3)
(3 (1) + 4 2) (3 1 + 4 3)

"

3 7

5 15

Ax =

"

1 2
3 4

#"

2
3

"

(1 2 + 2 3)
(3 2 + 4 3)

"

18

## The size of the result matrix is determined by the number of rows in

A and columns in B

AC =

1 2
3 4

A B 6= B A

3 7
5 15

BA =

11 16

## Generally: There are many properties of linear algebra which may or

may not be true for some special cases.
Does not generally commute means that perhaps special matrices can
be found that do commute, but that not all matrices commute.

## Like scalar algebra, +, - and * have the associative property:

3 1

4 1 = undefined

5 2

(n = 3 , m = 2 , j = 2 , k = 2)

"
#
6 10
3 1

1 2

=
CA =
1 4
4 1
3 4
11 18
5 2

A B = (B A)

AB =

that is, m = j

"

Example:

Examples:

Properties

## (Revised: Sep 06, 2012)

(A + B) + C = A + (B + C)

(A + B) C = A + (B C)

(A B) C = A (B C)

Page 5

Page 6

Section 1.4.4

Section 1.4.5

## Starting with an equation,

(A + B) C = A C + B C

We can add, subtract or multiply on the left or right by any allowed term
and get a new equation.

C (A B) = C A C B

Examples, given:

A+B = C

Then

## Like scalar algebra, linear algebra has a multiplicative identity:

(A + B) + D = C + D

IL C = C IR = C
E (A + B) = E A + E B = E C

Examples:

1 0 0
3

0 1 0 4

0 0 1
5

3 1

4 1

0
5 2

3 1

=
1 4 1

2
5 2

3 1

=
4 1

1
5 2

(A + B) F = C F

## Where A, B, C, D, E and F are compatible sizes.

Matrices that are the appropriate size for an operation are called
commensurate .

1 0
0 1

1 2
3 4

1 2
3 4

1 0
0 1

1 2
3 4

Page 7

Page 8

Section 1.5.0

## EE/ME 701: Advanced Linear Systems

Section 1.6.0

1.6 Determinant
The determinant is a scalar measure of the size of a square matrix.

## Linear Dependence: A set of p n-dimensional vectors,



v1 , v2 , v p , vi Rn

## det (A) = |A| R1

is linearly dependent if there exists a set of scalars {ai } , i = 1...p , not all
of which are zero, such that:

p
..

a1 v1 + a2 v2 + + a p v p = ai vi = 0 0 = .
(1)

i=1
0
Linear Independence: The set of vectors is said to be linearly independent if
there is no set of values {ai } which satisfies the conditions for Linear
Dependence. In other words,
p

ai vi = 0

(2)

(5)

## The determinant is not defined for a non-square matrix.

The determinant of a matrix will be non-zero if and only if the rows (or,
equivalently, columns) of a matrix are linearly independent. Examples:

(a)

1 2 3

4 5 6 =0

10 14 18

(b)

1 2 3

4 5 6

7 8 9

= 54

(6)

## In case (a), the third column is given by 2xCol2 - 1xCol1.

In case (b), the three columns are independent.

i=1

## implies that the {ai } are all zero.

Notice that in case (a), the third row is given by 2xRow1 + 2xRow2.

Written another way, vectors {vi } are linearly independent if and only if
(iff)
p

ai vi = 0

ai = 0 , i

Always for a square matrix, if the columns are dependent, the rows
will be dependent.

(3)

i=1

| |

v1 v2

| |

a1

| |

a
2

=0
vp

...

| |
ap

iff

a
1

a2

.. = 0
.

ap

(4)

Page 9

Page 10

Section 1.6.1

ai j i j

## for any i = 1, 2, ..., n

(7)

j=1

where Mi j is called a minor. Mi j is the same matrix as A, with the ith row and jth
column removed.
For example, with i = 1,

## 2. det (A B) = det (A) det (B)

3. Invertibility of a matrix: Given M Rnn, M is invertible iff det (M) 6= 0

j=2

j=3

a b c

d f
e f
d e

+ c det
b det

det d e f = a det

g i
h i
g h
g h i
Closed-form expressions for 1x1, 2x2 and 3x3 matrices are sometimes
handy. They are:

det [a] = a ,

## (b) Similarity relation:



det M A M 1 = det (M) det (A) 1/ det (M) = det (A)

det

a b
c d

## 1. If a matrix has the upper-triangular form of Au or the lower-triangular form

of Al

0
d1
T
d1

d2
d2

, or Al =
Au =

...
...

T
dn
0
dn
Then det (Au ) = nk=1 dk , or det (Al ) = nk=1 dk .

Example:
>> A = [ 1 2 3 ; 0 4 5 ; 0 0 6]
A = 1
2
3
0
4
5
0
0
6
2. A diagonal matrix is special case of 1).

a b c

det d e f = a e i a h f b d i + b g f + c d h c g e

g h i
Lecture: Basics of Linear Algebra


(a) det M 1 = 1/ det (M)

j=1

## 4. Given M, an invertible matrix,

where ai j is the element from the ith row and jth column of A; and i j is called the
cofactor, given by:
(8)
i j = (1)i+ j det Mi j

Section 1.6.3

## The determinant of a square matrix (the only kind !) is defined by Laplaces

expansion (following Franklin et al.):
det A =

Page 11

>> det(A)
ans = 24

Page 12

Section 1.7.0

1.7 Rank

Section 1.8.0

|||| :

r = rank (A) ,

A Rnm

## The number of independent rows is always equal to the

independent columns.

Example:

1 2 3 4
rT1

r =
A = rT2 = 5 6 7 8
1

6 8 10 12
rT
3

(9)
number of

## r3 = 1.0 r1 + 1.0 r2, so the set of 3 vectors is linearly dependent. Because

there are 2 independent vectors,

(10)

The norm of a vector, written ||x|| is a measure of the size of the vector. A norm is
any function of a vector x with these properties, for any vector x and scalar a, then
1. Positivity, the norm of any vector x is a non-negative real number

Rn 7 R

||x|| 0

2. Triangle inequality
||x + v|| ||x|| + ||v||
3. Positive homogeneity or positive scalability
||a x|| = |a| ||x||

rank (A) = 2
where |a| is the absolute value of a .

## The third row is the sum of the first two.

Notice that the 3rd column is 2 x Col2 - Col1
And the 4th column is 3 x Col2 - 2 x Col1

Actually, property 1 follows from properties 2 and 3, so 2 and 3 are sufficient for
the definition.

The rank of a matrix can not be greater than the number of rows or columns
rank (A) min (n , m)
A matrix is said to be full rank if rank (A) takes its maximum possible value,
that is
rank (A) = min (n , m)
Otherwise the matrix is said to be rank deficient.
A square matrix is invertible if it is full rank.
The determinant of a square matrix is zero if the matrix is rank deficient.
Lecture: Basics of Linear Algebra

Page 13

## Additionally, the norms we will use have the property

4. Positive definiteness
||x|| = 0

if and only if x = 0

That is, the norm is zero only if the vector is a vector of all zeros
(called the null vector). Technically, a norm with property 4 is
called a seminorm, but this distinction will not be important
for us.
Lecture: Basics of Linear Algebra

Page 14

Section 1.8.1

Section 1.9.0

## 1.8.1 Example norms

The 2-norm, (a.k.a. the Euclidean norm)
||x||2 =
example:
h
i
xT = 2 3 4 ,

||x||2 =

## A matrix A Rnm has p singular values, where p = min (n, m)

n

x2i

(11)

r

22 + (3)2 + (4)2 = 5.39 (12)

Other common norms are the 1-norm and the -norm, these are all special
cases of the p-norm
We can write the 2-norm as:

x2i

||x||2 =

|xi|

||x|| p =

||x||1 =

|xi|
i

!1

||x|| = lim

Examples:
h
i

2 3 4 = 9
1

|xi|
i

1
4
7
2

2
5
6
1

3
6
5
1

>> S = svd(A)
S = 14.1542
2.5515
0.3857

## i , the singular values, are a measure of how a matrix scales a vector.

For example, for matrix A,

(13)

!1

(14)

## another vector v3 so that A v3 is scaled by 0.3857.

With example matrix A, choosing v1 and v3 to illustrate the largest and
smallest singular values (more later on choosing v1 and v3) gives:
v1 = -0.5763,
-0.5735
-0.5822

= |xi |

v3 = 0.3724
-0.8184
0.4375

(15)

>> y1 = A*v1
y1 = -3.4700
-8.6660
10.3862
-2.3083

>> y3 = A*v3
y3 = 0.0482
0.0227
-0.1159
0.3640

!1

A =

!1

## The 1-norm is the sum of the absolute values:

(also called the Manhattan metric)
1

For example:

i=1

= max |xi |

(16)

h
i

2 3 4 = 4

## Since 1 = 14.1542 is the largest singular value, there is no vector v that

gives a y = A v larger than y1 .
Since 3 = 0.3857 is the smallest singular value, there is no vector v that
gives a y = A v smaller than y3

Page 15

Page 16

Section 1.10.0

## 1.9.1 SVD, condition number and rank

The SVD is used to determine the rank and condition number of a matrix.
rank (A) = # singular values > 0 ,

rank (A) = 3

## (Read # singular values significantly different from zero)

cond (A) = ratio largest / smallest singular value,
For square matrices, |det (A)| = ni=1 i ,

## EE/ME 701: Advanced Linear Systems

Section 1.10.1

But suppose the ideal values y are unknown, and instead we have
measurements y given by:
y = y + e
y
(20)
Where
y is a measurement of y,

## cond (A) = 36.70

det (A) = undefined

We will be looking at the SVD in detail in several weeks (cf. Bay, Chap. 4).

e
y represents measurement noise,

For example e
y = N (0, y ) are samples with a normal (Gaussian) random
distribution, with zero mean and y standard deviation.
Then what we can compute is

## 1.10 The condition number of a matrix

x = A1
1

The condition number of a matrix indicates how errors may be scaled when
the matrix inverse is used.
For example if we have these two equations in two unknowns,
7

= 3 x1 + 4 x2

y = Ax

becomes

x:

x :

x :

x = A1
1 y=

## Lecture: Basics of Linear Algebra

0.2 0.4

1 2

x1
x2

(18)
In the worst case, the error is amplified by the condition number of A1
max

01.

0.3

e
x = x x

## An important question is, how much error does measurement noise e

y
introduce into the calculation of x ?

e

7

(21)

(17)

9 = x1 + 2 x2

7
9

5
2

## (Revised: Sep 06, 2012)

||e
x||
||e
y||
= cond (A1 )
||x||
||y ||

or
(19)
Page 17

||e
x|| ||x|| cond (A1 )
Lecture: Basics of Linear Algebra

(22)

||e
y||
||y||

Page 18

Section 1.10.1

Example:

Section 1.10.3

## 1.10.2 Example of a badly conditioned matrix

>> A1 = [3 4 ;-1 2]

A1 =

ybar =

3
-1

## Rather than Eqn (17) suppose we have the data

4
2

2 = 2 x1 + 4 x2
198 = 200 x1 + 401 x2

7
-9
Then x is estimated by:

xhat0 =
5
-2
>> cond(A1)
ans =
2.6180

x 2 = A1
2 y2

## %% Notice left-division operator

%% Given by calculation with no noise

## >> A2 = [2 4 ;200 401],

%% This is the condition number.

2
200

4,
401

ystar2 =

2
198

## The Condition number of A2 is large

>> cond(A2),

If ||e
y|| = 0.01 , then
max (||e
x|| / ||x ||) = 2.6180 (0.01/11.4) = 0.0023 , or

(23)

||e
x|| 2.6180 (0.01/11.4) 5.39 = 0.012

ans =

## >> ytilde1 = 0.01 * rand(2,1)

ytilde1 = 0.0087
-0.0093
>> xhat1 = A1 \ (ystar1 + ytilde1)
xhat1 =
5.0055
%% Errors of about 0.0023 * ||x||
-2.0019
%%
= 0.0023 * 5.39 = 0.012

Estimating x 2 using A2

## >> ytilde2 = 0.01 * rand(2,1)

ytilde2 =
0.0121
-0.0140
>> ybar2 = ystar2 + ytilde2
ybar2 =
2
+
0.0121
198
-0.0140
>> xhat2 = A2 \ ybar2
xhat2 =
7.4592
-3.2266

## (Revised: Sep 06, 2012)

1.0041e+05 = 100410.

## >> xstar2 = A2 \ ystar2

xstar2 = 5
%% Calculating with ideal data still gives
-2
%% ideal results

A2 =

Page 19

## %% An example sample of noise

%% An example measurement of Y
2.0121
197.9860

Page 20

Section 1.10.3

Section 1.11.0

## 1.10.3 How condition number is determined

Condition number is given as the ratio of the largest to smallest singular
value
A
(24)
cond (A) =
A
Where A is the largest singular value of A , and
A is the smallest singular value of A .

## A square matrix has eigenvectors and eigenvalues, making up the

eigensystem of the matrix.
The main properties of eigenvectors and eigenvalues are introduced here.
Consider: A vector has both a direction and a magnitude. For example:

v1 =

## >> SingularValues = svd(A1)

SingularValues =
5.117
1.954

%% cond(A1) = 2.618

SingularValues =
448.1306
0.0045

1
2

, v2 =

1.5
3.0

, v3 =

1.58
1.58

## v1 has the same direction as v2 , but has a different magnitude.

v3 has the same magnitude as v1 , but has a different direction.
x2
v1

## In general, multiplying a vector v by a matrix A

v3

x1

y4 = A v4
What matrices have condition numbers:
Singular values are defined for any matrix, and so condition number can
be computed for any matrix.

## introduces both a change of magnitude and a change of

direction. For example:

y4 = A v4 =

## A rank-deficient matrix B has B = 0 , and so cond (B) = .

(The error multiplier goes to infinity !)

2.0 0.5
1.0

0.5

1
1

2.5
0.5

v2
x2

y4
x1
v4

Page 21

Page 22

## EE/ME 701: Advanced Linear Systems

Section 1.11.0

For any square matrix A, certain vectors have the property that the matrix
changes magnitude only, not direction. That is, writing

Section 1.11.1

i vi = A vi
where the i

C1

(25)

## are the eigenvalues and the vi are the eigenvectors.

y = Av
then if v is an example special vector, y and v have the same direction, but
possibly different magnitude. If the directions are the same,
y = v or

## Notice that in general a x 6= A x, because multiplication by matrix A will

rotate a general vector x
(Choosing vectors x at random, what is the probability of selecting an
eigenvector ?)

v = Av
1.11.1 Some properties of Eigenvectors and Eigenvalues

1.5
1.5

2.0 0.5
1.0

0.5

1
1

2.0 0.5
1.0

0.5

1
2

A vi i vi = 0

1.5

1.5

1
2

2.0 0.5
1.0

0.5

2.0 0.5
1.0

0.5

1
1
1
2

det (A i I) = 0

= 1.5

= 1.0

1
1
1
2

, so v1 =
, so v2 =

, 1 = 1.5

(26)

so

(A i I) vi = 0

rank deficient.
QED

, 2 = 1.0

Page 23

Page 24

Student
Exercise

Section 1.11.1

det

a b
c d

## 5. A complete set of eigenvectors exists when an n n matrix has n linearly

independent eigenvectors (so V, the matrix of eigenvectors, is invertible).

A=

and plugging in

det

2.0 0.5
1.0

0.5

1 0
0 1

= det

Section 1.11.1

2.0

0.5

1.0

0.5

1.0

0.5

V=

2.0 0.5

v1 v2

1 1

1.5 0

1 2

## 3. The polynomial equation given by Eqn (26) is called the characteristic

equation of the matrix A
(a) A 3x3 matrix gives an equation in 3, a 4x4 gives an equation in 4 , etc.
(b) Abels theorem states: there is no closed-form solution for the roots of
a polynomial of 5th order and above, therefore there is no closed-form
solution for the eigenvalues a 5x5 matrix or larger.

U=

1 0
0 2

## 6. A matrix that lacks a complete set of eigenvectors is said to be defective,

but this name is meaningful only from a mathematical perspective. For
many control systems, some combinations of parameter values will give a
defective system matrix.

## 4. Special case, when A is upper-triangular, lower-triangular or diagonal

n

det (A I) = (dk )

(27)

k=1

A defective matrix can only arise when there are repeated eigenvalues.

## Where dk is the kth diagonal element.

Eqn (27) shows that for upper-triangular, lower-triangular and diagonal
matrices, the eigenvalues are the diagonal elements.
Lecture: Basics of Linear Algebra

## This case corresponds to the case of repeated roots in the study of

ordinary differential equations, where solutions of the form y (t) = t et
arise.

## (Revised: Sep 06, 2012)

Page 25

In this case, a special tool called the Jordan Form is required to solve
the equation x (t) = A x (t) + B u (t).
Lecture: Basics of Linear Algebra

Page 26

Section 1.11.1

## 1. As mentioned, some matrices do not have a complete set of eigenvectors

(Student Exercise: What is the name for this ?)

## V is the matrix of eigenvectors, U is a matrix with the corresponding

eigenvalues on the main diagonal.
1
2
-2
4
5
6
>> [V,U] = eig(A)
V =
0.5276
0.2873
-0.7994
U =

Section 1.11.2

>> A =

## For a symmetric real matrix (or Hermitian complex matrix), it is

guaranteed that all eigenvalues are real.

3
1
3 ]

## Furthermore, it is guaranteed that a complete set of eigenvectors exists !

0.5267 - 0.0250i
-0.1294 + 0.1030i
0.8335

-2.4563
0
0

0
5.2282 + 0.5919i
0

0.5267 + 0.0250i
-0.1294 - 0.1030i
0.8335
0
0
5.2282 - 0.5919i

## 8. Notice that eigenvalues can be real or complex.

>> Poly = poly(diag(u))
Poly =
1.0000
-8.0000

2.0000

68.0000

3 8 2 + 2 + 68 = 0

(28)

As with many things, wikipedia has a very nice article on eigenvectors and
eigenvalues:
Everything you wanted to know about the eigensystem of A, in 3 pages.
(Revised: Sep 06, 2012)

## System matrices (A in x = Ax) are rarely symmetric, but certain other

important matrices are.
3. The determinant of a matrix is equal to the product of the eigenvalues of the
matrix
n

det (A) = i
i=1

Example:
>> aa = [ 1 2 3 ; 4 -1 -3; 5 2 1]
aa =
1
2
3
4 -1 -3
5
2
1
>> [Evecs, Evals] = eig(aa)

## 2. Symmetric matrices are a special case.

Page 27

Evecs = 0.6128
0.0136
0.7901

0.4667
-0.8751
-0.1276

0.2146
-0.8542
0.4734

>> prod(diag(Evals))
ans = 6.0000
>> det(aa)
ans = 6.0000
Lecture: Basics of Linear Algebra

Evals = 4.9126
0.0000
0.0000

0.0000
-3.5706
0.0000

0.0000
0.0000
-0.3421

## %% Product of the eigenvalues

%% Equals the determinant
(Revised: Sep 06, 2012)

Page 28

Section 1.11.2

Section 1.11.2

## 5. Matlab does not create the characteristic equation using det (A I) = 0,

and then solving for the polynomial roots.

## 4. The determinant equation

det (A I) = 0

(29)
Matlab actually goes the other way, and solves for the roots of a polynomial
by forming a matrix and finding the eigenvalues of that ! The polynomial

gives a polynomial in

1 3 8 2 + 2 + 68 = 0
+ an1
n

n1

+ + a1 + a0 = 0

(30)
is represented in Matlab as the vector

## Eqn (30) is the characteristic equation of matrix A. The eigenvalues of

A are the roots of Eqn (30).

## Poly = [ 1, -8, 2, 68]

and the command roots() finds the roots:

Eqn (29) is useful for theoretical results, but generally not a practical method
for calculating the eigenvalues, for n > 2:

>> roots(Poly)

Once you get Eqn (30), how are you going to find the roots of the
polynomial ? (Answer on the next page)
Working with the determinant does not lead to a numerically stable
algorithm.

## ans = [ 5.2282 + 0.5919i

5.2282 - 0.5919i
-2.4563

Actually under the hood Matlab forms a matrix (called the companion
matrix) and applies the eigenvalue routine to that:
>> compan(Poly)

ans =

## The determinant does not give the eigenvectors

8.0000
1.0000
0

-2.0000
0
1.0000

-68.0000
0
0

Going from Eqn (29) to Eqn (30) by solving the determinant involves
symbolic manipulation of matrix A. It is a lot of work.
>> eig(compan(Poly))
ans = [ 5.2282 + 0.5919i
5.2282 - 0.5919i
-2.4563 ]

Page 29

## Lecture: Basics of Linear Algebra

3 + a2 2 + a1 + a0 = 0

a2 a1 a0

C= 1
0
0

0
1
0
(Revised: Sep 06, 2012)

Page 30

Section 1.11.2

Relationship

Section 1.11.3

## Roots of Characteristic Equation Eigenvalues of the Companion Matrix

For a state-variable model, the poles of the system are the eigenvalues of the
A matrix

## When a complete set of eigenvectors exists, the matrix of the eigenvectors

can be used to transform A into a diagonal matrix, by similarity transform.
The general form for the similarity transform is:

If the A matrix has poles in the right half plane, the system is unstable.
Note: I havent told you an algorithm for eig(A).

b = T A T 1
A

T must be invertible

(31)

## Matlabs eig() command uses one from a library of algorithms,

depending the details of the matrix. The study of efficient algorithms
to find eigenvectors and eigenvalues has been an active area of research
for at least 200 years.

b
eig (A) = eig(A)
Proof: starting with Eqn (26),
det (A I) = 0
Left multiplying by T and right multiplying by T 1


det T (A I ) T 1 = det (T ) det (A I)

1
=0
det (T )

(32)

Where the right hand equality arises because det (A I) = 0 and det (T ) is finite.
From (32) it follows that






b I = 0
det T (A I ) T 1 = det T A T 1 T I T 1 = det A

## (Revised: Sep 06, 2012)

Page 31

b
Therefore any value that is an eigenvalue of A is an eigenvalue of A.
Lecture: Basics of Linear Algebra

QED
Page 32

Section 1.11.3

## THEOREM: When a square matrix A has a complete set of independent

eigenvectors
i
h
V = v1 , v2 , , vn ,

## EE/ME 701: Advanced Linear Systems

V diagonalizes A (continued)
A numerical example:

## where U is a diagonal matrix with the eigenvalues on the main diagonal.

Proof: Since the vectors vi are eigenvectors,

V=
A vi = i vi

AV = A

v1 , v2 , , vn

1 v1 , 2 v2 , , n vn

1 v1 , 2 v2 , , n vn

v1 , v2 , , vn

1 0 0
0 2 0
.. .. . . . ..
.
. .
0

0 n

1 0

0.2320 0.7858

16.1168

1.1168

0.0000

= VU

## Left multiplying both sides by V1 gives:

b = U.
Equation (33) is a similarity transform with T = V1 and A

## 0.8187 0.6123 0.4082

V1 A V =

16.1168

1.1168

0.0000

1 2 3

V U V1 = 4 5 6

7 8 9

VU = AV

V1 V U = U = V1 A V

0.4082

We have:

So

v1 v2 v3

U = 0 2 0 =

0 0 3

And so
h

1 2 3

A= 4 5 6

7 8 9

U = V1 A V

Section 1.11.3

(33)

(and, of course,

=U

=A

i vi = A vi )

QED
Page 33

Page 34

Section 2.0.0

y, A : known;

Section 2.1.0

y = Ab

solve for: b

(34)

## Think of n equations in p unknowns, for example an experiment with a

process:
y (k) = b1 v (k) + b2 v2 (k)
(35)
where v (k) is an independent variable and y (k) is the measurement.
The objective is to determine the elements of the b vector from Eqn (35)
with v (k) known and y (k) measurements.

## 2. Solving a differential equation

x (t) = A x (t) + B u (t)

A, B, x (t = 0) : known;

y (k) = (k) b =
T

v (k) v2 (k)

b1
b2

(36)

## where (in this example)

(k) =
For example, if b =

2 3

iT

y (1) = (1) b =
T

y (2) = T (2) b =
..
.
y (n) = (n) b =
T

. See Strang for a nice treatment of the two problems of linear algebra.

Page 35

h
h
h

v (k)
v2 (k)

(37)

## and v (1) = 2, v (2) = 3, ... v (n) = 1 then

2 4

3 9

1 1

= 2 2 + 4 3 = 16

= 3 2 + 9 3 = 33
2
3

= 1 2 + 1 3 = 1
(Revised: Sep 06, 2012)

Page 36

Section 2.1.0

## With several measurements (equations) we get a matrix vector equation (the

- emphasize that T (k) is a row vector:

y (1)
T (1)

..
y = ... =
b = Ab
.

T
y (n)
(n)
Vocabulary

## 2.1.1 Case 1: Where we have n = p independent equations, the exactly

constrained case
In this case A Rnn is a square matrix.

(38)

A1 A = A A1 = I

Example

b=

b1 b2

iT

## v (k) R1 is an independent variable that determines the regressor

vector
iT
h
(k) = v (k) v2 (k)
(k) R p is the regressor vector
A Rnp is the regressor matrix

## The columns of A, which correspond to the elements of () are called

the basis vectors or basis functions of model A

## The solution to Eqn (34) is given as:

y
b
y

e
y

: Measured values
b
b

e
b

: Estimated values

## Generally, the true values are unknown

b
y = Ab
b

(left multiply by A

Example,

y = A1 A b
(39)

(40)

T (k) =

v (k) v2 (k)

i

## Suppose we have data for k = 1, 2, with v (1) = 1.5, v (2) = 3 and

y (1) = 12.75, y (2) = 43.5

Page 37

b
b = A1 y

12.75
43.5

y (1)
y (2)

b
b=
(Revised: Sep 06, 2012)

and so

## Lecture: Basics of Linear Algebra

A1 y = I b = b

e
y=b
yy

y = Ab

(original equation)

Notation
y b : True values

Section 2.1.1

T (1)
T (2)

1.5 2.25
3.0 9.0

b
b = Ab
b=

y,

b
b=

2.5
4.0

1.5 2.25
3.0 9.0

b
b

Page 38

## EE/ME 701: Advanced Linear Systems

Section 2.1.2

2.1.2 Case 2: Where n > p, we have more equations than unknowns, the
over constrained case (Left pseudo-inverse case):
This is a common case. For example, a model has 3 parameters, and we run
an experiment and collect 20 data. We have 20 equations in 3 unknowns.
b.
In this case there is generally no value of b which exactly satisfies y = A b

Define e
y = y Ab
b,
minimizing ||e
y||2

Section 2.1.2

h
:

AT

AT

y = Ab
b

: [] =

[]

## 1. Left multiply each side by AT :

AT y = AT A b
b


Note that AT A is an p p matrix, for example a 3 3 matrix when
there are 3 unknowns.

1
2. When AT A R pp is full rank, left multiply each side by AT A
gives:


1
1 

b
(41)
AT A b
b = Ib
AT A
AT y = AT A

3. Which gives

1

b
AT y = A# y
b = AT A

(42)

1 T
A is called the left pseudo-inverse of A .
where A# = AT A
Lecture: Basics of Linear Algebra

Page 39

Page 40

## EE/ME 701: Advanced Linear Systems

Section 2.1.3

2.1.3 Case 3: Where n < p we have fewer equations than unknowns, the
under-constrained case (Right pseudo-inverse case):
In the under-constrained case, A is a row matrix, so our equation has the
shape:

h
i

y = Ab
: [] =

## b which exactly satisfy

In this case there are generally many values of b

b
b
y = A b , the right pseudo-inverse gives the solution with minimum b
2

## EE/ME 701: Advanced Linear Systems

Section 2.1.3

Remarks
1. For each of cases 1, 2, and 3 we had the requirement that A , AT A or
A AT must be full rank. When these matrices are not full rank, a more
general method based on the Singular Value Decomposition (SVD) is
needed.
2. With the SVD and four fundamental spaces of matrix A we will be able to
find the set of all vectors b
b which solve
y = Ab

## The right pseudo-inverse is given by:


1
T
A+
A AT
R = A

AT


When A AT Rnn is full rank then:

AT

We will see the SVD and four fundamental spaces of a matrix in Bay,
Chapter 4.


1
T
A AT
=I
A A+
R = AA

b is given by:

(43)

## Plugging Eqn (43) into y = A b

b gives:

(44)

b
b = A+
Ry


1
= AT A AT
y


1
y = A AT A AT
y = Iy

Demonstrating that b
b = A+
R y is a solution.

## Student exercise: what are the dimensions of A+

R ?
Lecture: Basics of Linear Algebra

Page 41

Page 42

Section 2.2.0

## EE/ME 701: Advanced Linear Systems

Section 2.2.0

The state-variable model is the topic of the second half of the course.

The linear state variable model is a very general form for modeling dynamic
systems in control theory, economics and operations research, biology and
other fields.
The state-variable model is a differential equation with the form:
x (t) = A x (t) + B u (t)

## Here we only note that solving the differential equation

x (t) = A x (t) + B u (t)
is fundamentally different from solving the algebraic equation

(45)

where

y = Ab
and involves the eigensystem of the system matrix A and using a similarity
transform and transformation to modal coordinates to determine the
solution.

## x (t) Rn is the state vector,

A Rnn is the system matrix,

## The differences between the solution of algebraic and differential equations

is summarized in table 1.

## B Rnm is the input matrix, and

u (t) Rm is the input vector.

Equation Type
Just as solving an algebraic equation,

Algebraic

Differential

Nature of Solution

A value

A function of time

Main Tools:

Gaussian Elimination
Matrix Inverse
SVD

Eigenvalues
Eigenvectors

Singular Matrix

Problems,
Solution requires
SVD

O.K.

## Complete set of Eigenvectors

Not important

Important

Rectangular matrix

OK

Impossible

3x = 7
is very different from solving a scalar differential equation,
3 x (t) = 2 x (t) + 7 u (t)

## Solving a matrix differential equation requires tools very different from

those for solving a matrix algebraic equation.

Page 43

## Table 1: Differences in the approach to algebraic and differential equations. A

singular matrix is rank deficient.

Page 44

## EE/ME 701: Advanced Linear Systems

Section 3.0.0

3 Summary
Basics of linear algebra have been reviewed.
The two problems of linear algebra have been discussed.
Algebraic equations
y = Ab

(46)

## take us into vector subspaces, projection and the Singular Value

Decomposition (SVD)
Differential equations
x (t) = A x (t) + B u (t)

(47)

Page 45

28

4.1.3

30

4.2

## A Span of a Vector Space

. . . . . . . . . . . . . . . . . . . . .

31

4.3

Change of basis . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

Contents
1 Introduction

4.1.2

4.3.1

33

1.1

4.3.2

34

1.2

## Definitions: Properties of Vectors

. . . . . . . . . . . . . . . . .

4.3.3

## Change of basis example . . . . . . . . . . . . . . . . . .

34

1.2.1

Parallel . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2.2

## Change of basis viewed geometrically (this section is connected

to Bay, section 2.2.4) . . . . . . . . . . . . . . . . . . . . . . . .

36

1.2.3

Norm (magnitude) . . . . . . . . . . . . . . . . . . . . .

4.4.1

## Example, change of basis viewed geometrically

. . . . .

37

1.3

Direction Cosine . . . . . . . . . . . . . . . . . . . . . . . . . .

10

4.4.2

1.4

11

## Numerical example based on representing from basis

vectors on the to basis vectors . . . . . . . . . . . . . .

38

Summary . . . . . . . . . . . . . . . . . . . . . . . . . .

40

4.4

4.4.3
2 Vector Spaces

12

2.1

14

2.2

15

3.1

3.2

17

5.1

5.2

41

43

5.1.1

45

46

19

5.2.1

46

3.1.1

20

5.2.2

47

3.1.2

20

21

4.1

23

4.1.1

## (Revised: Sep 10, 2012)

Page 1

48

6.1

Projection Theorem . . . . . . . . . . . . . . . . . . . . . . . . .

50

6.2

51

6.2.1

51

54

24

## Part 2: Vectors and Vector Spaces

6 Projection Theorem

6.3

Page 2

6.4

## EE/ME 701: Advanced Linear Systems

Projection Matrices . . . . . . . . . . . . . . . . . . . . . . . . .

56

6.4.1

57

9.1

87

6.4.2

60

9.2

88

6.4.3

## Projection with normalized basis vectors . . . . . . . . .

61

9.3

Study tip: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

89

7 Gram-Schmidt ortho-normalization
7.1

63

7.1.1

63

7.1.2

67

7.2

## Projection matrix with GS Ortho-normalization . . . . . . . . . .

68

7.3

Projection Coefficients . . . . . . . . . . . . . . . . . . . . . . .

69

7.4

70

7.5

71

8.2

72

72

8.1.1

74

8.1.2

78

8.1.3

## Bases for the Four Fundamental Spaces, Numerical

Example . . . . . . . . . . . . . . . . . . . . . . . . . .

80

8.1.4

81

8.1.5

## Questions that can be answered with the four fundamental

spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . .

83

83

84

8.1.6

86

62

8.1

Page 3

Page 4

Section 1.0.0

1 Introduction

Section 1.1.0

## 1.1 Definitions: three ways to multiply vectors

Three types of vector products are

## A vector: an n-tuple of numbers or functions

A 3-vector: x = 1

An n-vector: z =

z1
z2
..
.
zn

sin (2k/4)

cos (2k/4)

A 4-vector: x (k) =

sin (4k/4)

cos (4k/4)

A 2-vector: u (t) =

2.0

## Euclidean Vector: A Euclidean n-space corresponds to our intuitive notion of

2-D or 3-D space:

## Inner (scalar) product (also dot product):

A scalar measure of the interaction of two vectors of the same dimension
hv, wi = vT w = wT v =

vi wi

where v , w , Rn

(1)

i=1

## If v is complex, then vT is the complex-conjugate transpose, (note: in

Matlab the transpose operation forms the complex conjugate).
The inner product of vectors gives a scalar,

hv, wi = vT w = (a scalar)

## Outer (matrix) Product:

Axes are orthogonal

## Each vector corresponds to a point

v >< w = v wT = (a matrix)

Example:

## Intuitive because we live in Euclidean 3-space, R3

Length (L2 -norm) corresponds to our every-day notion of length.

v= 2

3

w= 3

1

hv , wi = 11

2 3 1

v >< w = 4 6 2

6 9 3

## Example of using the Outer Product: each contribution to the covariance

of parameter estimates is given by:

Page 5

2eb = e
be
bT

Page 6

Section 1.1.0

Section 1.2.2

1.2.1 Parallel
vw = z

(2)

w = av

## The magnitude of the vector is give by:

(4)

where a is a scalar.

(3)

Example:

v = 2 , w = 6 ,

3
9

## where is the angle between v and w.

The cross-product is only defined for vectors in 3-space. It is used, for
example, electro-magnetics, 3D geometry and image metrology.

w=3 2 ,

3

so

vkw

## Co-linear is another term for parallel vectors, as in

Cross product and Outer product are not much used in EE/ME 701

## Inner (or dot) product will be used All the time.

1.2.2 Perpendicular (or Orthogonal)
If

hv, wi = 0

(5)

## then v and w are said to be perpendicular (or, equivalently, orthogonal).

Example:

v = 2 , w = 5 ,

3
3

vT w =

1 2 3

so

5 = 1 10 + 9 = 0

vw
Part 2: Vectors and Vector Spaces

Page 7

## Part 2: Vectors and Vector Spaces

(6)
(Revised: Sep 10, 2012)

Page 8

Section 1.2.3

Section 1.3.0

## 1.2.3 Norm (magnitude)

The norm of a vector x is a non-negative real number, ||x|| 0

The inner prod. indicates how closely two vectors are related.

## The L p norms are most familiar.

|hv, wi| = ||v|| ||w||
L: Henri Lebesgue (1875-1941), set calculus (integration and
differentiation) on a rigorous foundation based on set theory.
The L2 norm, ||v||L2 (or ||v||2 , or simply ||v||) is the familiar notion of
distance:
r
||v||2 = v2i
(7)
i

hv, wi = 0

||v|| p =

p

or

hv, wi
=0
||v|| ||w||

Define :

1
p

1.0

## The general L p norm ||v||L p (or simply ||v|| p):

r

hv, wi
= 1.0 or
||v|| ||w||

or

(8)

hv, wi
= cos ()
||v|| ||w||

(10)

The L2 norm is the default, throughout the literature, ||v|| refers to ||v||2,
unless expressly defined to be a different norm (or any norm).

## The L2 norm is related to the dot product of a vector with itself:

1
2

||v||2 = hv, vi =

vT v

v2i



hv, wi
= cos1
(11)
||v|| ||w||

w
(9)

## In a general framework, vectors may be made with elements which are

not numbers, but so long as the dot product is defined, the 2-norm is still
given by Eqn (9).
The L2 -norm is said to be induced by the dot product. It is called the
induced norm.

Page 9

Page 10

Section 1.4.0

Section 2.0.0

2 Vector Spaces

## 1.4 Parallel, Perpendicular and the Zero Vector

Looking at Eqn (4)
w = av

## EE/ME 701: Advanced Linear Systems

(4, repeated)

When we talk about vector spaces we talk about sets of vectors. Closure is
the first important property of sets.
Rn is the most basic vector space. It is the set of all vectors with n elements.

## Closure: A set is said to be closed under an operation, if the operation is

guaranteed to return an element of the set.

0

0
= w = 0v
..
.

0

Examples:
The integers are closed under addition, subtraction and multiplication.
The integers are not closed under division.

with a = 0 .

## The positive integers are closed under addition and multiplication.

The positive integers are not closed under subtraction or division.
Vector spaces are collections of vectors which are closed under scaled

## And likewise, if w = 0, then clearly

hv, wi = 0

(5, repeated)

So in a formal sense, we can say that every vector is both parallel and
perpendicular to the zero vector.

## For a vector space S

S : {a set of vectors}
if
then

x1 , x2 S
x3 = a1 x1 + a2 x2 S

(12)

The central topic of Bay, chapter 2, is describing the ai , xi and S in the most
general possible way.
Lets start with a specific example, S = R3, the space in which we live.
Part 2: Vectors and Vector Spaces

Page 11

Page 12

Section 2.0.0

Section 2.1.0

## 2.1 Properties the scalars in Eqn (12) must have

Defining a vector x:

x
1

x = x2

x3

## Eqn (12) comprises scalars and vectors

,

x1 , x2 , x3 R

(13)

The scalars must be elements of a field. For a set with operations to form a
field, it must have:

## Then S = {x} = R3 , the vector space S is the set of all vectors x

comprised of 3 real numbers.

a+0 = a

(15)

a1 = a

(16)

2. A multiplicative identify 1:
Examples:

2 ,

1 ,

5

0

0

if a F, then a F

## Verifying that a1 x1 + a2 x2 S with an example (choosing the ai to be

real numbers):

1
2
0

2.0 2 + 1.0 1 = 5

1
5
3

(14)

a + a = 0

(17)

## 4. Operations of addition, multiplication and division must be defined, and

the set must be closed under these operations.
Examples:
The integers do not form a field (not closed under division)
The rational numbers do form a field (+, , , all work)
The field we will use most often is the real numbers, a R .

Page 13

Page 14

Section 2.2.0

## EE/ME 701: Advanced Linear Systems

Section 2.2.0

4. Closure under scalar multiplication: For all x S and for all a F, the
vector y = a x must be an element of S.

## These are the fundamental properties of vector spaces and operations on

vectors. Other important properties, quantities and calculations are built out
of these basics.
A vector space is a set of vectors. It must have at least these properties:

x S,

a F , y = ax S

(23)

: for all
5. Associativity of scalar multiplication:

## 1. S must have at least 2 elements.

a, b F and x S ,

2. The linear vector space S must contain a zero element, 0 such that:
x+0 = x,

xS

(18)

(24)

## 3. Vector space S must contain the additive inverse of each element,

if x S , then y S s.t. x + y = 0

a (b x) = b (a x)

(19)

(a + b) x = a x + b x

(25)

a (x + y) = a x + a y

(26)

: there exists
The vector space (a set of vectors) and its operations must also have the properties:
1. The vector space S must be closed under addition:
if x + y = v, then v S

(20)

## The fundamental properties of fields are such standard components of

standard algebra that we forget to think about them.
Any Vector Space (that is, set of vectors which satisfies the fundamental
properties above) will have the derived properties of vector spaces which
we will develop, such as
Orthogonality

Projection
x+y = y+x

(21)

length
etc.

## 3. Associative property for addition:

For 99% of what we will do in EE/ME 701, we will work with vectors in Rn
over the field of the real numbers.

(x + y) + z = x + (y + z)

(22)
Which is to say the familiar vectors and scalars, such as in Eqn (14).

Page 15

Page 16

Section 3.0.0

## 3 Properties of ensembles of vectors

In many cases it is interesting to consider a finite set of vectors, such as:


v1 , v2 , ..., v p S

(27)

Example:

V=

1

2 ,

0

0 ,

0

1 ,

1

Section 3.0.0

## The expression for linear dependence can also be written:

b1

| | |
|

b2

v p = v1 v2 v p1 .
=V b

..

| | |
|
b p1

(29)

## And recall that a set of vectors is linearly independent if there is no set of

values {bi } which satisfies the conditions for Linear Dependence. In other
words,

1

1

1

R3

bi vi = 0

i=1

## implies that the {bi } are all zero.

We shall shortly see that a finite set of vectors can define a Vector Space.
Notation for a finite set of vectors: let V connote a set with p vectors:



V = v1 , v2 , ..., v p ,

or equivently ,

V = v1 v2 v p (28)

| | | |

## New vectors can be formed as linear combinations of vectors v1 v p :

w = a1 v1 + a2 v2 + + a p v p
Recall that a set of vectors is linearly dependent if one or more of the
vectors can be expressed as a linear combination of the others, such as:
v p = b1 v1 + b2 v2 + + b p1 v p1

Page 17

Page 18

Section 3.1.0

Section 3.1.2

## In many cases, vector spaces contain an infinite number of vectors, so it

is not possible to define a vector space by writing down all its elements.
However, we can define a vector space in terms of a finite set of vectors.
Most often, we define a vector space as the set of all vectors which can be
created by linear combinations of specified vectors. For example:

2
1

S1 = x : x = a1 1 + a2 0 , a1, a2 R

1
2

(30)


S = x : x = a1v1 + a2v2 + ... + a pv p

(31)

## where S is a vector space;

S is the set of all vectors formed as linear combinations of the spanning


vectors v1 , v2 , v p .
We can say that S is the space spanned by v1 , v2 , v p, and that vectors
v1 , v2 , v p span vector space S.

## Vector space S1 is the set of all vectors x such that



Given an arbitrary set of p n-dimensional vectors V = v1 , v2 , v p , vi Rn ,
the set of vectors defines a vector space

Any set of vectors, at least one of which is non-zero, spans a vector space.

## 3.1.2 A vector space defined by a set of basis vectors

x = a1 1 + a2 0

1
2

A span is a very general concept, perhaps a bit too general. The concept of a basis
for a vector space is more restrictive. It works like a span, we write that the set of
basis vectors {vi } defines a vector space according to Eqn (31) , but for a set {vi }
to be a set of basis vectors they must additionally be linearly independent. When
the vectors vi are independent and we write

## x = a1v1 + a2v2 + ... + ar vr

(32)

S1 =

Vector space S1 is

{}

the set

of vectors x

A set of basis vectors is the minimum set of vectors that spans a space

such that

For a given vector space S, the choice of basis vectors is not unique.

x =

## Part 2: Vectors and Vector Spaces

there is a unique solution {a1, a2... ar } for every vector x S. Because the basis
vectors vi are linearly independent, there is exactly one solution for Eqn (32).

Page 19

## (is any span S also a basis S?)

(Revised: Sep 10, 2012)

Page 20

Section 3.2.0

## EE/ME 701: Advanced Linear Systems

Definition of vector space S1

## Dimension: The dimension of a linear vector space is the largest number of

linearly independent vectors that can be taken from that space.
Examples:
The dimension of vector spaces Rn is n
The dimension of the vector space given by Eqn (33) is 2

## A vector subspace is itself a vector space (if we call S1 a subspace, we

are emphasizing that it is embedded in a larger vector space).

Example:
Universe is R3
S1 is 2-D

## (Eqn (33) repeated)

2
1

3
S1 = x : x = a1 1 + a2 0 , a1, a2 R , xi R

1
2

## Set S1 is a vector space. Thus, if v and w are elements of S1 , then

2
1

3
S1 = x : x = a1 1 + a2 0 , a1 , a2 R , xi R
(33)

1
2

Section 3.2.0

z = b1 v + b2 w ,

z S1 ,

b1 , b2 R

Consider the properties of a vector space one by one, you will see that
they all hold.
The dimension of S1 is the number of linearly independent vectors that can
be selected from S1 .
Note that even though v, w, z R3 , it is still true that dim (S1 ) = 2.

1
0
1
2
3
3
2

1
0

2
Y

2
3

## Figure 2: S1 , a 2-dimensional vector subspace embedded in R3 .

Part 2: Vectors and Vector Spaces

Page 21

Page 22

Student
Exercise

Section 4.0.0

Section 4.1.0

## Working with a particular vector space V of vectors v, x, y Rn , if we select

a set of vectors BV = {v1, v2, ..., vr } from V such that
1. We have r vectors, where r is the dimension of vector space V, and

## VECTOR REPRESENTATION THEOREM: Given a set of vectors BV =

{vi } = {v1 , v2 , ..., vr } that are basis vectors for r-dimensional vector space V
embedded in Rn , then any vector sx V can be represented as a unique linear
combination of basis vectors

## 2. The vectors are linearly independent

s

x = a1 v1 + a2v2 + ... + ar vr

## Basis: A set of linearly independent vectors, BV , in vector space V is a basis for

V iff every vector in V can be written as a unique linear combination of
vectors from this set.

x = a1 v1 + a2 v2 + + ar vr

## BV = {v1 , v2 , ..., vr } is a basis for V iff

(35)

Since V is r-dimensional and the set of vectors {x, v1, v2, ..., vr } contains
r + 1 vectors, the set must be linearly dependent. Since the set is linearly
dependent there exists a set of scalars ai (called the basis coefficients) such
that x + a1 v1 + a2 v2 + + ar vr = 0, which gives

Or written mathematically,

Example basis:

x, v1 ..vr Rn

Proof:
Existence:

## exactly one a = {a1, ..., ar }

x V,

Uniqueness:

s.t. x = a1v1 + + ar vr

1
2

BV = 1 , 0

1
2

(36)

Suppose the ai giving vector x were not unique. This means there is a second
set of basis coefficients {bi } distinct from the values {ai } such that
x = b1 v1 + b2 v2 + + br vr .

(37)

(34)
Subtracting Eqns (36) and (37) gives

(38)

## Basis BV is illustrated by the two vectors in figure 2.

A basis for V is not unique, any set of r independent vectors in V is a basis
for V
Part 2: Vectors and Vector Spaces

## (Revised: Sep 10, 2012)

Page 23

But since the vi are linearly independent, Eqn (38) is possible only if each
of the (ai bi) = 0, which is to say that {ai } = {bi }, and so the basis
coefficients representing vector x on {vi } are unique.
Part 2: Vectors and Vector Spaces

Page 24

Section 4.1.0

## Eqn (36) gives us a way to represent a vector x on the basis vectors.

x = 2

8
Inspection shows that

x = 2 v1 + 3 v2 =

Definition:

The vector a =

2
3

## Sometimes we are interested in multiple bases. In some cases a vector x

will have a representation on each basis. It helps to have some notation to
keep track of things.

1
2

V = {v1 , v2 } = 1 , 0

1
2

v1 v2

2
3

Given a vector space S1 of vectors sx Rn with a basis V = [v1 , v2] and also
with a different basis W = [w1 , w2 ], then a vector sx S1 can be represented
3 ways:
s
x , Vx , Wx

(39)

where the left superscript indicates the basis on which the vector is
represented.
The vector sx is represented on the standard or Euclidean basis. This is
the ordinary representation of the vector (more on this later).

## is the representation of the vector x on basis V .

Example: given the bases V and W for vector subspace S1 , and given the
vector sx below, find the representation for sx on each of bases V and W .

x = 2 ,

Notice that
x , v1 , v2 R3
while
2

aR

## Vectors x , v1 , v2 have the dimension of the universe in which vector

subspace V is embedded.

## The representation of a vector on a subspace always has the dimension

of the subspace (which may be less than the dimension of the universe).
(Revised: Sep 10, 2012)

Page 25

V = [v1 , v2 ] = 1

0 ,

x = a1 v1 + a2 v2 = [v1 , v2]

## Part 2: Vectors and Vector Spaces

a1
a2

=V

W = [w1 , w2 ] = 1 2

3 4

Solution: since the columns of V are one basis for S1 , we know there exist
basis coefficients a1 and a2 to represent sx on V . The relationships can be
written:

Section 4.1.1

Example:

x,

so V x =

a1
a2

Page 26

## EE/ME 701: Advanced Linear Systems

Section 4.1.1

When r < n , we can solve for the basis coefficients with the left-pseudo
inverse. Recalling that it is given that sx S1 , then
Since sx = V V x ,

Example:


1
V
x = V TV
V T sx

>> x
x =
-1
0
2

x = b1w1 + b2 w2 = [w1 , w2 ]

(40)

Solving, we find

>> V * Vx
ans =
1
-2
8
V x R2

2
3

Wx =

b2

=W


1
x = WTW
WT x

2.4
0.2

b1

, which gives.

2 = 2.4 1 0.2 2

8
7
4

## 4.1.2 Sn : the standard basis

%% should give sx

A basis that we have been using all along is the standard basis in Rn. For R3
the basis vectors are:

## The calculation of Eqn (40) always gives a well defined result.

1
The only calculation that might not give a valid result is V T V
. We
are assured the solution exists by the fact that the columns of V (the basis
vectors) are linearly independent.
Notice that x R3 while V x R2
The vector has the dimension of the universe, while the representation of
the vector has the dimension of the subspace
Part 2: Vectors and Vector Spaces

>> Vx = inv(V*V) * V * x
Vx =
2
3

1
-2
8

Section 4.1.2

so
>> V
V =
2
-1
1

Page 27

1 0 0

S3 = 0 1 0

0 0 1

h
i

= s1 s2 s3

(41)

## So we might write x = S3 sx. In words,

Every vector x is represented on the standard basis by representation sx,
with sx = x and Sn = I (the n n identity matrix) is the set of standard basis
vectors.
Part 2: Vectors and Vector Spaces

Page 28

Section 4.1.2

Section 4.1.3

## Thus we have found three ways to represent vector x

We are chiefly concerned with vectors of numbers, but there are other types
of vectors and vector spaces, and our notions of basis and representation
carry over.

## 1. On the standard basis

x = S3 sx
2. On the V basis
x =V

x = a1 v1 + a2 v2

x=W

x = b1w1 + b2w2

## Consider, for example, the set of polynomials of s of degree up to 2, such as

p (s) = 3 + 7 s + s2. It turns out that this set is a vector space. We can define
the vector space P by

## 3. And on the W basis

n
o
P = p (s) : p (s) = a1 + a2s + a3s2
If we choose the set of basis functions

Basis Facts:
A representation is always on a specific basis.
representation of vector x on basis V .

such as

Vx

is the

You cant talk about the representation of a vector without specifying the
basis.

Notice that

vi , wi

R3 ,

whereas

## Each vector in a universe has the dimension of the universe, but

A representation of a vector has the dimension of the basis on which it is
represented.
(Revised: Sep 10, 2012)

with

a1

x = a2

a3

Even when the vectors of the vector space are not vectors of numbers, a
representation of a vector will be (because the representation is always an
r-dimensional vector of basis coefficients).

V x, W x R2 .

This is because, for this example, the universe has dimension n = 3, while
the vector subspace has dimension r = 2.

p (s) = P px

## The representation of a given vector on a given basis is unique.

sx ,

o
n
P = p1 (s) = 1 , p2 (s) = s , p3 (s) = s2

Page 29

## There is a one-to-one correspondence between a vector p P and its

representation px Rr on basis P. The representation is the vector of
basis coefficients.
Equivalently, the vector and its representation are isomorphic.
Useful tools, such as dot product and norm, can be applied to px .
Part 2: Vectors and Vector Spaces

Page 30

Section 4.2.0

## 4.3 Change of basis

Just as a set of basis vectors defines a vector space, a set of spanning vectors


v1, v2, , v p defines a vector space:


Vector Space V = x : x = a1v1 + a2v2 + + a pv p

(42)


The vector space spanned by vectors v1 , v2 , , v p is the set of all vectors
given as linear combinations of the spanning vectors
A vector space V is spanned by a set of vectors {vi } if every x V can be written
as a linear combination of the vi s .

## It may be interesting to transform a vector from its representation on one

basis to its representation on another basis. A change of basis is important,
for example, for exploring some properties of state variable models.
Example:

3
2 1

V Vx = 1 0 = sx = 4

14
1 2

Find
We can say that the vi s span Vector Space V ,
Write V = span {vi }

Section 4.3.0

Wx

1 7

W = 1 2

3 4

with

## , the representation of sx on basis W .

We actually know how to solve this problem, since we know how to find
from sx using W (see Eqn (40) in section 4.1.1).

Wx

## However, it will be handy to introduce a more general notation that express

from what basis and to what basis a vector is being transformed.

All bases are a spans, but not all spans are bases, the difference:
A span is any set of vectors, for example:

1
0
3
1
,
,
V = span ,
2
5
1
1

W

(43)
Since

## defines vector space V .

Compare Eqn (42) with Eqn (43), both define vector space V
The example shows that a span may have redundant vectors


1

1
x = WTW
W T sx = W T W
WTV

Page 31

(44)

1

0.60
0.40
= W
WTV =
WTW
VT
0.20 0.20

## Equivalently, the vectors of the spanning set may be linearly dependent.

Shortly, we will learn how to form a basis from any span

x=W
VT x=

0.60

0.40

0.20 0.20

Vx =

4.4
0.2

Page 32

Section 4.3.1

## EE/ME 701: Advanced Linear Systems

Note: when V is a square matrix

## 4.3.1 Notation for a change of basis

To change from one representation to another requires a transformation
matrix
W
V
x= W
VT x
where

Wx

Section 4.3.3

V
sT

= V 1

(46)

The symbol

W
VT

## indicates that transformation matrix T converts a vector from representation

on basis V (left subscript) to basis W (left superscript). Using the left suband super-scripts leaves the right positions open, for example
W
V T1 ,

s
V
VT x

s
VT

with


1
x = V TV
V T sx ,

so

V
sT

=V


1
= V TV
VT

(47)

G
sT

(48)

it follows that
F
=G
FT s T

## Find transformation matrix to convert vector sx to its representation on basis

F , and find the transformation matrix to convert this to basis G

(45)

which gives:
V

F s
x= G
FT s T x

## 4.3.3 Change of basis example

For the transformation from s to V , weve seen that the basis coefficients
are given by:
V

F
sT

Solution, since

Eqn (39))
s
x = V Vx

x=

find G
sT

W
V T2

## might be the transformations from V to W at t1 and t2

G
FT

10

s
x= 4 ,

11

2 2

F = 1 2 ,

1 3

2 4

G = 4 1

1 4

x = Vs T sx
(Revised: Sep 10, 2012)

Page 33

Page 34

Section 4.3.3

Section 4.4.0

F
sT

## and then evaluate

Fx = FT sx
s

and
and

G
FT

In section 4.3 we saw vectors represented on the standard basis and two
particular bases,
s
x , Fx , and Gx

Gx = GT Fx.
F

with
%% Find the two transformation matrices
>> FsT = inv(F*F)*F
FsT =
0.3117
-0.3506
0.0260
0.0260
0.2208
0.1688
>> GFT = inv(G*G)*G*F
GFT =
0.3333
-0.3333
0.3333
0.6667

F = 1 2 ,

1 3

G = 4 1

1 4

(49)

bases

G
FT

1
GT G
GT F

(50)

## Now, evaluating Fx and Gx

>> sx =[10; 4; 11]
sx =
10
4
11

>> Fx = FsT * sx
Fx =
2
3

>> Gx = GFT * Fx
Gx =
-0.3333
2.6667

## Bay presents a second way to derive the coordinate transformation in section

2.2.4, based on representing the basis vectors of the first basis in the second
basis.
Since F and G are sets of basis vectors for the same vector space, we can
represent each basis vector in the other basis.

>> F*Fx
ans =
10
4
11

>> G*Gx
ans = 10.0000
4.0000
11.0000

Page 35

Page 36

Section 4.4.1

## Writing the basis vectors, explicitly showing them to be represented on the

standard basis (these are just the basis vectors shown in Eqn 49,
h
iT
)
for example s f1 = 2 1 1
F=

sf

sf

G=

sg

sg

(51)

We can form G
F T by representing the F basis vectors on the G basis:
h
i
F = G G f1 G f2

Section 4.4.2

to basis vectors
>> F = [ 2 2
-1 2
1 3 ]
>> G = [ 2 4
-4 1
-1 4 ]

Now
 h
x = F Fx = G

Gf

Gf

i

x=G

h

Gf

Gf


x = G Gx (52)

i 
h
G
F
F
Gf
Gf
x=
x =G
1
2
FT x
So
G
FT

Gf

Gf

(53)

(54)

## The transformation from one coordinate frame to another is given as the

representation of the from basis vectors on the to basis vectors.

## >> sf1 = F(:,1)

>> sf2 = F(:,2)
sf1 = 2
sf2 =
2
-1
2
1
3
%% Find the representation of the F basis vectors on G
>> Gf1 = inv(G*G) * G * sf1
Gf1 =
0.3333
0.3333
>> Gf2 = inv(G*G) * G * sf2
Gf2 = -0.3333
0.6667

## 4.4.1 Example, change of basis viewed geometrically

In robotics, rotations are used to change the expression of a vector from one
coordinate frame to another (that is, a change of basis). The rotation matrix
from B to A is given by the axes of B expressed in A

|
|
|

A
A
AY
AZ
(55)
BR = X B
B
B

|
|
|
Part 2: Vectors and Vector Spaces

Page 37

Page 38

Section 4.4.3

## EE/ME 701: Advanced Linear Systems

Section 4.4.3

4.4.3 Summary

%% Transformation from F to G
>> GFT = [Gf1 Gf2]
GFT =
0.3333
-0.3333
0.3333
0.6667

## Looking at the way G

F T is given,
G
FT

%% Representation of x on F
>> Fx = [ 2
3 ]
Fx =
2
3

Gf

Gf

## A coordinate transformation is given by representing the from basis

vectors on the to basis.
Note that the equations are, of course, the same. In section 4.3 we wrote
G
FT

%% Representation of x on G
>> Gx = GFT * Fx
Gx =
-0.3333
2.6667


1
= GT G
GT F

G
FT

=
=

>> sx = F * Fx
sx = 10
4
11

Gf

Gf

1 T s
G f1 ,
GT G

1

h
GT s f1
GT G

sf

1 T s i
GT G
G f2

1

GT F
= GT G

from F to G .

>> sx = G * Gx
sx = 10.0000
4.0000
11.0000

Page 39

Page 40

Section 5.0.0

## 5 Vector Subspace (following Bay 2.4)

Section 5.0.0

Recall the terminology from set theory: subset and proper subset.
From the set of colors (the universe) {blue, green, red},

Vector subspace: A vector space of dimension r , comprising vectors of ndimensional elements, where r n.

## A vector subspace is a type of vector space, it must demonstrate all of

the properties of a vector space.

{blue,green,red} Colors
{blue, red}, is both a subset and a proper subset.

## Proper vector subspace: A vector space of dimension r , comprising vectors of

n elements, where r < n.

{blue,red} Colors
A vector subspace is defined by a basis or a span, for example

1
2

S1 = x : x = a1 1 + a2 0 ,

2
1

Then B =

1 ,

1 ,

B = 1 ,

B =

a1 , a2 R

## A vector space is a subset of the universe in which it is embedded. For

example
(56)
S2

is a basis for S1 .

S2 R3

1
1

2
3

## (Revised: Sep 10, 2012)

0
1
0

x : x = a1 0 + a2 1 + a3 0 ,
=

0
1
0

Page 41

Exercise:

## A proper vector subspace is a proper subset of the universe in which it is

embedded. For example

Make up
Another
span
and basis
for S1 .

a1 , a2 , a3 R

S1

2
1

x : x = a1 1 + a2 0 ,
=

1
2

S1 R3

a1 , a2 R

Page 42

Section 5.1.0

Section 5.1.0

Terminology

## An example 2-D subspace embedded in R3 :

2
1

S1 = x : x = a1 1 + a2 0 ,

1
2

a1 , a2 R

## Given S and B are vector spaces, if B S (if B is a proper subset of S),

then we can say that vector space B is embedded in vector space S .
Such as: Vector space S1 is embedded in R3.
(57)

## Proper vector subspace is used to emphasize that

r = dim (B) < dim (S) .
Often we omit the words proper and vector, such as saying:
Subspace B is embedded in S, which is isomorphic to R3 .
Sub can also be omitted:

## If a vector space is 2 dimensional, then dim (B) = 2 and we say that B is

2-D, such as:

1
0

B is a 2-D subspace in R7 ,

1
2

If dim (B) = r:

3
3

B is a r-D subspace in Rn .

1
0

2
Y

2
3

## Figure 3: 2D Vector subspace embedded in R3 .

Recall that vector space S1
must include the zero element.
must be closed under scalar multiplication.
must be closed under vector addition.
Part 2: Vectors and Vector Spaces

Page 43

Page 44

Section 5.1.1

Section 5.2.1

## 5.1.1 Observations on subspaces of Euclidean spaces

Proper subspaces of R3 have dimension 1 or 2.

When B A, and thus r = dim (B) < dim (A), there must be vectors in A
not lying in B.

## B, dim (B) = 0 contains only one vector: the zero vector.

Does the zero vector, 0 Rn, by itself satisfy the conditions to be a
vector space ?
B, dim (B) = 1: the vectors in B lie on a line passing through the origin.
B, dim (B) = 2: the vectors in B lie in a plane which includes the origin.
Proper subspaces of Rn have dimension 1 ... (n 1).
A 2-D vector subspace can be referred to as a plane.

Student
Exercise

Actually, when dim (B) < dim (A), almost all vectors in A are
outside B.
Consider a class room with the origin in the center of the top of the
desk in front. Almost all vectors in R3 (the class room) do not lie in the
surface of the desktop (a 2-D subspace). The surface of the desktop
has zero volume, and so 0% of the volume of the room.
5.2.1 The set-theoretic meaning of almost all

hypersurface.

## Almost all y A have property w has a precise meaning in set theory:

The elements of set A which do not have property w have measure
0.

## The set of states of the system forms a 4-D hyperplane in R12

A vector spaces of dim (S) > 3 can be referred to as hyperspace, as in
The points lie on the surface of a 6-dimensional hyper-cube lying in
an 8-dimensional hyper-space.

## Measure theory is whole subject unto itself (remember Lebesgue). Think of

it this way:
if you are choosing y randomly from A, the chance of getting one with
property w is 100% (almost all y have property w).
There are elements of A which do not have property w, but the chance of
choosing one is infinitesimal. In set theory we say
The set of y in A without property w has zero measure.
or, equivalently
Almost all y in A have property w.

Page 45

Page 46

Section 5.2.2

## EE/ME 701: Advanced Linear Systems

6 Projection Theorem

## 5.2.2 A vector y orthogonal to a proper subspace B

When proper subspace B is embedded in A, vector y A, but outside
subspace B can have two relationships to B:

## Before introducing the projection theorem, we need to introduce a

generalized notion of orthogonality.

x B

s.t.

hx, yi =
6 0

## Generalized Orthogonality: three flavors of orthogonality

For this discussion (as elsewhere) b, u, v, w, x, y and z are vectors in Rn ;
bold capitals, such as B, U, S and S1 , refer to subspaces.

(58)

## 2. Vector y can be orthogonal to subspace B, which is to say that y does

not overlap B at all. Formally:
if

hx, yi = 0

x B then

Section 6.0.0

y B

## This is our familiar case, based on the the inner product:

(59)

vw

Or, equivalently,

hv, wi = 0

(63)

## Case 2: A vector is orthogonal to a subspace

y B x B , hx , yi = 0

Student
exercise

y bi ,

subspace:
vB

hv, bi = 0 b B
(64)

(60)

## Case 3: A subspace is orthogonal to a subspace

This implies that each vector in the first subspace is orthogonal to each
vector in the second subspace:

(61)
BU

hb, ui = 0

b B u U

(65)

Page 48

then
yB

(62)

Page 47

Section 6.0.0

Section 6.1.0

## Now lets define a projection:

Projection: A projection of a vector onto a subspace is a vector. It is the
component of the original vector lying in the subspace. The remainder
of the original vector must be orthogonal to the subspace.
Looking at figure 4, u is the component of x lying in subspace U. This
means that the other part of x (that part not in subspace U, given by
w = x u) is orthogonal to U, or w U.
Observation: the subspace can have dimension 1, that is, we can project
a vector onto a vector.

u U

## PROJECTION THEOREM: Given vector spaces U and S with U a

proper subspace of S, then for all vectors x S there exists a unique
vector u U such that w = x u and w U.
Alternatively, in formal notation, given U S ,
x S u U s.t. w = x u , and w U

(66)

## The projection theorem is illustrated in figure 4.

The projection theorem tells us that given any vector and subspace (not
necessarily proper !) the vector can be broken down into 2 parts:
1. A part lying in the subspace, and

w = xu

w U

## 2. A part that is orthogonal to the subspace.

Notation: Introduce the notation PU x for the projection of x onto U. When u is
the projection of x onto U, we may write:
u = PU x
Consider the part which does not lie in subspace U.
Define w = x u , we may say that

## vector w is orthogonal to subspace U.

The set of all possible ws forms a vector subspace !
Orthogonal Complement: if U is a subspace of S, the set {w : w S , w U}
is the orthogonal complement of U in S , written: U .

Page 49

Page 50

Section 6.2.1

Starting with

## 6.2.1 First projection example, projection onto a 1-D subspace

with a 1-D subspace U = span {u1 }

, u1 =
f=

hw, u1 i = 0

(70)

hf, u1 i
hf, u1 i
=
hu1 , u1 i ||u1 ||2

(71)

1 =

(67)

The inner product gives us the projection of one vector onto another.
Projection onto a vector is illustrated in figure 5.

## Suppose we want to find the projection of f onto U. Whatever is left over,

must be orthogonal to U, so:
PU f = 1 u1
(68)
w = f PU f

hf, u1 i 1 hu1 , u1 i = 0
we can solve 1 :

Because the inner product operation is linear, Eqn (69) can be broken into two
parts. Keeping in mind that w = (f PU f) and for the 1-D case PU f = 1 u1
hw, u1 i = h(f PU f) , u1 i = hf, u1 i hPU f, u1 i = hf, u1 i 1 hu1 , u1 i = 0

## This is the problem of projecting a vector onto a subspace.

Consider vectors in

Section 6.2.1

## Example: Projection onto a 1-D subspace (continued)

In section 6.1 the projection theorem is laid out. The question arises: given
S, U and x, how can u and w be determined.

R4,

(69)

## Notice that the magnitude of the projected vector (PU f) is independent of

the length of the basis vector (u1 ).

u1
Pf

Pf
w

u1

Page 51

Page 52

Section 6.2.1

Eqn (71) is:

i
h

4 0 2 1

PU f =
2

PU f =

0.1

0.2

4 0 2 1

0.2

0.1

PU f =

2

0.1

0.2

0.2

0.1

0.7

 
2 1.4
2

= 7 =

10
2 1.4
2

1
0.7
1

||u1 ||2

u1

(72)

## 1. The direction, given by u1 .

The projection of f must lie in the subspace spanned by u1 .
2. The magnitude, given by the scaled inner product of f and u1 . Call this
magnitude the projection coefficient.

## 6.3 Normalization of the basis vectors

If the length of u1 is one, then Eqn (72) reduces to:
PU f = hf, u 1 i u 1

hf, u1 i

## Using f and u1 , a shorter version of u1 :

Section 6.3.0

0.1
0.7

 

0.2 1.4
0.2
0.7
=

0.1
0.2 1.4
0.2

0.1
0.1
0.7
0.1

(73)

and the projection coefficient is simply the inner product. This property is
sufficiently handy that we need to call it something:
Normal basis vector: a basis vector with length 1 is said to be normal,
and is sometimes written with a hat : u 1 .
Normalization: the process of making a vector normal.
Matlab example:
>> u1 = [ 1 2 2 1 ];
>> u1hat = u1 / norm(u1);

## Of course, the projection of f onto u1 is the same as onto u1 .

Part 2: Vectors and Vector Spaces

Page 53

Page 54

Section 6.3.0

## Consider example 1 one more time, with u 1 normalized:

Normalized :

Now

0.316

Section 6.4.0

One more interesting fact about projecting f onto U: since hf, u 1 i is a scalar,
Eqn (73) can be written:

0.632
u1

=
u 1 =

||u1 ||
0.632

0.316

(74)

 


PU f = hf, u 1 i u 1 = u 1 hu 1 , fi = u 1 u T1 f = u 1 u T1 f

(76)


M = u 1 u T1 is thus a term that multiplies a vector to give the projection of
the vector onto a subspace:

||u 1 || = 1
And so

PU f =

hf, u 1 i
||u 1 ||2

u 1 = hf, u 1 i u 1

(75)

0.316

0.316

0.316

0.70

0.632 0.632
0.632 1.40

= (2.214)
=

PU f =
4
0
2
1

0.632 0.632
0.632 1.40

0.316
0.316
0.316
0.70
Also, defining

0.316

i
0.632 h
0.2 0.4 0.4 0.2

M=

0.632
0.2 0.4 0.4 0.2

0.316
0.1 0.2 0.2 0.1
Now

0.70

Mf =

1
0.70

W = U
notice that

3.3

1.4
W
w = f PU f =

0.6

1.7

## M is pretty handy also, and so lets call it a projection matrix or

projection operator.

## Student exercise, what is the dimension of W ?

Part 2: Vectors and Vector Spaces

Page 55

Student
exercise

Page 56

Section 6.4.1

## EE/ME 701: Advanced Linear Systems

Notice that if we write

## 6.4.1 Bay example 2.10, projecting f onto a 2-D subspace

then:

0
2
0
, u1 = , u2 =
f=

2
2
1

1
1
1

(77)
and also:

Solution:
and so:
Since PU f U , PU f can be written:
PU f = 1 u1 +2 u2

1 , 2 R

u1 , u2 R4 ,

U = span {u1 , u2 }
(78)
Writing f = 1 u1 + 1 u1 + w, and using Eqn (78) and the linearity of
inner product
hf , u1 i = h1 u1 + 2 u2 + w , u1 i = 1 hu1 , u1 i + 2 hu2 , u1 i + 0

1
2

## Part 2: Vectors and Vector Spaces

hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i

hf , u1 i
hf , u2 i

## (Revised: Sep 10, 2012)

hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i

hf , u1 i
hf , u2 i

hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i

= UT U

hu1 , fi

hu2 , fi

(82)

= UT f

hf , u1 i
hf , u2 i

(83)

(84)


1
= UT U
U T f (85)

## The projection coefficients are given by Eqn (85).

Equation (85) for projecting f onto basis vectors U is related to what
equation that we have seen before ?
Finally, the projection is give by:

(80)

## Which solves to give:

(79)

hf , u2 i = h1 u1 + 2 u2 + w , u2 i = 1 hu1 , u2 i + 2 hu2 , u2 i + 0

## Eqn (79) may be written:

hu1 , u1 i hu2 , u1 i
hf , u1 i
1

hf , u2 i
hu1 , u2 i hu2 , u2 i
2

U = u1 u2

| |

## Consider again these vectors in R4, and project f onto the

subspace U = span {u1 , u2 }

Section 6.4.1

PU f = 1 u1 + 2 u2 = U

1
2


1
= U UT U
UT f

(86)

(81)
PU f = M f with
Page 57


1
M = U UT U
UT

(87)
Page 58

Student
Exercise

Section 6.4.1

1 0

1 0

1
2
2
1
2
0
10
3
=

UT U =

0 0 1 1 2 1
3 2

1 1

hf , u1 i
1
2
2
1
0
7
=

= UT f =

hf , u2 i
0 0 1 1 2
1

or

## Using the projection PU f we can find the orthogonal complement, U

1
2

10 3
3 2

7
1

1
1

w = f Pf = (I M) f

## where M is the projection matrix. We can say that

w = M f
where M = I M is the projection matrix for projecting a vector onto the
orthogonal complement, and w U .
Using the data of the example,

(88)
0.091 0.091

0.182 0.182
,

0.545 0.455

0.455 0.545

## Part 2: Vectors and Vector Spaces

Since

1

2

PU f = 1.0 u1 1.0 u2 =

1

0


1
M = U UT U
UT

0.182 0.364

0.364 0.727
M =

0.091 0.182

0.091 0.182

w = f PU f =

Section 6.4.2

2 0

U =

2 1

1 1

1.0

2.0

PU f = M f =

1.0

0.0

Page 59

0.818

0.364 0.091

0.091

0.364
0.273
0.182
0.182

M = I M =

and w = M f =

3 2 1 1

## Part 2: Vectors and Vector Spaces

iT
(Revised: Sep 10, 2012)

Page 60

## EE/ME 701: Advanced Linear Systems

Section 6.4.3

It would be nice to get rid of the matrix inversion step in projecting a vector
onto a subspace

b=
bT U
U

0.671

0.671

For many reasons, not least because the matrix inverse can be badly
conditioned.

Starting with

hf , u 1 i
0.316 0.632 0.632 0.316 0
2.214
T
=
b f =

=U

0
0
0.707 0.707 2
0.707
hf , u 2 i

1
1
0.671
3.162
2.214

=

=

2
0.671
1
0.707
1.414

1.0

2.0

PU f = 3.16 u 1 1.414 u 2 =

1.0

0.0

vectors:

## 0.182 0.364 0.091 0.091

1

0.364 0.727 0.182 0.182
b
bT =
b U
bT U
U
M =U

## 0.091 0.182 0.455 0.545

Part 2: Vectors and Vector Spaces

Section 7.0.0

7 Gram-Schmidt ortho-normalization

## 6.4.3 Projection with normalized basis vectors

With normalized vectors:

0.316
0

0.632
0
b =
U

0.632 0.707

0.316 0.707

## (Revised: Sep 10, 2012)

the basis

(90)

UT U = I
so that

Since


1
M = U UT U
UT = U I 1 UT = U UT

hu1 , u1 i hu2 , u1 i

UT U = hu1 , u2 i hu2 , u2 i

..
..
...
.
.

(91)

(92)

## Computing a projection without a matrix inverse requires that

hu1 , u1 i = hu2 , u2 i = = hur , ur i = 1

hu1 , u2 i = hu2 , u1 i = = ui , u j , i 6= j = 0

If

1

UT
M = U UT U

## In other words, the basis vectors must be normal and orthogonal.

(89)

Page 61

Orthonormal: Basis vectors which are both orthogonal and normal are
called orthonormal.

Page 62

Section 7.1.1

Section 7.1.1

## Gram-Schmidt Ortho-normalization is a process that starts with any span of

a vector subspace, and produces an orthonormal basis, V .

## 2. Take any vector from the subspace:

u1 =

2

1

In broad terms, the process to make orthogonal basis vectors works this way:
2. Choose any vector u in the subspace
Choosing, for example, a vector from any spanning set.

V=

## 3. Subtract off the projection onto the existing basis vectors.

3. Subtract from u the projection of u onto the existing basis vectors.
4. If a sufficient vector remains after the subtraction, normalize the
remaining vector and add it to the set of basis vectors.

## 7.1.1 Example Gram-Schmidt Ortho-normalization

Starting with the spanning set, let subspace S
[u1 , u2 , u3 ]

1
0

2 0

S = span
, ,

1
1

## (Revised: Sep 10, 2012)

0.316

0.632
u1
v1 = =
u
0.632
1

0.316

2

3

4. If a sufficient vector remains after the subtraction (if u1 > tol), normalize
the remaining bit and add it to the set of basis vectors.

be spanned by vectors A =

2 0 2

u1 =
=
2 0 2

1
0
1

## 5. Repeat steps 2-4 until the basis set is complete

The basis set is complete when r = n, or when all of the spanning
vectors have been used.

V = {v1} =

0.632

0.632

0.316
0.316

5. Repeat

Page 63

Page 64

Section 7.1.1

0

0

u2 =
,
1

1

V=

0.632

0.632

0.316

0.3

u3

0.286

0.572
u2
,
v2 = =

u
0.381
2

0.667

V = {v1 , v2} =

(93)

0.286

0.632 0.572
,

0.632 0.381

0.316
0.667
0.316

0.182

0.364

0.091 0.091

u3
= u3 V V u3 = u3

## In general, vector u2 is not parallel to u2 , that is u2 u2 .

If u2 > tol, u2 can not perpendicular to u2 either.

4. If a sufficient vector remains after the subtraction (if u2 > tol), normalize
the remaining bit and add it to the set of basis vectors.

0.286

0.632 0.572
,

0.632 0.381

0.316
0.667
0.316

## 3. Subtract off the projection onto the existing basis vectors.

V=

1

2

u3 =
,
3

2

0.316

0.6
0.2 0.4 0.4 0.2

u2 =
u2 = u2 V V u2 = u2

0.4
0.2 0.4 0.4 0.2

0.7
0.1 0.2 0.2 0.1

Section 7.1.1

## EE/ME 701: Advanced Linear Systems

= 10

0.444

0.888

0.444

0.444

15

4. If a sufficient vector remains, normalize the remaining bit and add it to the set
of basis vectors.

u3 < tol, u3 lies in V = span (v1, v2)

## 5. All out of vectors in the original spanning set. Done.

5. Repeat
Part 2: Vectors and Vector Spaces

Page 65

Page 66

Section 7.1.2

## In step 4 of the Gram-Schmidt algorithm, a vector ui is accepted to be a basis

vector if sufficient magnitude remains after projection onto the orthogonal
complement of V :
,

include ui / ui

Using

S = span

(94)
is given by:

where
A=

u1 u2 u p

0.316

0.286

## The projection matrix onto the subspace given by

One possible answer is a value that scales with the number and size of
vectors and the machine precision,

0.632 0.572

0.632 0.381

0.667
0.316

V=

Section 7.2.0

if ui > tol

## maxi ||ui || is the maximum of the norms of the columns ui Rn ,

is the machine precision
tol given by Eqn (94) reflects the fact that round-off errors in step 4 depend
on the dimensions of A and the magnitude of the vectors that make up the
calculation.

0 2

1 3

2
1

2

2

1

0.182

0.364

0.091 0.091

0.364
0.727
0.182
0.182

M = V VT =

(95)

1.0

0
2.0
= Mf =

PU f = P

2
1.0

1
0.0

(96)

## Importantly, nowhere in the whole process is a matrix inversion required !

Part 2: Vectors and Vector Spaces

Page 67

Page 68

Section 7.3.0

## Note that when the set of basis vectors V is orthonormal, then:

PU f = 1 v1 + 2 v2 = V

1
2





= V VT f = V VT f

## At each iteration of the GS Ortho-normalization, in step 3, we are

subtracting the projection onto the existing basis vectors from the candidate
vector.
(97)
This is the same as taking the projection onto the orthogonal complement of
the existing basis vectors. For example:

and so the projection coefficients to project f onto V are given simply by:

1
2

= VT f




x2 = x2 V V T x2 = I V V T x2

(98)

## is the projection of x2 onto V .

Given a basis set V, the projection matrix to project onto the orthogonal
complement is given by:

## Compare with Eqn (81) (repeated here)

1
2

Section 7.4.0

hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i

hf , u1 i
hf , u2 i

hu1 , u1 i hu2 , u1 i
hu1 , u2 i hu2 , u2 i

## Eqn (98) requires no matrix inverse operation.

inv(X*X)
can be computed in
We are accustomed to thinking
Eqn (81), and this is true for reasonably conditioned matrices that arent
too big.

UT f

M = I V V T

x2 = M x2

(99)

## Eqn (99) is not too surprising, it is saying:

The portion of x2 not lying in V is the total, minus the bit that is lying
in V.
Eqn (99) can be quite handy.

But for singular, poorly conditioned or even just large matrices (50x50
or larger)
inv(X*X)
may not exist, or the computation may lead
to large errors.
Gram-Schmidt is one of the most scalable and robust algorithms in linear
algebra.

Page 69

Page 70

Section 7.5.0

Section 8.1.0

y (k) = T (k) b

y = A b
b

with

b Rp
y Rn , b

## Where p is the number of parameters, and n is the number of data points.

Where y is measured data and y is estimated from the model.
Then

1

b
AT y ,
b = AT A

b ,
y = A b

= y y

The model set (also called the reachable set) is set of outputs, y , given
by the model for any possible tuning of parameters b
b.
For a linear model, the model set forms a linear vector space:

The columns of A are the spanning set for the r-dimensional model
set in the n-dimensional output space.
b are the projection coefficients of the data y onto the
The parameters b
basis vectors of the model set (namely the columns of A).

The G-S algorithm applied to the columns of A gives a basis for the
model set.

## The projection theorem tells us that with b

b given by Eqn (100):

## (Revised: Sep 10, 2012)

0
1
1

y = Ab

(101)

2. Output space: y Rn

It turns out that each of these spaces is further divided in 2, so there are a total of
four fundamental spaces of a matrix. For an n p matrix:
Input Space:
1. Null Space: Set of vectors b R p such that A b = 0
2. Row Space:
Orthogonal complement of the Null Space.
The row space is spanned by the columns of AT
A basis for the row space is given by Vrow = GS AT
Output Space:

3. Column space:
b Rp

## A basis for the column space is given by Vcol = GS (A)

4. Left null space:

## Said another way, there is no signal power remaining in which can

possibly be modeled by y (k) = T (k) b.
Part 2: Vectors and Vector Spaces

2
A=

(write A ).

## Two vector spaces associated with multiplying a vector by matrix A are:

1. Input space: b R p

(100)

Page 71

## The orthogonal complement of the column space

The set of all y such that y A b , b R p
Part 2: Vectors and Vector Spaces

Page 72

## EE/ME 701: Advanced Linear Systems

Section 8.1.0

Space of b vectors

Space of y vectors

Ro
w
row Sp
(A) ace

A b=y

br
Ab
ft N
lnu ull S
ll(A pac
)
e

b
0=A n

bn

Input Space, Rp

0
2
= A bn =

0
2

0
1

Le

Nu
l
Nu l Spa
ll(A ce
)

b=br+bn

Y=

Section 8.1.1

Co
lu
co mn
l(A Sp
)
ac

Y=A br

## EE/ME 701: Advanced Linear Systems

0
0
1
1

1
2
1

3
1
2
1

1 null (A)

(102)

Row Space: Vectors with a component from the row space give a nonzero output

Output Space, Rn

## Figure 6: Pictorial representation of the four fundamental spaces of a matrix.

Row space and Null space lie in the input space, while Column space
and Left-null space lie in the output space. (Adapted from Strang).
The equation illustrated in figure 6 is:
y = A b = A br + A bn = y + 0

3

1

6

=A
1

9

2
6

1

1 contains a component from row (A)

2

(103)

## All vectors which give a non-zero output contain a component from

the row space.

A part from the Row space and a part from the Null space.
Outputs lie in the Column space

Any component from the null space adds to length ||b||2 , but
contributes nothing to the output.

y = Ab

Page 73

Page 74

Section 8.1.1

Section 8.1.1

## Column Space example

3

1
2

=A
2 =A 1

9

2
1
6

b1 = 2 = br +bn = 1 + 1 ,

1
1
2

Vector b1 has a component from the row space and from the null space.
The contribution

1

6

=A
1

9

2
6

6
col (A)

9

6

(104)

b1r = 1

2

The contribution

## dim col A = rank A

The columns of A span the column space

col A = {y : y = A b }

b1n = 1

Left-Null Space
The Left-Null Space is the set of vectors in the output space that can
not be reached by
y = Ab

If

A b = y 6= 0 ,

and

then b row (A) .

Page 75

Page 76

Section 8.1.1

## Left-Null Space example:

Row Space:

1
lnull A
yln =

h
iT
Since 1 1 1 1 lies in the Left Null space of A , then the choice
for b which minimizes

is

1
1
1
1

Section 8.1.2

## The Left-Null Space is the

orthogonal complement of the
column space. For example,

e
y =

(105)

b= 0

0

where

(107)

## R is a set of basis vectors spanning the Row Space of A, and


GS AT indicates applying the Gram-Schmidt algorithm to AT

y=

(106)

, gives b Vr , y Rn

0

than b = 0

0

## (Revised: Sep 10, 2012)

Since the Row Space is spanned by the rows of A, every vector in the
row space is given by:
b = AT y

What Eqns (106) and (105) are saying is that there is no choice for b
which gets any closer to

## Part 2: Vectors and Vector Spaces

 
R = GS AT

Note: Recall that the GS algorithm takes any set of vectors, and returns an
ortho-normal basis on the space spanned by the vectors.

A b

## The Row Space is spanned by the rows of A, therefore an ortho-normal

basis on the row space is given by:

Column Space:
The Column Space is spanned by the columns of A
C = GS (A)

(108)

## where C is a set of basis vectors spanning the column space of A.

Since the Column Space is spanned by the columns of A, every vector y
in the column space is given by:
y =Ab
with the suitable choice of b .
Page 77

Page 78

Section 8.1.2

Null Space:

Section 8.1.3

## 8.1.3 Bases for the Four Fundamental Spaces, Numerical Example

Note that

Using the GS algorithm, we can determine bases for the four fund. spaces
Mr = R RT

(109)

## is a projection matrix, projecting any vector b onto the Row Space.

Since the null space is the orthogonal complement of the row space, the
projection matrix onto the null space is given by:
Mn = I Mr = I R RT

(110)

Since the columns of any projection matrix span the space onto which
the matrix projects, a basis set for the null space is given by:


null (A) = GS (Mn) = GS I R RT

(111)

>> A = [ 1 0 1; 2 0 2 ; 2 1 3; 1 1 2]
A =
1
0
1
2
0
2
2
1
3
1
1
2

n = 4
p = 3
r = 2

## >> Col = GramSchmidt(A)

Col =
0.3162
-0.2860
0.6325
-0.5721
0.6325
0.3814
0.3162
0.6674

dim Col =
dim Row =
rank A = 2

Left-Null Space
The Left-Null Space is the orthogonal complement of the Column Space.
The projection matrix onto the column space is given as
Mc = CCT

(112)

## and so the projection matrix onto the left-null is given as:

Mln = I Mc

(113)

And finally, a set of basis vectors for the left-null space is given by:


lnull (A) = GS (Mln) = GS I CCT

Row =
0.7071
-0.4082
0
0.8165
0.7071
0.4082

(114)

Page 79

Null =
0.5774
0.5774
-0.5774

dim Null =
p - dim Row = 1

LNull =
0.9045
-0.0000
-0.4020
0.3333
-0.1005
-0.6667
0.1005
0.6667

dim LNull =
n - dim Col = 2

Page 80

Section 8.1.4

Section 8.1.4

## 8.1.4 The Four Fundamental Spaces of a Matrix, revisited

Space of b vectors

Space of y vectors

Ro
w
row Sp
(A) ace

A b=y
Co
lu
co mn
l(A Sp
)
ac

Y=A br

br
Ab

b=br+bn
0=A

bn

Input Space, Rp

bn

Le

Nu
l
Nu l Spa
ll(A ce
)

ft N
lnu ull S
ll(A pac
)
e

Y=

Output Space, Rn

## Figure 7: Pictorial representation of the four fundamental spaces of a matrix

(repeated).
Looking back to figure 6,
The row space and null space are orthogonal complements
The input space, R p for an n p matrix, is the union of the row and
null spaces:
p = dim row (A) + dim null (A)
(115)
The column space and left-null space are orthogonal complements
The output space, Rn for an n p matrix, is the union of the column
and left-null spaces:
n = dim col (A) + dim lnull (A)
(116)
Additionally, the dimensions of the row and column spaces must be equal
and are equal to the Rank of A
dim col (A) = dim row (A) = rank (A)
Part 2: Vectors and Vector Spaces

(117)
Page 81

Page 82

## EE/ME 701: Advanced Linear Systems

Section 8.1.6

8.1.5 Questions that can be answered with the four fundamental spaces
Given y = A b

Section 8.2.0

## 8.2 Rank and degeneracy

Rank: In matrix theory the rank of a matrix is defined as the size of the largest
sub-array that gives a non-zero determinant.

## What is the set of all possible b

b which give a specific b
y?

## But determinant is an unsatisfactory numerical calculation, because it is

numerically very sensitive, and cant handle non-square matrices.

## If there is no exact solution, so = y A b 6= 0 ,

What is the set of all possible ,

Using the Gram-Schmidt algorithm, we can find a set of basis vectors for
the column space (or row space) of A, to determine the rank of A.

## What is the dimension and a basis for this set

Is there any non-zero value e
b such that A e
b=0

## In Matlab, Rank is obtained by the singular value decomposition, and

counting the number of singular values larger than a tolerance value. The
help message of rank() is instructive

b,

## What is the dimension and a basis for this set




Given y, what is the smallest y A b
b

## 8.1.6 Two ways to determine the four fundamental spaces

1. With Gram-Schmidt ortho-normalization
Ortho-normalize the columns of A to get the column space, the left-null
space is the orthogonal complement.

## >> help rank

RANK
Matrix rank.
RANK(A) provides an estimate of the number of linearly
independent rows or columns of a matrix A.
RANK(A,tol) is the number of singular values of A
that are larger than tol.
RANK(A) uses the default tol = max(size(A)) * norm(A) * eps.

Ortho normalize the rows of A to get the row space, the null space is the
orthogonal complement.
Range: The range of any function is the set of all possible outputs of that
function.

## 2. With the singular value decomposition (SVD)

The range space of matrix A is another name for the column space of A.
It is the range of function y = A b. It is the vector space spanned by the
columns of A.

Page 83

Page 84

Section 8.2.0

## EE/ME 701: Advanced Linear Systems

The nullity of a matrix is the dimension of its null space, denoted by q (A)

## 9 Summary and Review

with A Rnp ,

q (A) = p r (A)

Part 2 offers many definitions and concepts. However, as is often the case
with mathematical domains, there are only a few essential ideas:

(118)

## where r (A) is the rank of A.

A vector space is a set of vectors, and in general will not include all
vectors of the universe.

Degeneracy

## Simple operations, such as y = A b , lead naturally to vector spaces, and

our understanding of the solution can be in terms of vector spaces.

If rank (A) = min (n, p) we say the matrix is full rank. (It has the greatest
possible rank).

two vectors.

## The norm is a measure of the length of a vector.

The matrix has lost rank (if something happened that made it rank
deficient, such as a robot has reached a singular pose)

## Vectors and a vector space can be parallel, orthogonal, or somewhere in

between (include a component of each).

## The projection operation determines the components of a vector lying in

or orthogonal to a vector space.

## THEOREM: The rank of a matrix product.

Given A Rnm and B Rmp, and form C = A B Rnp , then the following
properties hold:
rank (C) + q (C) = p

## Gram-Schmidt ortho-normalization produces a basis that is handy for

determining projections.

(119)

## rank (C) min (rank (A) , rank (B))

q (C) q (A) + q (B)

Naturally, there are a variety of details fleshing out each of these essential
ideas.

(120)
(121)

The rank and dimension of the Null Space of C are determined by how the
column space of B falls on the row space of A .
rank (C) = dim intersection (col (B) , row (A))
Part 2: Vectors and Vector Spaces

Section 9.0.0

## (Revised: Sep 10, 2012)

Page 85

Student
Thought
Problem
Part 2: Vectors and Vector Spaces

Page 86

Section 9.1.0

Section 9.2.0

||x|| R

## 2. Distributivity (linearity): hx, y1 + y2i = hx, y1 i + hx, y2i

3. Can be an induced norm: hx, xi 0 x , and hx, xi = 0 iff x = 0 .

## A norm must have the following properties:

1. Positive definiteness: ||x|| 0 , ||x|| = 0 if and only if x = 0 .

It follows that:

## 1. Scalar multiplication, right term: hx, yi = hx, yi

3. Triangle Inequality:

## ||x + y|| ||x|| + ||y||

The length of the sum of two vectors can not be greater than the
individual lengths of the vectors.

## 4. Cauchy-Schwarz Inequality: |hx, yi| ||x||2 ||y||2

Two vectors can not be more parallel than fully parallel.
Technically, a vector space can be a vector space without having any norm
defined.
To be a vector space requires only a set of elements and the 8 properties
described in section 2.2.
But for the familiar vector spaces of Rn we have seen several norms,
||x||1 , ||x||2 , ||x|| , etc.
Specifically called a normed vector space.

Page 87

Page 88

Section 9.3.0

Section 9.3.0

## Embedded vector subspace

Proper vector subspace

## Learn the definitions !

Hyperplane, Hypersurface
Terms appear in italic bold where they are introduced and defined.

## All most elements of A have property p

Working together and talking about the subject mater will help toward this
goal.
Flash cards and drill may also be useful.

Orthogonality of subspaces
Representation of a vector
Transformation from one representation to another

Vocabulary :

Projection

Vector

## Projection operator, Projection matrix, Projection coefficients

Euclidean vector

Non-orthogonal projection

## Outer product (matrix product)

Ortho-normal vectors

## Cross product (vector product)

Gram-Schmidt orthogonalization

Orthogonality

Rank, degeneracy

## Span, Spanning set

Basis vectors
Dimension of a vector space
Standard basis
Vector universe
Part 2: Vectors and Vector Spaces

Page 89

Page 90

4.3.2
4.4

Contents

## Operator from data, example

. . . . . . . . . . . . . . .

46

47

## 5 Operators as Spaces (Bay section 3.2)

1 Linear Operator

## 2 Rotation and Reflection Matrices

5.1

Operator Norms

. . . . . . . . . . . . . . . . . . . . . . . . . .

49

49

50

5.1.1
5.2

48

2.1

5.2.1

50

2.2

5.2.2

52

2.3

14

5.2.3

53

2.4

. . . . .

15

5.2.4

54

## Rotation matrix in terms of the from-frame axes

expressed in to-frame coordinates . . . . . . . . . . . .

18

20

2.4.1
2.4.2

## 3 Linear Operators in Different Bases, or

A Change of Basis (Bay section 3.1.3)
3.1

21

22
25

4.1

28

4.2

. . . . . .

37

4.2.1

42

4.2.2

43

4.3

## Additional examples using change of basis

4.3.1

. . . . . . . . . . . .

44

44

## (Revised: Sep 10, 2012)

Page 1

5.3

Boundedness of an operator

. . . . . . . . . . . . . . . . . . . .

55

5.4

## Operator Norms, conclusions

. . . . . . . . . . . . . . . . . . .

55

5.5

Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . .

56

57

## 7 Forming the intersection of two vector spaces

58

7.1

Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8 Conclusion

59
60

Page 2

## EE/ME 701: Advanced Linear Systems

Section 1.0.0

1 Linear Operator

Section 2.0.0

## An operator is a generalization of the notion of a function. Operators are

functions of numerical arguments, and also functions of functions.
See: http://mathworld.wolfram.com/Operator.html

## Well be focusing on functions of numerical arguments, so an operator is

essentially a synonym for a function.
Linear Operator: An operator A from vector space X to vector space Y, denoted
A : X Y, is linear if it verifies superposition:
A (1 x1 + 2 x2 ) = 1 A x1 + 2 A x2

## Rotation and Reflection matrices are good examples of linear operators.

They are used extensively and will play an important role in the Single Value
Decomposition.

||R x|| = ||x||

(2)

## 2. Angles between vectors are preserved:

hR x1 , R x2 i
hx1, x2i
=
||x1|| ||x2 || ||R x1 || ||R x2 ||

(1)

x1 x2 Rn

(3)

For vectors x Rm

(4)

## 3. Handedness: Rotation matrices preserve handedness, equivalently for

rotation matrix R
det (R) = +1

## Example: The projection operator is a linear operator

(mapping a vector from Rn to a subspace of dimension n).

Reflection Matrix:
1. Reflection matrices preserve length
2. Reflection matrices preserve angles
3. Handedness: Reflection matrices reverse handedness, for reflection
matrix Q
det (Q) = 1
Part 3: Linear Operators

Page 3

Page 4

Section 2.1.0

## EE/ME 701: Advanced Linear Systems

Setup rotation matrix R2 with = +40o, det (R2) = +1

## 2.1 Example Rotation and Reflection Matrices

2-D, Rotation Matrix:

cos () sin ()
R=

sin () cos ()

(5)

## (Determinant will equal +1)

2-D, Rotation Matrix with Reflection:

Section 2.1.0

>>
>>
>>
>>

## %% Setup a +40 deg rotation matrix

theta2 = 40
Ct2 = cosd(theta2)
St2 = sind(theta2)
Rot2 = [ Ct2, -St2; St2, Ct2 ]
Rot2 =
0.7660
-0.6428
0.6428
0.7660

>> det(Rot2)
ans =
1.0000

## 1 0 cos () sin () cos () sin ()

Q=

sin () cos ()
sin () cos ()
0 1

(6)
Add a reflection to R2 , det (Q) = 1

## (Determinant will equal -1)

Setup rotation matrix R1 with = 40o , det (R1) = +1
%% Setup a -40 deg rotation matrix
>> theta=-40
>> Ct = cosd(theta); St = sind(theta);
>> Rot1 = [ Ct, -St; St, Ct ]
Rot1 =
0.7660
0.6428
-0.6428
0.7660
>> det(Rot1);
ans = 1.0000

Page 5

## %% A reflection operator can be an identity matrix

%% with an odd number -1 elements.
>> Rot2PlusReflection = [-1
0 ] * Rot2;
[ 0
1 ]
Rot2PlusReflection =

-0.7660
0.6428
>> det(Rot2PlusReflection)
ans =
-1.0000

0.6428
0.7660

Page 6

Section 2.1.0

## Process points with rotation and reflection matrices.

The rotation and reflection matrices are linear operators, they map points
in R2 onto points in R2 .
>> P1 = [ 0.5000
0.5000
0.7000
0.7500
0.2500
0.2500]
>> P2
= Rot *
P1
P2 =
0.8651
0.5437
0.6969
0.2531
-0.1299
-0.2584
>> P3
= Rot2 * P1
P3 =
-0.0991
0.2223
0.3755
0.8959
0.5129
0.6415
>> P3b = Rot2PlusReflection * P1
P3b =
0.0991
-0.2223
-0.3755
0.8959
0.5129
0.6415

THEOREM 3.1: A rotation matrix must have the property that RT = R1.
Proof: Because lengths must be preserved, the angles condition can be rewritten:
hx1, x2i = xT1 x2 = xT1 RT R x2 = hR x1, R x2i

(7)

For Eqn (7) to be true x1 , x2 Rn, RT R must be the identity matrix, ergo
RT = R1.
QED
Example: Rotation matrices are linear operators
For a rotation of [degrees] in R2, the rotation matrix is:

C
S

R =

S C

## Rotation and Rotation with Reflection

1
Rotated by 40 deg
and Reflected over the Y axis
(X values inverted)

Section 2.2.0

Original

(8)

0.5

## As an example, consider = 20o, then

0

0.94
0.34

R=

0.34 0.94

Rotated by 40 deg
0.5
1

0.8

0.6

0.4

0.2

0.2

0.4

0.6

0.8

Verify that R given by Eqn (8) is a ortho-normal matrix for any value of .

## Figure 1: Example of rotation and reflection.

Part 3: Linear Operators

Page 7

Page 8

Student
Exercise

Section 2.2.0

## EE/ME 701: Advanced Linear Systems

If the vi are normal, then

i)

AT = A1.

ii)

ii)

1
0
0

T
A A= 0 1 0

0 0 1

ii)

Proof:
i)

To show that

A=
then

v1 v2 v3

## hv3, v1i hv3 , v2 i hv3 , v3i

hR x1 , R x2 i
hx1, x2i
=
||x1|| ||x2 || ||R x1 || ||R x2 ||
(9)

0
0
hv1, v1i

AT A =
0
0
hv2, v2i

0
0
hv3 , v3 i

## iv) Two properties of the determinant are:

1. For square matrices B, Q and R with B = Q R, det (B) = det (Q) det (R),
2. For any square matrix B, det (B) = det BT

## If the vi are orthogonal, then

q
q
p
Let x2 = A x1, then ||x2 || = hx2, x2i = xT1 AT A x1 = xT1 x1 = ||x1||,
where the AT A is eliminated in the 3 step because AT A = I .

iii) hR x1 , R x2 i = xT1 RT R x2



(11)

## and AT = A1. The argument of this example extends directly to

A Rnn .

x1 , x2 = R x1 , R x2

AT A = I

Section 2.2.0

(10)


Since AT A = I, it follows that det AT det (A) = det (I) = 1. But since


det AT = det (A), it follows that det (A) = 1, thus, given that A is a square
ortho-normal matrix, det (A) = +1 or 1.
QED
~

Page 9

Page 10

Section 2.2.0

## THEOREM 3.3: Any square ortho-normal matrix A is a rotation matrix if

det (A) = +1 and a reflection matrix if det (A) = 1, and any rotation or
reflection matrix is a square ortho-normal matrix.

x1 =

0 1 0

Section 2.2.0

Proof:

(13)

i)

## Any square ortho-normal matrix is either a rotation or reflection matrix

Theorems 3.1 and 3.2 establish that a square ortho-normal matrix preserves
lengths and angles, therefore it is either a rotation or reflection matrix.

ii)

## Any rotation or reflection matrix R is a square ortho-normal matrix, proof by

Assume there is a rotation or reflection matrix R which is not a square
ortho-normal matrix, that would imply that either the columns of R are not
orthogonal or that the columns of R are not normalized. Show that each leads
ii.a) If the columns are not orthogonal, show that R can not preserve angles.
First, looking back to Eqn (9), if the columns are not orthogonal, there must


be at least one pair vi , v j , i 6= j such that vi , v j = ai j 6= 0. Therefore
RT R 6= I.
Next we need to show that since RT R 6= I ,
x1 , x2 s.t.

## hx1 , x2i = xT1 x2 6= xT1 RT R x2 = hR x1 , R x2i

(12)

Note: In a proof, it is not enough to simply assert Eqn (12). Even though
RT R 6= I, how do we know there are allowed choices for x1 and x2 such
that xT1 x2 6= xT1 RT R x2 . The last step of the proof gives a prescription to
construct such a x1 and x2.
ii.b) If one or more columns are not normalized, show that R can not preserve
angles.

## In this case there is no vi , v j = ai j 6= 0, i 6= j ; but there is hvi , vi i = aii 6= 1.

Select x2 = vi , then RT R x2 = aii x2 , and
hx1, x2i = xT1 x2 6= xT1 aii x2 = xT1 RT R x2 = hR x1 , R x2i
This contradicts the hypothesis that R is a rotation or reflection matrix.
Thus, it is shown that given R is a rotation or reflection matrix, the assumption that
QED
Thus all rotation or reflection matrices are ortho-normal matrices.

## Because RT R 6= I , there exists at least one x2 such that RT R x2 = x3 6= x2.

Because x3 6= x2, there must be at least one element of x3 which does not
equal the corresponding element of x2, call this the kth element, and choose

Page 11

Page 12

## EE/ME 701: Advanced Linear Systems

Section 2.2.0

THEOREM 3.4: Any two ortho-normal coordinate frames (sets of basis vectors)
A and B in Rn are related by a rotation matrix R and at most one reflection.
Proof: We can transform vectors represented in either coordinate frame to the
standard frame by
s
s
x = A ax ,
x = B bx

Section 2.3.0

## 2.3 Summary of mathematical properties of rotation matrices

Rotations and reflections are linear operators, they map from Rn to Rn by a
matrix multiplication.
Matrices that preserve lengths and angles are either rotation or reflection
matrices.

and so
a
bT

= A1 B

and so
det ( abT ) =

RT R = I

1
det (B)
det (A)

Thus RT = R1.

## If R is a rotation matrix, RT is a rotation matrix.


det A1 = 1/ det (A) ,

## If R1 and R2 are rotation matrices, R3 = R1 R2 is also a rotation matrix.

All square ortho-normal matrices are either rotation or reflection matrices,
if

## det (Q R) = det (Q) det (R)

Since A and B are ortho-normal bases, det (A) = 1 , det (B) = 1, which

shows that det ab T = 1, and so abT incorporates at most one reflection.

Even in Rn !

## det (R) = +1 the matrix is a rotation matrix,

det (R) = 1 the matrix is a reflection matrix.
~

## COROLLARIES 3.4: The action of any two reflections in Rn is to restore the

original handedness.

Page 13

Page 14

Section 2.4.0

Section 2.4.0

## 2.4 Multi-axis rotations comprise rotations about each axis.

Original Coordinates

Za

## In Robotics and elsewhere, the rotation operator is called simply a

rotation matrix.

Za
1

0.5

0.5

0.5
0.5

Ya

Pitch, roll and yaw rotations are seen in figure 2. These are the rotations
about the three axes, and can also be referred to as:

Yb

0
0

0.5 0.5
0.5

0.5

0.5

1
Y

0.5
0
0

1
1.5 1.5

Ya

Xa

0.5

0.5

1
Y

1
1.5 1.5

## Rotation about the Z-axis, Rz (roll)

Rotation about the Y -axis, Ry (yaw)
The terms pitch, roll and yaw are assigned to different axes by different
authors.

Z
b
Z

Zb Za

0
0.5
0.5

## An example multi-axis rotation is seen in figure 3.

Xa

0.5

0.5
Z

Multi-axis rotations preserve length and angles (so a Euclidean basis set
such as the X, Y, Z axes, remain orthogonal).

Xb
0

0.5

0.5

1
Y

1
1.5 1.5

0.5 0.5
0.5

0.5

0
0

0.5

0.5

1
Y

1
1.5 1.5

## Figure 2: 3D rotations, illustrating individual rotations Rx (pitch), Rz (roll), and

Ry (yaw). Note right-hand rule for rotation direction.

Page 15

Page 16

Section 2.4.0

## EE/ME 701: Advanced Linear Systems

Section 2.4.1

2.4.1 Rotation matrix in terms of the from-frame axes expressed in toframe coordinates
Above view
Z

Side View
Pitch: 30.00, Roll: 20.00, Yaw: 25.00

1
0.5

0.866 0.500
C S
A
=

BR =
0.500 0.866
S C

XX
b

0.5

Xa

Ya

0.5
0.5

0
0.5
0.5

0.5
1
Y1.5
1.5

0.5

0.5

0 0.5

1 1.5 1.5 1
Y

## A rotation matrix provides a transformation from one coordinate frame to

another, we can call these the from frame and the to frame.

0.5 0

0.5

Point Pa is given as

Pa =

0.7
0.5

Pa = AB R BPa =

0.356
0.783

Pa

a
YB

xis

Y axis

0.5
xis

Top View
Pitch: 30.00, Roll: 20.00, Yaw: 25.00

a
XB

0
X axis

0.5
1

1
0.5
0
0.5
1.5

Y
1

Za

0.5

0.5
Z
0

0.5 0.5

0.5

1.5

0.5

## The rotation matrix from B to A given by the axes of B expressed in A

Look at figure 4,
|
|

A
A
AY
BR = X B
B

|
|
Look at the B axes in figure 4, expressed in the A coordinate frame.

Page 17

Page 18

Section 2.4.1

3-D example

Section 2.4.2

## 2.4.2 Example: Photogrammetry, measurement from images

A typical application comes from photogrammetry, where it is often
necessary to shift vectors from target to camera coordinates.

The rotation from the B frame to the A frame in figure 3 is given by:

0.85
0.49
0.17

A
B R = 0.31 0.74 0.60

## To shift coordinates from camera to target coordinates:

By convention, the rotation from the camera frame to the target frame is
given by rotations about the three axes, corresponding to three angles pitch,
roll and yaw: Rx (pitch, ) Rz (roll, ) Ry (yaw, ).

0.85

A
X B = 0.31

0.42

c
tR =

## In general, the rotation from a B coordinate frame to an A coordinate

frame is given by:

A
A
BR = X B

AY

AZ

= tc R cPa + tPc.

## The X-axis of the B frame, expressed in A coordinates is:

tP
a

(14)

c
tR

1 0

1 0
0
0

0 Rz Ry Rx

0 1
1

C S 0

0 S

= 0 1 0 S C 0 0 1 0 0 C S

0 0 1
0
0 1
S 0 C
0 S C

CC +C S CS S S S CC S

= SC
(15)
S SS +CC C SS CS

S
C S
CC

Compare Eqn (15) with Bay, Eqn (3.16). Bay uses a different ordering for
the rotations.
Because matrix multiplication does not commute, Bays the 3-axis
rotation matrix - while similar - is not exactly the same as tcR.
There are at least 48 ways to put together a 3-Axis rotation matrix, and
they are all found somewhere in the literature.

Page 19

Page 20

Section 3.0.0

## When we make a change of basis, we change the axes on which a vector is

represented. For example, given two bases on R2 ,

U=

u1 u2

The vector f =

1 0
=
,
0 0.5

0.5 1

T

V=

v1 v2

y1 = A x1

## If we change the representation of the vectors, we have to make a suitable

change in the linear operator.

0.8 1
=

0.8 0

## Change the representation of the vectors

x2 = Bx x1

f=U f=

u1 u2

y2 = By y1

(17)

0.8
1
or f = 1.25
+0.5

0.8
0

where Bx and By are basis transformations in the input and output spaces,
respectively, and must be square and invertible.
Rewriting Eqn (16) with the change of basis
y1 = A x1
y2 = A x2

## Which is equivalent to writing

(16)

can be represented

0.5
1
0
f=
= 0.5 +2.0

1.0
0
0.5

Section 3.1.0

0.5

,
2.0

or f = V f =

v1 v2

1.25

0.5

## With a change of basis from U to V , the representation of a vector changes,

but the vector itself (vector f) remains the same.

(18)

## Relating Eqn (16) to Eqn (18) gives

By y1 = A Bx x1

or

y1 = B1
y A Bx x1

(19)

Using the uniqueness of Eqns (16) and (18), Eqn (19) implies that

A = B1
y A Bx

or, equivalently

A = By A B1
x

## We started with the equation

y1 = A x1
Part 3: Linear Operators

Page 21

Page 22

Section 3.1.0

Section 3.1.0

## and ended up with the equation

y2 = A x2

y2 = By y1

where the input and output bases of linear operator A have changed.
Which brings up this point: implicit in any linear operator are the bases
in which the input and output are expressed.
We normally assume these to be the standard Euclidean bases for Rm
and Rn .

x2 = Bx x1

the transformation matrices By and Bx must be square matrices, and full rank
to be invertible. By Rnn , Bx Rmm
A special case of basis transformation arises when A is a square matrix, or
A : Rn Rn . In this case the input and output transformations can be the
same,
y2 = B y1 , x2 = B x1
Combining with Eqns (16)-(19) above, we can write
y1 = A x1 = B1 A B x1
Which gives:
A = B1 A B
1

BAB

(20)

= A

(21)

When A and B are square and B is invertible, Eqn (20) has a special name.
It is called a similarity transformation.
Similarity transformations preserve the eigenvalues, in other words

eig (A) = eig A , where eig (A) is the vector of eigenvalues of A

## Similarity transformations are going to give us the freedom to re-write

system models from one basis to another, to explore model properties such
as modal response, controllability and observability.
Part 3: Linear Operators

Page 23

Page 24

Section 4.0.0

Section 4.0.0

## The basic architecture of an application of change of basis is this,

Given an application:

Example:

## How to change the basis of vectors, and finally

How to adapt an operator, when the vectors it operates on change basis
These are powerful tools for several types of problems.

## Call this basis s, for the standard basis

2. The application is difficult in the original basis

Bay homework problems 3.4, 3.5, 3.6, 3.7, 3.9, 3.10 and 3.11 are all
addressed by a change of basis.

Example:

methods

## 3. There is an alternative basis on which the application is easy

Fourier transform
Wavelet transform

Example:

## If the data vector (a voice signal) is represented on a set of

wavelet vectors (discrete wavelet transform), the application is
easy:

And applications
Just throw out the coefficients for basis vectors that make
little difference for human perception

Reconstructing MR images
jpeg image compression
Speech compression for cell-phone transmission
And for control systems, a change of basis is necessary to solve
x (t) = A x (t) + B u (t)
at all.
Part 3: Linear Operators

Page 25

Page 26

Section 4.0.0

## 4. We solve the problem or achieve the application in 3 steps:

Step 1. Transform the data from the s basis into a basis where the
application is easy, call this this F basis (it can have a different
name for each application).

## Bay problem 3.9 raises an interesting challenge from computer graphics.

Let P be the plane in R3 defined by 1x 1y + 2z = 0, and let l be the vector

2

l= 1 .

1

## Step 3. Transform the results back to the s basis, for utilization

s

A = sF T

Problem expressed
in s basis

(Problem is generally
unsolvable on the s basis)

F
sT

(22)

Solution expressed
in s basis

Alternative, F, basis
Solve
Problem

Section 4.1.0

## Denote by A : R3 R3 the projection operator that projects vectors in R3

onto the plane P, not orthogonally, but along vector l.
Projected points can be pictured as shadows of the original vectors onto
P, where the light source is at an infinite distance in the direction l.

(Problem is solvable
on the F basis)

## 1. Find the matrix

of operator
 F A which projects a vector represented in

basis F = f1, f2 f3 onto plane P, where f1 and f2 are any non-

## Problem: Canons blowing

up, Heat distribution
known in s basis
(s: just x, y, z)
No solution to
Heat-conduction
Equation for general
h0(x,y,z)

Alternative, F, basis is
sine and cosine functions
Solve
Problem
Heat conduction can
be solved for sin(x)
initial distrubtion.

Solution expressed
in s basis.

## Hot points determined,

cannon redesigned.

## By superposition, the solution

for many sine functions, is the
sum of the solutions for each
individual sine fucntion.

## (Revised: Sep 10, 2012)

Non-orthogonal projection is a standard operation in computer graphics.
Rendering systems do shading and compute reflections by representing
complex surfaces as many flat triangular tiles, and computing the
intersection point of many rays with these tiles.

## Figure 6: Problem solving steps using a change of basis.

Part 3: Linear Operators

## 2. Find the matrix of operator s A which projects a vector represented in

standard R3 basis onto plane P.

Page 27

Page 28

## EE/ME 701: Advanced Linear Systems

Section 4.1.0

The origin For the plane to be a 2-D vector subspace, the origin must lie in the
plane. In practical systems with many tiles (planes), the origin is offset to
a point in each plane as it is processed, so that a projection operator can be
used.

y1

ne

Section 4.1.0

x1

Pla

## EE/ME 701: Advanced Linear Systems

Surface Normal The plane is specified by its surface normal, a vector that is
orthogonal to the plane. This is a standard way to specify an n 1
dimensional hyper-surface in Rn .

x2
~n: Surface normal

y2

It is necessary to find basis for vectors for plane P . We can use the fact
that all vectors in plane P are orthogonal to n.

y = Ax

## Figure 7: Illustration of projection along ray ~l. In computer graphic, complex

surfaces can be represented as a mesh of triangular tiles.

is a projection of a point (x) onto the surface (to point y) . Since x and y are
both 3-vectors, A is a 3x3 matrix.

x

x= y ,

z

## Note: up to now we have considered only orthogonal projections, that is, if

g = P f is the projection of f onto a subspace, and w = f g, then w g.

2

l= 1

1

## But in this case, as with many projections in computer graphics, the

projection is not orthogonal. That is:

Where does the ray strike a plane defined by the surface normal n = 1 ?

2
Being able to cast the relation as a linear operator, of course,
greatly simplifies and accelerates the calculation.
A Play Station 3 can perform this calculation several billion times per
second.
Part 3: Linear Operators

Page 29

## The line l is not orthogonal to the plane or, equivalently,

l is not parallel to n, l n .
Bases: This problem is approached in problem 3.9 (a) on a basis F. In this
note, points in the standard space (x , y , z) are labeled sx and sy. Vectors
expressed on basis F are labeled Fx and Fy .
Part 3: Linear Operators

Page 30

Section 4.1.0

## Suggested Approach: With many engineering challenges, it is good to ask the

question: when I have an answer, how can I verify that it is the correct
given a point sxi in R3 and its shadow in plane P (call this syi ) how
can I independently verify that syi is correct ?
For this problem, the reverse problem, verification, may be easier than the
analysis to determine sy, and thinking through how to verify a correct
answer may help find how to determine sy.

## EE/ME 701: Advanced Linear Systems

(b): to verify that syi is the shadow of sxi consider what it means to be the shadow:
yi sxi = l

Example Data:

1. Given a point sxi and a point syi in the plane, derive the calculation to verify:

## (b) that syi is the shadow of sxi

(a): to verify that point sy1 lies in plane P we need basis vectors for P. These will
be any two independent vectors lying in P. That is, any two independent
vectors, each orthogonal to n.
1

sx

2
2.67

s
1 = 3 which projects onto y1 = 0.67

4
1.67

Verifying that syi lies in plane P: One way to check that sy1 lies in P is to form
the projection of sy1 onto P, and verify that it equals sy1 .

(24)

## Considering verification of the correctness of a solution:

Section 4.1.0

Considering that n = 1 ,

1 0

(23)

>> P = [ 2 1 ; 0
P =
2
1
0
1
-1
0

1 ; -1 0 ]

sy1 = -2.67
0.67
1.67

## >> %% Verify sy1 lies in P

>> sy1hat = P * inv(P*P) * P * sy1
sy1hat = -2.67
0.67
1.67

>> sy1hat-sy1
ans =
1.0e-15 *
-0.8882
0
-0.1665

1

PT syi
y = P PT P

## If y = syi , then sy1 lies in P.

Part 3: Linear Operators

Page 31

Page 32

## EE/ME 701: Advanced Linear Systems

Section 4.1.0

Verifying that syi is the shadow of sxi : To show that sy1 is the shadow of sx1,
show that ( sy1 sx1 ) is parallel to l . For the example data we find:
%% Difference Vector
>> ll = sy1-sx1
ll = -4.6667
-2.3333
-2.3333
%% Term-by-term ratio
>> ll ./ l
ans = -2.3333, -2.3333, -2.3333

%% Projection ray l
l =
2
1
1

Section 4.1.0

## 2. Given a vector Fx1 represented on basis F, find the operator FA that

determines Fy1, that is, determines the shadow point represented on basis F.
Discussion: Using F as a set of basis vectors, than any point sxi is represented as

2
1

2

s
xi = a1 0 + a2 1 + a3 1 = F F xi

1
0
1

sy1-sx1 = -2.3333 l

with

## Now considering text problem 3.9.

(Problem 3.9 part a)
Find yi on the F basis. Given

## 1. Construct the set of basis vectors

F=

f1 f2 l

where f1 and f2 are basis vectors in the plane, and l is the projection ray.
Given P above, the set of basis vectors for F is:

F=

f1 f2 l

b1

yi = b2

b3

so that

2 1 2

= 0 1 1

1 0 1

## Part 3: Linear Operators

a1

xi = a2

a3

This gives:

b1 1 0 0

F
yi = b2 = 0 1 0

0 0 0
0

Page 33

2
1
2

yi = b1 0 + b2 1 + b3 1 = F F yi

1
0
1

## Many F matrices are possible, with f1 and f2 as basis vectors for P.

Part 3: Linear Operators

(25)

F
xi

Page 34

Section 4.1.0

## For yi to be the projection of xi along l, in the F basis vectors, we can

only modify the 3rd coefficient, giving

b2 = a2

b1 1 0 0

b2 = 0 1 0

0 0 0
0

Or

yi = FA Fxi

FA

with

## Properties (a) yi lies in P, and (b) yi is the projection along l of xi

determine the projection in basis F :

Section 4.1.0

1. Starting with FA, find the operator sA that determines sy1 corresponding to
a point sx represented on the standard basis.

b1 = a1

a1

a2

a3

1
0
0

F
A= 0 1 0

0 0 0

(26)

s

yi = F Fyi ,

s
FT

so

=F

(28)

of Eqn (28))
xi = F 1 sxi ,

note

F
sT

= F 1

(29)

(27)

s

y =

s
FT

F
sT

xi

= sA sx

(30)

## is the projection operator on the F basis.

>> sA = F*[1 0 0; 0 1 0; 0 0 0] * inv(F)
sA =
0.3333
0.6667
-1.3333
-0.3333
1.3333
-0.6667
-0.3333
0.3333
0.3333
Note: there is only one operator sA that projects along line l to plane P in standard
coordinates. The numerical values come out to those given, whatever basis
is chosen for P.

Page 35

Page 36

Section 4.2.0

## EE/ME 701: Advanced Linear Systems

Consider the inverse DFT

## The Fourier transform is written as a convolution integral. Given f (t)

Fourier Transform:
Inverse Fourier Transform:

F () =
f (t) =

2 j t

f (t) e

dt

F () e+2 j x d

f ( j) =
(31)
(32)

F (k) =
1
f ( j) =
N

Inverse DFT:

f ( j) e

j=1
N

2i
N ( j1)(k1)

F (k) e

2i
N ( j1)(k1)

(34)

k=1

## The DFT gives a signal in frequency domain, k = (k 1)/N [cycles/sample].

f(j)

2
0

20

40

60
80
j [sample number]

100

120

140

real F(k)

100
50
0
50
0

10

20

30

40

50

60

70

imag F(k)

50

50
0

10

20

30
40
k [wave number]

50

60

70

## >> w1 = 1/16; w2 = 1/32;

>> jjs = 1:128;
>> for j1 = jjs,
f(j1) = cos(2*pi*w1*j1) ...
+ cos(2*pi*w2*j1);
end
>> F = fft(f);
>>
>> kk = (0:127)/2;
>> figure(1),
>>
subplot(3,1,1)
>>
plot(jjs, f)
>>
subplot(3,1,2)
>>
plot(kk, real(F))
>>
subplot(3,1,3)
>>
plot(kk, imag(F))

F (k) e

2i
N ( j1)(k1)

(Eqn 34 , repeated)

k=1

2i
N (11)(k1)

2i (21)(k1)
eN

vk =

..

2i
e N (N1)(k1)

(33)

## where f ( j) is in time-domain signal and F (k) is in frequency domain, and

DFT is the Discrete Fourier Transform.

1
N

## Defining basis vectors

For the Discrete Fourier Transform (DFT), the DFT and Inverse DFT are
given by summations in the place of integrals:
DFT:

Section 4.2.0

ei k (11)

i (21)
e k

=

..

.

ei k (N1)

(35)

where

2 (k 1)
N
is the frequency corresponding to component F (k).
k =

(36)

Using the vk , the inverse DFT of Eqn (34) takes the form:
f=

1
(F (1) v1 + F (2) v2 + + F (N) vN )
N

(37)


T
where f = f (1) f (2) f (N)
is the time-domain signal, and

T
is the frequency-domain signal.
F = F (1) F (2) F (N)
Eqn (37) has the form of expanding f on a set of basis vectors.
The 1/N term is a normalization.

## Figure 8: Signal and Discrete Fourier Transform.

Part 3: Linear Operators

Page 37

Page 38

Section 4.2.0

V=

## Putting together Eqns (39) and (41) gives

v1 v2 vN

(38)

f=

We find

1
VF
N
(where F is the Fourier transform of f).
f=

1
V VTf
N

(42)

(39)

## An important property of the Fourier basis functions is that they are

orthogonal . That is, for Eqns (35) and (38):
V VT = N I

Section 4.2.0



or, V T = N V 1

(40)

## (The Fourier basis functions require term N to be normalized)

Looking at Eqn (33), and considering the vT is the complex conjugate
transpose of v, we find that the DFT is given by:
F =VTf

(41)

Summary:
Eqn (35) defines an orthogonal basis of N- element vectors (or functions,
for the continuous time FT)

>> w0 = 2*pi/N
>> for kk = 1:N,
>>
for jj = 1:N,
>>
V(jj,kk) =
>>
end
>> end

V =
1.00
1.00
1.00
1.00
1.00
1.00

1.00
0.50
-0.50
-1.00
-0.50

+
+
+
-

0.87i
0.87i
0.00i
0.87i

0.50 - 0.87i

%% Fundamental frequency
%% Build the N basis vectors
%% The N elements of each vec
exp(j*w0*(jj-1)(kk 1));

1.00
-0.50
-0.50
1.00
-0.50

+
+

0.87i
0.87i
0.00i
0.87i

-0.50 - 0.87i

1.00
-1.00
1.00
-1.00
1.00

+
+
-

0.00i
0.00i
0.00i
0.00i

-1.00 + 0.00i

1.00
-0.50
-0.50
1.00
-0.50

+
-

0.87i
0.87i
0.00i
0.87i

-0.50 + 0.87i

1.00
0.50
-0.50
-1.00
-0.50

+
+

0.87i
0.87i
0.00i
0.87i

0.50 + 0.87i

## Because the Fourier basis functions are orthogonal, no matrix inverse

is required to compute the change of basis (Fourier would have had a
difficult time doing a large matrix inverse in 1805!)
The DFT, Eqn (41), is a change of basis, from the standard basis of f ( j)
to the Fourier basis F (k).
The IDFT is the change of basis back.
Part 3: Linear Operators

Page 39

Page 40

## EE/ME 701: Advanced Linear Systems

Now consider the FFT
>> f = [ 1 2 3 4 5 6]
f =
1
2
3
4
5
6

Section 4.2.0

normalize)

6
0
0
0
0
0

0
6
0
0
0
0

Section 4.2.1

## 4.2.1 The Fourier transform as a change of basis

%% Using the standard FFT function
>> F1 = fft(f)
F1 = 21.0000
-3.0000 + 5.1962i
-3.0000 + 1.7321i
-3.0000
-3.0000 - 1.7321i
-3.0000 - 5.1962i

## %% Using a matrix multiply, for a change of basis

>> F2 = V*f
%% The inverse Fourier transform
F2 = 21.0000
>> f3 = (1/N)* V * F2
-3.0000 + 5.1962i
-3.0000 + 1.7321i
f3 = 1
-3.0000 - 0.0000i
2
-3.0000 - 1.7321i
3
-3.0000 - 5.1962i
4
5
6

>> V*V
ans =

## EE/ME 701: Advanced Linear Systems

0
0
6
0
0
0

0
0
0
6
0
0

0
0
0
0
6
0

1. The matrix multiply form only works for sampled signals (discrete
signals, or the discrete Fourier transform). Continuous signals require
integrals (and can be defined in terms of inner products).
2. Matrix multiplication is conceptually simple, but computationally
inefficient. For a 1024 element FFT, a 1024x1024 matrix V would be
required.
Fourier actually focused on the columns of V . Fouriers insight contained 2
parts:
1. He could solve the problem of heat conduction for an initial heat
distribution given by a basis vector vk
2. The basis vectors of the Fourier transform are orthogonal !
Ordinarily given basis vectors V and data f, to find the basis
coefficients we need to solve

1 T 
F = V 1 f ,
or V T V
V f

But the Fourier basis vectors are orthogonal, with a very simple
normalizing factor,
F =VTf

0
0
0
0
0
6

## If the Fourier Transform is just a matrix multiplication for a change of basis,

why are text book derivations based on summations and integrals:

Page 41

Page 42

Section 4.2.2

Section 4.3.1

## 4.2.2 Using the Fourier transform

Suppose we had an operator FA that operates on frequency data, not on
time-domain data, and given f time-domain data.
1
y= V
N

A VT f

(43)

## Where the logic is:

Convert f to frequency domain

(action of V T )

(action of FA)

(action of V )

1
A= V
N

A V

(44)

And so
y = sA f

(45)

Given the data

1
4
1

{x1 , x2 , x3} =
2 , 5 , 2

3
0
6

0
4
1

{y1 , y2 , y3} =
2 , 1 , 0

2
6
1

## Find the operator A to solve

y = Ax
The key to finding a good basis for solving the problem is to realize that a
basis F is found so that the x data are

1 0 0

F

x1 , Fx2 , Fx3 = 0 , 1 , 0 , with

0
1
0

F

y1, Fy2 , Fy3

## Then the operator would simply be:

F



A = Fy1 , Fy2 , Fy3

1
0
so that

F
y1 = FA Fx1 = FA 0 , Fy2 FA Fx2 = FA 1

0
0
Part 3: Linear Operators

Page 43

etc.

Page 44

Section 4.3.1

1

F
x1 = 0 ,

0

Section 4.3.2

0

F
x2 = 1 , etc.

0

>> X = [1
2
3

## The solution is to choose as basis vectors the xi vectors themselves. Then

F = [x1 , x2 , x3]

(46)

4 -1
5 2
6 0]

>> Y = [ 0
2
-2

4 -1
1 0
6 1]
Double checking:

>> sFT = X;
which gives

## With Eqn (46)

x1 = F Fx1 = [x1 , x2 , x3] 0 ,

0
s
FT

F
sT

=F,

## >> FsT = inv(sFT)

FsT =
-0.8000
0.4000
-0.2000

etc.

= F 1

(47)

On the F basis:
F

y =

A x

with



A = Fy1 , Fy2 , Fy3 = FsT [ sy1, sy2 , sy3 ]

s

A = sF T

F
sT

(48)

-0.4000
0.2000
0.4000

0.8667
-0.2667
-0.2000

>> FA = FsT * Y
FA =
-2.5333
0.9333
1.2000

1.6000
0.2000
-1.6000

1.6667
-0.6667
0

## Now convert the operator to the standard basis

>> sA = sFT * FA * FsT
sA =
1.8000
0.4000
-1.2000
-0.6000
3.8000
2.4000

-0.8667
1.4667
-3.5333

>> sA * X(:,1)
ans =
0.0000
2.0000
-2.0000
>> sA * X(:,2)
ans =
4.0000
1.0000
6.0000
>> sA * X(:,3)
ans =
-1.0000
-0.0000
1.0000

y = sA sx

Page 45

Page 46

Section 4.4.0

Section 5.0.0

## 4.4 Conclusions change of basis as a tool for analysis

Some problems are much more easily solved in a special coordinate frame
that is different from the natural (standard) coordinates of the problem.
In these cases the best way, and possibly the only way, to solve the problem
is to transform the data into the special coordinates, solve the problem, and
transform the answer back out again.
With linear transformations and operators, we can then build the operator
directly in standard coordinates with Eqn (22).
Since the time of Fourier, coordinate transformation is a standard tool, but
we might call it by other names, or never explicitly compute the matrices F
and F 1
Coordinate transformation is the only general way to solve the equation

The set of linear operators from one vector space into another (or into itself)
also forms a vector space.
Working with the space of all operators A : Rm Rn , and with
y1 = A1 x , y2 = A2 x
then
A1 x + A2x = (A1 + A2) x
and all the required properties of a vector space are satisfied
1. The 0 operator is included in the set
2. For every operator there is the additive inverse operator

## x (t) = A x (t) + B u (t)

6. Closure under scalar multiplication
7. Associativity of scalar multiplication
8. Distributivity of scalar multiplication over operator addition

Page 47

Page 48

Section 5.1.1

Section 5.2.1

## Operators also have norms. Intuitively, the size of an operator relates to

how much it changes the size of a vector. Given:

Just as there are several vector norms, each induces an operator norm.
5.2.1 The L1 norm of an operator

y = Ax
with suitable norms ||y|| and ||x||. Then the operator norm ||A||op is defined
by:
||y||
(49)
= sup ||y||
||A||op = sup
x6=0 ||x||
||x||=1
where sup, supremum, indicates the least upper bound.
The operator norm is induced by the vector norms ||y|| and ||x||.
An operator matrix does not have to be square or full rank, it can be any
matrix, A Rnm
5.1.1 Operator norm properties

## 2. ||A1 + A2 ||op ||A1 ||op + ||A2 ||op

3. ||A1 A2 ||op ||A1 ||op ||A2 ||op
4. || A||op = || ||A||op

Example

1 4

y = Ax = 2 5 x,

3 6

## what is the max. possible ||y||1 / ||x||1

For the moment lets consider x with ||x||1 = 1 . (Through linearity, ||y||1
just scales with ||x||1 ) .

## Operator norms have these properties

1. ||A x|| ||A||op ||x||

## The L1 norm of an operator answers this question: Given a vector x with

||x||1 , and y = A x , what is the largest possible value of ||y||1 ?

## (from the definition of the operator norm)

(triangle inequality)
(Cauchy-Schwarz inequality)

It turns out that the choice for x that gives the largest possible 1-norm
for the output, is to put all of ||x||1 on the element corresponding to the
largest column vector A . (Largest in the L1 sense.)
Choose:

(scalar multiplication)
so

0

x=
1

then

||x||1 = 1 , ||y||1 = 15

||A||1op = 15
Part 3: Linear Operators

Page 49

Page 50

Section 5.2.1

Example:

A=

Section 5.2.2

1
4

,
a2 =
2
5

3 6

a1

y = Ax ,

## The L2 -norm of an operator gives the maximum L2 length of an output

vector for a unit input vector.
so

0.5

for x =
,
0.5

0
for x = ,
1

||a2||1 = 15
||A||2 = sup
x6=0

1

y= 2 ,

3

1
for x = ,
0

||a1||1 = 6 ,

||y||1 = 6

||A||2 =

max

||x||2 =1

xT A A x

o1/2

yT y

(52)

## Eqn (52) is just a re-statement of the definition of an operator norm, with

the ||y||2 expanded inside the brackets.
The L2 norm of a matrix is the largest singular value of the matrix
||A||2 = (A)

||A||1 = max a j 1
j

(53)

## It is determined using the singular value decomposition (a topic of Bay

chapter 4). For Example:
>>[U,S,V] = svd(A)
U = -0.429
0.806
-0.566
0.112
-0.704
-0.581

V =

S =
0.408
-0.816
0.408

9.508
0
0

0
0.773
0

-0.386
-0.922

(50)

## The L1 norm of matrix A is the largest L1 norm of any column vector of A.

Part 3: Linear Operators

2.5

## y = 3.5 , ||y||1 = 10.5

4.5

4

y = 5 , ||y||1 = 15

6

then

(51)

## Bay defines the L2 norm of a matrix operator as:

Given
A = [a1 , a2 ... , am ]

||y||2
||A x||2
= sup
= sup ||A x||2
||x||2 x6=0 ||x||2
||x||2 =1

Page 51

## >> x2 = [.386; .922]

;
y2 = A*x2 ,
y2 = [ 4.0740,
5.3820
,
6.6900]
>>norm(y2)/ norm(x2),
ans =
9.5080
Part 3: Linear Operators

Page 52

-0.922
0.386

Section 5.2.3

## 5.2.4 The Frobenius norm

An alternative to the operator norms induced by the vector norms (the 1-, 2and - norms, above), one can define an norm directly on the elements of
the matrix. The entry-wise norms are given by:

a unit input .
||A|| = sup
x6=0

Section 5.2.4

||y||
||A x||
= sup
= sup ||A x||
||x|| x6=0 ||x||
||x|| =1

(54)
n

||A||opp =

## Bay defines the L norm of a matrix operator as:

m

||A|| = max
i

j=1

ai j

i=1 j=1

p
ai j

!1

(56)

Note, care must be taken with notation, because the normal notion of the
operator norm is induced, as in section 5.2.2, not given by Eqn (56)

(55)
A special case of these so-call entry-wise norms is the Frobenius norm
n

The L -norm is given by the row vector of A with the greatest L1 -norm.

||A||F =

## For the example matrix:

i=1 j=1

!1
2
2
ai j

(57)

The Frobenius norm has the properties a norm (described in section 5.1)

r
1 4
1

A = r2 = 2 5 ,

3 6
r3

r1 =
r2 =
r3 =

h
h
h

1 4
2 5
3 6

i
i
i

||r1 ||1 = 5
,

compute

||r2 ||1 = 7

decomposition).

||r3 ||1 = 9

So ||A|| = 9

## To see why ||A|| is given by the 1-norms of the rows

-norms of anything !) consider that:

||A||F =

0
1
1
1
1
= =
=
= = 1

1
1
1
1
0

## One of the above vectors, multiplying A, gives the maximum -norm

output. And that max is the greatest 1-norm of a row.

## (Revised: Sep 10, 2012)

||A||F

Page 53

i1/2
h 
T
tr A A

v
umin(n,m)
u
= t 2i (A)

(58)

(59)

i=1

where tr () denotes the trace of a matrix, which is the sum of the elements
on the diagonal; and the i in Eqn (59) are the singular values.
Part 3: Linear Operators

Page 54

Section 5.4.0

Section 5.5.0

## See, for example

J-C Lo and M-L Lin, Robust H Control for Fuzzy Systems
with Frobenius Norm-Bounded Uncertainties, IEEE Transactions on
Fuzzy Systems 14(1):1-15.

## Adjoint Operator: The adjoint of a linear operator A is denoted A and must

satisfy the relationship
hAx, yi = hx, A yi
for all x and y

||A x|| ||x||

## Adjoint Operator is a general concept that applies to all types of vector

spaces (such as vectors of polynomials)

## 5.4 Operator Norms, conclusions

All 4 matrix norm definitions satisfy properties 1-4 of operator norms
(section 5.1.1).

## The adjoint of a real-valued operator is just the matrix transpose


A = AT

operator is just the complex-conjugate
 of a complex-valued

T
transpose A = A
Example:

A=

Matlab:
Part 3: Linear Operators

Page 55

4 j

2
5

3+ j 6
1

A =

2 3 j

4+ j 5 6
1

is A

Page 56

## EE/ME 701: Advanced Linear Systems

Section 6.0.0

When A = A
e.g., A =

3+ j 7
A is a symmetric matrix for the real case,
2

## A is a Hermitian matrix for the complex case (Hermitian = complex

conjugate transpose, A in Matlab)

Section 7.0.0

## 7 Forming the intersection of two vector spaces

Given vector space U, with basis vectors {u1 , u2 , ..., unu } and vector space
V, with basis vectors {v1 , v2 , ..., vnv } , where nu is the dimension of U and
nv is the dimension of V.
Sometimes it is interesting to find the intersection of the two vector spaces,
which is a vector space W given by

## We can also say that A is a self-adjoint operator

Hermitian matrices have two important properties that we will use to
develop the singular value decomposition

W = UV

(60)

When U = [u1 , u2 , ..., unu ] and V = [v1 , v2 , ..., vnv ], for vectors lying in W,
it must be the case that there are representations on both U and V , that is

w W,

(61)

## 6 Bay Section 3.3

A system of simultaneous linear equations can be written in the form
y = Ab

## Which is to say, vectors made of the U

and V basis coefficients of vectors in the
intersection, must lie in the null space of
the matrix [U, V ].

## Bay address the various possibilities of the equation y = A b .

We will come back to this topic after covering the singular value
decomposition

Page 57

a
1

a2

.
..

an
u

[U, V ]
=0

b1

b
2

.
..

bnv

Page 58

(62)

Section 7.1.0

## The dimension of the intersection is the number of linearly independent

solutions to Eqn (62), which is the dimension of the null space of [U, V ]
Vectors

a
1

wi = U ...

anu

(63)

8 Conclusion
For finite vectors, linear operators are matrices

## Rotation + Reflection matrices Ortho-normal matrix

Linear Operators have a number of characteristics

## They change when we make a change of basis

The form a vector space

7.1 Example
>> U = [

Section 8.0.0

b
1

wi = V ...

bnv

or equivalently

1
2
3

4
5
6 ]

Null = 0.5000
0.5000
0.5000
0.5000

>> V = [ 2
3
5

3
4
4 ]

## For any operator, there is an adjoint operator

>> w1 = U*Null(1:2)
w1 = 2.5000
3.5000
4.5000

## Verify that w1 lies in U and V by checking the projection onto each.

(Notice the projection calculation for non-orthogonal basis set.)
>> U*inv(U*U)*U*w1
ans = 2.5000
3.5000
4.5000
Part 3: Linear Operators

>> V*inv(V*V)*V*w1
ans = 2.5000
3.5000
4.5000
(Revised: Sep 10, 2012)

Page 59

Page 60

## Singular Value Decomposition

6 Determining the rank of A and the four fundamental spaces with the
SVD
28
6.1

Contents
1 Introduction

Example: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

31

## 7 Exact, Homogeneous, Particular and General Solutions to a Linear

Equation
34

1.1

Rotation+ Matrices . . . . . . . . . . . . . . . . . . . . . . . . .

7.1

## The four types of the solution for y = A x . . . . . . . . . . . . .

34

1.2

Scaling Matrices . . . . . . . . . . . . . . . . . . . . . . . . . .

7.2

36

## 2 The Singular Value Decomposition

2.1

Numerical Example: . . . . . . . . . . . . . . . . . . . . . . . .

2.2

2.2.1

## 3 Proof of the SVD theorem (following Will)

Solving for the j and u j , given the v j . . . . . . . . . . . . . . .

12

3.2

13

3.3

15

3.4

## Choosing the v j correctly

. . . . . . . . . . . . . . . . . . . . .

16

3.5

19

40

42

10 Conclusions

43

20

23

4.1.1

24

5 SVD Conclusion

12

3.1

4.1

. . . . . . .

27

Page 1

Page 2

Section 1.0.0

1 Introduction

Section 1.1.0

## Gilbert Strang calls the SVD:

"Absolutely a high point of linear algebra."

y = Ax ,

y Rn ,

x Rm ,

A Rnm

(1)

## For a nice explanation of the SVD by Todd Will, see:

http://www.uwlax.edu/faculty/will/svd

Ax=y
y

## 1.1 Rotation+ Matrices

As we have seen, rotation+ matrices preserve lengths and angles.

Output Space, Rn

## Figure 1: Action of a matrix, A maps x onto y.

As we shall see, in all cases, the linear transformation from Input Space
to Output Space is made up of 3 parts:
1. An ortho-normal transformation from input coordinates to singular
coordinates
2. Scaling (in singular coordinates)
3. An ortho-normal transformation from singular to output coordinates
As we saw in the previous section, an orthonormal matrix gives either a
rotation (det R = +1) or reflection (det R = 1). Some authors use the term
rotation matrix for actions 1 and 3, because it captures the intuition of
what is going on, but rotation alone is not sufficiently complete.

## The relation y = A x is the general linear operator from an Input Vector

Space to an Output Vector Space.

Input Space, Rm

1
1

1
X

1
1

1
X

## We will use rotation+ to refer the orthonormal matrices.

If we can understand y = A x in terms of rotations+ and scaling, many
interesting results and practical methods will follow directly.
Part 4a: Singular Value Decomposition

Page 3

Page 4

Section 1.2.0

## Scaling matrices scale the Euclidean axes of a space. An example Scaling

matrix is:

1 0
0.5 0

=
(2)
=
0 2
0 3
15

10

10

Scaled by 3.0

5
5

5
5

5
X

10

15

1 0
0 2

15

15

10

10

(3)

Scaled by 0.0
Scaled by 0.5

0.5 0

15

Section 1.2.0

0
5
5

5
X

10

15

Scaled by 0.5

5
X

10

5
5

15

5
X

10

15

Figure 4: Action of a scaling matrix with a singular value of zero for the Y axis.
Figure 3: Action of a scaling matrix.
A Scaling matrix does not have to be square. The matrices
Figure 4 was generated with:
FF = [ A collection of 2-vectors (specifying points) ]
Scale = [ 0.5, 0; 0 3.0];
FF1 = Scale * FF;

0.5 0 0
0

3 0

0.5 0

= 0

Scaling Matrices:
1. Are diagonal
2. Have zero or positive scaling values
A negative value flips the configuration, well do this with rotations+.

Page 5

1 0 0

1 0

0 2 0

3 = 0 2

0 0
0

(4)

(5)

## are also scaling matrices, in operators which map from R3 R2 or

R2 R3 , respectively

Page 6

Section 2.0.0

Section 2.1.0

## The SVD THEOREM:

Every matrix A can be decomposed into:
1. An rotation+ from input coordinates onto Singular Coordinates (the
coordinates in which the scaling occurs).

>> A = [ 2 3 ;
3 4
A =
2
3
3
4
4
-2

-2 ]

## 2. Scaling in singular coordinates

3. An rotation+ from singular coordinates onto output coordinates.

U =

-0.5661
-0.7926
-0.2265

-0.1622
-0.1622
0.9733

S =

6.2450
0
0

0
4.3589
0

## Said another way:

A Rnm ,

A = U VT

-0.8082
0.5878
-0.0367

(6)

and so
y = Ax =U V x
T

(7)

V =

-0.7071
-0.7071

0.7071
-0.7071

where:
1. V T is an rotation+ from input coordinates onto singular coordinates.
2. is a scaling matrix.
3. U is an rotation+ from singular coordinates onto output coordinates.

>> A1 = U*S*V
A1 = 2.0000
3.0000
3.0000
4.0000
4.0000
-2.0000

Uniqueness:
Remarks:

## The singular values are unique

The columns of U and V corresponding to distinct singular values are
unique (up to scaling by 1)

## 1. The columns of U form an ortho-normal base for Rn (the output space). U

is a rotation+ matrix.

## The remaining columns of U and V must lie in specific vector subspaces,

as described below.

## 2. The columns of V form ortho-normal base for Rm (the input space). V is a

rotation+ matrix.
3. The scaling matrix is zero everywhere except the main diagonal.

Page 7

Page 8

Section 2.2.1

Section 2.2.1

## The singular value decomposition can be written

A = U V
or

(8)

A=

uk k vTk

(9)

| | | |
0 0

T
A = U V = u1 u2 un 0 2 0

0 0 ...
| | | |

k=1

where uk are the columns of U, vk are the columns of V , and r is the rank of
A.
Eqn (8) is a decomposition into matrices, while Eqn (9) is a decomposition
into individual component vectors.

vT1

vT2

(12)

..

T
vm

Recall
p = min (n, m)
r = rank (A)
Because of the zeros in the scaling matrix, Eqn (12) reduces to one of
Eqn (13) or (14), below.

## 2.2.1 Demonstrating that A = rk=1 uk k vTk

The development of the proof of the SVD and generalized inverse rest on
the decomposition

## When m n , A is given by Eqn (13), where the columns of U for

j = m + 1..n multiply zero rows in , and are omitted.

A=

uk k vTk

(10)

k=1

## Which is the relationship that tells us the connection between a specific

vector vk in the input space and a specific vector uk in the output space.
A general proof of the singular value decomposition theorem follows in
section 3.
Here we show that,
Given that a decomposition exists with the form
A =U V

(11)

## where U and V are ortho-normal matrices and is a scaling matrix,

Then a decomposition according to Eqn (9) follows from Eqn (8).
Part 4a: Singular Value Decomposition

## (Revised: Sep 10, 2012)

Page 9

And when n < m, then A is given by Eqn (14), where the rows of V for
i = n + 1..m multiply zero columns in , and are omitted.

A =

A =

1 vT1

| |
|
T

2 v2
u1 u2 um
..

| |
|
m vTm

|
|
|

1 u1 2 u2 n un

|
|
|

## Part 4a: Singular Value Decomposition

when p = m < n
,
keep only p columns of U (13)

vT1

vT2
when p = n < m
,
keep only p rows of V T (14)
..

vTn
(Revised: Sep 10, 2012)

Page 10

Section 2.2.1

A = 1 u1 2 u2

|
|

Section 3.1.0

vT1

vT

2
(15)
p up

..

|
vTp
|

## EE/ME 701: Advanced Linear Systems

Going across the rows on the right, and down the columns on the left, the
elements of 1 u1 form a vector outer product with the elements of vT1 ,
2 u2 with vT2, etc.
The mechanics of multiplying the terms of Eqn (15) give the result

## Following the demonstration given by Will (see: http://www.uwlax.edu/faculty/will/svd)

we first show that given the v j , the decomposition
r

A=

uk k vTk = U V T

(18)

k=1

can be derived.
Then we show how to determine the v j satisfying a necessary condition.

## 3.1 Solving for the j and u j , given the v j

A=

k uk vTk

(16)

k=1

If r < p then the k , k = {r + 1 , ... p} are zero, and those terms can be
dropped from Eqn (16), giving the sought after form:
r

A=

k uk vTk

(17)

k=1

## Given Eqn (18) we find that

r

Avj =

uk k vTk

k=1

v j = u j j vTj v j = u j j

(19)

## where all the terms k 6= j drop out because vTj v j = 0, k 6= j, and 1, k = j.

Thus, the u j and v j vectors and j must satisfy
Avj = uj j

(20)

Eqn (20) immediately solves for all of the j and vectors u j corresponding
to j > 0:

j = A v j
uj =

Page 11

## Part 4a: Singular Value Decomposition

(21)

1
Avj
j

(22)
(Revised: Sep 10, 2012)

Page 12

Section 3.2.0

V =

0.39 0.92
0.92

0.39

4.08

A v1 = 5.38

6.69

u1 =

1
1

9.51

0.43

1.00 4.00

0.39 0.92

=
2.00 5.00
0.77

0.92 0.39
0
3.00 6.00
(23)

## For U to be an ortho-normal matrix, u1 and u2 must be orthogonal and

normal.
They are automatically normalized by the calculation of Eqn (22).
Check that they are orthogonal:
>> u1 = U(:,1)
u1 = -0.4287
-0.5663
-0.7039
>> u2=U(:,2)
u2 = 0.8060
0.1124
-0.5812
>> u1*u2
ans =
1.1102e-16

5.38 = 0.57

6.69
0.70

9.51 0

= 0 0.77 ,

0
0

0.43 0.81 |

U = 0.57 0.11 ?

0.70 0.58 |

## (Revised: Sep 10, 2012)

0.81

Remarks:

4.08

1 = 5.38 = 9.51

6.69

4.08

0.43

A = 0.57 0.11 ? 0

0.70 0.58 |
0

1 4

A= 2 5

3 6

Choose:

Section 3.2.0

Then A = U V T gives:

Consider

## Our algorithm does not tell us how to choose u j if j is zero or non-existent.

But either way, it doesnt matter for Eqn (23).
Since U is an ortho-normal matrix, the missing columns must be an

orthonormal basis for the orthogonal complement of the u j , j = 1..r .
Page 13

Page 14

Section 3.3.0

## For U to be an ortho-normal matrix, its columns must be orthonormal. The

columns are normal automatically, by step (22), but orthogonal ?

## What happens if we dont choose the correct ortho-normal matrix V ?

0.49 +
* 0.36

0.55 , 0.57 = 0.99 6= 0

0.66
0.75

V =

0.94 0.34
0.34

0.94

## No ! if V is not correctly chosen, U does not turn out to be an ortho-normal

matrix !
(yet to be proven).

2.31

2.31

1 = 3.59 = 6.48

4.87

A v1 = 3.59

4.87

2.31

0.36

## 3.4 Choosing the v j correctly

For finding the j and u j according to Eqns (21) and (22), the challenge is
to find v j which are orthonormal, such that u j are orthogonal.

This requires:

u1 =
3.59 = 0.55
1

4.87
0.75

For j 6= i ,

6.48 0

= 0 7.00 ,

0
0

(24)

## When j is zero or non-existent we are free to select u j in the orthogonal

complement of {ui } (the left-null space of A), so we are interested in the
case j , i > 0 .

In this case 1/ j (1/i ) 6= 0 , so Eqn (24) is verified if and only if

0.36 0.49 |

U = 0.55 0.57 ?

0.75 0.66 |

## (Revised: Sep 10, 2012)

uTj ui = 0, and so

 1 1
uTj ui = vTj AT
(A vi ) = 0
j i

## Repeating steps (21) and (22) for v2 gives:

Section 3.4.0





vTj AT (A vi ) = vTj AT A vi = 0

Page 15

(25)

Page 16

Section 3.4.0

## Lemma 1: When the vectors vi are ortho-normal and span Rm , then


vTj AT A vi = 0 j 6= i if and only if
AT A vi = i vi

(26)

Section 3.4.0

## In the general case of an m m matrix, eigenvalues and eigenvectors can be

a bit messy:
They can be complex.
The eigenvectors will not, in general, be orthogonal, and


in which case vTj AT A vi = i vTj vi = 0

There may not even be a complete set of eigenvectors (in which case we
say the matrix is defective)

Proof:



Necessary: Because the vi are ortho-normal, vectors v j , j 6= i span the
entire m 1 dimensional space orthogonal to vi , therefore if AT A vi must
lie in the 1-D space spanned by vi .
Sufficient: Since
v j vi .

AT

A vi = i vi , then

vTj AT

A vi =

vTj i vi

= 0, because
QED

Q = AT A.

## Square matrices have special vectors which give a scaled version of

themselves back under multiplication. For example:

1
1

2
2

## Lemma 2: When Q is Hermitian (that is, when Q = Q , Q is the complex

conjugate), then
1. There is a complete set of eigenvectors, and furthermore, they are
orthogonal (ortho-normal, when we normalize them), and
2. The eigenvectors and eigenvalues are real (even if Q is complex).

## However when Q is a Hermitian matrix (as Q = AT A must be) we are in

luck.

= 2

1
1

(27)

Remarks:
1. A symmetric matrix, such as Q = AT A , is a real Hermitian matrix.
2. Further discussion is deferred until our in-depth discussion the eigen-system
of a matrix.

These special vectors are called eigenvectors and the scale factors are
called the eigenvalues.
For an m m matrix, there can be at most m eigenvectors.
Eigenvectors can be scaled (considering multiplying left and right of Eqn
(26) by a scale factor). We usually normalize eigenvectors.
Part 4a: Singular Value Decomposition

Page 17

Page 18

Section 3.5.0

## 3.5 Proof by construction of the SVD

With the above lemmas taken together, for any matrix A

y = Ax

## AT A is Hermitian (symmetric if A is real)

Lemma 2 shows that AT A has a complete set of real, orthogonal
eigenvectors.
Lemma 1 shows that these eigenvectors satisfy the condition to be the
input basis vectors the SVD, the v j in
r

A=

Section 4.0.0

With A and y given. We are looking for a particular solution that solves
y = A x p
where
y is the projection of y onto the column space of A

uk k vTk

= U V

(28)

k=1

according to

j = A v j
uj

1
=
Avj
j

(29)
(30)

## The remaining u j are determined as basis vectors for the orthogonal

complement of the u j corresponding to j 6= 0.
Following these steps, the singular value decomposition of A is constructed.
QED

Page 19

i
h
Uc = u1 u2 ur

(31)

y = Uc UcT y =

uk uTk y

(32)

k=1

## Writing Eqn (32) out graphically shows

uT

| |
|
| |
|

uT

2
y=
y = u1 u2 ur
u1 u2 ur

..

| |
|
| |
|
uTr

a
1
| |
|

a2

= u1 u2 ur
. = a1 u1 + a2 u2 + + ar ur

..

| |
|
ar
Part 4a: Singular Value Decomposition

uT1 y

uT2 y

...

uTr y

(33)

Page 20

## EE/ME 701: Advanced Linear Systems

Section 4.0.0

Eqn (33) shows that the projection of y onto the column space of A can be
written
y = a1 u1 + a2 u2 + + ar ur
(34)

Section 4.0.0

## When we replace x in Eqn (38) with the expanded x of (39), we get

ak = k vTk x = k vTk (b1 v1 + b2 v2 + + br vr )

(40)

with
ak = uTk y

(35)

So far there is nothing surprising about Eqns (33-35). They say that we
can write the output of
y = A x p
(36)
as a linear combination of basis vectors on the column space of A.

## But the basis vectors vk are ortho-normal, so

ak = k vTk (b1 v1 + b2 v2 + + br vr ) = k vTk vk bk = k bk

## We can write Eqns (33-35) whenever we have Uc , an ortho-normal basis

for the column space of A .

## Using Eqns (35) and (41) gives bk , the basis coefficients of x p

Without the SVD, the analysis would stop here, there is no way to
discover contributions of x to give the basis coefficients ak .
With the SVD there is a way to discover the contributions of x to each term !

bk = (1/k ) ak = (1/k ) uk T y

uk k vTk

y = Ax =

k=1

xp =
x

(42)

## We know each term in the output is given by:

!

(41)

k=1

k=1

vk bk = vk (1/k ) uk

y=

vk (1/k ) uk

k1

y = A# y

(37)

= a1 u1 + a2 u2 + + ar ur

(43)

with
ak = k vTk x

(38)

A# =

vk (1/k) uk T

(44)

k=1

## Any vector x in the row space of A can be written

Or in Matlab:
x = b1 v1 + b2 v2 + + br vr

(39)
>> Asharp = pinv(A)

Page 21

Page 22

Section 4.1.0

Section 4.1.1

## 4.1.1 A numerical example using the generalized inverse

Consider the data

y = Ax

(45)

## With data vector y, find

A# =

vk (1/k) uk T

(46)

A = [ 1
2
3
4

5
6
7
8

6
8
10
12 ]

ybar =

6
5
5
3

k=1

With y = A x, find x p
And the solution
x p = A#

(47)

## lies in the row space of A, and gives

y = A x p

(48)

Before proceeding, lets find the correct y , the projection of y onto col (A)
>> W = GramSchmidt(A)
W = 0.1826
0.8165
0.3651
0.4082
0.5477
0.0000
0.7303
-0.4082

yhat =
6.1000
5.2000
4.3000
3.4000

## With y the projection of y onto the column space of A .

Trying the left pseudo-inverse:

## >> xp1 = inv(A*A) * A * ybar

Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 8.606380e-18.
xp1 = -2.8750
0.6875
-1.0000

## Eqn (46) always works, for every shape or rank of matrix.

Eqn (46) gives a new degree of freedom: we can choose r !
(More on this at the end of the notes on SVD)

## Double checking that Ax p1 gives y

A mile off from y !

>> A*xp1,

ans = -5.4375
-9.6250
-13.8125
-18.0000

## (The warning is to be taken seriously !)

Part 4a: Singular Value Decomposition

Page 23

Page 24

## EE/ME 701: Advanced Linear Systems

Section 4.1.1

Now with the generalized inverse (also called the pseudo inverse)
>> [U,S,V] = svd(A)
U = -0.3341
0.7671
-0.4359
0.3316
-0.5378
-0.1039
-0.6396
-0.5393
S =

V =

-0.4001
0.2546
0.6910
-0.5455

23.3718
0
0
0

0
1.3257
0
0

0
0
0.0000
0

-0.2301
-0.5634
-0.7935

-0.7834
0.5910
-0.1924

-0.5774
-0.5774
0.5774

Section 4.1.1
or

## >> xp = Asharp * ybar

xp =
-2.3500
2.0500
-0.3000

-0.3741
0.7970
-0.4717
0.0488

>> xp=pinv(A)*ybar
xp = -2.3500
2.0500
-0.3000

## Double checking that A x p1 gives y

Previously we saw three cases for the equation
y = Ax ,

A Rnm

Case

Size of A

Exactly constrained

n=m

Matrix Inverse

Tool

Over constrained

n>m

Left Pseudo-Inverse

Under constrained

n<m

Right Pseudo-Inverse

(49)

Must Exist

A1
1
ATA
1
A AT

## Each of these cases requires that a matrix be invertible, or other methods

are required. With the generalized inverse

## There are 2 non-zero s , using Eqn (46)

>> Asharp = V(:,1)*(1/S(1,1))*U(:,1) + V(:,2)*(1/S(2,2))*U(:,2)
Asharp = -0.4500
-0.1917
0.0667
0.3250
0.3500
0.1583
-0.0333
-0.2250
-0.1000
-0.0333
0.0333
0.1000
Eqn (46) is implemented by the pinv() command in Matlab
>> Asharp = pinv(A)
Asharp = -0.4500
0.3500
-0.1000

xp =

k=1

k=1

with
A# =

(50)

vk (1/k ) uTk

k=1

-0.1917
0.1583
-0.0333

0.0667
-0.0333
0.0333

0.3250
-0.2250
0.1000

## There is no matrix inversion.

At most the basis vectors corresponding to k > tol are retained (we have
the freedom to choose r < rank (A)).
If A 6= 0, at least one k is guaranteed to be greater than zero.

Page 25

Page 26

Section 5.0.0

5 SVD Conclusion

:
U:

Section 6.0.0

## 6 Determining the rank of A and the four funda-

The singular value decomposition gives us an expansion for any real matrix
A of the form
A = U V T
V T:

## EE/ME 701: Advanced Linear Systems

rotation+
from input
i coordinates to singular coordinates, where
h
V = v1 vm Rmm

## mental spaces with the SVD

The rank of a matrix is the number of singular values which are greater than
a tolerance.
Recall, singular values are always positive or zero.

## In any numerical calculations we have round-off error, j may be greater

than zero due to round-off error.

rotation+ from
singular icoordinates to output coordinates,
h
where U = u1 un Rnn

## So we set a tolerance for the minimum output of Y = A b which will be

considered non-zero.

expansion for A:
r

A=

uk k vTk

## Matlabs default value is given by:

tol = max(size(A)) * norm(A) * eps
>> eps = 2.2204e-16 (for 64 bit floating point calculation)

k=1


We have seen that if the vi are the eigenvectors of AT A then the k and uk
are straight-forward to compute.
Since Q = AT A is a Hermitian matrix, the eigenvectors exist and are
orthogonal.
Note: if n < m it may be numerically more convenient to compute the

## eigenvectors of Q = A AT , which is a smaller matrix.

Student exercise: how would the SVD be determined from the eigenvec
tors of Q ?

## The expansion of A is given as:

..
.

..
.

..
.

..
.

..
.

..
.

U V T = u1 ur ur+1 un

..
..
..
...
...
...
.
.
.

1
..

.
r

vT1

..

.
0

vT

vT
0

r+1

..
..

.
.

vTm
0

where 1 ... r are the singular values greater than the tolerance, and the
rest of the diagonal elements of are effectively zero.

As we are about to see, bases for the four fundamental spaces of a matrix
are given directly from the singular value decomposition.

Page 27

Page 28

Section 6.0.0

.. .. ..
. . .

u1 ur

.. .. ..
. . .

.. ..
..
. .
.

ur+1 un

..
.. ..
.
. .

0
...
r

vT1

..

vr

vTr+1

..

T
vm

Section 6.0.0

0
...

## function [Row, Col, Null, Lnull] = spaces(M)

%
function [Row, Col, Null, Lnull] = spaces(M)
%
%
Return the four fundamental spaces
%
[n,m] = size(M);
[u,s,v] = svd(M);
r = rank(M);
%% select spaces
Row=v(:,1:r);
Null = v(:,(r+1):m);
Col=u(:,1:r);
Lnull = u(:,(r+1):n);

where the columns of Uand V provide bases for the Column, Left-Null,
Row and Null spaces:

..
.

..
.

..
.

Uc = u1 ur ,

.. .. ..
. . .

..
.

..
.

..
.

Uln = ur+1 un ,

..
.. ..
.
. .
Part 4a: Singular Value Decomposition

..
.

..
.

..
.

..
.

Vr = v1 vr

.. .. ..
. . .
..
.

..
.

Vn = vr+1 vm

..
..
..
.
.
.
(Revised: Sep 10, 2012)

Page 29

Page 30

Section 6.1.0

6.1 Example:

Section 6.1.0

## We need to take a look at the third singular value to see if it is approximately

zero, or just quite small.

Consider

2
A=

0
0
1
1

## >> ss = diag(S); ss(3)

ans =
4.2999e-16
(51)
3 is effectively zero (zero to within the round off error), so A is rank 2.
Now form the 4 fundamental spaces.

The four fundamental spaces of matrix A can be found through the SVD:
>> r = sum(ss > norm(A)*max(n,m)*eps)
r = 2

U =
-0.2543
0.3423
-0.5086
0.6846
-0.6947
-0.2506
-0.4405
-0.5928
S =

0.8987
-0.3615
-0.1757
0.1757

5.3718
0
0
0

0
1.0694
0
0

0
0
0.0000
0

V = -0.5774
-0.2113
-0.7887

0.5774
-0.7887
-0.2113

0.5774
0.5774
-0.5774

-0.1030
0.3769
-0.6509
0.6509

## >> [n, m] = size(A)

n =
4 ,
m =
3
Since the rank is 2, the a basis for the row space is given by the first 2
columns of V
>> Vr
Vr =

>> Vn
Vn =

= V(:, 1:r)
-0.5774
0.5774
-0.2113
-0.7887
-0.7887
-0.2113

## (Revised: Sep 10, 2012)

Page 31

= V(:, (r+1):NRows)
0.5774
0.5774
-0.5774

Page 32

## EE/ME 701: Advanced Linear Systems

Section 6.1.0

Since the rank is 2, a basis for the column space is given by the first 2
columns of U
>> Uc
= U(:, 1:r)
Uc = -0.2543
0.3423
-0.5086
0.6846
-0.6947
-0.2506
-0.4405
-0.5928

Section 7.1.0

## 7 Exact, Homogeneous, Particular and General

Solutions to a Linear Equation
7.1 The four types of the solution for y = A x
There are four elements in the solution for
y = Ax

(52)

equations:

## >> Un = U(:, (r+1):NCols)

Un =
0.8987
-0.1030
-0.3615
0.3769
-0.1757
-0.6509
0.1757
0.6509

Exact Solution: An exact solution is one that exactly satisfies Eqn (52). Recall
that if x is over-constrained (more ys than xs), then we can get a solution
which is not an exact solution.
Homogeneous Solution: A homogeneous solution is a solution to:
A xh = 0
There is the trivial homogeneous solution, x = 0. To have more interesting
homogeneous solutions, matrix A must nhave a non-trivial onull space.
Suppose the null space is given by a basis n1 , n2 , , nn :
xh = n1 n1 + n2 n2 + + nn nn

A xh = 0

(53)

many of them.

Page 33

Page 34

## EE/ME 701: Advanced Linear Systems

Section 7.1.0

Particular Solution: A particular solution is one that solves Eqn (52), either
exactly or in a least squared error sense. x p is the particular solution, and
y = A x p. The residual,
e
(54)
y = y y
may be zero (an exact solution) or may be minimal.

## Each of the terms relating to the particular solution comes from a

fundamental space:
y lies in the column space.
x p lies in the row space.
Recall the several different norms, minimizing the residual for each
different norm can give a different meaning to x p. One can assume that
is minimized in the least-squares sense, unless a different norm is given.
When working with the 2-norm (the usual case),

Section 7.2.0

## 7.2 Numerical example showing the generalized inverse

Continuing with the example in section 6.1,
>> Uc = U(:, 1:r)
Uc = [ -0.2543
-0.5086
-0.6947
-0.4405
>> Vr = V(:, 1:r)
Vr = [ -0.5774
-0.2113
-0.7887

'

Keep in mind:

0.3423
0.6846
-0.2506
-0.5928 ]

A = Uc

## Even though A R43 , r is

a 2x2 matrix, and Uc and Vr
each have 2 columns.

0.5774
-0.7887
-0.2113 ]

SigmaR = [
5.3718
0
0
1.0694

r VrT

&

## y is given as the projection of y onto the column space.

Choose an example y , project y onto the column space.

e
y lies in the left-null space.

>> Y = [ 1 2 3 4 ]

x p is given as

xp = A y

(55)

## General Solution: The general solution is a parametrized solution, it is the set of

all solutions which give y :
xg = x p + xh

y = A xg

(56)

>> Yhat = Uc * Uc * Y
Yhat = [ 0.8182
1.6364
3.9091
3.0909 ]

If the null space and any particular solution is known, then all of the general
solutions are given by Eqn (56).

Page 35

Page 36

Section 7.2.0

Section 7.2.0

## Lets take a look at the residual e

y = y y

Now compute x p
>> Xp = Vr* inv(SigmaR) * Uc * Y
Xp = [ -0.2121
1.2424
1.0303 ]

ytilde =
0.1818
0.3636
-0.9091
0.9091

## Double check that A x p gives y

>> A * Xp
ans = [ 0.8182
1.6364
3.9091
3.0909

e
y should lie in the left-null space. The basis for the left null space is Un
>> Un = U(:, (r+1):NCols)
Un = [ 0.8987
-0.1030
-0.3615
0.3769
-0.1757
-0.6509
0.1757
0.6509

## The example y = A x looks like an over-constrained problem (4 data, 3

unknowns), try matrix pseudo-inverse:
>> XX = inv(A*A) * A * Y
Warning: Matrix is close to singular or badly scaled.
Results may be inaccurate. RCOND = 5.139921e-18.
XX = [ -6.2500
1.5000
4.2500 ]

Checking that e
y lies in Un :
>> ytilde2 =
0.1818
0.3636
-0.9091
0.9091

Un * Un * ytilde

A * XX
---------2.0000
-4.0000
1.7500
3.7500

A * Xp
-------0.8182
1.6364
3.9091
3.0909

Yp
-----0.8182
1.6364
3.9091
3.0909

Y
----1
2
3
4

Page 37

Page 38

Section 7.2.0

Section 8.0.0

## Define is the largest singular value of A and is the smallest singular

value of A. Also, order the singular values so that 1 = , ... , p = .

xg = x p + xh
In some cases the null space will be empty, so xg = x p . But in this case
dim null (A) = 1

.577

xh = a1 vm = a1 .577

.577

||A||2 =

By writing y = rk=1 uk k vTk x , it is clear that the largest ||y||2 is given
when x corresponds to the largest singular value of A. Choosing x = v1
gives y = 1 u1 and ||y||2 = 1 .
The condition number of A is given as

So xg is given as:

cond (A) =

0.2121
.577

xg = 1.2424 + a1 .577

1.0303
.577

## Consider that when we form A# we use 1/k , if there is a k which is very

small, there will be terms in A# which are very large. With
x = A# y

0.2121
.577

## xg = 1.2424 + 0.42 .577

1.0303
.577
>> Xg = Xp - 0.42*Xh
Xg = [ -0.455
1.000
1.273 ]

(57)

The condition number is a measure of how much round off errors can be
expanded by multiplying a matrix by A.

## Coefficient a1 can be used to satisfy one constraint, such as make

x (2) = 1.0 . Which gives a1 = .2424/.577 = 0.42

if there are noise components in y, when they hit the large terms in A#
they will give large errors.
Example: A problem with cond (A) = 10 is said to be well conditioned,
if cond (A) = 10000 the problem is said to be poorly conditioned.

>> A * Xg = Yhat
= 0.8182
1.6364
3.9091
3.0909

## If = 0, the condition number is unbounded.

A method to handle poorly conditioned problems:
q

## use A# = k=1vk (1/k ) uTk with some q < r

That is, throw out the smallest singular values.

Page 39

Page 40

## EE/ME 701: Advanced Linear Systems

Section 8.0.0

The absolute value of the determinant of any matrix is the product of its
singular values
p

|det (A)| = k

(58)

k=1

## Corollary: if any k = 0, then det (A) = 0.

Recall that when A Rnn
n

det (A) = k
k=1

n
p

k = k
k=1 k=1

Section 10.0.0

## 9 History of the Singular Value Decomposition

The singular value decomposition was originally developed by differential
geometers, who wished to determine whether a real bilinear form could be made
equal to another by independent orthogonal transformations of the two spaces
it acts on. Eugenio Beltrami and Camille Jordan discovered independently,
in 1873 and 1874 respectively, that the singular values of the bilinear forms,
represented as a matrix, form a complete set of invariants for bilinear forms
under orthogonal substitutions. James Joseph Sylvester also arrived at the singular
value decomposition for real square matrices in 1889, apparently independent
of both Beltrami and Jordan. Sylvester called the singular values the canonical
multipliers of the matrix A. The fourth mathematician to discover the singular
value decomposition independently is Autonne in 1915, who arrived at it via
the polar decomposition. The first proof of the singular value decomposition
for rectangular and complex matrices seems to be by Eckart and Young in 1936;
they saw it as a generalization of the principal axis transformation for Hermitian
matrices.
In 1907, Erhard Schmidt defined an analog of singular values for integral operators
(which are compact, under some weak technical assumptions); it seems he was
unaware of the parallel work on singular values of finite matrices. This theory was
k
further developed by mile Picard in 1910, who is the first to call the numbers sv
singular values (or rather, valeurs singulires).
Practical methods for computing the SVD date back to Kogbetliantz in 1954, 1955
and Hestenes in 1958 resembling closely the Jacobi eigenvalue algorithm, which
uses plane rotations or Givens rotations. However, these were replaced by the
method of Gene Golub and William Kahan published in 1965 (Golub & Kahan
1965), which uses Householder transformations or reflections. In 1970, Golub
and Christian Reinsch published a variant of the Golub/Kahan algorithm that is
still the one most-used today.
[Wikipedia]

Page 41

Page 42

## EE/ME 701: Advanced Linear Systems

Section 10.0.0

10 Conclusions
The Singular Value Decomposition gives us a geometric picture of matrixvector multiplication, comprised of
A rotation+
Scaling
A rotation+
Using the SVD we can find basis vectors for the four fundamental spaces.
And the basis sets are ortho-normal
And the basis vectors of the row and column spaces are linked by the
singular values
Numerically, computation of the SVD is robust because computation of the
eigenvectors of a symmetric matrix is robust.
The SVD can be used to compute:
rank (A)
||A||2
cond (A)
Ortho-normal basis vectors for the four fundamental spaces
The solution to y = A x for the general case, without matrix inversion.

Page 43

Form

3.2.4
3.3

27

33

3.3.1

Contents
3.4
1 Introduction
1.1

## Review of basic facts about eigenvectors and eigenvalues . . . . .

1.1.1

Repeated eigenvalues . . . . . . . . . . . . . . . . . . . .

## 2 Properties of the Eigensystem

Bay section 4.1, A-Invariant Subspaces . . . . . . . . . . . . . . .

2.2

2.3

## Interpreting complex eigenvalues / eigenvectors . . . . . . . . . .

10

2.3.1

Example: 3D Rotation . . . . . . . . . . . . . . . . . . .

11

## The eigen-system of symmetric (Hermitian) matrices . . . . . . .

13

3 The Jordan-form
3.1

3.2

41

42

4 Conclusions

43

45

16

16

3.1.1

17

20

3.2.1

## Regular and Generalized Eigenvectors

. . . . . . . . . .

20

3.2.2

22

3.2.3

26

## Part 4b: Eigenvalues and Eigenvectors

3.4.1

. . . . . . . . . . . . . . . . . . .

2.1

2.4

35

## Looking at eigenvalues and eigenvectors in relation to the

null space of ( I A) . . . . . . . . . . . . . . . . . . .

1.1.2
1.2

## Example where regular E-vecs do not lie in the column

space of (A k I) . . . . . . . . . . . . . . . . . . . . .

Page 1

Page 2

Section 1.1.0

1 Introduction

Section 1.1.2

of ( I A)

## We have seen the basic case of eigenvalues and eigenvectors

Starting from
A vk = k vk .

(k I A) vk = 0

(1)

In this chapter we will elaborate the relevant concepts to handle every case.

(4)

## The eigen-values are values of k such that (k I A) has a non-trivial null

space.
The eigenvectors are the basis vectors of the null space !

## 1.1 Review of basic facts about eigenvectors and eigenvalues

Since
Only square matrices have eigensystems

det (k I A) = 0

## The eigenvector satisfies the relationship A vk = k v , which leads to the

eigenvector being the solution to
(A k I) vk = 0

(2)

## Theorem: for each distinct eigenvalue, there is at least one independent

eigenvector.
Proof: The proof follows directly from Eqns (4) and (5).

or, said another way, the eigenvector is a vector in the null space of the
matrix (k I A) .

Notes:

Example:

## 1. Any vector in the null space of (k I A) is an eigenvector. If the null

space is 2 dimensional, then any vec. in this 2D subspace is an E-vec.
2. Since the determinant of any matrix with a non-empty null space is
zero, we have:
(3)
det (k I A) = 0 , k = 1..n
which gives the characteristic equation of matrix A.
Part 4b: Eigenvalues and Eigenvectors

(5)

Page 3

>> A = [ 2 1;

0 2]

## >> [V,U] = eig(A)

V =
1.0000
-1.0000
0
0.0000
Part 4b: Eigenvalues and Eigenvectors

A =

2
0

U = 2
0

1
2

0
2
(Revised: Sep 10, 2012)

Page 4

Section 1.1.2

Section 1.2.0

## When there are repeated eigenvalues:

1. We are assured to have at least 1 independent eigenvector.

## 2. There may fewer independent eigenvectors than eigenvalues

1. Group the into k sets of repeated eigenvalues (one set for each unique
)

Definitions:
The algebraic multiplicity of an eigenvalue is the number of times the
eigenvalue is repeated.

## The number of k in the kth set is called the algebraic multiplicity,

and is given by mk

## The geometric multiplicity is the number of independent eigenvectors

corresponding to the eigenvalue. (dim null (A I))

## Since (k I A) always has a non-trivial null space, every k set of

eigenvalues has at least one eigenvector vk ,

>> A = [ 2 1;

0 2]

A =

2
0

## 2. Determine the number of independent eigenvectors corresponding to k

by evaluating
q (k I A) = dim null (k I A) .

1
2

## This is called the geometric multiplicity, and is given by gk .

>> [V,U] = eig(A)
V =
1.0000
-1.0000
0
0.0000

U = 2.0000
0

## If mk 2, it is possible that there are fewer independent eigenvectors

than eigenvalues.

0
2.0000

1 q (k I A) mk

Eigenvalue: 2.0
Algebraic multiplicity: 2

(6)

3. If mk > gk for any k, the Jordan form and generalized eigenvectors are
required.

Geometric Multiplicity: 1
Number of missing eigenvectors: 1

k k mk gk

## Recall the eigen-decomposition of a matrix:

1
2
..
.

A = V U V 1
The eigen-decomposition only exists if V is invertible. That is if there is a
complete set of independent eigenvectors.
Part 4b: Eigenvalues and Eigenvectors

mk gk

Page 5

Page 6

Section 1.2.0

Section 2.1.0

## 2 Properties of the Eigensystem

Example,
>> A = [ 2 3 4 ; 0 2 1 ; 0 0 2 ]
A =
2
3
4
0
2
1
0
0
2

## 2.1 Bay section 4.1, A-Invariant Subspaces

Recall: for triangular and diagonal matrices, the eigenvalues are the
diagonal elements

## >> [V, U ] = eig(A)

V =
1.0000
-1.0000
0
0.0000
0
0
U =

2
0
0

0
2
0

1.0000
-0.0000
0.0000

>> RoundByRatCommand(V)
ans = 1
-1
1
0
0
0
0
0
0

0
0
2

(7)

D = E

(8)

mk gk

( I A) v = 0

## The Jordan form will be required.

Av = v

Eqn (8) shows that the Eigenvectors lie in the null space of ( I A)

v Av = 0

## When the operator A is understood from context, X1 is sometimes said

to be simply invariant.

table 2.

k k mk gk

## Let X1 be a subspace of linear vector space X. This subspace is A-invariant

if for every vector z X1 , y = A z X1 .

## (Revised: Sep 10, 2012)

where E is the electric field vector, D is the electric flux density (also called the
displacement vector) and is the dielectric constant.
Page 7

Page 8

Section 2.2.0

D
1 11 12 13

D2 = 21 22 23

D3
31 32 33

Section 2.3.0

E
1

E2

E3

## Complex eigenvalues lead to complex eigenvectors, and correspond to

rotations
Complex eigenvalues come in complex conjugate pairs

Find the directions, if any, in which the E-field and flux density are collinear.
Examples: Consider the basic rotation matrix:

Solution:
For the E-field and flux density to be collinear they must satisfy D = E,

R=

## Which is to say, the anisotropic directions are the eigenvectors of the

dielectric tensor.

C S
S

Eigenvalues:

## 2.2 Finding eigenvalues and eigenvectors

For ( I A) v = 0 to have a solution, ( I A) must have a non-trivial null
space. This is equivalent to saying
det ( I A) = 0

(9)

( I R) =

## which gives the characteristic equation

det (I R) = 2 2C +C2 + S2 = 2 2C + 1 = 0
which solves to give

## We have seen that Eqn (9) gives an nth order polynomial in

This is more important for understanding than as a solution method

EE/ME 701.
Use:

q
4C2 4
2

2C

q
2

4 S2

= C j S

6= 0o , 180o !
i = 1.0

## >> [V, U] = eig(A)

Part 4b: Eigenvalues and Eigenvectors

2C

Page 9

Page 10

Section 2.3.1

## EE/ME 701: Advanced Linear Systems

Section 2.3.1

The matrix R of Eqn (11) also describes a rotation of 600 about the axis v

## 2.3.1 Example: 3D Rotation

Weve seen the rotation matrix

c
tR

CC +C S CS S S S CC S

= SC
S SS +CC C SS CS

S
CS
CC

0.577

v3 = 0.577
0.577

(10)

w
y
Rw
x
z

vector R w.

(11)

## The mathematical manifestation of the absence of any other R-invariant

subspace is that the other two eigenvalues are complex.

Corresponds to
= 135o ,

= 135o ,

= 19.47o

## When we form solutions of the form

This matrix (like all 3D rotation matrices with general angles) has one real
eigenvalue and a complex conjugate pair
>> [V, U] = eig(R)
V = -0.5774
0.2887 + 0.5000i
0.2887 - 0.5000i
U =

## Every vector w v will be rotated by R. So the only R-invariant subspace

lies along the axis of rotation.

0.5000 + 0.8660i
0
0

-0.5774
0.2887 - 0.5000i
0.2887 + 0.5000i

0.5774
0.5774
0.5774

0
0.5000 - 0.8660i
0

0
0
1.0000

## (Revised: Sep 10, 2012)

x (t) = V eU t V 1 x (0)

(12)

the complex eigenvalue and eigenvector pair combines to give solutions that
can be written
x (t) = a1et cos (t + ) w1 + a2et sin (t + ) w2

(13)

## where the complex is written = j and the complex eigenvectors

form the real eigenvectors:
1
(real part)
(14)
w1 = (v1 + v2)
2
1
(v1 v2)
2
Part 4b: Eigenvalues and Eigenvectors
w2 = j

Page 11

(imaginary part)
(Revised: Sep 10, 2012)

(15)
Page 12

Section 2.4.0

## 2.4 The eigen-system of symmetric (Hermitian) matrices

The eigensystem of a Hermitian matrix Q , (symmetric matrix, if real) have

## Property 3: The eigenvectors of a Hermitian matrix, if they correspond to distinct

eigenvalues, must be orthogonal.
Proof: Starting with the given information, eigenvalue values 1 6= 2, and
corresponding eigenvectors v1 and v2

## Notation: use A to be the complex-conjugate transpose (equivalent to A in

Matlab).
Property 1: If A = A, then for all complex vectors x, x A x is real.
Proof: Define y = (x A x). Applying the transpose to the product,

Section 2.4.0

Q v1 = 1 v1

(16)

Q v2 = 2 v2

(17)

## Forming the complex-conjugate transpose of Eqn (16)

y = (x A x) = x A x = x A x

v1 Q = 1 v1 = 1 v1

must be real.

(18)

## where we can drop the complex conjugate, because we know 1 is real.

Now multiplying on the right by v2 gives

## Property 2: The eigenvalues of a Hermitian matrix must be real.

Proof: Suppose the is an eigenvalue of Q , with v a corresponding
eigenvector, then
Qv = v

v1 Q v2 = 1 v1 v2 = 1 v1 v2

(19)

## But the multiplication also gives:

Now multiply on the left and right by v,
v1 (Q v2) = v1 2 v2 = 2 v1 v2

(20)

1 v1v2 = 2 v1 v2

(21)

v Q v = v v = v v = ||v||

So we find
By property 1,

v Q v

=

## If 1 6= 2, Eqn (21) is only possible if v1v2 = 0, which is to say that v1 and

v2 are orthogonal.

v Q v
||v||2

must be real.
Part 4b: Eigenvalues and Eigenvectors

Page 13

Page 14

Section 2.4.0

## EE/ME 701: Advanced Linear Systems

Section 3.1.0

3 The Jordan-form

T = V 1 QV

(22)

## where V is an orthonormal matrix and T is an upper-triangular matrix.

Since T is upper-triangular, T will be lower-triangular. However, V is
ortho-normal and Q is Hermitian, so
V = V 1 ,
So

Q = Q , and



V 1 = (V ) = V

## For T to be both upper-triangular and lower-triangular, it must be

diagonal. Let U = T be a diagonal matrix.
Multiplying Eqn (22) on the left by V and on the right by V 1gives
=VUV

(23)

## Which is precisely the form of the eigen-decomposition of Q, where

Diagonal matrix U holds the eigenvalues of Q, and

Remarks:
Since T = T , the diagonal elements (the eigenvalues) must be real
(see property 2).
When Q is real (symmetric), V will be real.
(Revised: Sep 10, 2012)

(25)

U =

0
2

...

and

(26)

eU t =

e1 t

0
e2 t

## But what if the set of vk is incomplete, so there is no V 1 !

...

(27)

Such matrices can arise with repeated eigenvalues and do not have a
complete set of eigenvectors. Such matrices are called defective.
If we generated A matrices randomly, defective matrices would be quite
rare.
But repeated poles (eigenvalues) are relatively common in the analysis
of dynamic systems, and sometimes even a design goal.

eAt = V eU t V 1

Q=V TV

(24)

## When there is a complete set of eigenvectors,



T = V Q V 1 = V 1 QV = T

Page 15

## When A has repeated eigenvalues and missing eigenvectors (gk < mk ),

analysis of eAt requires converting matrix A to the Jordan form.
When we have a complete set of independent eigenvectors, eAt is given by
Eqn (26).
Part 4b: Eigenvalues and Eigenvectors

Page 16

Section 3.1.1

Section 3.1.1

J=

## With scalar differential equations, we know that equations with repeated

roots give solutions of the form

1
0

(31)

## The expression for eJ t is

y (t) = c1 e1 t + c2 t e1 t .
For example,
y (t) + 6 y (t) + 9y (t) = 0

(28)

## has the characteristic equation s2 + 6 s + 9 = 0 which as the roots

s = {3, 3} . The solution to Eqn (28) is:
y (t) = c1 e3t + c2 t e3t .

(29)

J2 t 2
Jk t k
eJ t = I + J t +
+ +
+
(32)
2!
k!

2
3
1 0
1
2 2
3 32
+ t
+ t
+
+ t
=
1! 0
2 ! 0 2
3 ! 0 3
0 1

t k k k k1
+
(33)
+
k! 0
k
The (1,1) and (2,2) elements give the series

But Eqn (27) for eAt has no terms of the form t e3t . And yet Eqn (28) is
simply represented in state space with:

x (t) =

y (t)
y (t)

d y (t) 6 9 y (t)
x (t) =
=
dt y (t)
1 0
y (t)

(30)

And the solution to Eqn (30) is (as always for x (t) = A x (t))

1 k k
t = et
k
!
k=1

1+

given as
0+

1
t
+
k k1 t k =
1 ! k=2 k !

t 1+

k=2

1
k1t k1
(k 1) !

= t 1+

k=1

1 kk
t
k!

= t et

Jt

(34)

Page 17

e = exp

1
0

t =

et t et
0

et

(35)
Page 18

Section 3.1.1

et t et
1 0

exp 0 1 t = 0 et

0
0
0 0

1 2 t
2t e
t et

et

(36)

1 0 0

(37)

## 3.2.1 Regular and Generalized Eigenvectors

The regular eigenvectors are what we have considered all along, they satisfy
the relationship

A = M J M 1

A=

M=

And

## J is a block-diagonal matrix, it is composed of mk mk blocks along the

main diagonal. Eqns (31), (36) and (37) give examples of 2x2, 3x3 and
4x4 blocks.
Each block on the diagonal of J is called a Jordan block.

6 9

(39)

## The columns of M are regular eigenvectors and generalized eigenvectors

0 1 0

t
exp

0 0 1

0 0 0

Matrix A has been transformed to the Jordan form (sometimes called the
Jordan canonical form) when
J = M 1 A M

with

Section 3.2.1

## EE/ME 701: Advanced Linear Systems

0.949
0.316

eAt = M eJ t M 1 = M

(38)

0.0316
0.0949

, J=

e3t t e3t
0

e3t

3
0

1
3

A v = k v

or

(A k I) v = 0

(40)

From Eqn (40), a set of independent regular eigenvectors is given as the null
space of (A k I).

M 1

So the solution x (t) = eAt x (0) will have terms of the form e3t and t e3t ,
as need !
Part 4b: Eigenvalues and Eigenvectors

Page 19

Page 20

Section 3.2.1

## The generalized eigenvectors form chains starting with a regular

eigenvector. The generalized eigenvectors satisfy the relationship
l+1
l
A Vk,l+1
j = k Vk, j +Vk, j

(41)

l
(A k I) Vk,l+1
j = Vk, j

(42)

Or, rearranging

## is the the next generalized eigenvector in a chain

(see Bay Eqn (4.14)).

In this notation,
(43)

## is the first element of a chain; it is a regular eigenvector, and it is the jth

regular eigenvector of the kth distinct eigenvalue.
= 1 designates that Vk,1 j

The l
be a regular eigenvector.

Section 3.2.2

## 3.2.2 First Jordan Form Example

Consider eAt , with A given as:
>> A = [ 3 3 3 ; -3 3 -3 ; 3 0 6 ]
A =
3
3
3
-3
3
-3
3
0
6
First look at the eigenvalues

Where Vk,l+1
j

Vk,1 j

## Eqn (42) is an example of a recursive relationship, it is an equation that is

applied repeatedly to get all elements of the chain.
The method presented here to determine the Jordan form is the bottom-up
method presented in Bay, section 4.4.3.

>> U = eig(A)
U = 3.0000
6.0000
3.0000
%% This command rounds-off values to nearby rational numbers
%% which may be integers
>> U = RoundByRatCommand(U)
U =
3
6
3
A has a repeated eigenvalue, we can make a table analyzing the structure of
the eigensystem of A
k k mk
1
2

3
6

2
1

gk

mk gk

1 or 2
1

0 or 1
0

## Table 3: Analysis of the structure of the eigensystem of A.

Table 3 shows that A has two distinct eigenvalues, and we dont yet know if
1 has 1 or 2 independent eigenvectors.
Part 4b: Eigenvalues and Eigenvectors

Page 21

Page 22

Section 3.2.2

## >> lambda1=3; lambda2=6; I = eye(3);

>> v1 = null(A-lambda1*I); v1 = v1/v1(1)
v1 = 1
%% Eigenvector, scaled so the
1
%% first element is an integer
-1

In this case for k = 1 we have only one regular eigenvector, so it must serve
as the first element, or anchor, of the chain of generalized eigenvectors
regular eigenvector:
generalized eigenvector:

The geometric multiplicity is the dimension of the null space in which the
eigenvectors lie. For 1, g1 = 1. Putting this information into the table
k k mk gk
1
2

3
6

2
1

mk gk

1
1

1
0

## Table 4: Analysis of the structure of the eigensystem of A.

The total number of eigenvectors (regular+generalized) needed for k is mk .
The number of regular eigenvectors is gk . The regular eigenvectors get the
notation:
1
1
1
Vk,1
, Vk,2
, ..., Vk,g
k
where j = 1 ... gk .
The number of needed generalized eigenvectors, corresponding to k , is
mk gk .

Section 3.2.2

## (Revised: Sep 10, 2012)

Page 23

1
V1,1

solves

2
V1,1
solves

1
(A 1 I)V1,1
= 0

(44)

2
1
(A 1 I)V1,1
= V1,1

In Matlab
>> lambda1=1; lambda2=2; I = eye(3);
%% Find the first regular eigenvector,
>> V111 = null(A-lambda1*I); V111=V111/V111(1)
V111 = 1.0000
1.0000
-1.0000
%% Find the generalized eigenvector by solving Eqn (44)
>> V112 = pinv(A-lambda1*I)*V111
V112 = -1.0000
1.0000
0.0000
%% Find the regular eigenvector for lambda2
>> V211 = null(A-lambda2*I); V211=V211/V211(2)
V211 =
0
1.0000
-1.0000

Page 24

Section 3.2.2

## Put the eigenvectors (regular and generalized) together in the M matrix.

The regular and generalized eigenvectors of a chain must be put in order.
vectors Vk,l j

## For each k and j, put the

going to the end of the chain.

M =
1.00
-1.00
0
1.00
1.00
1.00
-1.00
0.00
-1.00

..
.
..
.
..
.
..
.

## 3.2.3 More on Jordan blocks

A matrix in Jordan canonical form has a block diagonal structure, with
Eigenvalues on the main diagonal

>> J = inv(M) * A * M
>> J = RoundByRatCommand(J)
J =
3
1
0
0
3
0
0
0
6

0
,

## A 2x2 block has 1 one, a 3x3 block has 2 ones, etc.

One Jordon block corresponds to each regular eigenvector
If the regular eigenvector has no generalized eigenvectors, then it creates a
1x1 block.

i
h
1 V2 V1
M = V1,1
1,1
2,1

3 1

0 3
J=

0 0

## If the regular eigenvector anchors a chain with one generalized eigenvector,

then it creates a 2x2 block, etc.
Each Jordan block corresponds to:
1x1 block:

a regular eigenvector

## n n block, n 2: a chain anchored by a regular eigenvector, with n 1

generalized eigenvectors
eAt = M eJ t M 1

(45)

For a system governed by x (t) = A x (t), and considering the J matrix, the
output of the system will have solutions of the form
y (t) = c1 e3t + c2 t e3t + c3 e6t

(46)

Using the Vk,l j notation, if we look at the structure of the M matrix, we can
determine the layout of Jordan blocks. For example,
h
i
1 V1 V2 V1 V2
M = V1,1
1,2
1,2
2,1
2,1
Then the blocks of J are arranged:

1x1

J=

where the first two terms correspond to the first Jordan block, and the last
term to the second Jordan block.
Part 4b: Eigenvalues and Eigenvectors

Section 3.2.3

## Put in the chain corresponding to for each regular eigenvector ( j) for

each distinct eigenvalue (k). Chains may have a length of 1.

Page 25

2x2
2x2

Page 26

Section 3.2.4

3
1
0
0
0
0

-1
1
0
0
0
0

1
-1
2
0
0
0

1
-1
0
2
0
0

0
0
1
-1
1
1

0
0
1
-1
1
1

>> U = eig(A)
U = 2.0000
2.0000
2.0000
2.0000
2.0000
0

## >> lambda1=0; lambda2=2; I = eye(6);

>> [Row, Col, Null, LNull] = spaces(A-lambda2*I);
>> g2=rank(Null); g2 = 2
(A 2 I) has a 2-dimensional null space, so there are 2 independent regular
eigenvectors.
=

0
0.7071
0
0.7071
-0.7071
0.0000
0.7071
0.0000
0.0000
0
0.0000
0
For convenience, scale the eigenvectors to get integer values
>> V211 = Null(:,1)/Null(3,1)
V211 =
0
0
1
-1
0
0
Part 4b: Eigenvalues and Eigenvectors

## >> %% Check that the eigenvectors are indeed eigenvectors

>> %% These norms come out to zero, very small would be sufficient
>> NDiff1 = norm( A*V211 - lambda2*V211 )
NDiff1 =
0
>> NDiff2 = norm( A*V221 - lambda2*V221 )
NDiff2 =
0
Note: All vectors from null (A 2I) are eigenvectors. For example,

1 = 0 , 2 = 2 is repeated 5 times.

Null

Section 3.2.4

A =

## >> V221 = Null(:,2)/Null(1,2)

V221 = 1
1
0
0
0
0
(Revised: Sep 10, 2012) Page 27

## >> x = 0.3*V211 + 0.4*V221

x = 0.4
>> NDiffx = norm( A*x - lambda2*x )
0.4
NDiffx = 0
0.3
-0.3
0.0
0.0

k k mk gk
1
2

0
2

1
5

1
2

mk gk
0
3

## Table 5: Structure of the eigensystem of A.

We need 3 generalized eigenvectors to have a complete set.
These three will be in chains anchored by one or both of the regular
eigenvectors of 2 .

Page 28

Section 3.2.4

Section 3.2.4

2
Find V2,1

l
(A k I) Vk,l+1
j = Vk, j

(47)

Bx = y

(48)

## with B = (A k I) , y = Vk,l j and x = Vk,l+1

j .
We know some things about the solutions of Eqn (47)
1. For an exact solution to exist Vk,l j must lie in the column space of
(A k I)
2. If we find a solution Vk,l+1
j , it is not unique. We can add any component
from the null space of (A k I) , and it will still be a solution.

## >> V212 = pinv(A-lambda2*I)*V211

V212 =
0
0
0.00
-0.00
0.50
0.50
%% Check that V212 is a generalized eigenvector
>> NDiffV212 = norm( (A-lambda2*I)*V212 - V211 )
NDiffV212 = 2.7581e-16
2 is a generalized eigenvector.
Yes, V2,1

## Considering again the example problem,

Check that V211 and V221 lie in the column space of (A 2 I) by
checking that the projection of each onto the column space is equal to
the original vector

## 2 is in the column space of (A I)

Test to see if V2,1
2

ans = 0.7071

## >> [Row, Col, Null, LNull] = spaces(A-lambda2*I);

3
No, so there is no V2,1

## >> NIsInColumnSpaceV211 = norm( Col*Col*V211-V211 )

NIsInColumnSpaceV211 = 1.1102e-16
%% V211 is in the column space of (A-lambda2*I)
>> NIsInColumnSpaceV2 = norm( Col*Col*V221-V221 )
NIsInColumnSpaceV2 = 1.1430e-15
%% V221 is in the column space of (A-lambda2*I)
Both vectors lie in the column space of (A 2 I) , so each will have at
least one generalized eigenvector.
Part 4b: Eigenvalues and Eigenvectors

## (Revised: Sep 10, 2012)

Page 29

2
Now evaluate V2,2

## >> V222 = pinv(A-lambda2*I)*V221

V222 = 0.50
>>
%% Check that V222 is a gen. eigenvector
-0.50
>>
NDiffV222
= norm( (A-lambda2*I)*V222 - V221 )
-0.00
NDiffV222
=
6.2804e-16
-0.00
0.00
2 is a generalized eigenvector
Yes, V2,2
0.00

Page 30

Section 3.2.4

## 2 is in the column space of (A I)

Now check to see that V2,2
2

## 3 . This will be the third generalized eigenvector

Yes, so there is a V2,2

V223 =

0.00
-0.00
0.25
0.25
-0.00
-0.00

0
0
0
1
2
0

0
0
0
0
1
2

J =

## First we need the regular eigenvalue corresponding to 1

>> V111 = null(A-lambda1*I);
>> V111 = V111/ V111(5)

V111 =

0
0
0
0
1
-1

Put in the chains of E-vecs, starting each chain with its regular E-vec

## Part 4b: Eigenvalues and Eigenvectors

>> J = inv(M)*A*M;
>> J = RoundByRatCommand(J)
J =
0
0
0
0
0
2
1
0
0
0
2
0
0
0
0
2
0
0
0
0
0
0
0
0
Interpreting the result

## >> M = [ [V111] [V211 V212]

>> M = RoundByRatCommand(M)
M =
0
0
0
0
0
0
0
1.00
0
0 -1.00
0
1.00
0
0.50
-1.00
0
0.50

Section 3.2.4

Now find J

## >> NIsInColumnSpaceV222 = norm( Col*Col*V222-V222 )

NIsInColumnSpaceV222 = 4.2999e-16

1.00
1.00
0
0
0
0

0.50
-0.50
0
0
0
0

## Correspondingly, M has 3 chains of eigenvectors

M

0
0
0.25
0.25
0
0

0
| 0
0 |
0
0
0
----------------0
| 2
1 |
0
0
0
0
| 0
2 |
0
0
0
----------------------------0
0
0 |
2
1
0
0
0
0 |
0
2
1
0
0
0 |
0
0
2

0
0
0
0
1.00
-1.00

|
0
|
0
| 1.00
| -1.00
|
0
|
0

0
0
0
0
0.50
0.50

|
|
|
|
|
|

1.00
1.00
0
0
0
0

0.50
-0.50
0
0
0
0

0
0
0.25
0.25
0
0

Page 31

Page 32

Section 3.3.0

## 3.3 One more twist, freedom to choose the regular eigenvector

Fact: If a matrix A has repeated eigenvalues, with gk > 1 independent eigenvector,
the gk eigenvectors form a vector subspace, any vector from which is an
eigenvector.
When mk 3, it is possible that gk 2, and we need to find a generalized
eigenvector.

## EE/ME 701: Advanced Linear Systems

Section 3.3.0

When gk 2, we have the freedom to choose the anchor for the chain of
generalized eigenvectors
1 , V 1 , ...
Not just from a list, Vk,1
k,2

## But as any vector from the null space of (A k I) .

1 , V 1 , neither one of
It may be that we have valid eigenvectors , Vk,1
k,2
which lies in the column space of (A k I) !

## By forming the intersection of the null and column spaces of (A k I), we

can find the needed regular eigenvector.

In this case,
dim null (A k I) = gk 2
and any vector from the 2-dimensional (or larger) null space of (A k I) is
an eigenvector.

W = col (A k I) null(A k I) ,

Vk,1 j W

(50)

## Consider the generating equation for the generalized eigenvector

(A k I) Vk,2 j = Vk,1 j

(49)

## The anchor Vk,1 j must also lie in the column space of (A k I)

A regular eigenvector that anchors a chain of generalized eigenvectors must
lie in 2 spaces at once:
The null space of (A k I) . . . . . . . . . . . . . . . To be a regular e-vec of A.
The column space of (A k I) . . . To generate a generalized e-vec of A.

Page 33

Page 34

## EE/ME 701: Advanced Linear Systems

Section 3.3.1

3.3.1 Example where regular E-vecs do not lie in the column space of
(A k I)
Consider the matrix

2 1

0 4
A=

0 1

0 1

>> NIsInColumnSpaceV1 =
NIsInColumnSpaceV1 =
>> NIsInColumnSpaceV2 =
NIsInColumnSpaceV2 =

## >> RoundByRatCommand ( eig(A) )

ans = 2
2
2
2

-0.7118
-0.2468
-0.4650
-0.4650

norm( Col*Col*v1-v1 )
0.6325
norm( Col*Col*v2-v2 )
0.6325

But what about the possibility that there exists another eigenvector which
lies in the null space and column space of (A k I):

Null =

0
0.8165
-0.4082
-0.4082

1
0
0
0

k k mk gk

Null(3,1) )

## >> I = eye(4) lambda1=2,

>> [Row, Col, Null, Lnull] = spaces(A - lambda1*I)
0.3055
0.7342
-0.4287
-0.4287

A first choice for eigenvectors are the two basis vectors of the null space of
(A k I)

Col =

Section 3.3.1

## >> v1 = RoundByRatCommand( Null(:,1) /

>> v2 = RoundByRatCommand( Null(:,2) )
v1 =
0
v2 = 1
-2
0
1
0
1
0

2 2

0 0

## EE/ME 701: Advanced Linear Systems

x1 = a1 v1 + a2 v2
First, consideration of the possibilities
The universe is R4 , or 4D, with dim col (A k I) = 2, and
dim null (A k I) = 2. So there are 3 possibilities:
1. Two 2D spaces can fit in a 4D universe and not intersect, so it is
possible
col (A k I) null (A k I) = 0

mk gk
2

## 3. It is possible that the intersection is 2 D

Part 4b: Eigenvalues and Eigenvectors

Page 35

Page 36

Section 3.3.1

## Previously, we have seen how to form the intersection of two subspaces,

Given sets of basis vectors U = [u1 , u2 , ..., unu ] and V = [v1 , v2 , ..., vnv ],
vectors in the intersection
W = UV

That is

## Now find w1 , a vector in both the column and null spaces of (A k I)

>> w1 = Col*CoeffVec(1:2)
>> w1 = RoundByRatCommand( w1/w1(1)),

w1 =

1
2
-1
-1

Section 3.3.1

(51)

Are solutions to

## EE/ME 701: Advanced Linear Systems

a1 anu , b1

(52)

i
a1 anu , b1 anv .
i
anv must solve

a
1

[U, V ] ... = 0

bnv

(A k I) has only one regular eigenvector that can anchor a chain. So the
chain must have length 3 (2 generalized E-vecs).
2 as solution to
Compute a candidate for the first generalized eigenvector, V1,1
2 = V1
(A k I) V1,1
1,1

(53)

h
i
The coefficient vector must lie in the null space of Col, Null , where
[Col] and [Null] are sets of basis vectors on the column and null spaces.
>> CoeffVec = null([Col, -Null])
CoeffVec =

0.7033
-0.0736
0.6547
0.2673
h
i
Since the null space of Col, Null is one dimensional, the intersection
of the column and null spaces is 1D
Part 4b: Eigenvalues and Eigenvectors

## >> NIsEigenvectorW1 = norm( A*w1 - lambda1 * w1)

NIsEigenvectorW1 =
0

Page 37

## >> V111 = w1,

>> v3 = pinv(A - lambda1*I) * V111,

v3 =

0
0.3333
0.3333
0.3333

3 , V 2 must be in the
To find the remaining generalized eigenvector, V1,1,
1,1
column space of (A k I)

## >> NIsInColumnSpaceV112 = norm( Col*Col*v3 - v3)

NIsInColumnSpaceV112 = 0.4216
It is not !
2 ,
v3 is a candidate generalized eigenvector, but we can not use it for V1,1
3 .
because it does not lead to V1,1

Page 38

## EE/ME 701: Advanced Linear Systems

Section 3.3.1

2 must solve
Going back to the generating Eqn, V1,1

Section 3.3.1

## 2. Determine the candidate vector

>> v3b = v3 + Null * CoeffVec2(3:4),

2
1
(A k I) V1,1
= V1,1

(54)

v3 is a particular solution to Eqn (54), but it is not the only solution. Any
vector
2
V1,1
= v3 + b1 n1 + b2 n2
(55)
is a solution to (54), where n1 and n2 are basis vectors for the null space of
(A k I).
2 that is in the column space of (A I), we need a
To find a value for V1,1
k
solution to
2
= v3 + b1 n1 + b2 n2 = a1 c1 + a2 c2
(56)
V1,1

Or

c1 c2 n1

a
1

i
a2
= v3
n2

b1

b2

(57)

## 3. Check to be sure the new candidate is in the column space of (A k I)

>> NIsInColumnSpaceV112b = norm( Col*Col*v3b - v3b)
NIsInColumnSpaceV112b =
6.0044e-16
Yes !
2 and compute V 3
Set V1,1
1,1

V112 = 0.5714
0.1429
0.4286
0.4286

V113 =
0
0.1905
0.6905
-0.3095

## 1 is any independent regular eigenvector. Compute J.

Build the M matrix, V1,2

M = 1.0000
0.5714
0
2.0000
0.1429
0.1905
-1.0000
0.4286
0.6905
-1.0000
0.4286
-0.3095

## >> CoeffVec2 = pinv( [Col, -Null]) * v3

CoeffVec2 = -0.0880
-0.8406
-0.2333
0.5714

v3b = 0.5714
0.1429
0.4286
0.4286

1.0000
0
0
0

## >> J = RoundByRatCommand( inv(M) * A * M )

J = 2
1
0
0
0
2
1
0
0
0
2
0
0
0
0
2
J has a 3x3 block, and a 1x1 block.

Page 39

Page 40

Section 3.4.0

Section 3.4.1

## Strikingly, Matlab has no numerical routine to find the generalized

eigenvectors or Jordan form (standard Matlab no jordan() routine !)

## The regular eigenvectors are given as the null space of (A i I)

For a repeated eigenvalue k
The algebraic multiplicity, mk , is the number of times k is repeated

## This is because the Jordan form calculation is numerically very sensitive,

a small perturbation in A produces a large change in the chains of
eigenvectors
This sensitivity is true of the differential equations themselves,

## The geometric multiplicity, gk , is the dimension of null(A k I)

When eigenvalues are repeated, we may not have enough independent
regular eigenvectors (gk < mk ), in which case the Jordan form is required.
The Jordan form corresponds to scalar differential equations with
repeated roots and solutions of the form

## y (t) + 6 y (t) + 9.00001y (t) = 0

has two distinct roots !
Consider the stages where a decision must be made
When there are two eigenvalues with a b, are they repeated or
distinct ?

y (t) = a1 e1 t + a2 t e1 t + a3 t 2 e1 t ...

## What is the dimension of null (A I) ?

For repeated eigenvalues, regular eigenvectors give rise to chains of
generalized eigenvectors. The generalized eigenvectors are solutions to
l+1
l
(A i I) vk,
j = vk, j

## Does vlk, j lie in the column space of (A I), or does it not ?

l+1
Is vk,
j independent of the existing eigenvectors ?

(58)

## There is no known numerical routine to find the Jordan form that is

sufficiently numerically robust to be included in Matlab.

roots.

## The Matlab symbolic algebra package does have a jordan() routine. It

runs symbolically on rational numbers to operate without round-off error,
for example:

21/107 52/12

A=

Page 41

119/120

1/1

11/12

3/2

8/5

13/14

Page 42

Section 4.0.0

4 Conclusions

Section 5.0.0

To solve

## Each Jordan block corresponds to a chain of generalized eigenvectors.

x (t) = A x (t) + B u (t)

## we are interested to make a change of basis from physical (or other)

coordinates to modal coordinates.
This involves the eigenvalues and eigenvectors. The eigenvectors solve the
equation
(A i I) vi = 0

l+1
l
(A i I) vk,
j = vk, j

For the l th generalized eigenvector, of the jth Jordan block, of the kth distinct
eigenvalue
v1k, j is a regular eigenvector, corresponding to k

## So the eigenvectors lie in null (A i I)

l+1
l
For vk,
j to exist, vk, j must lie in the column space of (A k I)

## Eigenvalues may be complex

The Jordan form leads to solutions of the differential equation in the form
t et , t 2 et , etc. For example:

## Complex eigenvalues correspond to solutions with terms

x (t) = a2et cos (t + ) w1 + a2et sin (t + ) w2
The complex terms correspond to the action of a rotation in state space,
in the sub space spanned by the complex eigenvectors.
The eigenvectors corresponding to an eigenvalue (or a complex conjugate
pair of eigenvalues) define and A-invariant subspace. Vectors in this
subspace stay in this subspace.
The modal matrix is the transformation from from modal to physical
p
coordinates, M = m T .

## If we lack a complete set of regular eigenvectors, M includes generalized

eigenvectors, and
J = M 1 A M
gives the system matrix in Jordan form.
(Revised: Sep 10, 2012)

Page 43

..
.
..
.
..
.
.. 2 t
. e
t e2 t
..
. 0 e2 t
..
. 0
0

..
1 1 .

1 ..

...
J =

..

. 2 1

..

.
2 1

..
.
2

(59)

eJt

1 t t e1 t
e

0 e1 t

1 2 2 t
2t e
t e2 t

e2 t

Page 44

Section 5.0.0

## 5 Review questions and skills

1. In what fundamental space do the regular eigenvectors lie.
2. Given the eigenvalues of a matrix, analyze the structure of the eigensystem
(a) Determine the number of required generalized eigenvectors
3. Indicate the generating equation for the generalized eigenvectors.
4. Indicate in what fundamental space the vectors of the generating equations
must lie.
5. When gk 2, and no regular eigenvector lies in the column space of
(A k I), what steps can be taken ?
6. When additional generalized eigenvectors are needed, and v, a candidate
generalized eigenvector does not lie in the column space of (A k I), what
steps can be taken ?

Page 45

## Review-By-Example of some of the Basic

Concepts and Methods of Control System
Analysis and Design

4.1.1
4.2

Contents
1 Differential Eqns, Transfer Functions & Modeling

4.3

27

. . . . . . .

28

4.2.1

29

4.2.2

. . .

30

4.2.3

33

## Determining approximate performance measures from a dominant

second-order mode . . . . . . . . . . . . . . . . . . . . . . . . .

35

4.3.1

36

4.3.2

36

4.3.3

36

1.1

## Example 1, Golden Nugget Airlines . . . . . . . . . . . . . . . .

1.2

Block diagram

. . . . . . . . . . . . . . . . . . . . . . . . . . .

1.3

## Laplace Transforms and Transfer Functions . . . . . . . . . . . .

1.3.1

Laplace transform . . . . . . . . . . . . . . . . . . . . .

1.3.2

5 Design

1.3.3

## Transfer Functions are Rational Polynomials . . . . . . .

5.1

Design methods

. . . . . . . . . . . . . . . . . . . . . . . . . .

38

1.3.4

10

5.2

38

1.3.5

12

1.3.6

15

16

2.1

20

2.2

21

2.3

## Analyzing other loops

22

. . . . . . . . . . . . . . . . . . . . . . .

3 Analysis
4 Working
4.1

37

6 Summary

39

7 Glossary of Acronyms

40

25
with

the

pole-zero

constellation
26

26

Page 1

Page 2

Section 1.1.0

## 1 Differential Eqns, Transfer Functions & Modeling

1.1 Example 1, Golden Nugget Airlines

Section 1.2.0

## 1.2 Block diagram

A block diagram is a graphical representation of modeling equations and
their interconnection.

## Dynamic systems are governed by differential equations

(or difference equations, if they are discrete-time)

## Eqns (1)-(3) can be laid out graphically, as in figure 2.

Example (adapted from Franklin et al. 4thed., problem 5.41, figure 5.79)

Mp(t)
Elevator
servo

v(t)

GNA

7
s + 10

Aircraft
Dynamics

+
Me(t)
+

Mt(t)

Integrator

(t)

s+3
s2 + 4 s + 5

(t)

1
s

(t), (t)

Mp(t) Me(t)

## Figure 1: Golden Nugget Airlines Aircraft. Me (t) is the elevator-moment (the

control input), M p (t) is the moment due to passenger movements (a
disturbance input), and (t) is the aircraft pitch angle, (t) = (t) .
For example, the aircraft dynamics give:
1

d Mt (t)
d 2 (t)
d (t)
+ 5 (t) = 1
+ 3 Mt (t)
+4
dt 2
dt
dt

## Mt (t) = Me (t) + M p (t) ;

1

(t) =

(1)

d (t)
dt

(2)

d Me (t)
+ 10 Me (t) = 7 v(t)
dt

(3)

## Where Mt (t) is the total applied moment.

Eqn (1) has to do with the velocity of the aircraft response to Mt (t).
Eqn (2) expresses that the pitch-rate is the derivative of the pitch angle.
And Eqn (3) describes the response of the elevator to an input command
from the auto-pilot.
Main Fact: The Differential Equations come from the physics of the system.
Part 1A: Controls Review-By-Example

Page 3

Signal Symbol

Units

v (t)

[volts]

Me (t)

[N m]

M p (t)

(t)

[N m]

(t)

## Table 1: List of signals for the aircraft block diagram.

When analyzing a problem from basic principals, we would also have a list
of parameters.

Parameter Symbol

Value

Units

b0

[N-m/volt]

...

...

...

## Table 2: List of parameters for the aircraft block diagram.

Part 1A: Controls Review-By-Example

Page 4

Section 1.3.1

Section 1.3.2

## To introduce the Transfer Function (TF), we need to review the Laplace

transform.
Time Domain Signal

L{f(t)}

Scaled Impulse:
F(s)

f(t)
L-1{F(s)}

Unit step:

F (s) = b
1
s

f (t) = t, t 0

F (s) =

1
s2
n!

Decaying Exponential:

"Time Domain"

Space of all
functions of s
"Frequency Domain"

f (t) = t n , t 0

F (s) =

(t) = b et

F (s) =

Sinusoid:

## f (t) = Bc cos (t) + Bs sin (t)

Sinusoidal oscillation:

## Oscillatory Exp. Decay:

s2 +2
c s+Bs
F (s) = sB2 +0
s+2
Bc s+(Bc +Bs )
F (s) = s2 +2 s+2
n

2 + 2

U(s)

u(t)

Solve
Algebraic
Eqn.

Solve
Diff Eq.
Y(s)

y(t)

## 2. The LT makes possible the

transfer function.

1/

Area = 1.0

1+(t)

t=0

t=0

t=0
(t) , impulse function

Unit ramp
e-t

of time

Space of all
functions of s

t=0

## 4. Frequency domain: find Guy (s)

for all U (s).
Figure 3: Solve differential equation in time domain or solve an algebraic
equation in s domain.
(Revised: Sep 08, 2012)

t=0
Decaying expon.

Page 5

t=0

A Sinusoid

sn+1
b
s+

F (s) =

n =

transforms ?

## Part 1A: Controls Review-By-Example

f (t) = b (t)

F (s) =

Higher Powers of t:

of time

u (t).

F (s) = 1

Unit ramp:

## The Inverse Laplace transform

maps F (s) to f (t).

## 1. Differential equations in time

domain correspond to algebraic
equations in s domain.

f (t) = (t)

Unit Impulse:

## 1.3.1 Laplace transform

The Laplace transform (LT) maps
a signal (a function of time) to a
function of the Laplace variable s .

Laplace transform

Page 6

Section 1.3.2

## Two theorems of the Laplace transform permit us to build transfer functions:

1. The Laplace transform obeys superposition and scaling
Given: z (t) = x (t) + y (t)

then

then

Given:

## Given: X (s) = L {x (t)} ,

then

d x (t)
dt

Section 1.3.2

Eqns (5)-(7) tell us something about the ratio of the LT of the output to the
LT of the input:


a2 s2 + a1 s + a0 Y (s) = (b1 s + b0) U (s)

(7, repeated)

(b1 s + b0)
Output LT
Y (s)
=
=
= G p (s)
U (s)
(a2 s2 + a1 s + a0)
Input LT

of x (t):


(8)

## A Transfer Function (TF) is a ratio of the input and output LTs

Given

G p (s) =

= s X (s)

(b1 s + b0)
,
(a2 s2 + a1 s + a0)

(9)

## Where G p (s) is the TF of the plant.

Putting these rules together lets us find the
transfer function of a system from its governing
differential equation.

u(t)

System

y(t)

Gp(s)

The transfer function is like the gain of the system, it is the ratio of the
output LT to the input LT.

Consider a system that takes in the signal u (t) and gives y (t), governed by
the Diff Eq:
a2

Important fact:

d 2 y (t)
d y (t)
d u (t)
+ a1
+ a0 y (t) = b1
+ b0 u (t)
dt
dt
dt 2

## However, the TF depends only on the parameters of the system

(coefficients a2..a0 and b0..b1 in the example), and not on the actual
values of U (s) or Y (s) (or u (t) and y (t)).

(4)

(Notice the standard form: output (unknown) on the left, input on the right.)
Whatever signals y (t) and u (t) are, they have Laplace transforms. Eqn (4)
gives:

## Basic and Intermediate Control System Theory are present transfer

function-based design.
By engineering the characteristics of the TF, we engineer the system to
achieve performance goals.
r(t)

d 2 y (t)
d y (t)
+ a1
+ a0 y (t)
L a2
dt 2
dt


d u (t)
= L b1
+ b0 u (t)
dt

## a2 s2 Y (s) + a1 sY (s) + a0 Y (s) = b1 sU (s) + b0 U (s)



a2 s2 + a1 s + a0 Y (s) = (b1 s + b0) U (s)
Part 1A: Controls Review-By-Example

(5)
(6)
(7)

Page 7

e(t)

Controller

u(t)

ys(t)

Plant

y(t)

Gp(s)

KcGc(s)

Sensor Dynamics
Hy(s)

## Figure 4: Block diagram of typical closed-loop control.

Part 1A: Controls Review-By-Example

Page 8

Section 1.3.3

Section 1.3.4

## 1.3.4 A transfer function has poles and zeros

A TF has a numerator and denominator polynomial, for example

GNA

G p (s) =

(t), (t)
Mp(t) Me(t)

## Figure 5: Golden Nugget Airlines Aircraft. (t) [radians/second] is the

pitch-rate of the aircraft, and Mt (t) is the moment (torque) applied by
the elevator surface.
Consider the differential equation of the Aircraft pitch angle:
1

d 2 (t)
d (t)
d Mt (t)
+4
+ 5 (t) = 1
+ 3 Mt (t)
dt 2
dt
dt

N (s)
2 s2 + 8 s + 6
= 3
D (s) s + s s2 + 4 s + 0

The roots of the numerator are called the zeros of the TF, and the roots of
the denominator are called the poles of the TF. For example:
>> num = [2 8 6]
>> den = [1 2 4 0 ]
num =
2
8
6
den =
1
2
4

## >> zeros = roots(num)

zeros =
-3
-1
>> poles = roots(den)
poles =
0
-1.0000 + 1.7321i
-1.0000 - 1.7321i

(10)

From (10) we get the TF, take LT of both sides, and rearrange:


s2 + 4 s + 5 (s) = (s + 3) Mt (s)
(s)
s+3
= 2
Mt (s)
s +4s +5

(11)

We can also use Matlabs system tool to find the poles and zeros
>> Gps = tf(num, den)
Transfer function:

Note:
We can write down the TF directly from the coefficients of the differential
equation.
We can write down the differential equation directly from the coefficients
of the TF.
Transfer functions, such as Eqn (11), are ratios of two polynomials:
s+3
numerator polymonial
(s)
=
:
Mt (s) s2 + 4 s + 5
denominator polynomial

>> zero(Gps),

>> pole(Gps),

Page 9

(12)

## %% Build the system object

2 s^2 + 8 s + 6
----------------s^3 + 2 s^2 + 4 s

ans = -3
-1
ans =
0
-1.0000 + 1.7321i
-1.0000 - 1.7321i

Page 10

## EE/ME 701: Advanced Linear Systems

Section 1.3.4

Interpreting the poles (p1, p2, ..., pn) and zeros (z1, ..., zm)

Section 1.3.5

## 1.3.5 Properties of transfer functions

We can use the poles and zeros to write the TF in factored form:

Just as a differential equation can be scaled by multiplying the left and right sides
by a constant, a TF can be scaled by multiplying the numerator and denominator
by a constant.

2 s2 + 8 s + 6
b2 (s z1) (s z2)
=
s3 + 2 s2 + 4 s (s p1) (s p2) (s p3)
2 (s + 3) (s + 1)
=
(s 0) (s + 1 1.732 j) (s + 1 + 1.732 j)

G (s) =

## Monic: A TF is said to be monic if an =1. We can always scale a TF to be monic.

If G1 (s) is scaled to be monic, then
b0
(13)
G1 (s) =
s + a1

## With a complex pole pair we can do two things,

1. Use a shorthand
G (s) =

2 (s + 3) (s + 1)
s (s + 1 1.732 j)

## (Because poles always come in complex conjugate pairs.)

(a) Write the complex pole pair as a quadratic, rather than
G (s) =

1st

order terms

## Rational Polynomial Form: A TF is in rational polynomial form when the

numerator and denominator are each polynomials. For example

2 (s + 3) (s + 1)
s (s2 + 2 s + 4)

G p (s) =

The zeros are values of s at which the transfer function goes to zero

(14)

## An example of a TF not in rational polynomial form is:

The poles are values of s at which the transfer function goes to infinity
We can plot a pole-zero map:

2 s2 + 8 s + 6
s3 + s s2 + 4 s + 0

G3 (s) =

PZ Map

2 (s + 3)/ (s)
(s2 + 2 s + 4)/ (s + 1)

(15)

imaginary

## >> Gps=tf([2 8 6], [1 2 4 0])

>> pzmap(Gps)

By clearing the fractions within the fraction, T3 (s) can be expressed in rational
polynomial form

1
0

G3 (s) =

1
2
4

2 (s + 3)/ (s)
(s2 + 2 s + 4)/ (s + 1)

(s) (s + 1)
2 s2 + 8 s + 6
2 (s + 3)(s + 1)
= 3
= 2
(s) (s + 1) (s + 2 s + 4)(s) s + 2 s2 + 4 s

Note the middle form above, which can be called factored form.
2

real

## Figure 6: Pole-Zero constellation of aircraft transfer function.

Part 1A: Controls Review-By-Example

Page 11

Page 12

Section 1.3.5

## General Form for a rational polynomial transfer function

for a rational polynomial transfer function is

G (s) =

sm + b

## The general form

sm1 + b

bm
m1
1 s + b0
ansn + an1 sn1 + + a1 s + a0

## 1.3.5 Properties of Transfer function (continued)

(16)
So, for example, the aircraft transfer function from elevator input to pitch
rate gives a type 0 system:
(s)
s+3
=
Me (s) s2 + 4 s + 5

## m = number of zeros, n = number poles

A TF with m n is said to be proper.

poles : s = 2 1 j ,

type : 0

But the TF from the elevator to the pitch angle gives a type I system:

## When m < n the TF is said to be strictly proper

s+3
(s)
=
Me (s) s (s2 + 4 s + 5)

## Example of a TF that is not proper:

2s2 + 5 s + 4
s+1

Section 1.3.5

System Type: A property of transfer functions that comes up often is the system
type. The type of a system is the number of poles at the origin.

G4 (s) =

## EE/ME 701: Advanced Linear Systems

note: m = 2,
n=1

poles : s = 0, 2 1 j ,

type : I

If we put a PID controller in the loop, which adds a pole at the origin, the
system will be type II.

## such a TF can always be factored by long division:

G4 (s) =

2s2 + 5 s + 4 2 s (s + 1) 3 s + 4
3s +4
1
=
+
= 2 s+
= (2s + 3)+
s+1
(s + 1)
s+1
s+1
s+1

## A non-proper TF such as G4 (s) has a problem: As s = j j , the gain

goes to infinity !
Since physical systems never have infinite gain at infinite frequency,
physical systems must have proper transfer functions.

Page 13

Page 14

Section 2.0.0

Section 2.0.0

## 1.3.6 The Impulse Response of a System

When we have a system such as in figure 7, the Laplace transform of the
output (a signal) is given by

u(t)

Gp(s)

r(t) + u(t)

y(t)

y(t)
Gp(s)
sensor

## Y (s) = G p (s) U (s)

(17)

A unit impulse input is a very short pulse with an area under the curve of
the pulse of 1.0.
Since the Laplace transform of a unit-impulse is 1, then the Laplace
transform of the impulse response is the transfer function

Open Loop

Closed Loop

Figure 8: A plant with TF G p (s) in open and closed loop. Closed loop requires a
sensor.
Feedback is the basic magic of controls. A feedback controller can
Make an unstable system stable . . . . . . . . . . . . . . . . . . Helicopter autopilot
Make a stable system unstable . . . . . . . . . . . . . . . . Early fly-ball governors

## Make a slow system fast . . . . . . . . . . . Motor drive, industrial automation

Make a fast system slow . . . . . . . . F16 controls, approach / landing mode

u(t)

y(t)
t

u(t): impulse
U(s)=1

t
System
Gp(s)

Open loop:

## y(t): impulse response

Y(s)=Gp(s)

Closed loop:

Figure 7: The transfer function is the Laplace transform of the impulse response.
The connection between the impulse response and TF can be used to
determine the TF
Apply an impulse and measure the output, y (t). Take the LT and use
G p (s) = Y (s).
The connection between the impulse response and TF helps to understand
the mathematical connection between an LT and a TF
Part 1A: Controls Review-By-Example

Y (s)
= G p (s)
U (s)

Page 15

## Y (s) = G p (s) (R (s) Y (s)) = G p (s) R (s) G p (s)Y (s)

Y (s) (1 + G p (s)) = G p (s) R (s)
G p (s)
Y (s)
=
R (s)
1 + G p (s)

(18)

Y (s)
Forward Path Gain
=
= Try (s)
R (s)
1 + Loop Gain

(19)

Page 16

Mt(t)
+

s+3

s +4s+5

(t)

## Figure 9: Simple feedback of aircraft pitch angle.

s+3

G p (s)
(s)
s (s2 +4 s+5)
=
=
R (s) 1 + G p (s) 1 + 2 s+3
s (s +4 s+5)

(20)

s+3

(s)
s (s2 +4 s+5)
=
R (s) 1 + 2 s+3
s (s +4 s+5)

## Analyzing the response

Gps
= tf([1 3], [1 4 5 0])
Try = tf([1 3], [1 4 6 3])
figure(1), clf
[Y_open, Top]
= step(Gps, 6);
[Y_closed, Tcl] = step(Try, 6);
plot(Top, Y_open, Tcl, Y_closed)
xlabel(t [seconds]);
ylabel(\Omega, pitch-rate)
title(Open- and closed-loop)
text(3, 1.6, Open-loop, rotation, 45)
text(4, 0.8, Closed-loop)
SetLabels(14)
print(-deps2c, OpenClosedResponse1)


s s2 + 4 s + 5
s+3
=
(21)
s (s2 + 4 s + 5) s (s2 + 4 s + 5) + (s + 3)

The closed-loop TF is still not quite in Rat Poly form, here is the final step:
(s)
s+3
=
R (s) s3 + 4 s2 + 6 s + 3

(22)

3.5

The open-loop
system is type I
The closed-loop
system is type 0

r(t)

(s)

2.5

1.5

1
Closedloop

(t)

0.5

R(s)

0
0

The response
completely
changes !
, pitchrate

r(t)

Section 2.0.0

oo
p

n
l

Section 2.0.0

O
pe

## EE/ME 701: Advanced Linear Systems

3
t [seconds]

Figure 11: Open and Closed loop response of the aircraft, the two responses have
very different characteristics.
Page 17

Page 18

Section 2.0.0

e(t)
Kc

Mt(t)

Section 2.1.0

r(t)

s+3

s2 + 4 s + 5

components:

(t)

## 1. Plant (thing being controlled)

Figure 12: Feedback for aircraft pitch control, with P-type gain Kc .


## Look at Kc = 1.0, 3.0, 10, 0

Kc = 1; Try1 = tf(Kc*[1 3], [1 4 5
Kc = 3; Try2 = tf(Kc*[1 3], [1 4 5
Kc = 10; Try3 = tf(Kc*[1 3], [1 4 5
figure(1), clf
...
plot(Top, Y_open, Tcl, Y_closed1,
...

drives)

0]+Kc*[0 0 1 3])
0]+Kc*[0 0 1 3])
0]+Kc*[0 0 1 3])

3. A sensor

r(t)

e(t)

Controller
KcGc(s) =

Nc(s)
Kc D (s)
c

u(t)

Plant
Gp(s) =

y(t)

Np(s)
Dp(s)

Sensor Dynamics
Ny(s)
Hy(s) = D (s)
y

ys(t)

3.5

## Figure 14: Basic loop with a plant, compensator and sensor.

lo
op

The TF is given as

pe

2
O

stable as Kc
increases

2.5
, pitchrate

## System gets much

faster as Kc
increases

Try (s) =

Kc =10

1.5

Kc =3

Kc Gc G p
Y (s)
Forward Path Gain
=
=
R (s)
1 + Loop Gain
1 + Kc Gc G p Hy

1
Closedloop, Kc =1

Often, for the controls engineer the plant, G p (s), is set (e.g., the designer of
a cruise-control does not get to change the engine size).

0.5

0
0

3
t [seconds]

Figure 13: Open and Closed loop response of the aircraft, with Kc = 1.0,
Kc = 3.0, and Kc =10.0 .
Part 1A: Controls Review-By-Example

Page 19

## As a controls engineer, we get to pick Gc (s) and maybe influence Hy (s)

(e.g., by convincing the project manager to spring \$\$\$ for a better sensor).

Page 20

Section 2.2.0

## 2.2 Common controller structures:

PI

PID

Proportional-Derivative

Prop.-Integral

Prop.-Int.-Deriv.

Gc (s) =

k p s +ki
s

Section 2.3.0

## 2.3 Analyzing other loops

PD

Gc (s) = kd s + k p

## EE/ME 701: Advanced Linear Systems

Gc (s) =

kd s2 +k p s +ki
s

d(t)

Disturbance Filter
Nd
Gd(s) = D
d

(s+z ) (s+z )

Gc = Kc (s+p11 ) (s+p22 )
r(t)

Input Shaping
Hr(s)

Common Applications

PID: Many many places, Astrom has estimated that 80% of controllers are
PID (good, speed accuracy, stability)
Lead-Lag: Used where a pole at the origin is unacceptable, can be as good
as PID (notice, 5 parameters rather than 3)

e(t)

Controller
N
KcGc(s) = Kc Dc
c

uc(t)

Plant

up(t)

Gp(s) =

y(t)

Np
Dp

Sensor Dynamics
Ny
Hy(s) = D
y

ys(t)

PI: Velocity control of motor drives, temperature control (good speed and
accuracy, acceptable stability)
PD: Position control where high accuracy is not required (good speed and
stability, so-so accuracy)

Figure 15: Basic loop with a disturbance input, d (t) , and sensor noise, Vs (t)
In some cases we may want to consider additional inputs and outputs.
Many systems have a disturbance signal that acts on the plant, think of
wind gusts and a helicopter autopilot.
All systems have sensor noise.
Any signal in a system can be considered an output. For example, if we
want to consider the controller effort, uc (t), arising due to the reference
input
Tru (s) =

Hr (s) Kc Gc (s)
Uc (s) Forward Path Gain
=
=
R (s)
1 + Loop Gain
1 + Kc Gc (s) G p (s) Hy (s)

(23)

Tde (s) =

Page 21

## Gd (s) G p (s) Hy (s) (1)

E (s)
=
D (s) 1 + Kc Gc (s) G p (s) Hy (s)

Vs(t)

(24)

Page 22

Section 2.3.0

## 2.3 Analyzing other loops (continued)

As a final example, lets consider the output arising with sensor noise
Tvy (s) =

## Y (s) Hy (s) (1) Kc Gc (s) G p (s)

=
Vs (s)
1 + Kc Gc (s) G p (s) Hy (s)

(25)

The example transfer functions, Eqns (23), (24) and (25) show some
interesting properties. The TFs are repeated here (omitting the many (s)s)
Try (s) =

Hr Kc Gc G p
1 + Kc Gc G p Hy

Tru (s) =

Hr Kc Gc
1 + Kc Gc G p Hy

Ted (s) =

Gd G p Hy (1)
1 + Kc Gc G p Hy

Tvy (s) =

Hy (1) Kc Gc G p
1 + Kc Gc G p Hy

Section 2.3.0

## If we consider what happens as Kc , we can see what happens for very

high gain. For this, assume that Hr (s) = 1.0 and Gd (s) = 1.0, since these
two terms merely pre-filter inputs.
When Kc , 1 + Kc Gc G p Hy Kc Gc G p Hy , so
Try (s)

Kc Gc G p
1
=
Kc Gc G p Hy Hy

Tde (s)

Gd G p Hy (1) Gd v
=
Kc Gc G p Hy
Kc Gc

Tru (s)
,

Kc Gc
1
=
Kc Gc G p Hy G p Hy

Tvy (s)

Hy (1) Kc Gc G p
= 1
Kc Gc G p Hy

Try (s) 1/Hy (s) shows that the TF of the plant can be compensated, it
disappears from the closed-loop TF as Kc .
Try (s) 1/Hy (s) also shows that the TF of the sensor can not be
compensated. If the characteristics of Hy (s) are bad (e.g., a cheap sensor)
there is nothing feedback control can do about it !
Tde (s) 1/Kc Gc shows that disturbances can be compensated, as Kc ,
errors due to disturbances go to zero ;)

## The denominators are all the same

The poles are the same for any input/output signal pair
The stability and damping (both determined by the poles) are the same
for any signal pair
The numerators are different
The zeros are in general different for each input/output signal pair
Since the numerator help determine if the signal is small or large, signals
may have very different amplitudes and phase angles

Tru (s) 1/G p Hy shows that U1 (s) does not go up with Kc , and also, if the
plant has a small gain (G p (s) is small for some s = j ) then a large control
effort will be required for a given input.
Tvy (s) 1 shows that there is no compensation for sensor noise. If there
is sensor noise, it is going to show up in the output !

Summary:
Feedback control can solve problems arising with characteristics of the
plant, G p (s), and disturbances, d (t).
Feedback control can not solve problems with the sensor, Hy (s), or
sensor noise, vs (t) .

Page 23

Page 24

Section 3.0.0

3 Analysis

Section 4.1.0

## 4 Working with the pole-zero constellation

4.1 Basics of pole-zero maps, 1st order, p1 =

## You have seen that:

The performance criteria any controls engineer should be aware of are seen
in table 4.

## The criteria fall into 3 groups

A real pole gives the terms of Eqn (27), as seen in figure 16.

Y (s) =

## Many performance measures are introduced in first semester controls (we

will review them here). Those with (*) are introduced in 2nd semester
controls.

## As seen in table 4, some performance measures are evaluated in the time

domain, and others in the frequency domain.

Amplitude

1
0

(26)

Impulse response

PoleZero Map

()

y (t) = C1 et

2
Imaginary Axis [sec1]

Speed
Degree of Stability Accuracy
Time
Rise Time
Stable / Unstable
Domain
Settling Time
Overshoot
ISE ()
IAE ()
Peak Time
Damping Factor
Frequency Pole Locations
Pole Locations
Disturbance Rejection
Noise Rejection ()
Domain
Bandwidth ()
Phase Margin ()
()
()
(Or S-plane) Cross-over Freq.
Gain Margin
Tracking Error ()
Table 4: A range of controller specifications used in industry.
Note: () marks items developed in 2nd semester controls.

3
C1
=
s+2 s+

Faster

h(t) =
t
t/
2 t
e
=e =e
C1 / e

Splane
2
3

2
1
0
1
Real Axis [sec ]

0
0

1
2
Time (secs)

## Figure 16: First order pole and impulse response.

Depending on the performance measure, one of four methods of evaluation
is used:
1. Evaluated directly from the transfer function (e.g., steady state error)
2. Evaluated by looking at the system response to a test signal (e.g., trise )
3. Evaluated from the pole-zero constellation (e.g., stability, settling time)
4. Evaluated from the bode plot (e.g., phase margin)
Part 1A: Controls Review-By-Example

Page 25

## A real pole has these characteristics:

y (t) et/ where [sec] = 1/ is the time constant.

## Further to the left indicates a faster response (smaller ).

The pole-zero constellation does not show either the KDC or the Krlg of
the TF or the amplitude of the impulse response.
Part 1A: Controls Review-By-Example

Page 26

Section 4.1.1

X Case:

Case:

= 16 [sec1]

## Each complex pole pair gives a mode of the system response.

= 1/4 [sec]

A complex pole pair gives the terms of Eqn (27), as seen in figure 19.

= 1/16 [sec]

## Using the Laplace transform pair

F (s) =

Bc s + (Bc + Bs )
f (t) = et (Bc cos (t) + Bs sin (t))
s2 + s s + 2n

one finds

Y (s)
Faster Decay

5
15

10
5
0
Real Axis [sec1]

y (t)

Figure 17: A change in pole location changes the decay rate and damping.

2 s + 14
b1 s + b0
=
s2 + 2 s + (2 + 2) s2 + 3 s + 18.25

PoleZero Map

0.5

0.5

2
t

0.5
Time (secs)

0
0

0.5
Time (secs)

6 4 2
0
Real Axis [sec1]

t/

=e

1.5t

=e

0
1

5
8

0
0

Impulse response
3

Splane

1.5
= 0.06 [sec]
Amplitude

Amplitude

1.5
= 0.25 [sec]

(27)

Amplitude

5

20

Section 4.2.0

4 [sec1]

2
0

2
4
Time (secs)

## Figure 18: A change in changes the time constant.

Part 1A: Controls Review-By-Example

Page 27

Page 28

Section 4.2.1

*
p1, p1

=1.5 j 4

p1

X Case: 1
Case: 4

[sec1]

PO

[Dim less]

44%

[Dim less]

4%

## PZ map showing two systems

1
0
Real Axis [sec1]

Description

Given by

Units

Decay Rate
Damped Nat. Freq.
Natural Freq.
Pole Angle
Damping Factor

p1 = + j
p1 = + j

2n = 2 + 2
= atan2 (, )
= /n

[sec1]
[deg]
Dimless, []

4
2

Faster Decay

0
2
4
6
8

4 2
0
2
1
Real Axis [sec ]

0.7

2
0

Amplitude

Polar

2
4

0.9
0.7 0.5 0.3
6 4 2
0
2
1
Real Axis [sec ]

6
8

## X System step response

1.5

p1 = + j
p1 = n (90 + )
= n
= sin ()
p
= 1 2 n
= / n
p

n 1 2
H (s) =
H
(s)
=
s2 + 2 n s + 2n
(s + )2 + 2
Table 6: The terms of table 5 relate to rectangular or polar form for the poles.

0.5 0.3

4 0.9

## Table 5: Factors derived from the location of a complex pole.

(Note: Franklin et al. often use , d and .)
Rectangular

6
Imaginary Axis [sec1]

Splane

p1

= 0.24
46 % Overshoot

0.5

0
0

1.5

Amplitude

or
or d
n

or

[sec1]

6
3

Term

6
4

Section 4.2.2

PZ Map,

2
4
Time (secs)

= 0.71
4 % Overshoot

0.5

0
0

2
4
Time (secs)

## Figure 22: Step response: a change in , is unchanged.

Part 1A: Controls Review-By-Example

Page 29

Page 30

Section 4.2.2

X Case: 4 [sec1]

4%

Faster
Oscillation

0
10
20

0.5 0.3

0.9

20

12

10
0.9
20

20 10
0
10
Real Axis [sec1]

30

0.5

0
0

1
Time (secs)

4%

[Dim less]

4%

10
0
10
20

0.7 0.50.3

20

0.9
10
0

28 20 12

= 0.24
46 % Overshoot

0.5

0.9
20

0.7 0.50.3
30 20 10
0
10
1
Real Axis [sec ]

1.5

= 0.71
4 % Overshoot

0.5

0
0

1
Time (secs)

= 0.71
4 % Overshoot

0.5

= 0.25 [sec]

1
Time (secs)

10

## X System step response

0
0

0.71

20

1.5

Amplitude

1.5

= 0.71
4 % Overshoot

PO

Figure 25: A radial change in pole location changes the decay rate and oscillation
frequency, but not the damping.

Amplitude

Amplitude

1.5

[Dim less]

30 20 10
0
10
1
Real Axis [sec ]

20 10
0
10
Real Axis [sec1]

## Figure 23: A change in changes the oscillation frequency and damping.

X System step response

[sec1]

10
28

[sec1]

Case: 16

## Imaginary Axis [sec ]

0.7

20
Imaginary Axis [sec1]

20

30

X Case:

10

PO

Section 4.2.2

Amplitude

= 0.06 [sec]
2

0
0

0.2
0.4
Time (secs)

## Note: The S-plane has units of [sec1 ].

Part 1A: Controls Review-By-Example

Page 31

Page 32

Section 4.2.3

## 4.2.3 The damping factor:

b0
b0
=
s2 + s a1 + a0 s2 + 2 n s + 2n

(29)

value

System characteristic

<0
=0

Unstable
Marginally Stable (poles on imaginary axis)

0<<1
giving:

Section 4.2.3

H (s) =

a1
a1
=
=
2 n 2 a0

(30)

## is defined by Eqn (30) for either real poles: ( 1.0), or

a complex pole pair: (0.0 < < 1.0).

=1
>1

## Critical Damping (repeated real poles)

Over Damped (two separate real poles)

## Table 7: Ranges of damping factor.

As illustrated in figure 24, above, on the range 0.0 < < 1.0, the damping
factor relates to percent overshoot. For a system with two poles and no
zeros, the percent overshoot is given by Eqn (31) and plotted in figure 27
q
P.O. = 100 e / 1 2

(31)

S-plane
Stable Region
( >0 )

Unstable Region
( <0 )

100

## Figure 28: Damping factor and stability.

90
80

Percent Overshoot

70
60
50
40
30
20
10
0
0

0.1

0.2

0.3

0.4
0.5
0.6
Damping factor [.]

0.7

0.8

0.9

Figure 27: Percent overshoot versus damping fact. Exact for a system with two
complex poles and no zeros, and approximate for other systems.
Part 1A: Controls Review-By-Example

Page 33

Page 34

Section 4.3.0

Section 4.3.3

## dominant second-order mode

1st order:
Returning to performance measures, in section 3 weve seen that these are
defined from the step response.
Rise time

Settling time

Peak time

Overshoot

1.5

## Step response of complex pole pair

2.5

Output y(t)

Output y(t)

ss

0.5

| |
T : 1090%
r

Rise Time

yp

r (1)
ss

2%

4 |

6
8
Ts: 98% Settling Time

tr =

1.8
n

y (1)
ss

1.5

Time [sec]

(32)

## 4.3.3 Settling time from pole locations

Settling time is sometimes defined as the time to approach within 4% or 2% or
even 1% of the steady-state value. These give slightly different definitions of Ts .
The one will will use (corresponding approximately to 2%) is:

|
tp

90

y (0)
0
ss

tp =

Mp

0.5

1.8

Overshoot
1

tr =

2nd order:

ts =

0 |
t

10

| 1
t

90

2
Time [sec]

(33)

## Figure 29: Quantity definitions in a CL step response.

While there are not equations that give these measures exactly for any
system other than 2 poles and no zeros. But for this special case
T (s) =

b0
s2 + 2 n s + 2n

Page 35

Page 36

Section 5.0.0

5 Design

Section 5.2.0

## 5.1 Design methods

The major design methods are:

Root locus

Analysis

## Speed, Stability: determine by determining pole locations

Accuracy: increase the system type, check the SSE

Completed
Controller
Design

Performance
Specifications

Frequency response

()

## Speed: Bandwidth and Cross-over frequency directly from bode plot

Stability: Phase margin, Gain margin directly from bode plot
Accuracy: tracking accy., disturbance rejection directly from bode plot

Design

## In Analysis, we use mathematical methods to determine performance

specifications from a completed controller design (all structure and
parameters specified).

## Design using state-space design methods, check speed, stability and

accuracy from the step response

## 5.2 Root Locus Design

Devised by Walter R. Evans, 1948 (1920 1999)

Mathematical

## W.R. Evans, Control system synthesis by

root locus method, Trans. AIEE, vol. 69,
pp. 6669, 1950.

Gut feeling
Trial and error

IEEE in 1963)

## Calling a colleague with experience

to determine a controller structure and parameters to meet performance
goals.
Part 1A: Controls Review-By-Example

()

Page 37

## Evans was teaching a course in controls,

and a student (now anonymous) asked a
Part 1A: Controls Review-By-Example

Page 38

Section 6.0.0

6 Summary

## EE/ME 701: Advanced Linear Systems

Section 7.0.0

7 Glossary of Acronyms

## LT: Laplace Transform

Every input or output signal of the system has a unique Laplace transform

## FPG: Forward path gain

The ratio of Laplace transforms, however, does not depend on the signals.

## LG: Loop Gain, (also Gl (s))

The ratio depends only on properties of the system. We call it the transfer
function. For example:

## RHP: Right-half of the S plane (unstable region)

LHP: Left-half of the S plane (stable region)

b1 s + b0
Guy (s) = 2
s + a1 s + a0

(34)

## Transfer function gives us many results

The pole locations tells us the stability and damping ratio
We can get approximate values for rise time, settling time and other
performance measures
Control system analysis is the process of determining performance from a
system model and controller design

Control system

* Step response

frequency.

frequency.

OL: Open loop

* Pole locations

* Bode plot

## P, PD, PI, PID: Proportional-Integral-Derivative control, basic and very common

controller structures.

Design =Analysis1

## it is the process of determining a controller design given a system model

and performance goals.
The root locus method is one method for controller design.
Part 1A: Controls Review-By-Example

Page 39

Page 40

## Building Linear System Models

Contents
1 System modeling,
classical vs. state-variable modeling
1.1
1.2

1.3
1.4

1.2.1

1.2.2

## Writing the n first-order differential equations in standard

form . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

## Standard form for the linear, time-invariant state-variable

model: . . . . . . . . . . . . . . . . . . . . . . . . . . .

10

## Why consider linear state-variable models ?

(when algebraic form is just fine) . . . . . . . . . . . . . . . . .

11

12

21

3.6

22

3.7

3.8

24

3.9

## Continuous signals and systems, continuity in the mathematical

sense . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

25

26

27

## 3.12 A Note on units . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

4 State, revisited

14

4.1.2
4.1.3

## For direct application of Superposition and Scaling, state

must be zero . . . . . . . . . . . . . . . . . . . . . . . .

30

## Definition of a Linear System considering non-zero state

(e.g., DeCarlo definition 1.8) . . . . . . . . . . . . . . . .

31

32

. . . .

35

18

37

2.2

## Simple example system

3.1

Admissible signal . . . . . . . . . . . . . . . . . . . . . . . . . .

18

3.2

19

3.3

20

3.4

21

4.1.1

30

16

15

33

## A system Maps inputs to outputs . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . .

4.1

28

2.1

## Part 5: Models of Linear Systems

Realizable system . . . . . . . . . . . . . . . . . . . . . . . . . .

## Modeling with a single, higher-order differential equation

(sometimes called classical modeling) . . . . . . . . . . . . . . .

1.2.3

3.5

Page 1

7.1

## Example 1, an electrical circuit

. . . . . . . . . . . . . . . . . .

37

43

45

7.2.1

53

7.1.1
7.2

. . . . . .

Page 2

7.2.2
7.3

7.4

59

60

7.3.1

61

7.3.2

. .

64

66

8.1

73

73

8.1.1

## Interpreting the transfer function

. . . . . . . . . . . . .

74

8.1.2

## DC gain of a state-variable model . . . . . . . . . . . . .

77

1 System modeling,
classical vs. state-variable modeling
To introduce state modeling, lets first look at an example. Consider the
RLC circuit of figure 1. The parameters and signals of the circuit are:

Parameters

8.2

Interpreting D . . . . . . . . . . . . . . . . . . . . . . . .

77

78

8.2.1

81

. . . . . . . .

85

9.2

87

9.2.1

## Designing a pole placement controller

. . . . . . . . . .

90

9.2.2

LQR Design . . . . . . . . . . . . . . . . . . . . . . . .

92

volts
amp

volts
amp/sec
amps
volt/sec

Units

Signals

Units

Vs (t)

[volts]

VL (t)

[volts]

iS (t)

[amps]

iL (t)

[amps]

VR (t)

[volts]

VC (t)

[volts]

iR (t)

[amps]

iC (t)

[amps]

iR
+

is
Vs(t)

10 Conclusions

Signals

84

9.1

Units

L
8.1.3

Section 1.0.0

iL
+
L

iC
+
C

VC(t)

95

Page 3

Page 4

Vs(t)

iL
+

Section 1.1.0

iR
is

Section 1.0.0

iC

VC(t)

## iR (t) iL (t) iC (t) = 0

1
(Vs (t) VC (t)) iL (t) CVC (t) = 0
R
1
1
1
VC (t) + iL (t) =
Vs (t)
VC (t) +
RC
C
RC

## The constituent relations are:

VR (t) = R iR (t)
d iL (t)
VL (t) = L
dt
d VC (t)
iC (t) = C
dt

(1)

Where known quantities are on the right and unknowns on the left.

(2)

In Equation (5) we have unknown signals VC (t) and iL (t) on the left, we
must eliminate one of them, to have one equation in one unknown.
Take the derivative of Eqn (5) to produce iL (t), and use inductor
constituent relation and VC (t) = VL (t) to give:

## The continuity constraints are (Kirchhoffs laws for electrical systems):

VR (t) +VL (t) Vs (t) = 0

(3)

VC (t) VL (t) = 0
is (t) iR (t) = 0
iR (t) iL (t) iC (t) = 0

(5)

(4)

1
1
1
VC (t) +
VC (t) + iL (t) =
Vs (t)
RC
C
RC
1
1
1
VC (t) +
VC (t) =
Vs (t)
VC (t) +
RC
LC
RC

(6)

## Limitations of classical modeling:

Complexity goes up quickly in the number of variables, 3rd order 4-5X
as much algebra as 2nd order , 4th order 10-20 X more algebra.

## Two approaches to modeling this type of dynamic system

1. Classical modeling, with a single high-order differential equation

## The significance of the initial conditions is unclear: VC (t0 ) , VC (t0 ) ...,

how to get VC (t0 ) ?

2. State-variable modeling

Page 5

Page 6

Section 1.2.0

Section 1.2.1

## Develop n first order differential equations in terms of states and inputs:

Determine the state variables of the system, these are variables that
make up the initial condition for the system (formal definition and more
examples coming later).
For an nth order system, there are n states
For the RLC circuit, there are two states, we can choose them to be
VC (t) and iL (t), giving the state vector:

x (t) =

VC (t)
iL (t)

Systems generally have many signals. Table 2 shows 8 signals for this
simple circuit. Only 2 are states, and 1 is an input, so 5 are just signals
within the system.
A Deep Property of state variable modeling: all of the signals in the
system have a unique expression in terms of the states and inputs.
If no expression exists, the model does not have enough states, and
If there are multiple possible expressions, the model has redundant
states that should be removed from the state vector.
Examples of using Eqns (1) - (4) to write other signals in terms of the
states, VC (t) and iL (t) and the input Vs (t)

## We are working toward the state equation:

x (t) = A x (t) + B u (t)

(7)

VL (t) = VC (t)
VR (t) = Vs (t) VL (t) = Vs (t) VC (t)

Starting with differential equations that come directly from the model:
1
VC (t) =
iC (t)
C

(8)

1
iL (t) = VL (t)
L

(9)

Note that the derivative is written on the left, and all signals that
determine the derivative are on the right.
n
o
All signals on the right must be states, VC (t) , iL (t) , or inputs, Vs (t)

## iR (t) = VR (t) /R = 1/R (Vs (t) VC (t))

Use the basic modeling equations to re-write the differential equations with
only states and inputs on the right hand side:
1
1
1
iC (t) = (iR (t) iL (t)) =
VC (t) =
C
C
C


1
(Vs (t) VC (t)) iL (t)
R
(10)

1
1
iL (t) = VL (t) = VC (t)
L
L

(11)

Notice that in Eqns (8) and (9), the right hand terms including iC (t) and
VL (t), neither of which is a state or an input.
Part 5: Models of Linear Systems

Page 7

Page 8

Section 1.2.2

Input
u(t)

A, B, C, D

(12)
(13)

(14)

(16)

(17)

## y (t) = C x (t) + D u (t)

Putting Eqns (12) and (13) in the form of (14)

VC (t)
iL (t)

1
RC
1
L

C1

VC

x(t)

## Figure 2: Block diagram showing the elements of a state-variable model,

including inputs, outputs and states, and model matrices A, B, C, D.

## Put the model equations in matrix / vector form:

x (t) = A x (t) + B u (t)

Output
y(t)

System

The right hand side written with states and inputs as the only signals
1
1
1
VC (t) =
VC (t) iL (t) +
VS (t)
RC
C
RC
1
iL (t) = VC (t)
L

Section 1.2.3

(t) +

1
RC

iL
0
0

h
i VC
(t) +  Vs (t)
y (t) =
1 0
iL

Signals

Vs (t)

x (t)

Name
State
Vector

u (t)

(15)

Input

Units
"

[volts]
[amps]

Parameters
#

A=

"

C1
0

"

1
L

B=

[volts]

C=

Vector
y (t)

Output Vector

1
RC

D = 

x (t) =

VC (t)
iL (t)

[volts]
[amps]

Name

1
RC

[volts]

## where the state vector is:

x (t0 )

Initial Condition :

System Matrix

Input Matrix

Output Matrix
Feed-forward

Units
i

h

volts/sec
sec1
i  amp 
h
amps/sec
1
sec
volt
"  1  #
h sec i
amps/sec
volt

[]

volts
amp

[]

i i

Matrix

Table 2: List of parameters and signals of the State-Variable Model. Each of the
seven elements has a name, and each has physical units, which depend
on the details of the system.

## States are signals (i.e., functions of time)

States have units (like all physical signals)
Part 5: Models of Linear Systems

Page 9

Page 10

Section 1.3.0

Section 1.4.0

## Prior to the development of state-variable approaches, analytic work focused

on algebraic equations for modeling.

## Many types of systems are modeled with sets of linear equations

Plants and controllers for control

theory

## Example, it was not known until publication of Masons rule in 1953

when control system equations could be simplified.

Signal processing
And more
In fact, the only models which can be worked out in a general way are linear.

## Many properties of the system made only clear by state-variable

approach:

Many truly nonlinear results are known, but they are applicable only in
specific cases.

## Nonlinear systems are often analyzed by linearizing them about an

operating point ...

## Coordinate transformations and modal coordinates

Interest in state variable modeling for analysis and design of systems
accelerated with the early work of Kalman, Bellman and others.

## To apply the methods of linear analysis.

Any time there are two or more variables, vectors and matrices become
powerful tools for analysis
Earlier algebraic approaches do not extend well to multi-variable systems
(such as multi-input, multi-output (MIMO) systems).
Powerful mathematical results for linear systems are applicable to problems
from all domains. !!!! Geometric interpretation !!!!
Vector spaces bring us notions of size, orthogonal, independence,
sufficiency, degrees of freedom (DOF), which apply equally well across
all domains.
Part 5: Models of Linear Systems

Page 11

## Rudolf E. Kalman, On the General Theory of Control Systems, Proc.

1st Inter. Conf. on Automatic Control, IFAC: Moscow , pp 481-93, 1960.
Richard E. Bellman, Dynamic Programming, Princeton University Press,
1957.
Rudolf E. Kalman, Mathematical Description of Linear Systems,
SIAM J. Control, vol. 1, pp 152-192, 1963.
Vasile M. Popov, On a New Problem of Stability for Control Systems,
Automatic Remote Control, 24(1):1-23, 1963.
Part 5: Models of Linear Systems

Page 12

## EE/ME 701: Advanced Linear Systems

Section 1.4.0

The computational tools to work with vectors and matrices were being
introduced at about the same time

Section 2.0.0

## 2 Formal properties of systems

Definition: A system is something which maps inputs to outputs.

## Wilkinson, James Hardy, Rounding errors in algebraic processes,

Englewood Cliffs:Prentice-Hall, 1963.
George E. Forsythe and Cleve B. Moler, Computer solution of linear
algebraic systems, Englewood Cliffs: Prentice-Hall, 1967.
Cleve Moler went on to co-found Mathworks (Matlab); and is
Mathworks chief scientist today.

u(t)

y(t)
Aircraft
Outputs:
- Pitch
- Roll
- Yaw
- Velocity

Inputs:
- Aileron
- Rudder
- Elevator
- Throttle

## Analysis: What can we say about how an aircraft responds ?

Design: How do we engineer the aircraft so that it responds well ?
Control: How do we pick u(t) ?

Starting Point
What is a system ?
What are the properties of a system ?

Page 13

Page 14

Section 2.1.0

Section 2.2.0

## 2.2 Simple example system

The System

System

N{}

u(t)

.y(t)

+
+

u(t)

y(t)

Vc(t)

Space of All
Possible Input Signals

## Figure 5: Simple example system with a single state. This system is an RC

circuit.

Space of All
Possible Output Signals

1+(t)

y(t) = N [u(t)]

(18)

0 t<0
u(t) = 1+ (t) =
1 t0

## Examples (describe the components and signals):

System
Car Cruise Control

Components

Signals

## v (t) , (t) , RPM (t)

Rocket, engine,

v(t) , (t)

Rocket Guidance

steering hydraulics

(t) , (t)

(19)

## Look at two cases:

Solving the Diff. Eqn. gives:
Case 1: VC (t = t0 ) = 0.0 [volts]
Case 2: VC (t = t0 ) = 3.0 [volts]

t=0

## Input Signal, unit step function:

y1 (t) = 1 e RC t
y2

1
(t) = 1 + 2 e RC t

t >0
t >0

Economy
Biosphere
Part 5: Models of Linear Systems

Page 15

Page 16

Section 2.2.0

Section 3.1.0

2.5

2
y (t)

## These have formal mathematical definitions (See Bay section 1.1)

y(t)

1.5

Characteristic

* Linear
* Time Invariant
* Causal (Non-anticipative)
* Realizable
* Lumped
* Continuous
* Discrete
* Quantized
* Deterministic

y (t)

0.5

0
0

0.5

1.5

Time [sec]

## Figure 6: Responses y1 (t) and y2 (t) of the RC circuit.

So the system response, y(t) , depends on two things:
The input u(t)
Something inside the system at t = t0 , in this case VC (t = t0)
Definition: The state of the system is information about conditions inside the
system at t = t0 that is sufficient and necessary to uniquely determine the
system output.

Signal System

## Table 3: Characteristics of Signals and Systems

Write:
y(t) = N [u(t), x(t0 )]

(20)

## A signal is admissible if it has mathematical properties such that it has a Laplace

transform (all physical signals are admissible). Signal u (t) is admissible if:
i)

## it is piecewise continuous (discontinuous at most at a finite number of

locations in any finite interval)

ii)

t0

## It is equal to the number of independent storage elements in the system

(such as independent capacitors, inductors, masses and springs)

iii)

## u (t) is exponentially bounded (there exists an exponential signal which goes

to infinity faster than u (t))

The selection of state variables is not unique (more on this and state-variable
transformations later)

iv)

u (t) is a vector of the correct dimension (e.g., for aircraft inputs [aileron,
rudder, elevator, throttle], u (t) R4).

## The number of state variables is equal to the order of differential equation

required to describe the system.

Page 17

s.t.

u (t) 0 t t0

( :

there exists;

for all)

Page 18

Section 3.2.0

Section 3.3.0

## 3.3 Time-invariant system (without internal variables)

System: a system is linear if and only if (iff) it obeys the superposition rule (which
incorporates scaling and homogeneity).

A system is time invariant if a time shifted input gives the time-shifted output.
Given:

Given:
y = N [u]
y1 = N [u1]
y2 = N [u2 ]
N [u (t T )] = y(t T )

and
Examples:

u3 = c1 u1 + c2 u2

## Time Invariant: a boiler

fuel in heat out (independent of clock time t0 )

Then:

## Not Time Invariant (time varying)

y3 = c1 y1 + c2 y2

## Example: the economy

Define: u(t) = Loans for buying corn seed

Or

Related:

## x (t) = A (t) x (t) + B (t) u (t)

Scaling:

N [c1 u1 ] = c1 y1

Homogeneity:

N  = 0

## y (t) = C (t) x (t) + D (t) u (t)

A (t) , B (t) , C (t) and D (t) depend on altitude and Mach number.

Page 19

Page 20

Section 3.5.0

Section 3.6.0

## For a causal system, the output at time t0 , y(t0 ), is completely determined

by the inputs for t t0 .
That is: u(t1 ) , t1 > t0 does not influence y(t0 ) .

## A lumped parameter system is characterized by ordinary differential

equations (the coefficients of the differential equation are the lumped
parameters)
Distributed systems are characterized by partial differential equations

Causal Example:

iR
y(t) = 2 u(t 1)

is
Vs(t)

## Non-causal (anticipative) Example:

iL
+

iC
+

VC(t)

y(t) = 2 u(t + 1)
Figure 7: An RLC circuit.

## 3.5 Realizable system

A realizable system is a physically possible system, one which in principle
could be built. To be realizable, a system must be:
1. Causal

When the frequency is not too high, the circuit of figure 7 is characterized
by:
1
1
1
vc (t) +
(21)
vc (t) +
vc (t) =
vs (t)
RC
LC
RC
When the frequency is high enough wave phenomena become important
At some high frequency, lumped parameter models break down

## Partial differential equations and wave propagation are required

A distributed system model is required.
We will deal exclusively with lumped parameter systems
All continuous-time systems have a cross-over frequency where
distributed phenomena (often wave phenomena) become important
Part 5: Models of Linear Systems

Page 21

Page 22

Section 3.7.0

Section 3.8.0

## Continuous (time) system: governed by differential equations, such as:

and systems
x(t)

= 2 x(t) + 3 u(t)

(22)

y(t) = x(t)

0.9
0.8
0.7

u(t)

0.6
0.5

## Signal: A discrete (sampled) signal u(k) is defined (sampled) only at specific

sample instants, t = tk (see figure 8).

0.4
0.3

## System: A discrete (sampled) system is governed by a difference equation, such

as:

0.2
0.1
0
0

0.5

1.5
Time [sec]

2.5

xk = 2 xk1 1 xk2 + 3 uk

## Figure 8: A continuous and sampled signal.

Signals:
A continuous signal is defined
for all values of time, e.g.,



u (t) = 0.5 + 0.2 cos t/3 +
6
+0.2 cos (2t/3)
+0.2 cos (2t + )

yk = xk

## A sampled (discrete) signal is defined only

particular instants, e.g.,

t (k) =

0.3

0.6

0.9

1.2

1.5

1.8

2.1

2.4

(23)

u (k) =

0.805

0.297

0.235

0.400

0.128

0.0937

0.5247

0.528

## (Revised: Sep 10, 2012)

Page 23

Usage: generally we say sampled signal and discrete system, though all
combinations are sometimes seen.
Any computer-based data acquisition results in sampled signals; any
computer-based signal processor is a discrete system
Some mathematical results are more straight-forward or intuitive (or
possible) for one time or the other.

Page 24

Section 3.9.0

## 3.9 Continuous signals and systems, continuity in the mathematical sense

Signal: A signal is continuous if the limit of a sequence of values is the value of
the limit:
lim u (t) = u (t1 )
(24)
tt1

Section 3.10.0

## 3.10 Quantized signal

A signal can be quantized, the signal takes only certain possible values, for
example
u(t) {0, 1/4, 1/2, 3/4, 1, 5/4, ...}
(27)
Discrete signals are often quantized, but continuous signals can also be.
Example signals with the possible combinations of the characteristics
continuous and discrete and/or quantized are seen in figure 9.

if t t1 ,

then

u (t) u (t1 )

## System: A system is continuous if, when a sequence of input signals ui converges

to u, then the corresponding sequence of outputs converges to the output
signal of the limiting input. That is, if
lim ui = u

(25)

N [u ] = lim N [ui ]

(26)

then
i

## Figure 9: Example signals: a) Continuous-time signal (with a discontinuity), b)

Continuous-time, quantized signal, c) Discrete signal, d) Discrete,
quantized signal.

Page 25

Page 26

Section 3.12.0

## EE/ME 701: Advanced Linear Systems

Section 4.0.0

4 State, revisited

## 3.11 Deterministic Signals and systems

Signal: Deterministic signals have no random component, examples:

## We saw above that internal variables partially determine the output of a

system. Look at two cases:

Deterministic signal:

u(t) = 1+ (t)

## the ideal voltage on an RC circuit, with u(t), VC (t0 ) given.

y1 (t) = 1 e RC t

Case 1: vc (t = t0 ) = 0 [volts]

Non-deterministic signal:

## Case 2: vc (t = t0) = 3 [volts]

y2

1
(t) = 1 + 2 e RC t

t >0
t >0

## wind gusts acting on a helicopter.

3

System: A deterministic system does not introduce random components into the
output signals

System
2.5

R
+
2

u(t)
Vc(t)

RLC circuit

y (t)

y(t)

y(t)

Deterministic:

1.5

y (t)

0.5

Non-deterministic:

0
0

0.5

economy,

1
Time [sec]

## Figure 10: Simple example system.

biological systems,
These internal variables are called states.

## Definition: The state of a system at a time t0 is the minimum set of internal

variables which is sufficient to uniquely specify the system outputs given
the input signal over [t0 , ).

## 3.12 A Note on units

All physical signals have units
Systems with physical inputs and outputs have units.
The units of the system are [Output] / [Input].
Part 5: Models of Linear Systems

Page 27

Page 28

1.5

Section 4.0.0

Section 4.1.1

## 4.1 Modified definition of linearity, considering state

Examples of states
Elementary dynamics:

Circuits:

## voltages across a capacitor, currents through

inductors

4.1.1 For direct application of Superposition and Scaling, state must be zero

Fluid system:

Economy:

## balances in accounts, levels of material in

inventories, position of material in transit

For superposition and scaling to apply to a system in the simple way, the
internal states must be zero, consider:
y (t) = N [u (t) , x (t0 )] = u (t) + x (t0 )

(28)

so with u3 = u1 + u2

The number of states is equal to the order of the differential (or
difference) equation of the model
States are often associated with energy-storage elements,

(29)

## More generally, states are associated with storage of something, where

the stored amount changes with time, giving:

(30)

## = u1 (t) + u2 (t) + 2 x (t0 )

d Amount Stored
= f (x, u,t)
dt
The definition of the states of a system is not unique, consider analyzing
a circuit for a voltage, or for a current.
Keep in mind that the state variables must be independent. Consider the
circuit of figure 11

## So x (t0 ) 0 is required for Eqns (29) and (30) to be consistent.

(0 is the Null or zero vector, it is a vector of zeros)
Definition: zero state response:

R1

iL1(t)

## y (t) = N [u (t) , 0] = u (t) + 0

(31)

L1
Va (t)

iL2(t)
L2

Figure 11: Circuit with inductors in series. This system has only one state, iL1 (t) .
Part 5: Models of Linear Systems

Page 29

Page 30

Section 4.1.2

definition 1.8)

## 4.1.3 Time-invariant system, considering non-zero state

The definition of a time-invariant system is a bit more complex when state
is considered, because we must account for the time-shifted state.

## Let N [u, x (t0 )] be the response of system N [] to the input signal u ()

defined over [t0, ) , with initial state x (t0 ) .

Definition:

Then system N [] is linear if and only if, for any two admissible inputs
signals u1 and u2 , and for any scalar k,
k (N [u1, x (t0 )] N [u2 , x (t0 )]) = N [k (u1 u2) , 0 (t0)]

Section 4.1.3

## A system is time-invariant if, t t0 x1 Rn such that

NT [N [u, x (t0)]] = N [NT [u] , x1 (t0 + T )]

x (t0 ) Rn
(32)

(33)

## where 0 () is the zero vector.

For linear systems, the response can be factored into the response due to the
initial state and the response due to the input

## where NT [] is the time delay system:

Student
Exercise

NT (u (t)) = u (t T ) .

(34)

Interpretation:
Eqn(33) says that there exists a possibly different initial condition x1 (t + T )
such that the delayed output of the original system with IC x (t0 ) is identical
to the output of the system with a delayed input and the new IC x1 (t + T ) .
Study question:
For a linear time-invariant system, what is the relationship between x (t0 )
and x1 (t0 + T )

Page 31

Page 32

Section 5.0.0

Section 5.0.0

## For the linear, time-varying, continuous-time system we add (t) as an

argument to the model matrices

The notation for a state model depends on its properties, whether it is linear
or nonlinear, time invariant or time varying, continuous or discrete, etc.
Most general continuous case, nonlinear and time varying
State equation:

(35)

Output equation:

(36)

(41)

## y (t) = C (t) x (t) + D (t) u (t) ;

(42)

The discrete-time system does not have derivatives, the system equation
gives x (k) at the next sample instant:
x (k + 1) = A x (k) + B u (k)

(43)

## y (k) = C x (k) + D u (k)

(44)

all models have x (t0 ) as the IC. Also, see DeCarlo example, Eqn (1.24).
For the nonlinear, time invariant system, time is no longer an argument of
f () and g ()

## And if the discrete-time system is time-varying, A, B, C, D become

functions of sample:

State equation:

(37)

(45)

Output equation:

(38)

(46)

## y (t) = C x (t) + D u (t) ;

(40)

nx1

u(t)
+

B
nxm

n: number of states
m: number of inputs
p: number of outputs

x(t)

px1

y(t)
=

C
pxn

u(t)
+

D
pxm

mx1

Output equation:

(39)

A
nxn

nx1

State equation:

x(t)

mx1

.
x(t)
nx1

## The linear, time-invariant, continuous-time system we have seen (and will

be the one we most commonly use)

## Figure 12: Configuration of the signals (vectors) and parameter (matrices) of a

state variable model.
Part 5: Models of Linear Systems

Page 33

Page 34

Section 6.0.0

linear system)

Section 6.0.0

## 3. Write the differential equations in state-variable form

(a) Higher order differential equations are written as a chain of first-order
equations

## 1. Write the relevant relations for the system

(b) Put derivative in the left-hand side, these must be the state derivatives
(a) Define symbols for the signals and parameters

(c) All signals on the right-hand side must be expressed in terms of the states
and inputs

## i. Constituent relations (for elements)

ii. Continuity constraints (for how elements are linked into a system)

4. Write the equation of the output signal (or signals) using the states and
inputs

(c) Record the units, verify that units balance in the equations
5. Check units throughout, to verify correctness.
The equations express laws of physics, the units must balance

Essential things to keep in mind:

## (a) Determine the system order

The system order will almost always be sum of the orders of the
contributing differential equations
Rarely, differential equations may be inter-dependent in a way that
reduces the order
(b) Select the state variables

Signals are functions of time and change when the input (signals) change.
Parameters are generally constant (or slowly varying) and are
properties of the system.
Both have physical units
2. Pay special attention to the states

of the system

## The choice is not unique

Often the storage variables are a good choice (often called physical
coordinates)
Part 5: Models of Linear Systems

Page 35

## With experience, it is usually pretty straight-forward to determine the

states.
Part 5: Models of Linear Systems

Page 36

Section 7.1.0

Section 7.1.0

## (b) Write the equations

i. Constituent relations

## 7.1 Example 1, an electrical circuit

v1(t)
Is(t)

d
Vc (t)
dt 1
d
VL1 (t) = L1 IL1 (t)
dt

IR1(t) VR1(t)
+
R1

Ic1 (t) = C1

IL1(t) VL1(t)
v2(t)
+
Ic1(t)
L1
+
+
VIs(t)
Vc1(t) C1

IR2(t)
+
VR2(t)

R2

## Figure 13: Electrical circuit.

Sum of the voltages around a loop, + if you enter at the + terminal
Sum of the currents entering a node

## 1. Write the relevant relations for the system

(c) Record the units, verify that units balance in the equations

Signals
Is (t)
IR1 (t)

Sup. current

[Amps]

Vs (t)

Sup. voltage

Parameters

[volts]

R1 voltage

[volts]

R1 , R2

Resistance

VR2 (t)

R2 voltage

[volts]

C1

Capacitance

Vc1 (t)

C1 voltage

[volts]

L1

Inductance

VL1 (t)

L1 voltage

[volts]

R1 current

[Amps]

VR1 (t)

IR2 (t)

R2 current

[Amps]

Ic1 (t)

C1 current

[Amps]

IL1 (t)

L1 current

[Amps]

i
volts
amp
h
i
amp-sec
h volt i
volt-sec
amp

## Table 4: Signals and parameters of the electrical circuit.

Ic1 (t) = C1

d
Vc (t)
dt 1

[amps] =

VL1 (t) = L1

d
IL (t)
dt 1

[volts] =

h amp-sec i  volt 
volt
second


volt-sec h amp i
amp

second

## 2. Identify the differential equations

d
Vc (t)
dt 1
d
VL1 (t) = L1 IL1 (t)
dt
Ic1 (t) = C1

Page 37

Page 38

Section 7.1.0

## EE/ME 701: Advanced Linear Systems

3. Write the differential equations in state-model form

## 1 Cap + 1 Inductor, 2nd Order

(a) Higher order differential eqns are written as a chain of first-order eqns.

## (b) Select the state variables

(b) Put derivative in the left-hand side, these must be the state derivatives

## For this RLC circuit there is a clear choice:

Vc1 (t)

x (t) =

IL1 (t)

x2 (t) =

VR1 (t)
IR2 (t)

x3 (t) =

1
d
Ic (t)
Vc (t) =
dt 1
C1 1
1
d
VL (t)
IL (t) =
dt
L1 1

(47)

VIs (t)
VR2 (t)

(48)
(49)

This step shows why Vc1 (t) and IL1 (t) are natural choices for the
states.

## Other possible choices

Section 7.1.0

(c) All signals on the right-hand side must be expressed in terms of the states
and inputs

Later, we will see how to convert the state model into a state model
with any of these state vectors.
If we wanted a model based on x3 (t), it is probably easiest to derive
the model based on the physical coordinates, Eqn (47), and then
make a change of basis to transform the model to x3 (t).
Illegal choices

In Eqn (48) we need to express Ic1 (t) in terms of {Vc1 (t) , IL1 (t) , Is (t)}
This involves using the constituent and continuity equations that
describe the system.
From

=
=

(50)

x4 (t) =

VR1 (t)
IR1 (t)

, x5 (t) =

Vc1 (t)
VR2 (t)

## This is the needed form. For VL1 (t)

VL1 (t) = VR1 (t) = R1 IR1 (t) = R1 (Is (t) IL1 (t))
Using Eqns (50) and (51) in Eqns (48) and (49)

x6 (t) =

Vc1 (t)
Is (t)

## (Revised: Sep 10, 2012)

(51)

Page 39

d
1
1
1
(Is Vc1 (t) /R2) =
Vc (t) + 0 IL1 (t) + Is (t)
Vc (t) =
dt 1
C1
C1 R2 1
C1
R1
R1
1
d
R1 (Is (t) IL1 (t)) = 0Vc1 (t) IL1 (t) + Is (t)
IL (t) =
dt
L1
L1
L1
Part 5: Models of Linear Systems
(Revised: Sep 10, 2012) Page 40

Section 7.1.0

## (d) Put the model in state-variable form

x (t) = A x (t) + B u (t)

d Vc1 (t) C11R2 0 Vc1 (t)
=
+
x (t) =
dt IL (t)
0
RL11
IL1 (t)
1

1
C1
R1
L1

[volts/sec]
[volts]

, x (t) :
x (t) :
[amps/sec ]
[amps ]

(52)

Is (t)


sec1

A:
 1 ,

sec

x (t) =

Vc1 (t)
IL1 (t)

## u (t) = [Is (t)] ,

A=

C11R2
0

0
RL11

4. Write the equation of the output signal (or signals) using the states and
inputs
Suppose the output were VR1 (t). In Eqn (51) we have already derived
that VR1 (t) = R1 (Is (t) IL1 (t)), so
y (t) =

0 R1

Vc1 (t)
IL1 (t)

## y (t) = C x (t) + D u (t)

This is the output equation, with
y (t) = [VR1 (t)] , C =

0 R1

## Units in the output equation are given as:



h h
i i [volts]
volts
+ volts [amps]
[volts] = amp
amp
[amps ]
Units check in the output equation.

We need to find a way to express Vs (t) in terms of the states and inputs,
{Vc1 (t) , IL1 (t) , Is (t)}. Going back to the original equations,

, D = [R1 ]



h
i

volt

sec1
[volts/sec]
[volts]
=

+ amp-sec
 1
 1 [amps]

sec
[amps/sec ]
[amps ]
sec

## y (t) = [Vs (t)]

Notice, the output is made up of a signal, and the output and feed-forward
matrices are made up of parameters.
Part 5: Models of Linear Systems

1
C1
R1
L1

i
volt
amp-sec

B:
 1
sec
h

## Alternative output, suppose we wanted to determine Vs (t), for example to

calculate the supply power Ps (t) = Vs (t) Is (t). The output will be Vs (t):

+ [R1 ] Is (t)

B=

Notice, the state and input are made up of signals, and the system and
input matrices are made up of parameters.

Section 7.1.0

Page 41

Vs (t) = VR1 (t) +Vc1 (t) = R1 (Is (t) IL1 (t)) +Vc1 (t)

(53)

## This gives Vs (t) in terms of the states and inputs !

Part 5: Models of Linear Systems

Page 42

Section 7.1.1

Section 7.1.1

## Build the Matlab state-variable model object

y (t) = C x (t) + D u (t)

## >> SSmodel = ss(A, B, C, D)

(54)

a =

with
C =

1 R1

, D = [R1 ]

x1
x2

x1
-500
0

y1

x1
0

x2
0
-1000

b =
x1
x2

(55)
c =

A deep property of state models: just as with Eqns (54) and (55) we can find
a row for the C and D matrices to give any signal in the system.

x2
-100

## >> Poles = pole(SSmodel)

Poles = -1000
-500
>> Zeros = zero(SSmodel)
Zeros = -500
0

d =

u1
y1 100
Continuous-time model.

## 7.1.1 Building and exercising the circuit model

To build the model, build the A , B , C and D matrices:

>>
>>
>>
>>

%%
R1
R2
L1
C1

= 100
%% Ohms
= 200
%% Ohms
= 0.1
%% Henries
= 10e-6

## >> figure(1), clf; step(SSmodel); print(-deps2c, CircuitAStepResponse)

>> figure(2), clf; bode(SSmodel); print(-deps2c, CircuitAFreqResponse)

Bode Diagram

Step Response

40

100

30
Magnitude (dB)

90
80

## %% Build the state equation

>> A = [ -1/(C1*R2), 0; 0, -(R1/L1) ]
A = -500
0
0
-1000

Amplitude

70

10
0

50

10
90

Phase (deg)

40
30

## %% Build the output Eqn

>> C = [ 0, -R1]
C =
0 -100
>> D = [ R1 ]
D = 100
(Revised: Sep 10, 2012)

20

60

20

B = 1.0e+05 *
1.0000
0.0100

u1
1e+05
1000

Page 43

45

10
0

3
Time (sec)

6
3

x 10

0
1
10

10

10

10

10

Page 44

Section 7.2.0

## See tables 5 and 6.

(b) Write the equations
m2

zv(t)
vcar

ks

bs

m1

zw(t)
kw

r(t)

## Figure 15: Quarter vehicle suspension.

Parameters

m1
kw

wheel mass

[kg]

ks

i
Newtons
h meter i
Newtons
Suspension damping meter/sec
Suspension stiffness

[kg]
bs
i
Newtons
tire stiffness
meter
Table 5: Parameters of the mechanical quarter suspension.
h

F (t) = m x (t)

(56)

Hooks law:

(57)

Damper Eqn:

## F (t) = b (x1 (t) x2 (t))

(58)

ii. Continuity constraints (for how elements are linked into a system)

Inerial Reference

Newtons 2nd law:

## It is desired to model the system

and determine the response zv (t)

Section 7.2.0

## The vehicle is moving across a

road, with the surface profile given
by r (t).

## Sum of the forces acting on any free body is zero

Sum of the velocities around any loop is zero
(c) Record the units, verify that units balance in the equations
See tables 5 and 6.
2. Identify the differential equations
The differential Eqns come from Eqns (56) and (58)

Signals
zv (t)
Height of vehicle
[m]
zw (t) Height for tire center [m]

## Table 6: Signals of the mechanical quarter suspension.

1. Write the relevant relations for the system
Part 5: Models of Linear Systems

Page 45

Page 46

Section 7.2.0

## EE/ME 701: Advanced Linear Systems

Model equations (repeated)

## (a) Determine the system order

We get two 2nd order differential equations:
m1 zw (t) =
m2 zv (t) =

Section 7.2.0

## m1 zw (t) = kw (r (t) zw (t)) ks (zw (t) zv (t)) bs (zw (lt) zv (t))

m2 zv (t) = +ks (zw (t) zv (t)) + bs (zw (t) zv (t))

Forces on wheel
Forces on 1/4 vehicle

(59)

(60)

zw (t) =

mass

zw (t)

zw (t)

x (t) =

zv (t)

zv (t)

[m]

[m/sec]

[m]

[m/sec]

zw (t)

h

h

## zw (t) = (kw + ks ) /m1 bs /m1 +ks /m1

zv (t) =

i
zw (t)

+ [__] r (t)
__ __ __ __

zv (t)

zv (t)

(61)

Page 47

zw (t)

zw (t)
d x (t)
=
x (t) =

dt
zv (t)

zv (t)

- The system order will almost always be sum of the orders of the
contributing differential equations -

equations.

(63)

## 3. Write the differential equations in state-variable form

(62)

zw (t)

i
zw (t)
+[kw ] r (t)
+bs /m1

zv (t)

zv (t)
(64)

z (t)
w

i
zw (t)
+  r (t)
bs /m2

zv (t)

zv (t)
(65)
(Revised: Sep 10, 2012)

Page 48

## EE/ME 701: Advanced Linear Systems

Section 7.2.0

Eqns (64) and (65) gives two of the rows of the A matrix.

zw (t) =

zw (t)
d x (t)

=
x (t) =

dt
zv (t)

zv (t)

zw (t)

zw (t)

i
zw (t)

+ [__] r (t)
__ __ __ __

zv (t)

zv (t)

## zw (t) is an element of the state vector, we just hook it up !

zw (t) =

Likewise

z (t)

i
zw (t)
+  r (t)

0 1 0 0

zv (t)

zv (t)

zw (t)

h
i
zw (t)
+  r (t)
zv (t) = 0 0 0 1

zv (t)

zv (t)
Part 5: Models of Linear Systems

Page 49

Section 7.2.0

## Putting the pieces together

x (t) = A x (t) + Bu (t)

0
1
0
0

(kw +ks )
bs
ks

m1 + m1 + mbs1
m1
x (t) =

0
0
0
1

ks
bs
ks
+ m2 m2 mbs2
+ m2

(66)

0
z (t)

zw (t) mkw
+ 1 r (t)

zv (t) 0

zv (t)
0

4. Write the equation of the output signal (or signals) using the states and
inputs
If the output of interest is the vehicle response

y1 (t) = zv (t) =

## EE/ME 701: Advanced Linear Systems

zw (t)

i
zw (t)
+  r (t)
0 0 1 0

zv (t)

zv (t)

(67)

Suppose the desired output is the force in the suspension spring. Since
Fs (t) = ks (zw (t) zv (t))
The output equation is given as:

y2 (t) = Fs (t) =

## Part 5: Models of Linear Systems

ks 0 ks

zw (t)

i
zw (t)
+  r (t)

zv (t)

zv (t)
(Revised: Sep 10, 2012)

(68)

Page 50

## EE/ME 701: Advanced Linear Systems

Section 7.2.0

Suppose the desired output is the road force on the tire. Since

Section 7.2.0

## 5. Check units throughout, to verify correctness.



Noting that 1.0 [Newton]=1.0 kg m/sec2

## The output equation is given as:

y3 (t) = Fw (t) =

kw

zw (t)

kw
:
m1

i
zw (t)
+ [kw ] r (t)
0 0 0

zv (t)

zv (t)

(69)

z (t)
v

y (t) = Fs (t)

Fw (t)

zw (t)
0 0 1 0

zw (t)
= ks 0 ks 0

zv (t)
kw 0 0 0
zv (t)

(70)

kg-m 1
sec2 m

kg-m 1
sec2 m/s





1
kg
1
kg




=
=

i
h
sec2
i
h
sec1

## Now suppose the desired output is all three

bs
:
m2

0

+ 0
r (t)

kw

[]

[]
[]
[m/sec]



  1  2  1 

sec2
m/sec2
sec
sec
sec
=
x (t) :

[]
[m/sec]
[]
[]





  1  2  1 
2
2
sec
sec
sec
m/sec
sec

[]



sec2
[m]
+

[]

[]

Page 51

[m/sec]

[m]

[m/sec]

Units check !

[m]

Page 52

Section 7.2.1

## EE/ME 701: Advanced Linear Systems

Examine the poles and zeros and response

## 7.2.1 Building and exercising the suspension model

Build the model
= 10000
= 2500
= 10000
= 25
= 250

%%
%%
%%
%%
%%

N/m
N/m
N/m/s
kg
kg

>> A = [ 0
1
0
0
-(kw+ks)/m1 -bs/m1
ks/m1
bs/m1
0
0
0
1
ks/m2
bs/m2 -ks/m2 -bs/m2 ]
>> B = [ 0; kw/m1; 0 ; 0]
>> C1 = [ 0, 0, 1, 0];
D1 = 
>> SSmodel2a = ss(A, B, C1, D1)

%% Fast mode
%% Lightly damped mode
%% Slow mode

Bode Diagram

Step Response

50

1.8
Magnitude (dB)

kw
ks
bs
m1
m2

## >> Poles = pole(SSmodel2a)

Poles =
-438.92
-0.41 + 6.00i
-0.41 - 6.00i
-0.25
>> Zeros = zero(SSmodel2a)
Zeros = -0.2500

1.6

Amplitude

1.4

50
100

1.2

150

200
0

0.8
Phase (deg)

>>
>>
>>
>>
>>

Section 7.2.1

0.6
0.4

90

180

0.2
0

a =

b =
x1
x2
x3
x4

x1
0
-500
0
10

x2
1
-400
0
40

x3
0
100
0
-10

c =

x4
0
400
1
-40

x1
x2
x3
x4

u1
0
400
0
0

d =
y1

x1
0

x2
0

x3
1

x4
0

y1

u1
0

6
Time (sec)

10

12

14

270
1
10

10

10
10

10

10

## The shock absorber is not doing its job !

Examine the damping of the modes
>> [Wn, rho] = damp(SSmodel2a)
Wn = 0.2516
6.0186
6.0186
438.9211

rho =

1.0000
0.0687
0.0687
1.0000

## Mode with damping of 0.07 is too lightly damped.

Continuous-time model.
Part 5: Models of Linear Systems

Page 53

Page 54

Section 7.2.1

>>
>>
>>
>>
>>
>>

%%
kw
ks
bs
m1
m2

2nd parameter
= 10000
%%
= 1000
%%
= 1000
%%
= 25
%%
= 250
%%

set
N/m
N/m
N/m/s
kg
kg

## >> Poles = pole(SSmodel2b)

Poles = -31.4478
-5.4731 + 1.3136i
-5.4731 - 1.3136i
-1.6059
>> Zeros = zero(SSmodel2b)
Zeros = -1.0000

%% Softer spring
%% Softer shock absorber

x1
x2
x3
x4

x1
0
-440
0
4

x2
1
-40
0
4

x3
0
40
0
-4

x4
0
40
1
-4

%% Fast Mode
%% Oscillatory mode,
%%
now alpha>omega
%% Slow mode

The slow mode is faster, the fast mode is much slower, and the oscillatory
mode is better damped

>> A = [ 0
1
0
0
-(kw+ks)/m1 -bs/m1
ks/m1
bs/m1
0
0
0
1
ks/m2
bs/m2 -ks/m2 -bs/m2 ]
>> B = [ 0; kw/m1; 0 ; 0]
>> C1 = [ 0, 0, 1, 0]; D1 = 
>> SSmodel2b = ss(A, B, C1, D1)
a =

Section 7.2.1

## Compute the damping factors

>> [Wn, Z] = DAMP(SSmodel2b)
Wn = 1.6059
5.6286
5.6286
31.4478

b =
x1
x2
x3
x4

Z =

1.0000
0.9724
0.9724
1.0000

u1
0
400
0
0

## Examine the step and frequency response

Bode Diagram

Step Response

50

1.4

x3
1

x4
0

d =
y1

u1
0

50

100

Continuous-time model.

0.8

150
0
0.6

Phase (deg)

x2
0

Amplitude

y1

x1
0

Magnitude (dB)

1.2

c =

0.4

0.2

0.5

1.5
Time (sec)

2.5

3.5

90

180

270
1
10

10

10

10

10

## Figure 17: Suspension step and frequency response.

Part 5: Models of Linear Systems

Page 55

Page 56

Section 7.2.1

## EE/ME 701: Advanced Linear Systems

The model now has 3 outputs

## >> TFmodel2b = tf(SSmodel2b)

r(t)

A, B, C, D

Section 7.2.1

zv(t)
Fs(t)
y(t)

Transfer function:
1600 s + 1600
-------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600

x(t)

Fw(t)

n: states
m: inputs
p: outputs
mxp: TFs

## Figure 18: Block diagram of quarter suspension system reflecting 3 outputs.

Building the state-variable model and computing the transfer function
may be the easiest way to get the TF of a complex system.
Lets consider multiple outputs, from Eqn (70).
The C and D matrices change.
Output Eqn: y (t) = C x (t)+D u (t)
>> C2c = [ 0 0
1 0
ks 0 -ks 0
-kw 0
0 0 ];
>> D2c = [ 0; 0; kw]
>> SSmodel2c = ss(A, B, C2c, D2c)
a =
x1
x2
x3
x4
c =
y1
y2
y3

x1
0
-440
0
4

x1
0
1000
-1e+04

x2
1
-40
0
4

x3
0
40
0
-4

x2
0
0
0

x3
1
-1000
0

x4
0
40
1
-4

b =
x1
x2
x3
x4
x4
0
0
0

u1
0
400
0
0

d =
y1
y2
y3

u1
0
0
1e+04

Continuous-time model.
Part 5: Models of Linear Systems

Page 57

## Examine the transfer function and step response

>> TFmodel2c = tf(SSmodel2c)
Transfer function from input to output...
1600 s + 1600
#1: -------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600

#2:

## 4e05 s^2 - 3.207e-10 s + 2.754e-09

-------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600

#3:

## 10000 s^4 + 4.4e05 s^3 + 4.4e05 s^2 + 6.487e-10 s + 1.913e-10

------------------------------------------------------------s^4 + 44 s^3 + 444 s^2 + 1600 s + 1600

## Each input/output pair (in this case 1x3) gives a TF.

The zeros and gain are different between the TFs
The poles are the same
Note: the coefficients of 109 1010 in TFs 2 and 3 are due to round-off
error. These TFs have a double zero at the origin.
Part 5: Models of Linear Systems

Page 58

## EE/ME 701: Advanced Linear Systems

Section 7.2.2

equation
Given a differential equation, such as

Step Response

1.5

0.5
0

To: Out(2)

(71)

T (s) =

1000
Amplitude

Section 7.3.0

## >> figure(1), clf;

>> step(SSmodel2c);
>> print(-deps2c, Mechanism2StepResponsec)

To: Out(1)

## EE/ME 701: Advanced Linear Systems

b2 s2 + b1 s + b0
3
a3 s + a2 s2 + a1 s + a0

(72)

500

## How do we construct a state-variable model ?

A. In Matlab
500

>> num = [b2, b1, b0]; den = [a3, a2, a1, a0]
>> TFmodel = tf(num, den);
>> SSmodel = ss(TFmodel)

To: Out(3)

10000
5000
0

5000

0.5

1.5

2
Time (sec)

2.5

3.5

## There are 4 canonical forms. The canonical forms have a strong

relationship to properties called controllability and observability, and
we will see all four (cf. Bay chapter 8).
Here we consider Controllable Canonical Form
(cf. Bay section 1.1.3)

## 7.2.2 Conclusions, Quarter Suspension Example

First, the Diff Eq must be monic. That means that the an coefficient is 1.0

## Divide through Eqn (71) (or 72) by a3

It is straight forward to
test various parameter configurations

## obtain a transfer function

State-variable model naturally represents systems with multiple outputs
Part 5: Models of Linear Systems

Page 59

## With a2 = a2 /a3 , a1 = a1 /a3 , ... , b2 = b2 /a3 , ... , b0 = b0 /a3

Part 5: Models of Linear Systems

Page 60

Section 7.3.1

Section 7.3.1

(74, repeated)

## 1. Write the relevant relations for the system

2. Identify the differential equations
(a) Define symbols for the signals and parameters.

## (a) Determine the system order:

Parameters
h
i
 1
y
Signals
a2 Coefficient sec
b2 Coefficient
u sec
h
i


y (t) Output y
y
a1 Coefficient sec2 b1 Coefficient
2
u (t) Input u
h u sec i


y
a0 Coefficient sec3 b0 Coefficient
u sec3
Table 7: Signals and Parameters for the Differential Equation.
Units result from

dividing through by a3 , which has units of sec3 .

3rd order
(b) Select the state variables
Considering Eqn 74, select the state variables to be the output variable
and its derivatives up to the n 1 derivative

## (b) Write the equations

To model a differential equation in Controllable Canonical form,
break the TF into two components, with the denominator first.
u(t)

y(t)

B(s)
A(s)

(a) Original TF

u(t)

1
A(s)

z(t)

(76)

## The state vector is said to be in phase-variable form.

The derivative of the last phase variable is given by Eqn (74), re-written
as Eqn (77) (cf. Bay Eqn (1.25))

## 1.0z(t) = a2 z (t) a1 z (t) a0 z (t) + 1.0 u (t)

(74)
(75)

x1 (t)

x (t) = x2 (t)

x3 (t)

(c) Record the units, verify that units balance in the equations
See table 7.
(Revised: Sep 10, 2012)

x (t) = z (t)

z(t)

z (t)

y(t)
B(s)

z (t)

x (t) = z (t)
,

z (t)

Page 61

z (t)

= z (t)

z(t)

= 0
0
1

a0 a1 a2

z (t)

z (t)

z (t)

(77)

+ 0 u (t)

1
(78)

Page 62

Section 7.3.1

## y (t) = b2 z (t) + b1 z (t) + b0 z (t)

(75, repeated)

4. Write the equation of the output signal (or signals) using the states and
inputs

z (t)

h
i

(79)

z (t)

## EE/ME 701: Advanced Linear Systems

5. Check units throughout, to verify correctness (continued)
Putting the units together:


u-sec1



x (t) : u-sec2


u-sec3

(Note: Bays development includes a b3 term. Eqn (1.25) show how the b3
term is incorporated).

## 5. Check units throughout, to verify correctness.

From Eqn (74), z (t) has units of [u], so the phase-variable state vector
has units

z (t)

x (t) = z (t)

z (t)

[u]



x (t) : u-sec1


u-sec2

[]
1
[]

A : []
[]
1

  2   1
3
sec
sec
sec
C:

h

y
u

i h

y-sec
u

i 

y-sec2
u



[]

B : []


sec3


y
u

y
u

i h

[]

[]

[u]

= []
u-sec1
[]
1



  2  1

u-sec2
sec
sec
sec3

[]

+ [] [u]


sec3
y-sec
u

i 

[u]





y-sec2
u-sec1
u


u-sec2



## (Revised: Sep 10, 2012)

Page 63

>> a2 = 5; a1 = 7; a0=8;
>> b2= -2; b1 = 3; b0 = 4;
>> A = [ 0
1
0 ;
0
0
1;
-a0, -a1, -a2];
>> B = [ 0;0; 1];
>> C = [ b0, b1, b2];
D = 0;

 y 

+ [u]

D:

y (t) : [y] =

h

Section 7.3.2

Page 64

Section 7.3.2

## EE/ME 701: Advanced Linear Systems

Look at the step and frequency response

x1
x2
x3

x1
0
0
-8

x2
1
0
-7

x3
0
1
-5

b =

## >> figure(2), clf; step(SSmodel3a);

>> figure(3), clf; bode(SSmodel3a);

u1
0
0
1

x1
x2
x3

Bode Diagram
Step Response

10
1.2

0
Magnitude (dB)

a =

y1

x1
4

x2
3

x3
-2

d =

u1
0

y1

0.8

10
20
30

Amplitude

c =

Section 7.4.0

0.6

40
360

Phase (deg)

0.4

0.2

## >> Poles = pole(SSmodel3a) >> Zeros = zero(SSmodel3a)

Poles = -0.6547 + 1.3187iZeros = 2.3508
-0.8508
-0.6547 - 1.3187i
-3.6906
Looking at the pole-zero map (constellation)

0.2

4
5
Time (sec)

270

180

90
1
10

10

10

10

## >> figure(1), pzmap(SSmodel3a);

Analog computers include integrators, gain blocks and summing junctions
PoleZero Map

1.5

b3

y(t)

Imaginary Axis

0.5

b1

b2

b0

u(t)

0.5

.
x3

.
x2

x3

x2

.
x1

x1

1.5
4

-a2

Real Axis

-a1

-a0

a3=1

## Figure 23: Simulation diagram for system in controllable canonical form.

Part 5: Models of Linear Systems

Page 65

Page 66

## EE/ME 701: Advanced Linear Systems

Section 7.4.0

Integrators, gain blocks and summing junctions are built with Op-Amps
C1

Integrators:
Vin(t)

+
iR1(t)

R1
V-(t)
V+(t)

Gain Blocks:

Rf

Vin(t)

Vo(t)

Ra

V+(t)

(105..107),

iRf(t)
+

Vol(t)

V (t)

iRa(t)

## Figure 24: An Op-Amp integrator.

Op Amp gain is very high
negative feedback, so

Section 7.4.0

iC1(t)

## and the circuit is configured with

An Op-Amp gain block is shown in figure 26.

V (t) V + (t)

## And in the circuit of figure 24, V + (t) is wired to ground. So node V

becomes a virtual ground
V (t) 0
Op Amp input currents are very small (109.. 1012 amps), so
IR1 (t) + Ic1 (t) = 0

V0 (t) =

Rf
Vin (t)
Ra

(83)

Summing junctions:
Rf

(80)

## Eqn (80) gives



1
d
Vin (t) V (t) +C1
Vo (t) V (t) = 0
R1
dt

(81)

Va(t) i
Ra
Ra
+
Vb(t) i
Rb

V-(t)

Rb

V+(t)

iRf
+
Vol(t)

## With the properties of the virtual ground, Eqn (82) becomes

Figure 26: An Op-Amp summing junction.

d
1
d
1
Vin (t) +C1 Vo (t) = 0 , Vo (t) =
Vin (t)
R1
dt
dt
R1 C1

## The Op-Amp virtual ground configuration sums the currents at the V

node.

Giving
Vo (t) =
Part 5: Models of Linear Systems

Z t
t0

1
Vin (t) dt
R1 C1
(Revised: Sep 10, 2012)

(82)
Page 67

V0 (t) =
Part 5: Models of Linear Systems


Rf
Rf
Va (t) + Vb (t) + ...
Ra
Rb
(Revised: Sep 10, 2012)

(84)
Page 68

Section 7.4.0

Vo (t) =

Z t
t0

1
Vin (t) dt
R1 C1

## EE/ME 701: Advanced Linear Systems

Section 7.4.0

Returning to the simulation diagram, we can write down the state model
directly from the simulation diagram (and vice-versa)
(82, repeated)
b3

In each of Eqns (82), (83) and (84) the output voltage is inverted relative
to the input voltage.
This is an inherent property of the virtual ground configuration.

b1

b2
u(t)

.
x3

.
x2

x3

x2

y(t)

b0

.
x1

x1

-a2

## 1. Introduce - signs as needed, and invert signals as needed with a gain

blocks, g = 1, or
2. Include a second Op-Amp in each element (Integrator, Gain block,
Summing junction), so that the block is non-inverting.

-a1

-a0

a3=1
(Controllable Canonical Form, Notation follows Bay)

## The output of each integrator is a state

1. Write the relevant relations for the system
(a) Define symbols for the signals and parameters
Signals
u(t)
Input
y (t)
Output
x1 (t) , x2 (t) , x3 (t) States

Parameters
b0..b3 Numerator Coefficients
a0..a2 Denominator Coefficients

Page 69

Page 70

Section 7.4.0

b3

b1

b2
u(t)

.
x3

.
x2

x3

-a2

x2

## EE/ME 701: Advanced Linear Systems

y(t)

.
x1

-a1

y3(t)

b3

b0

Section 7.4.0

b1

b2

x1

u(t)

.
x3

.
x2

x3

-a0

-a2

a3=1

b0

.
x1

x2

y(t)

x1

-a1

-a0

a3=1
(Controllable Canonical Form, Notation follows Bay)

## Figure 29: Simulation diagram for system in controllable canonical form.

3. Write the differential equations in state-variable form

x1 (t) = x2 (t)

(85)

x2 (t) = x3 (t)

(86)

## x3 (t) = a2 x3 (t) a1 x2 (t) a0x1 (t) + u (t)

y (t) = b3 (a2 x3 (t) a1 x2 (t) a0x1 (t) + u (t))

x (t)
1

x (t) = x2 (t)

x3 (t)

(87)

## +b2 x3 (t) + b1 x2 (t) + b0 x1 (t)

(88)

0
1
0

= 0
0
1

a0 a1 a2

x (t)
1

x2 (t)

x3 (t)

0

+ 0 u (t) (89)

1

4. Write the equation of the output signal (or signals) using the states and
inputs (cf. Bay Eqn (1.25))

## 2. Identify the differential equations

(a) Determine the system order:

3rd order

## (b) Select the state variables

The physical coordinates of the system are
the integrator outputs
These are voltages we can observe on an
oscilloscope.
Part 5: Models of Linear Systems

x1 (t)

x (t) = x2 (t)

x3 (t)

y (t) =

x (t)
i 1

x3 (t)

(90)

+ [b3] u (t)

(91)

See example 3.

Page 71

Page 72

Section 8.1.0

## 8 Some basic operations with state-variable models

8.1 Deriving the transfer function from the state-variable
model
It is straight forward to derive the transfer function from a state variable
model.
Starting with the state equation

## x (t) = A x (t) + B u (t)

y (t) = C x (t) + D u (t)

## EE/ME 701: Advanced Linear Systems

Section 8.1.1

8.1.1 InterpretingYthe
(s)transfer function
= C (s I A)1 B+D
U (s)

(96, repeated)

Eqn (96) can be solved symbolically by Cramers rule, to give the symbolic
transfer function.
Recall from basic linear algebra that Cramers rule gives the matrix
inverse as:
1
U 1 =
(97)
det U
Where U is an n n matrix, and adj U is the adjugate of matrix U.
Defining

(98)

(99)

Then

(92)

## Y (s) = C X (s) + D U (s)

(93)

(s I A) X (s) = B U (s)

Examples:

or
X (s) = (s I A)1 B U (s)

(94)

## With (94), Eqn (93) leads to:

Y (s) = C (s I A)1 B U (s) + D U (s)

(95)

If m = 1 (one input) and p = 1 (one output), then Eqn (95) gives the transfer
function:
Y (s)
= C (s I A)1 B + D
(96)
U (s)

## (Revised: Sep 10, 2012)

Mi, j is the i, jth minor of matrix U and is the determinant of the matrix by
removing the ith row and jth column from U.

Page 73

U2 =

a b
c d

a b c

U3 = d e f

g h i

e f
+

h i

..
.

b c

h i

...

Page 74

Section 8.1.1

(s I A)
It follows that

## EE/ME 701: Advanced Linear Systems

Section 8.1.1

Multiplying out

=
det (s I A)

h
1
Y (s)
= 2
U (s) s + 7 s + 10

(100)

Y (s)
1
=
C adj (s I A) B + D
U (s) det (s I A)

(101)

2 s + 19
3s +6

= 28 s + 131
s2 + 7 s + 10

(103)

Today, we wouldnt want to apply Cramers rule by hand for any case larger
than 3x3.
Under the hood, this is how Matlab finds the TF from a state-variable model.

Example

A=

C=

B=

2
3

D = 

>> A = [ -2 3 ; 0 -5] B = [ 2 ; 3 ] C = [ 5
>> SSmodel = ss(A, B, C, D)
>> tf(SSmodel)

Transfer function:

Then

## det (sI A) = det

(s + 2)

(s + 5)

= (s + 2)(s + 5) 0 = s2 + 7 s + 10

(s + 2)

(s + 5)

## So the TF is given by:

h
Y (s)
1
= 2
U (s) s + 7 s + 10

## Part 5: Models of Linear Systems

(s + 5)

(s + 2)

i (s + 5)
3
2
+  (102)
6
0
(s + 2)
3
(Revised: Sep 10, 2012)

Page 75

6 ] D = 0

28 s + 131
-------------s^2 + 7 s + 10

Note:
The denominator is given by det (s I A), showing the the poles are the
eigenvalues of the A matrix.
The B, C, and D matrices play no role in determining the poles, and thus
the stability
rows in C) the term
C adj (s I A) B
would give an array of numerator polynomials, one for each input/output
pair.
Part 5: Models of Linear Systems

Page 76

Section 8.1.3

s j 0

Section 8.2.0

KDC = lim

## Given a state-variable model with state vector x (t)

Y (s)
= lim C (s I A)1 B + D = C (A)1 B + (D104)
U (s) s j 0
C A1 B + D (105)

(106)

(107)

## And an invertible transformation matrix T giving a new state vector z (t)

8.1.3 Interpreting D

z (t) = T x (t)

strictly proper.

(108)

## We can derive a new state model based on state vector z (t).

If D =0 , then

We can say that we have transformed the system from the coordinate
system of x (t) to the coordinate system of z (t).

## the number of zeros equals the number of poles.

If D = 0 , we call the system strictly proper

## Derivation of the transformation is straight forward. From Eqn (108) we

can solve for x (t)

## If D 6= 0 , we call the system is proper, but not strictly proper

x (t) = T 1 z (t)
x (t) = T

z (t)

(109)
(110)

Plugging (108) and (109) into Eqns (106) and (107) gives
T 1 z (t) = A T 1 z (t) + B u (t)

(111)

## From (111) we can write

z (t)

T A T 1 z (t) + T B u (t)

y (t) = C T
Part 5: Models of Linear Systems

Page 77

## Part 5: Models of Linear Systems

(112)

z (t) + D u (t)
(Revised: Sep 10, 2012)

Page 78

Section 8.2.0

## Equation (115) is a similarity transform.

b z (t) + Bb u (t)
z (t) = A

b = T A T 1
A

(113)

with

Section 8.2.0

(114)

## A similarity transform preserves eigenvalues,

b = T A T 1
A

(115)

Cb = C T 1

(117)

Bb = T B

 
b
eig (A) = eig A
The poles of system II are the same as the poles of the original system.

(116)

D : unchanged

## Coordinate transformation is very powerful

(118)

We can convert a given model with x (t) to an equivalent model with z (t)
by choice of any invertible matrix T

## The input and output are unchanged

Note that the D matrix, which directly couples u (t) to y (t), is unchanged
Only the internal representation of the system is changed

u(t)

A, B, C, D

y(t)

u(t)

A, B, C, D

x(t)

y(t)
z(t)

## Figure 30: Block diagrams of original linear state-variable system and

transformed system.

Page 79

Page 80

Section 8.2.1

## EE/ME 701: Advanced Linear Systems

Suppose we were interested in the suspension deflection

## 8.2.1 Example coordinate transformation

Consider the example of section 7.2.
m2

## zs (t) = zw (t) zv (t)

zv(t)

If our interest was such that we wanted a state model with zs (t) and zs (t)
directly as states, we could introduce the transformation

vcar
ks

bs

m1

zs (t)

kw
r(t)

Inerial Reference

zw (t)
,
x (t) =

zv (t)

zv (t)

1 0 1

x (t) =

y1 (t) = zv (t) =

(kw +ks )
m1

mbs1 + mks1

+ mks2

+ mbs2

mks2

zw (t)

kw
m
+ mbs1
x (t) + 1 r (t)

0
1

0
mbs2

1
0

zw (t)

1 zw (t)
= T x (t)

0 zv (t)

zv (t)
1

(120)

(119)

a =
x1
x2
x3
x4
c =
y1

Page 81

1 0 1
0
1
0

i
zw (t)
+  r (t)

0 0 1 0

zv (t)

zv (t)

0 1
T =

0 0

0 0

zw (t)

zs (t) 0 1

=
z (t) =

zv (t) 0 0

zv (t)
0 0

zw(t)

Section 8.2.1

x1
0
-440
0
4
x1 x2
0
0

x2
x3
1
0
-40
40
0
0
4
-4
x3 x4
1
0

x4
0
40
1
-4

b =
x1
x2
x3
x4
d =
y1

u1
0
400
0
0
u1
0

Page 82

Section 8.2.1

## Introduce the transformation

>> T = [ 1 0 -1 0 ; 0 1 0 -1 ; 0 0 1 0 ; 0 0 0 1]
T =
1
0
-1
0
0
1
0
-1
0
0
1
0
0
0
0
1

## Put in State Feedback control, control signal u (t) is given as:

u (t) = K x (t) + N f r (t)

x1
x2
x3
x4
c =
y1

x1
0
-444
0
4
x1 x2
0
0

x2
x3
1
0
-44 -400
0
0
4
0
x3 x4
1
0

x4
0
0
1
0

Input
u(t)

b =

u1
0
400
0
0
u1
0

x1
x2
x3
x4
d =
y1

## Ap, Bp, Cp, Dp

System

Output
y(t)
x(t)

Figure 33: State-variable model of the open-loop system. This is the plant before
feedback control is applied, u (t) as input, and y (t) as output.
r(t)

r(t)

Nf

u(t)

y(t)

## Ap, Bp, Cp, Dp

x(t)

x(t)

-K

Bode Diagram

Step Response

(121)

The control signal depends on the state vector and a reference input.

## >> Ahat = T * A * inv(T);

Bhat = T * B
>> Chat = C1 * inv(T);
Dhat = D1
>> SSmodelHat = ss(Ahat, Bhat, Chat, Dhat)
a =

Section 9.0.0

50

1.4

New System
0
Magnitude (dB)

1.2

Figure 34: State-variable model of the closed-loop system with feed-forward gain
in input. The closed-loop system has r (t) as input and y (t) as output.

50

Amplitude

100
0.8

150
0

Phase (deg)

0.6

0.4

0.2

r(t)

90

## Acl, Bcl, Ccl, Dcl

New System

0.5

1.5
Time (sec)

2.5

3.5

270
1
10

10

10

10

Of course, the step and frequency response are unchanged. The model
transformation changes only the internal representation of the system.
(Revised: Sep 10, 2012)

x(t)

10

## Part 5: Models of Linear Systems

y(t)

180

Page 83

Figure 35: State-space model of the closed-loop system, with r (t)as input and
y (t) as output.
Feedback control fundamentally transforms the system. Changing the statevariable model from figure 33 to 35.
Part 5: Models of Linear Systems

Page 84

Section 9.1.0

Section 9.1.0

## To determine the state-variable model of the system with feedback, start

with the open-loop model (figure 33)

## y (t) = C p x (t) + D p u (t) = C p x (t) D p K x (t) + D p N f r (t)

= (C p D p K) x (t) + D p N f r (t)

(16, repeated)

## y (t) = C p x (t) + D p u (t)

(17, repeated)

So we can write
y (t) = Ccl x (t) + Dcl r (t)

(125)

Ccl = C p D p K

(126)

Dcl = D p N f

(127)

with

## Plugging the control law

u (t) = K x (t) + N f r (t)

(121, repeated)

## Eqns (122)-(127) describe how we determine the state-variable model of the

system with feedback control.

## into the state equation, we find

x (t) = A p x (t) + B p u (t) = A p x (t) B p K x (t) + B p N f r (t)

## State feedback control is fundamentally different from single-loop,

compensated feedback:

= (A p B p K) x (t) + B p N f r (t)

So we can write

## There is no compensator transfer function Gc (s)

x (t) = Acl x (t) + Bcl r (t)

(122)

Acl = A p B p K

(123)

Bcl = B p N f

(124)

inputs.

with

Page 85

Page 86

Section 9.2.0

## 9.2 State-variable feedback example: Inverted pendulum

An inverted pendulum is a mechanism comprising a cart and pendulum that
holds the pendulum upright by feedback control. As illustrated in figure 36.
The system is open-loop unstable

Section 9.2.0

## Linearizing the equations about the operating point

0 = 0 , 0 = 0
For simplification, defining
p = I (M + m) + M m l 2

(t)
Ff=B z(t)

z (t)

0
Ap =

z (t)
,
x (t) =

(t)

(t)

z(t)
F(t)

## Figure 36: Mechanical schematic of an inverted pendulum.

From Lagranges equations, the equations of motion for the inverted
pendulum:
(M + m) z (t) + b z (t) + m l (t) cos ( (t)) m l 2 (t) sin ( (t)) = F (t)(128)


I + m l 2 (t) + m l g sin ( (t)) + m l z (t) cos ( (t)) = 0 (129)

(I+m l 2 )

p
Bp =

ml
p

Cp =

(I+m l 2 ) b
p

( m2 g l 2 )

mpl b

m g l (M+m)
p

1 0 0 0
0 0 1 0

Dp =

0
0

Parameters

Mass of cart

[kg]

z (t)

Cart position

[m]

Mass of pendulum

[kg]

(t)

Pendulum Angle

[deg]

Length of Pendulum

F (t)

Applied force

[N]

Friction coef.

Inertial of pendulum

g = 9.8

Accel. of gravity

[m]


kg m


2

[m/s2 ]

N
m/s

Page 87

Page 88

Section 9.2.0

Example Data:

## 9.2.1 Designing a pole placement controller

M = 0.5;
i = 0.006;

m = 0.2;
g = 9.8;

>>

p = i*(M+m)+M*m*l^2,

>>

Ap = [

0
0
0
0

Ap =

0
0
0
0

b = 0.1;
l = 0.3;

p =

1
-(i+m*l^2)*b/p
0
-(m*l*b)/p

1.0000
-0.1818
0
-0.4545

>> Bp = [0;
(i+m*l^2)/p;
0;
m*l/p]
Bp =

Section 9.2.1

0
1.8182
0
4.5455

0.0132

0
(m^2*g*l^2)/p
0
m*g*l*(M+m)/p

0
2.6727
0
31.1818

## >> Sdesired = [ -3 -4 -5 -6]

0;
0;
1;
0]

0
0
1.0000
0

1
0

7.0310

>> Acl = Ap - Bp * K
Acl =
0
1.0000
0
14.6939
13.9592 -63.2776
0
0
0
36.7347
34.8980 -133.6939

0
-12.7837
1.0000
-31.9592

>> Nf = 1/(Cp*inv(-Acl)*Bp)
Nf =
-8.0816
0

>> Cp = [ 1 0 0 0 ;
0 0 1 0 ]
Cp =

K =
-8.0816
-7.7776
36.2727

0
0

0
1

0
0

## >> SScl = ss(Acl, Nf(1)*Bp, Cp, Dp)

a =
x1
x2
x3
x4
x1
0
1
0
0
x2
14.69
13.96 -63.28 -12.78
x3
0
0
0
1
x4
36.73
34.9 -133.7 -31.96
b =

>> Dp = [ 0 ; 0 ]
Dp =
0
0

x1
x2
x3
x4

u1
0
-14.69
0
-36.73

Continuous-time model.

Page 89

Page 90

## EE/ME 701: Advanced Linear Systems

Section 9.2.1

With linear time-invariant systems there is a very nice optimal control result.
Consider the cost function

Step Response
1

0.5

J=

(130)

t=0

0
Amplitude

Section 9.2.2

>> step(SScl)

To: Out(1)

0.5

0.4

To: Out(2)

0.2

0.2

0.4

0

0.5

1.5
Time (sec)

2.5

3.5

Q =

## Figure 37: Step response of Inverted Pendulum with control.

1
0
0
0

0
1
0
0

0
0
1
0

0
0
0
1

>> R = 1
R = 1
>> K = lqr(Ap, Bp, Q, R)
K =
-1.0000
-2.0408
>> Acl = Ap - Bp * K
Acl =
0
1.0000
1.8182
3.5287
0
0
4.5455
8.8217

20.3672

0
-34.3585
0
-61.3961

3.9302

0
-7.1458
1.0000
-17.8646

>> Nf = 1/(Cp*inv(-Acl)*Bp)
Nf =
-1.0000
0
Part 5: Models of Linear Systems

Page 91

Page 92

Section 9.2.2

Step Response
1.5

To: Out(1)

0.5

Amplitude

0.5
0.1

0.05

Step Response
1.5

0.05

1
0

4
Time (seconds)

To: Out(1)

0.1

Section 9.2.2

## Example use of Brysons rules, to accelerate the response, place greater

cost on position error
>> Q = diag( [100 1 100 1])
Q =
100
0
0
0
0
1
0
0
0
0
100
0
0
0
0
1
>> K = lqr(Ap, Bp, Q, R)
K = -10.0000
-8.2172
38.6503
7.2975
>> Acl = Ap - Bp * K
>> Nf = 1/(Cp*inv(-Acl)*Bp)
Nf = -10.0000
0
>> Poles2 = eig(Acl)
Poles2 =
-6.9654 + 2.7222i
-2.2406 + 1.7159i
-6.9654 - 2.7222i
-2.2406 - 1.7159i

## >> Poles1 = eig(Acl)

Poles1 =
-8.3843
-3.7476
-1.1020 + 0.4509i
-1.1020 - 0.4509i

To: Out(2)

0.5

Amplitude

0.5
0.4

## Second LQR controller example

0.2
To: Out(2)

Brysons rules: choose elements of Q and R to be 1/ x2i , where xi is the
allowed excursion of the ith state.

0.2

0.4

4
Time (seconds)

## Figure 39: Step response of second LQR controller.

Part 5: Models of Linear Systems

Page 93

Page 94

## EE/ME 701: Advanced Linear Systems

Section 10.0.0

10 Conclusions
Weve seen
Some of the basic properties of system models, and classification of signals.
How to build a state-variable model in 5 steps:
1. Write the relevant relations for the system
2. Identify the differential equations
3. Write the differential equations in state-variable form
4. Write the equation of the output signal (or signals) using the states and
inputs
5. Check units throughout, to verify correctness.
List at least three reasons why state-variable models are advantageous
relative to differential equation modeling
Construction of several example state-variable models
Basic operations on a state variable model, including
Determining the transfer function
Coordinate transformation
State feedback control.

Page 95

Student
exercise

## Solutions to the State Equation and

Modes and Modal Coordinates

Section 1.1.0

1 Modal coordinates
1.1 Derivation of modal coordinates

Contents

## Considering the response of a linear system:

1 Modal coordinates
1.1

1.1.1

(1)

(2)

## Choose the basis vectors to be the columns of the modal

matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.1.2

1.1.3

## Interpretation of the transformations

. . . . . . . . . . .

10

. . . . . . . . . . . . . . . . . . .

12

1.2

1.3

## Transformation of the state model to and from modal

coordinates (Similarity transform from A p to Am or back) . . . . .
1.3.1
1.3.2

18

1 (t)

(t) = ...

n (t)

19

2.1

2.2

## General form for combining complex conjugate parts of a 2nd

order mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

23

2.3

25

2.4

27

(3)

19

## is the representation of x (t) on basis vectors

Writing M =

e1 , e2 en

o
,

x (t) = M (t)

e1 , e2 en

o
.

so (t) = M 1 x (t)

(4)

30

4 Conclusions
Part 6: Response in Modal Coordinates

x (t) = i (t) ei

16
17

## 2 Complex eigenvalue pairs

Given a state vector x (t) Rn and a set of basis vectors {ei } for Rn , we
know that we can represent x (t) on basis {ei } by

i=1

## Case of a full set of real, distinct eigenvalues

(diagonalizing A p ) . . . . . . . . . . . . . . . . . . . . .
. . .

## where A p, B p , C p, D p form the model in physical coordinates.

31
(Revised: Sep 10, 2012)

Page 1

Page 2

## EE/ME 701: Advanced Linear Systems

Section 1.1.0

Likewise, we can represent the input signal, B p u (t) on the same basis:

Section 1.1.1

1.1.1 Choose the basis vectors to be the columns of the modal matrix

i (t) ei = B p u (t)

## EE/ME 701: Advanced Linear Systems

(5)

i=1

However, when the ei are the columns of the modal matrix of A p (assuming
for now a complete set of independent eigenvectors), then the terms

where the i (t) are the basis coefficients representing B p u (t) on {ei }.
Writing

i (t) A p ei

i (t) i ei

become

i (t) ei = M (t)

(6)
And Eqn (7) becomes

i=1

then
B p u (t) = M (t)

so (t) = M 1 B p u (t)

i=1

## If we expand x (t) in the state equation using the representation on {ei }, we

find:
from:
x (t) = A p x (t)
+ B p u (t)
n

i (t) ei

i=1

i=1

i=1

i (t) A p ei + (t) ei

(7)

n

## i (t) ei i (t) A ei (t) ei

i=1

i=1

= 0

(8)

i=1

Even though the i (t) and i (t) terms in Eqn (8) are scalar, because of the
middle term, with matrix A p, Eqn (8) leads in general to the vector equation
n

i=1

## i (t) ei i (t) A ei i (t) ei = 0

i=1

i=1

i=1

(10)

Since the ei are independent, Eqn (10) is verified only if each term in
parenthesis is zero, which gives a set of simultaneous scalar equations (see
Bay Eqn (6.25))


i (t) i (t) i i (t) = 0

## Rearranging the state equation gives (see Bay, Eqn (6.24))

from:




n 
i (t) ei i (t) i ei i (t) ei = i (t) i (t) i i (t) ei = 0

(8, repeated)

i=1

## Eqn (11) we know how to solve:

i (t) = i (t0 ) e (tt0 ) +

t0

e (t) i () d

(12)

i=1

Z t

(9)
x (t) = M (t)

## which is not especially helpful.

Part 6: Response in Modal Coordinates

(11)

Representing the state and input on the modal matrix of A p , an nth order
coupled differential equation becomes a set of n first order uncoupled
differential equations !

i=1

!
n
n
i (t) I
i (t) A i (t) I ei = 0

or

Page 3

Page 4

## EE/ME 701: Advanced Linear Systems

Section 1.1.1

The basis vectors ei that decouple the system are the columns of the modal
matrix
i
h
1
(13)
e1 e2 en = M , recall A p = M J M
Eqn (13) is the general form.
eigenvectors exists, then
M =V ,

J =U

## EE/ME 701: Advanced Linear Systems

1.1.2 Example transformation into modal coordinates
Consider the system governed by

and

A p = V U V 1

(14)

Section 1.1.2

x (t) =

0.164

0.059

0.059

0.164

## with initial condition

where V is the matrix of eigenvectors and U has the eigenvalues on the main
diagonal.
The eigensystem is:

x (t) +

x (t0 ) =

1
0

1.085
0.031

u (t)

(15)

V =
-0.7071
-0.7071
0.7071
-0.7071
U =

-0.2231
0

0
-0.1054

## There is a complete set of independent eigenvectors. The modal matrix is

given by:

1 1
1

(16)
V , M=
M=
0.707
1 1
Eqn (16) illustrates that the basis vectors of the modal matrix can be
scaled by a parameter (actually each vector can be scaled independently).
By scaling M to 1.0, the some of the coefficients below get simpler.

Page 5

Page 6

Section 1.1.2

## The i are given from Eqn (5), which can be rewritten as

1 (t)

.. = M (t) = B u (t)
.
p

n (t)

so
(t) = M

B p u (t) =

0.527

0.558

u (t)

## EE/ME 701: Advanced Linear Systems

Section 1.1.2

For a step input, which gives constant i (t), the form for the solution is:

1  t
e 1

i (t) = i (0) et + i
(17)

(20)

which gives

1  0.223t
1
e
0.223

1  0.105t
0.527
1
e
0.105

(18)

(21)

## The transformation back to physical coordinates is given by:

(t0 ) = M 1 x (t0 ) =

0.5
0.5

(19)

## And the two uncoupled solutions are (with t0 = 0)

1 (t) = 0.5 e0.223t 0.558

Z t

e0.223 (t) u () d

## 2 (t) = 0.5 e0.105t 0.527

Z t

e0.105 (t) u () d

1
1

1 (t) +

1
1

2 (t) (22)

## Eqn (22) can be used in two ways

1. Eqn (22) shows that the output x (t) is the superposition of n
contributions. Each contributing vector is a basis vector. They are the
columns of the modal matrix. And each has a basis coefficient i (t).

## 2. We can add up the contributions for an individual xi (t), such as:





x1 (t) = .5 e0.223t + 2.5 1 e0.223t + .5 e0.105t + 5 1 e0.105t
(23)

Page 7

Page 8

Section 1.1.2

Section 1.1.3

From

## 1. By transformation into modal coordinates, the states are uncoupled.

rT1

..
(t) = M 1 x (t) =
.

rTn

## 2. The uncoupled equations can be solved

3. The solution in physical coordinates is found by transformation from modal
coordinates back into physical coordinates.
The order of an equation in the uncoupled equations can be:

x (t)

(24)

the jth row of M 1 gives the coupling of the physical states into the jth
mode. We can write that
1 (t) = rT1 x(t) = hr1, x(t)i
..
.

## First, for a real pole

Second, for a complex pole pair

(25)

## n(t) = rTn x(t) = hrn , x(t)i

Eqn (11) works directly for complex pole pairs, but it may be more
convenient to organize each pair of complex poles into a 2nd order realvalued mode (see section 2).
nth order for an n n Jordan block.
When there is not a complete set of eigenvectors, the Jordan form is used,
and gives l coupled states for each chain of l eigenvectors (one regular
and l 1 generalized).

From
x (t) = M (t)

(26)

## each mode contributes to the physical states according to a column of M,

and each physical state is determined by a combination of modes according
to a row of M.

The response of a linear system can be thought of as a collection of firstand second-order mode responses with forcing functions.
Through the M matrix, the response of the physical states of the system
(voltages and velocities, say), generally involves a superposition of all of
the modes of the system.

Page 9

Page 10

Section 1.1.3

= M 1 B p

(27)

## the input is coupled to each mode according to the elements of . Following

the example of Eqn (25)

i = ri , B p
(28)

Consider 2 masses connected by a spring, this system will have two oscillatory
modes:
a lower frequency mode in which the masses swing in phase, and
a higher frequency mode in which the mass swing in opposite phase.

## Notice in particular that if there is an element in that is zero, there is no

forcing function for that mode in Eqn (12).
i (t) = i (t0 ) e (tt0 ) +

Z t
t0

e (t) i () d

Section 1.2.0

2
k

## (Eqn (12), repeated)

u

Since the modes are uncoupled, if a i is zero, the input u (t) is not
connected to that mode.

m1

m2

## We call the system expressed in terms of (t) and using Am , Bm , Cm , D p

as the system in modal coordinates, because each state variable is uniquely
associated with a mode.
In Phase

Opposite Phase

## The linearized equations of motion are:

m1 l 2 1 + d 1 + (k + m g l) 1 = k 2 + u
m2 l 2 2 + d 2 + (k + m g l) 2

Page 11

= k 2

(29)
(30)

Page 12

Section 1.2.0

## With m = 2, l = 1, d = 3, k = 20, and g = 9.8 , we find:

1 (t)
1.50 19.80 0.00 10.00

1 (t)
1.00
0.00
0.00
0.00
, Ap =

x (t) =

2 (t)
0.00 10.00 1.50 19.80

2 (t)
0.00
0.00
1.00
0.00

Section 1.2.0

0.5

Theta 1

(31)
0.5
0

10

5
6
t [seconds]

10

Theta 2

0.5

0.5
0

## Figure 4: Two masses swinging in phase (mode 2, mode 1 not excited).

2
Theta 1 [degrees]

h
iT
xT (0) = 0 1 0 1

## A faster, out-of-phase mode, excited from the initial condition (figure 3)

h
iT
xT (0) = 0 1 0 1

We can excite the modes individually. Setting up initial conditions that lie
in the column space formed by the basis vectors of the mode.
General initial conditions give motion that is a superposition of all modes.

1
0

10

5
6
t [seconds]

10

1
Theta 2 [degrees]

Theta 1 [degrees]

1
0.5
0

0
0.5

0.5
1
0

1
0
1

10

## Figure 5: Two masses swinging, both modes excited. (Figure 5 is the

superposition of figures 4 and 3.)

1
Theta 2 [degrees]

0.5

0.5
0
0.5

Notes:
1
0

5
6
t [seconds]

10

Figure 3: Two masses swinging out of phase (mode 1, mode 2 not excited).
Part 6: Response in Modal Coordinates

Page 13

## The unforced response (to initial conditions) is considered in figures 3-5.

The forced response (to u(t)) can also be understood in terms of modes.
Part 6: Response in Modal Coordinates

Page 14

Section 1.2.0

## For an n dimensional system there may be n1 first order modes and n2

second order modes, where n = n1 + 2 n2
In modal coordinates, we can write the system dynamics as:
(t) = Am (t) + (t)

## 1.3 Transformation of the state model to and from modal

coordinates (Similarity transform from A p to Am or back)

(32)
Choosing the modal matrix M as a special transformation matrix, and
introducing (t), the state vector in modal coordinates,
x(t) = M (t)

Am = M 1 A p M

(33)

A p = M Am M 1

(34)

If the system has second order modes, Am will have 2x2 blocks, one
corresponding to each 2nd order mode.

Then:
(t) =

Or:
(t) = = Am (t) + Bm u (t)
y (t) = Cm (t) + D p u (t)
Where:
Am = M 1 A p M
Bm = M 1 B p
Cm = C p M

## Modes give distinct contributions to the total output, such as

second order:



M 1 A p M x(t) + M 1 B p u(t)

## If the system requires the Jordan form, Am will have a block

corresponding to each Jordan block.
The important physical property of modes is that they are uncoupled. In
figure 4 mode 1 evolves without mode 2, and likewise in figure 3. In figure
5 both modes evolve, independently.

(35)

(t) = M 1 x(t)

If the system has only first order modes, Am is diagonal (all blocks are
1x1).

first order:

Section 1.3.0

Weve seen that we can transform a state model to a new set of basis vectors
with a transformation matrix.

and

y(t) = a1e1t

## 2x2 block for every pair of complex eigenvalues

Modal coordinates are best (and perhaps only) understood in state space.
Part 6: Response in Modal Coordinates

Page 15

## l l block for every l l block in the Jordan form (if needed)

Part 6: Response in Modal Coordinates

Page 16

Section 1.3.1

## 1.3.1 Case of a full set of real, distinct eigenvalues

(diagonalizing A p)

U =

1
...
n
..
.

..
.

Section 1.3.2

## 1.3.2 Example, second-order system, 2 first-order modes.

When we have real and distinct eigenvalues we will have a complete set of
independent eigenvectors, and

## EE/ME 701: Advanced Linear Systems

Given the system of section 1.1.2. Recall that the solution for x (t) is of the
form

1
1
1 (t) +
2 (t)
x (t) = M (t) = e1 1 (t) + e2 2 (t) =
1
1
Figure 6 is a plot of the phase portrait, showing the state trajectory from
initial condition

1
x (0) =
0

V = v1 vn

..
..
.
.

## A phase portrait is a plot of the state as a function of time.

It can be in any coordinate frame.
It will be an n-dimensional plot, for an nth order system.

Then
M =V

and Am = M

Ap M

(36)
1

x2

## The system matrix in model coordinates, Am, is the diagonal matrix of

eigenvalues.

e2

x2

0.5

x(t)

x1

0.5
e1
1
0.5

0.5
x1

1.5

Figure 6: Response of a system with two first-order modes. The x show the
state at 1.0 second intervals.
Part 6: Response in Modal Coordinates

Page 17

Page 18

Section 2.1.0

## When each pole is distinct, the state-response of a system is the

superposition of the individual modal responses. Putting together Eqns (25)
and (24) above, the response due to initial condition and each mode is given
by:

## 2.1 Example, double pendulum revisited

Considering the double pendulum example of section 1.2, the A p matrix
gives the eigenvectors and eigenvalues:

V =

Am =

0.75 + j 5.41

0.00

0.00

0.00

0.00

0.75 j 5.41

0.00

0.00

0.00

0.00

0.75 + j 3.04

0.00

0.00

0.00

0.00

Section 2.1.0

0.75 j 3.04

(37)

## i (t) ei = eit ei rTi x(0)

(38)

with

When the poles are complex, Eqns (37), (38) are none-the-less valid. For
the example above,

1 (t) e1

2 (t) e2

e(0.75+ j5.41)t

e(0.75 j5.41)t

0.70 + j 0.00

+0.02 + j 0.13

+0.70 + j 0.00

0.02 j0.13

0.70 j 0.00

+0.02 j 0.13

+0.70 j 0.00

0.02 + j0.13

h
i

h
i

x (0) =

Page 19

and

t = 0.6

Page 20

Section 2.1.0

Section 2.1.0

## >> [V, U] = eig(Ap);

>> Vinv = inv(V);
>> v1 = V(:,1)
v1 =
-0.6955 + 0.0000i
0.0175 + 0.1262i
0.6955
-0.0175 - 0.1262i
>> r1 = Vinv(1,:)
r1 =
-0.36 - 0.05i
0.00 - 1.98i
>> v2 = V(:,2) %% v1*
>> r2 = Vinv(2,:) %% r1*
>> x0 = [ 1;2;-1;-2]
>> t = 0.6
>> xi1 = exp(U(1,1)*t) * v1 * r1 * x0
xi1 =
0.0477 - 3.5723i
-0.6494 + 0.0813i
-0.0477 + 3.5723i
0.6494 - 0.0813i
>> xi2 = exp(U(2,2)*t) * v2 * r2 * x0
xi2 =
0.0477 + 3.5723i
-0.6494 - 0.0813i
-0.0477 - 3.5723i
0.6494 + 0.0813i

## If i is complex, ei and ri will in general also be.

In this case Eqn (37) can make a complex contribution xi (t), but complex
terms always come in complex conjugate pairs, and two equations like
(37) make up the mode.
We can combine the two 1st order complex terms into a single 2nd order
real term.
0.36 + 0.05i

-0.00 + 1.98i

## >> RoundByRatCommand(xi1+xi2) %% First and second terms

ans =
0.0954
%% create a real-valued 2nd order mode
-1.2988
-0.0954
1.2988
Part 6: Response in Modal Coordinates

Page 21

Page 22

Section 2.2.0

## 2.2 General form for combining complex conjugate parts of a

2

nd

Section 2.2.0

All the imaginary terms cancel (as they must !) and the result reduces to:

order mode

x(t) = 2 a er rr T b er ri T a ei ri T b ei rr T x (0)

Considering two terms from Eqn (37), a 2nd order mode makes a
contribution to x (t) according to:

## x(t) = [ (er + jei ) (er jei ) ]

...
e(+ j )t

e( j )t

..
.

(rr + jri )T

(rr jri )T

..
...
.
(39)

Define a + j b as
a + j b = e(+ j )t = et (cos( t) + j sin( t))

(40)

Or, equivalently,
a = et cos ( t) ,

x(0)

x(t) = [ er ei ]

b a

rTr
rTi

x(0)

(42)

## Considering the numerical example above

>> alpha = real( U(1,1) )
alpha = -0.7500
>> omega = imag( U(1,1) )
omega =
5.4072
>> a =
a =
>> b =
b =

b = et sin ( t)

(41)

exp(alpha*t) * cos(omega*t),
-0.6343
exp(alpha*t) * sin(omega*t),
-0.0654

## Multiplying out Eqn (39) gives:

x(t) = [a er rr T + j a er ri T + j b er rr T b er ri T
+ j a ei rr T a ei ri T b ei rr T j b ei ri T
a er rr T j a er ri T j b er rr T b er ri T
j a ei rr T a ei ri T b ei rr T + j b ei ri T ] x(0)

Page 23

Page 24

Section 2.3.0

Section 2.3.0

M = ei ei

## x (t) = A p x (t) + B p u (t)

Convert to modal coordinates

ei = er + j e j

with

(44)

The 2 2 element in N that selects the real and imaginary parts of ei is:

11 j
N2 =
2 1 j

with
Am = M 1 A p M

Bm = M 1 B p

Then

Am

(45)

...
|

(43)

j j

## Now considering Am . The complex pair of eigenvectors will have complex

conjugate eigenvalues

and

M = er e j

| |

Then

Am = N 1 M 1 A p M N = N 1 Am N

N2

...

N =

M 1 = N 1 M 1

N21 =

which gives

Choose N to be

M = MN

Page 25

1
= N Am N =

...
1

j j

...

...
i
i
...

1

2

...

(46)

1 j

...
1

Page 26

Section 2.4.0

...

=
2

...

## Note, if it is convenient, we can scale the columns of M as desired. For

example, scaling the largest element of each column to 1.0 gives

(47)

(48)

1.00

1 j
N=
2
0

0
then

j 0 0

0 1 1

0 j j

## Finding the system matrix in Modal coordinates,

real
The double pendulum has two second order modes. Using the example from
section 2.1, choose

0.03 1 0.08 1

M =

1.00 0
1
0

0.03 1 0.08 1

where
= + j

Section 2.4.0

...

+
j ( )
=

j ( ) +

## EE/ME 701: Advanced Linear Systems

0.75 5.41

5.41
1 1
1
Am = N M A p M N = N Am N =

0.75
0
0

0.75 3.04

3.04 0.75
0

## The first mode oscillates in a vector subspace of state space, given by

S1 = {x : x = 1 e1 + 2 e2 }

0.70

0.00

0.68

M =NM =

0

## The second mode oscillates in a vector subspace of state space, given by

S2 = {x : x = 3 e3 + 4 e4 }

Page 27

Page 28

Section 2.4.0

coordinates is:

1
,
x (0) =

(0) = M p

Section 3.0.0

## 3 x (t) = A p x (t) defines A p-invariant spaces

Considering the unforced response of

1.0

x (0) =

## x (t) = A p x (t) + B p u (t)

(49)

and the initial condition of Eqn (49) excites only the first (faster) mode.

(that is, u (t) = 0), a real i and ei define a 1-D A p -invariant subspace. That
is, if
x (t = 0) {x : x = ai ei }
then
x (t) {x : x = ai ei }

x (0) =
,
0

1

(0) = M p

x (0) =

1.0

If
(50)

## (Revised: Sep 10, 2012)

x (0) {x : x = a1 er + a2 ei }
then
x (t) {x : x = a1 er + a2 ei } t > 0

and the initial condition of Eqn (50) excites only the second (slower) mode.

(51)

with
and
ei = imag (vi )
(52)
er = real (vi )

t > 0

Page 29

(53)

## Eqn (53) implies that rr , ri {x : x = a1 er + a2 ei } , which is verified

numerically for the example above.

Page 30

## EE/ME 701: Advanced Linear Systems

Section 4.0.0

4 Conclusions
The response of linear dynamic systems can be decomposed into the
responses of individual modes.
The modes are decoupled: if the input or initial condition excites only
one mode, only that mode responds.
Modes are either
First order, or
Second order, or
Correspond to a Jordan block
The basis vectors of the modes (vectors ei in Eqn (3) et seq.) are the bases
of A p -invariant subspaces in state-space, defined by the modes.
The modal matrix M is used to transform between modal and physical
coordinates.
The forcing function, u (t), can also be decomposed into forcing functions
for each of the modes.

Page 31

Section 1.0.0

## 1 Introduction to Stability Theory

We will consider the stability of

Contents
Linear time-invariant systems
1 Introduction to Stability Theory

## 2 The phase plane (or phase portrait)

Nonlinear systems

2.1

2.2

2.3

13

2.4

## Determining local stability . . . . . . . . . . . . . . . . . . . . .

14

3 Limit Cycles

15

4 Lyapunov stability

17

4.1

## Generalization of Lyapunovs energy function: . . . . . . . . . . .

20

4.2

Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . .

21

4.3

## Lyapunov stability of a linear system

22

. . . . . . . . . . . . . . .

## easy, given by the eigenvalues

harder, some unexpected results
in general quite hard

## Stability of linear time-invariant systems is relatively simple:

Examine poles (eigenvalues of the system matrix)
LTI systems can show exponential decay, marginal stability, or
exponential growth
Linear time-varying systems:
We cant look at the succession of instantaneous systems
Even if the eigenvalues are always in the left half-plane, the system can
be unstable
In discrete time, a sequence of stable systems can be unstable.
More powerful tools needed.

5 Summary

23
For nonlinear systems the picture is even more complex:
Local stability,
Properties for small inputs may not extend to large input
Stability may be input dependent
Stability may be initial-condition dependent

Page 1

Page 2

Section 1.0.0

Section 2.0.0

## A tool is needed to understand stability properties in general cases

For time-varying and nonlinear systems, various flavors of stability:

## Stable (or delta-epsilon stable)

x (t) = f (x, u, t)

## If the system starts within of the equilibrium point, it never goes

outside of distance from the equilibrium point.

(1)

y (t) = g (x, u, t)

Uniform stability
we need a tool for looking at the behavior of the system.
Stability is not dependent on the time (for time-varying systems).
One general and powerful tool is the phase plane.
Asymptotic stability
Example, consider the undamped system :

## As t , the state draws ever closer to the equilibrium point.

d 2y
+ 2 y = 0
dt 2

Exponential stability
As t , the state draws ever closer to the equilibrium point at
least as fast as some exponential curve.

x (t) =

x (t) =

0 2
1

## Undamped oscillator response

A time-plot shows:

1
0.8
0.6

State signals

0.4
0.2
0
0.2
0.4
0.6
0.8
1
0

10

15

20

25

time [sec]

## An nth order system will have n curves in the time plot.

Part 7: Phase Portraits and Stability

Page 3

Page 4

Section 2.0.0

Section 2.0.0

## Phase plane (continued)

To draw the phase portrait, take away time, and make the axes of the plot
the states (2-D plot for 2nd order system, 3-D plot for 3rd order system, etc.)

## A path is called a phase trajectory or simply a trajectory .

We can consider any number of trajectories, to create a phase portrait.

1

1.5

Velocity [m/s]

0.5

Velocity [m/s]

0.5

0.5

0.5

0.5

0
Position [m]

0.5

1.5
1.5

0.5

0
Position [m]

0.5

1.5

## Undamped oscillator response

1
Undamped oscillator response

0.5
Velocity [m/s]

0.8

State signals

0.6
0.4

state:

0.2

0.5
0

## What is state, it is all the information needed to determine the future

trajectory of the system (given inputs).

0.2

1
0.4
0.6
0

4
time [sec]

0.5

0
Position [m]

0.5

## Corollary: the trajectory departing from a point xa is a function only of

xa , with no dependence on how the trajectory arrived to state xa .

## Figure 2: Damped oscillator.

Part 7: Phase Portraits and Stability

Page 5

Page 6

Section 2.0.0

Section 2.2.0

## At each point in phase-space (at each possible state) there is a direction of

departure.

The concept of a trajectory through phase space extends naturally to higherorder systems.

We can plot the phase arrows along trajectories, or at any point, such as grid
points.

1

0.8

0.8

0.6

0.6

0.4

0.4

0.2

0.2

0
0.2

## An equilibrium point is a steady-state operating point of a system, it is a

point xe where:
x = f (xe , u, t) = 0
(2)

0.4

0.6

0.6

0.8

0.8

0.2

0.4

point
x = A 0 = 0

1
1

0.5

0
Position [m]

0.5

## 1. We cant easily plot 3rd and higher order phase portraits.

2. The theorem that phase trajectories can not cross is less useful.

Velocity [m/s]

Velocity [m/s]

0.5

0
Position [m]

0.5

## If A has a null space, there will be infinitely many equilibrium points

(a null space in A corresponds to a pole at the origin, consider they
dynamics of a car, there are infinitely many places your car can stop).

## To plot 4(b), we need only be able to compute x (x), it is not necessary to

solve the differential equation.

## If there is an input, the equilibrium will be shifted:

x (t) = A x (t) + B u ,

xe : A xe + B u = 0

## A handy feature for nonlinear systems.

In 2-D we have powerful theorem: phase trajectories can not cross.

Page 7

Page 8

Section 2.2.0

Section 2.2.0

1. Stable node

2. Unstable node

4. Center

## Stable node, the phase trajectories travel into xe ,

Unstable node, the phase trajectories travel away from xe ,

Page 9

Page 10

Section 2.2.0

Section 2.2.0

## Figure 8: Phase portrait for marginal stability.

Figure 7: Saddle points, system has both stable and unstable modes.

Page 11

Page 12

Section 2.3.0

Section 2.4.0

## As you would expect, a variety of terminology is applied to the

characteristics of differential equations.

x = f (x, u, t)

referred to as:

(3)

## With x = xe + x , x (x) is given by the Taylor series expansion by:

A fixed point
x (x) = x (xe ) +

## An attractor (if the system is stable).

If the fixed point is called an attractor , we can call the region (of
state space) from which trajectories converge to the attractor the basin of
attraction.
Figure 7.2 of Bay illustrates multiple attractors and basins of attraction of a
pendulum. Notice stable and unstable equilibria.



f (x, u, t)
x + O x2
x
x=xe

(4)

0+J x

## The derivative of ha vector is a matrix of thei individual scalar derivatives.

T
, then:
With f (x, u, t) = f1 () f2 () fn ()

f1
x1
f2
x1

f1
x2
f2
x2

..
.
..
.

f1
xn
f2
xn

f (x, u, t)
J=
=

x

x=xe

.. fn
fn fn
. xn
x1 x2

x=xe

Near to xe , the first-order term will dominate and the stability properties
(stable vs. unstable vs. saddle point vs. center) is determined by the
f (x, u,t)
, which is sometimes called the Jacobian matrix.
eigenvalues of
x
Local stability is usually computable, because we can usually form the
Jacobian matrix, even for non-linear systems.
Figure 9: Multiple attractors and basins of attraction (Bay figure 7.2).
Part 7: Phase Portraits and Stability

## (Revised: Sep 10, 2012)

Page 13

Local stability is a limited result. For example, even for a stable fixed point,
it may be difficult to determine the size of the basin of attraction.
Part 7: Phase Portraits and Stability

Page 14

Section 3.0.0

3 Limit Cycles

Section 3.0.0

## Contrasting marginal stability (figure 8), a phase portrait for marginal

stability has closed curves, but these are isolated, and are not the limit of
a trajectory.

## Establishing the existence of a limit cycle has no general solution

Limited solution for the 2D case:
Poincares Index: for the region enclosed by a closed curve, write
n = N S
where N is the number of centers, foci and nodes,
and S is the number of enclosed saddle points.
For the closed curve to be a limit cycle, it is necessary but not
sufficient that n = 1 .

Summary on Stability
The local stability of a (sufficiently smooth) nonlinear system can be
determined by examining the stability of the equilibrium points.
Nonlinear systems can have a new kind of behavior: a limit cycle.
Needed: a global stability result.

Page 15

Page 16

## EE/ME 701: Advanced Linear Systems

Section 4.0.0

4 Lyapunov stability

Section 4.0.0

k (y)

Lyapunov method:

b (y)

## Introduce an energy function (or, in general, a Lyapunov function)

Show that energy decays everywhere, except at the equilibrium point,
where it is zero.

x=

## First published in Russian in 1892,

Translated into French in 1907

x =

y
0

x1
x2

k (y) b (y)

## Consider the second order system

Assuming unit mass, the sum of potential and kinetic energy is given by:

E (x1 , x2 ) =

x22
+
2

k (x1 ) dx1

## Looking at the rate of change of energy in the system:

dE
= x2x2 + k (x1 ) x1
dt
= x2 [x2 + k (x1 )]

## But from the dynamics, x2 = k (x1 ) b (x2), and so

dE
dt

becomes:

dE
= x2 [k (x1 ) b (x2) + k (x1)]
dt
Part 7: Phase Portraits and Stability

Page 17

Page 18

Section 4.0.0

Section 4.1.0

## 4.1 Generalization of Lyapunovs energy function:

Introduce a scalar function of the system state:
V (x)
which is positive-definite
V (x) > 0 ,

x 6= 0

(5)

dV
V d x
=
dt
dx dt
Conclusions:

(6)

and show:

## If b () = 0 and the system has only a spring,

in the system remains constant.

dE
dt

## = 0 and the total energy

dV
< 0 , x 6= 0
(7)
dt
Which establishes the Bounded-input bounded-output (BIBO) stability:

V (t) V (t0)
As long as
then :

t > t0

(8)

E>0
dE
dt

## Challenge: how do we find V (x) ?

<0

Energy function
Clever choice

Observations:
For any initial condition giving energy E0 , the system never goes to a
state with E (x) > E0
When b > 0 energy steadily decays, and the system converges to
E = 0.

Typical choice:
V (x) = xT P x ,

(9)

## Of course, failure to find a suitable V (x) does not prove instability.

Part 7: Phase Portraits and Stability

Page 19

Page 20

Section 4.2.0

Section 4.3.0

## Physical energy is often quadratic in the state variables

Given
x (t) = A x (t)

Consider Vc = 12 CV 2

(11)

## Because we want a positive definite function V , a quadratic form is often

chosen for a Lyapunov function.

V (x) = xT P x

(12)

V (x) = xT P x

(10)

Where

## Then V (x) is given by

P is a matrix
V (x) = x T P x + xT P x
It can be shown (see chapter) P may be restricted to symmetric matrices
without loss of generality
(What is the advantage of a symmetric matrix, in terms of the
eigensystem of P?)

= xT AT P x + xT P A x = xT Q x

Student
exercise
where

Q = AT P + P A

If P is positive-definite

(13)

(14)

## Q positive definite proves the stability of system (11).

xT P x > 0 x 6= 0
Eqn (14) is called the matrix Lyapunov equation.
Requirement for a PD matrix:

For a stable linear system, Eqn (14) has a solution for every positive
definite P

## All eigenvalues > 0.

P may be written P = RT R
(Student exercise, prove that any P written P = RT R is symmetric and
positive (semi)definite, for arbitrary choice of R.)

## Equations (11)-(14) show the concept applied to a linear system

And provide a starting point for studying nonlinear systems.

## P is positive-semidefinite if all eigenvalues 0, and some eigenvalues

may be 0.
Part 7: Phase Portraits and Stability

Page 21

Page 22

## EE/ME 701: Advanced Linear Systems

Section 5.0.0

5 Summary
Weve seen that stability is more complex for nonlinear systems
Property of system, and also of initial conditions and inputs
(Think of the stability of a plane in flight).
A system at equilibrium and perturbed can:
Diverge
Go to a different equilibrium point
Go into a limit cycle
The Lyapunov stabiltiy method gives one approach for non-linear systems.

Page 23

## Controllability and Observability of

Linear Systems

3.2

3.3

Contents
1 Introduction

1.1

Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

1.2

## 2 Tests for Controllability and Observability

Controllability, discrete time system . . . . . . . . . . . . . . . .

2.2

## Controllability for continuous-time systems . . . . . . . .

10

2.3

Controllability example . . . . . . . . . . . . . . . . . . . . . . .

12

2.4

15

2.5

## Observability, continuous time system . . . . . . . . . . . . . . .

18

2.6

Observability example . . . . . . . . . . . . . . . . . . . . . . .

18

2.7

Popov-Belevitch-Hautus Tests . . . . . . . . . . . . . . . . . . .

19

2.7.1

19

2.7.2

20

2.2.2

2.8

. . . . .

3.1

3.2.1

34

37

3.3.1

38

4.1

39

39

21
26

31

2.1

2.2.1

30

Page 1

Page 2

Section 1.1.0

1 Introduction

Section 1.1.0

## No discussion of State-Variable models would be complete without a

discussion of observability and controllability (R. E. Kalman [1, 2]).

y(t)

s+3

u(t)

(s+3) (s+4)

## Modes correspond to energy storage elements

When a pole and a zero cancel, the mode - nonetheless - remains in the
system:

u(t)

y(t)
s+3

(s+3) (s+4)

## The mode disappears from the transfer function.

The mode - and energy storage element - remains in the system:

## (B) Unobservable system (state hidden from output)

Either the signal of the mode does not appear in the output
(Unobservable, fig 1(B));
Or the input signal cant reach the mode (Uncontrollable, fig 1(C)).

1.1 Definitions

u(t)

s+3

y(t)

(s+3) (s+4)

## (C) Uncontrollable system (state untouched by input)

Controllable System: The input signal reaches each of the modes, so that the
system can be driven from any state, x0 , to the origin by suitable choice of
input signal.

output path.

## Figure 1(C) shows an uncontrollable system.

Observable System: The response of each mode reaches the output. An initial
condition, x0, can be determined by observing the output and input of the
system.
Part 8: Controllability and Observability

Page 3

Page 4

Section 1.2.0

## 1.2 Main facts about uncontrollable or unobservable modes:

Even if a pole and zero cancel, the system still has that mode in its response.

Section 2.1.0

## 2 Tests for Controllability and Observability

2.1 Controllability, discrete time system
Consider example 3.10 in [Bay]. For the LTI discrete-time system:

Challenge: if a pole and zero cancel, it means that feedback control can do

x (k + 1) = A x (k) + B u (k)

(1)

## If the mode is slow or lightly damped, it will remain slow or lightly

damped.
If the mode is fast and well damped: being uncontrollable or
unobservable is often not a problem.
Exception: rocket, the stress introduced by oscillation of an
uncontrollable or unobservable mode can over-stress the structure.
How do uncontrollable or unobservable modes get excited:

## Controllability, example 3.10: Given and arbitrary initial condition x (0),

under what conditions will it be possible find an input sequence,
u (0) , u (1) , ..., u (l) which will drive the state vector to zero, x (l) = 0.
Solution:
Consider the output for a few samples of the input:
At time k = 0 :

## Disturbances (dont go through zeros of control input path),

Initial conditions,

## x (1) = A x (0) + B u (0)

(2)

Non-linearities, etc.
At time k = 1 :
x (2) = A x (1) + B u (1)

(3)

## = A [A x (0) + B u (0)] + B u (1)

= A2 x (0) + A B u (0) + B u (1)

Page 5

Page 6

Section 2.1.0

At time k = 2 :

Section 2.1.0

## x (3) = A x (2) + B u (2)

i
h
= A A2 x (0) + A B u (0) + B u (1) + B u (2)

and also

## The ascending powers of A arise because x (k + 1) depends on x (k)

(recursion).
For 0 = x (l), rearranging Eqn (4) gives:

then

## x (l) Al x (0) = Al x (0) = B u (l 1) + A B u (l 2) + ... + Al1 B u (0)

(5)
The right hand side of (5) can be put in matrix-vector form, to give:

u
(l

1)

u (l 2)

h
i

..

.

u (1)

u (0)

u
(l

1)

u (l 2)

..
(7)
u=
.

u (1)

u (0)

(4)

## (Revised: Sep 10, 2012)

Al x (0) = P u

(8)

From Eqn (8), it is clear that if Al x (0) lies in the column space of P , then
a solution u exists which is the control sequence to drives the initial state to
the origin. Thus:
If the rank of P = n , then a control sequence u is guaranteed to exist;
and the system is controllable.

(6)

Page 7

Page 8

Section 2.2.1

## EE/ME 701: Advanced Linear Systems

2.2.2 Controllability for continuous-time systems

## Controllability for a continuous time system follows from a similar analysis,

except that integrals and the Cayley-Hamilton theorem are required.

## THEOREM (Controllability): A n-dimensional continuous-time LTI

system is controllable if and only if the matrix

p () = + n1

n1

+ + 1 + 0

p (A) = A + n1 A

n1

(9)

## PROOF: Recall that the solution of the LTI equations is:

x (t) = eA(tt0 ) x (t0) +

+ + 1 A + 0I

Z t

eA(t) B u () d

(13)

t0

(10)

## We want to establish whether a control signal u (t)exists such that

x (t1 ) = 0, so consider:

Cayley-Hamilton Theorem:
A matrix satisfies its own characteristic equation

Z t1

## THEOREM (Cayley-Hamilton theorem): If the characteristic

polynomial of an arbitrary n n matrix A is denoted (), computed
as () = det (A I), then matrix A satisfies its own characteristic
equation, denoted by (A) = 0.

t0

(A) = An + n1 An1 + + 1A + 0I
n

n1

= n1 A

1 A 0I

(11)

## where the i are the coefficients of the characteristic equation.

In any polynomial expression in A, we can always replace An and higher
powers with the right hand side of Eqn (11).
Part 8: Controllability and Observability

## eA(t1 ) B u () d = eA(t1 t0 ) x (t0 ) , 1 Rn

(14)

We dont need to solve for the 1 for the proof, only to know it
exists.

## As a result, all matrix polynomial expressions are equivalent to an

expression of order n or less, because

so

(12)

has rank n.

n

## 2.2.1 Polynomial expressions on matrices, and the Cayley-Hamilton Thrm

Section 2.2.2

Page 9

Term eA(t1 t0 ) has a polynomial expansion, so by the CayleyHamilton theorem, it can be expressed in terms of a low-order
polynomial:
eA(t1 t0 ) =

i () Ani

(15)

i=1

We dont need to solve for the i () for the proof, only to know
they exist.

Page 10

Section 2.2.2

1 =
=

Z t1
t0

Z t1
t0

"

i () A

ni

i=1

"

Ani B i ()

i=1

Z t1 h

n1

t0

= An1 B

t0

## Consider the electrical circuit given in figure 3:

B u () d
#

(16)

R
u () d

i
B 1 () u () + + A B n1 () u () + B n () u () d

Z t1

Section 2.3.0

1 () u () d + + A B

Define:
i ,

Z t1
t0

Z t1
t0

n1 () u () d + B

Z t1
t0

+
vs(t)

+
vx(t)
-

(17)

n1

.
..

(18)

2
1
1
RC
d vC (t)
C vC (t) RC

vs (t)
=
+
1
dt iL (t)
L1 0
iL (t)
L

h
i vC (t)
+  vs (t)
vx (t) =
1 0
iL (t)

P = [B | A B] =

Page 11

(19)

## The Controllability matrix is given as:

From Eqn (14), i is a vector depending on x (t0); from Eqns (17) and (18), if
P is full rank, then there exists a control input u (t) , t [t0 , t1 ] to give some
i such that x (t1 ) = 0 . Thus, the system is guaranteed to be controllable.
Part 8: Controllability and Observability

n () u () dx

## As before, we dont need to be able to compute the i , only to know they

exist.

n1

h
i

=P
1 = B | A B | ... | Al2 B | Al1 B

+ vc(t)

i () u () d

iL(t)

1
RC
1
L

1
R22C2 + LC
1
R LC

(20)

## When P becomes rank-deficient, the determinant will be zero. So we can

compute the determinant to test for values of R which make the system
uncontrollable.
Part 8: Controllability and Observability

Page 12

## EE/ME 701: Advanced Linear Systems

Section 2.3.0

Note: The controllability matrix doesnt have to be square, (well see this
example with an extra input below). A determinant calculation can only
be used when P is square.

|P | =

So rank (P ) < 2 if

1
RC
1
L


1

R22C2 + LC
= 1 1
R2 LC2 L2 C
1

R LC
R=

## It means that the transfer function has a pole-zero cancellation, and

that the zero is effectively in the input path:

1
s s + RC
Vx (s)
=
2
1
Vs (s) s2 + RC
s + LC

(23)

p
L/C then L = R2 C the denominator factors into:

s
Vx (s)
=
Vs (s) s2 + 2

iL(t)

+ vc(t) I (t)
b

+
vx(t)
-

(22)

1
s + RC


1
1 2
RC s + RC

+
vs(t)

L
C

Section 2.3.0

## Consider adding another control input (one way to deal with an

uncontrollable system)

(21)

## What it means for the system to be uncontrollable: for arbitrary to , t1 and

x (t0 ) , we can not find a control input vs (t) , t [t0, t1 ] to give x (t1 ) = 0; in
other words, to drive the state to 0.

If R =

1
s s + RC


1
1
s + RC
s + RC

## Figure 3: RLC circuit example with second input.

The state model becomes:

2
1
1
1
RC
vC (t)
vs (t)
d vC (t)
C
RC
C

(24)
=
1
dt iL (t)
L1 0
iL (t)
Ib (t)
0
L

h
i vC (t)
+  vs (t)
vx (t) =
1 0
iL (t)
The controllability matrix is:

s

1
s + RC

The second mode is not gone from the system, it is just unreachable from
the input.

P = [B | A B] =

1
RC
1
L

1
C

0 |

1
R22C2 + LC
1
R LC

R2
1
LC

(25)

Notice that P changes shape. It still has n rows, but now has 2n columns.
A 2 4 matrix has to be quite special to have rank less than 2.
In the present case, P is rank 2 as long as the two rows are independent.
The zero in the (2, 2) element assures the rows will be independent.

Page 13

Page 14

Section 2.4.0

Section 2.4.0

At time k = 0 :

## Consider example 3.11 in [Bay]. For the LIT discrete-time system:

y (0) = C x (0) + D u (0)
x (k + 1) = A x (k) + B u (k)

(27)

(26)
At time k = 1 :

## y (1) = C x (1) + D u (1)

Observability, example 3.11: Given and arbitrary initial condition x (0),
when l samples of the inputs and outputs, u (k) and y (k), k {0..l},
are known, under what conditions will it be possible to determine
(observe) the initial condition x (0).

(28)

## = C [A x (0) + B u (0)] + D u (1)

= C A x (0) +C B u (0) + D u (1)

At time k = 2 :
Duality: Duality situations are those where, with a systematic set of
changes or exchanges, one system turns out to have properties equivalent
to another. For example, circuit duality:

## y (2) = C x (2) + D u (2)

i
h
= C A2 x (0) + A B u (0) + B u (1) + D u (2)

Capacitors

Inductors

Resistances

Conductances

(29)

## Generalizing, Eqn (29) becomes:

and the new dual circuit will behave identically to the original.
In controls there are several dualities, an important one is the
Observability / Controllability duality. As we will see, with an
appropriate transpose and C substituted for B, the properties carry over.
Solution:

C

y (1) C A

2
y (2)
= CA

..
..

.
.

C Al1
y (l 1)
y (0)

x (0) +

Big Term
(based on u (k))

(30)

## Consider the output for a few samples of the output:

Part 8: Controllability and Observability

Page 15

Page 16

## EE/ME 701: Advanced Linear Systems

Section 2.4.0

Observability (continued)

y (0)
y (1)
y (2)
..
.
y (l 1)

Section 2.6.0

## 2.5 Observability, continuous time system

It doesnt matter what the big term is, as long as it is known. Write

## Proof Q must be full rank for a continuous time system to be observable is

omitted. It can be constructed by considering y (t) and its derivatives.

Big Term
(based on u (k))

= Q x (0)

(31)

C
1
0

Q =

=
2
1
CA
RC C

(33)

## which is independent of L and full rank for any values of R and C.

CA

2
Q ,
CA

..

C Al1

(32)

From Eqn (31), it is clear that if x (0) lies in the row space of Q , then by
knowing the left hand side it is possible to determine x (0). It is guaranteed
that x (0) lies in the row space of Q if the rank of Q is n.
If the rank of Q = n , then x (0) can be determined from known samples
of the input and output; and the system is observable.

Page 17

Page 18

Section 2.7.2

Section 2.7.2

## 2.7.2 PBH Test of Observability

These tests for controllability and observability do not involve testing any
matrix for rank, but rather examine the left eigenvectors (eigenvectors) for
orthogonality with B (C) .

## LEMMA (PBH Test of Controllability): An LTI system is not

controllable if and only if there exists a eigenvector v of A such that
Cv = 0

Left Eigenvectors: Left eigenvectors are row vectors which have the property
that
vA = v
(34)

## PROOF: We will see that it is a direct consequence of examining

observability in modal coordinates.

For example,


so v =

1 1

1 1



3 1

= 3 1 1
0 2

(35)

## and they are the (right) eigenvectors of A .

2.7.1 PBH Test of Controllability
LEMMA (PBH Test of Controllability): An LTI system is not
controllable if and only if there exists a left eigenvector of A such
that
B = 0T
PROOF: The text establishes the PBH test by direct proof. We will see
that it is a direct consequence of examining controllability in modal
coordinates.

Page 19

Page 20

Section 2.8.0

## EE/ME 701: Advanced Linear Systems

The rotation to modal coordinates is given by

Recall that we can use the eigenvectors of A to transform the state variable
model onto a new basis (modal coordinates) in which the modes are
decoupled.

V = Evecs A ; T = V 1
b = T A T 1
A
Bb = T B

## Controllability: For any decoupled mode (all modes in modal coordinates)

we must have an input into the mode, in order for the mode to be
controllable.

## The system is controllable if and only if all elements of the

input matrix B are non-zero in modal coordinates.

## Observability: For any decoupled mode (all modes in modal coordinates)

we must have an output from the mode, in order for the mode to be
observable.

## The system is observable if and only if all elements of the

output matrix C are non-zero in modal coordinates.

Section 2.8.0

Cb = C T 1
The eigenvectors of A are given as:
>> A = [-2/(R*C), (1/C)
-(1/L)
0]
>> B = [1/(R*C) ; (1/L)]
>> C = [ -1 0]; D = 1;
>> [V, U] = eig(A)

## Controllability and Observability in modal coordinates Example:

Consider the original circuit (figure 3) with L = 1 [H], C = 1 [F] and R = 2.
p
R 6= L/C , so the system is controllable. The state model is given by:

1
1

A=

1 0

0.5

B=

1.0

C=

D = 

1 0

## (Revised: Sep 10, 2012)

Page 21

V =

0.3536 - 0.6124i
0.7071

0.3536 + 0.6124i
0.7071

Notice that the eigenvalues and eigenvectors are complex. This is fine. The
complex terms will come in complex-conjugate pairs, with the imaginary
components canceling in the output.

Page 22

Section 2.8.0

## Controllability and Observability in modal coordinates Example (continued)

>> T = inv(V)
T =
0.0000 + 0.8165i
-0.0000 - 0.8165i

## >> VV = [V(:,1), V2]

VV =
0.7071
-0.3536
0.7071
0.3536

0 + 0.0000i
-0.5000 - 0.8660i

>> Bhat = T * B
Bhat = 0.7071 - 0.0000i
0.7071 + 0.0000i
>> Chat = C * inv(T)
Chat = -0.3536 + 0.6124i

Section 2.8.0

Following the steps in section 4.4 to construct the T matrix (called the modal
matrix M by Bay), the transformation is given by:

0.7071 - 0.4082i
0.7071 + 0.4082i

## >> Ahat = T * A * inv(T)

Ahat = -0.5000 + 0.8660i
-0.0000 - 0.0000i

## EE/ME 701: Advanced Linear Systems

>> T = inv(VV)
T = 0.7071
-1.4142

0.7071
1.4142

-0.3536 - 0.6124i

does C .

## >> Ahat = T * A * inv(T)

Ahat = -1.0000
1.0000
0
-1.0000

When R = 1 [] .
The is more difficult because of the repeated eigenvalue (the repeated
pole at s = 1/ (RC)).
It is a coincidence that the pole-zero cancellation occurs at the same
value of R as the double pole, and that this A matrix has only 1
eigenvector.

>> Bhat = T * B
Bhat = 1.4142
0.0000
>> Chat = C * inv(T)
Chat = -0.7071

0.3536

Page 23

Page 24

Section 2.8.0

Section 3.0.0

With
b x (t) + Bu
b (t)
x (t) = A
Notice:

(36)

## A double pendulum will be used to illustrate how uncontrollable or

unobservable modes can arise in a physical system.
The example shows that modes arise with choices of where or how
actuators or sensors are placed.

b
1. The repeated poles at 1 in A.

## 3. Cb has all non-zero elements (the system is observable).

A typical remedy for uncontrollable or unobservable modes is to reengineer a system to change an actuator or sensor.
The double pendulum, seen in figures 4 and 5:
Has 4 energy storage elements (kinetic energy of each mass, potential
energy of each mass);
Is a 4th order system;
Has 2 oscillatory modes.

l1

M1

l2

M2

## Figure 4: Double pendulum: example system for unobservable and

uncontrollable modes.
Part 8: Controllability and Observability

Page 25

Page 26

Section 3.0.0

## Controllability and observability example (continued)

Mode 1: Motion together, 2 = 1
(Potential energy, no spring energy)

Section 3.0.0

## Mode 2: Motion together, 2 = 1

(Potential energy and spring energy, faster oscillation)

Ap = 0

-19.8000 0
10.0000
1.0000 0
0
0
0
10.0000 0
-19.8000
0
0
1.0000 0

## >> OLPoles = eig(Ap)

OLPoles = 0.0000 +
0.0000 -0.0000 +
-0.0000 -

## The modeling equations are:



M1 l12 1 (t) = 1 (t) = M1 g l1 1 k (1 2 ) b 1


M2 l22 2 (t) = 2 (t) = M2 g l2 2 k (2 1 ) b 2
h
iT
With state vector x (t) = 1 1 2 2 we get the state variable model:
>>
>>
>>
>>
>>
>>
>>
>>

## >> %% Double Pendulum, y/observability example

%% Setup parameters
M1 = 2; M2 = M1;
% Ball mass
l1 = 1; l2 = l1;
b = 0.1;
% Damping factor
k = 20
% Spring stiffness
M1l2 = M1*l1^2
M2l2 = M2*l2^2
g = 9.8
% Gravity constant

5.4589i
5.4589i
3.1305i
3.1305i

%% Fast mode
%% Slow mode

## And if we apply an input force on M1and measure 2 as illustrated in

figure 6, we get the input and output matrices:

>> Bp = [ l1/M1l2 0 0 0]

%% Input is force on M1

>> Cp = [ 0 0 0 1 ]
>> Dp = [ 0]

%% Output is 2

## >> Ap = [ -b/M1l2 -k/M1l2-g/l1 0

+k/M1l2 ;
1
0
0
0 ;
0
+k/M2l2
-b/M2l2 -k/M2l2-g/l2 ;
0
0
1
0 ]
Part 8: Controllability and Observability

Page 27

Page 28

## EE/ME 701: Advanced Linear Systems

Section 3.0.0

This model gives the step response of figure 7, where both modes are
present.

Section 3.1.0

## 3.1 Testing Controllability and Observability

1. Controllability: construct the controllability matrix

2(t)
(Sensor)

>>
%% Controllability matrix
>> Ccontrol = [Bp, Ap*Bp, Ap*Ap*Bp, Ap*Ap*Ap*Bp]
Ccontrol = 0.5000
0 9.9000 0
0
0.5000 0
-9.9000
0
0
5.0000 0
0
0
0
5.0000

u(t)
(Applied Force)

M1

M2

>> rank(Ccontrol)
ans = 4

Step Response

0.05

0.04

## >>>> %% Observability matrix

>> Oobserve = [Cp; Cp*Ap; Cp*Ap*Ap; Cp*Ap*Ap*Ap]
Oobserve =
0
0
0
1.0000
0
0
1.0000
0
0
10.0000
0
-19.8000
10.0000
0
-19.8000
0

Amplitude

0.03

0.02

0.01

0.01

0.02

50

100

150

200

250

Time (sec)

>> rank(Oobserve)
ans =
4

Page 29

Page 30

Section 3.2.0

u(t)

## >> Oobserve1 = [Cp1; Cp1*Ap; Cp1*Ap*Ap; Cp1*Ap*Ap*Ap]

Oobserve1 = 0
0.5000 0
0.5000
0.5000 0
0.5000 0
0
-4.9000 0
-4.9000
-4.9000 0
-4.9000 0
>> rank(Oobserve1)
ans = 2

(Sensor)
k

M1

M2

## Figure 8: Double pendulum with sensor moved to a new location.

The output signal is given by:
y (t) =

C p1 =
The controllability matrix:

l1 1 + l2 2
2

0 l1 /2 0 l2 /2

P = B p A p B p A2p B p A3p B p ;

## The observability matrix:

>> Cp1 = [ 0 l1/2 0 l2/2 ]
Cp1 = [0
0.5000 0
0.5000 ]

y(t)

or

Section 3.2.0

## A badly placed sensor (continued)

Now lets model a system with a position sensor placed in the middle of the
spring (rather than sensing position 2 ):

(Applied Force)

i

unchanged

(37)

Page 31

Page 32

Section 3.2.0

Section 3.2.1

## Lets take a look at the poles and zeros

Lets go back to the original system, and rather than applying a force to M1,
lets put in a linear motor that acts between M1 and M2

Original system:

## >> Sys0 = ss(Ap, Bp, Cp, Dp);

>> Poles0 = pole(Sys0)
Poles0 = 0.0000 + 5.4589i
0.0000 - 5.4589i
-0.0000 + 3.1305i
-0.0000 - 3.1305i
>> Zeros0 = zero(Sys0)
Zeros0 = Empty matrix: 0-by-1

%% Fast mode

Linear Motor
f(t) = ks va(t)

u(t)

%% Slow mode

2(t)
(Sensor)

u(t)
(Applied Force)

(Applied Force)
k

M1

%% No zeros

M2

## System with the sensor in the middle of the spring:

>> Sys1 = ss(Ap, Bp, Cp1, Dp);
>> Poles1 = pole(Sys1);
%% Poles1:
>> Zeros1 = zero(Sys1)
Zeros1 =
0 + 5.4589i
0 - 5.4589i

Unchanged.

0
B p2 =

l2 /M2 l22

## %% Zeros overlap Fast mode

The new sensor position introduces zeros which collide with the fast mode
poles.
This system is controllable but unobservable.

l1 /M1 l12

Cp

Cp A p
Q =

C p A2p

C p A3p

unchanged (38)

Page 33

Page 34

Section 3.2.1

Section 3.2.1

## This system is observable by uncontrollable, take a look at the system in

modal coordinates:

Bp2 = 0.5000
0
-0.5000
0

## %% Rotate to modal coordinates

[V, P] = eig(Ap);
mpT = V;
mpT1 = inv(mpT)
Am = mpT1 * Ap * mpT
= diag([0+j5.5, 0-j5.5, 0+j3.1, 0-j3.1])

## >> Ccontrol2 = [Bp2, Ap*Bp2, Ap*Ap*Bp2, Ap*Ap*Ap*Bp2]

Ccontrol2 = 0.5000 0
-14.9000
0
0
0.5000
0
-14.9000
-0.5000 0
14.9000
0
0
-0.5000
0
14.9000
>> rank(Ccontrol2)
ans = 2

## Input matrix of system with badly placed actuator:

Bm = mpT1 * Bp2
= [3.6; 3.6; 0; 0]

## Output matrix of system with badly placed sensor:

Cm = Cp2 * mpT
= [0, 0, 0+j0.22, 0-j0.22]

## Lets take a look at the poles and zeros

System with the actuator between the links.
>> Sys2 = ss(Ap, Bp2, Cp, Dp);
>> Zeros2 = zero(Sys2)
Zeros2 =
0 + 3.1305i
%% Zeros overlap Slow mode
0 - 3.1305i
The new actuator design introduces zeros which collide with the slow mode
poles.
Part 8: Controllability and Observability

Page 35

Page 36

Section 3.3.0

Section 3.3.1

## 3.3.1 Uncontrollable and unobservable modes and rocketry

The order of the system is based on the number of storage elements (or 1st
order differential equations).
If a pole and zero cancel, it does not remove an energy storage element
from the system: the mode is still in there.

Space launch vehicles are one type of system where controllability and
observability problems arise:
Flexible structures
Restricted freedom to place sensors or actuators

## Controllability and observability are properties of the system, not the

controller.

The poles and zeros move around vigorously during launch, as the
aerodynamic condition changes and mass changes.

## There is no way to fix a controllability or observability problem by

choosing a different controller.

for rocketry

## Most often controllability or observability problems are fixed by changing

Open-loop unstable mode (without control, the rocket will fall over)
Lightly damped structural modes:

## Unobservable or uncontrollable modes may not be a problem if they are

sufficiently fast and well damped.

## Can be excited by vibrations during launch

Moderate oscillations can (and did!) exceed structural limits.
Transfer functions provide little insight into controllability and observability. State variable models were needed to see what was going on (R.E.
Kalman )

Page 37

Page 38

Section 4.1.0

Section 4.1.0

## Bay also addresses time varying systems and alternative definitions of

Controllability and Observability

## 4.1 Alternative definitions of Controllability and Observability

We have several alternative definitions for the concepts of Controllability
and Observability
In most cases (except as noted), these are equivalent to basic Controllability
and Observability for Linear Time-Invariant Systems.
The differences become more important (and subtle) for nonlinear or timevarying systems.

Controllability:
A system is controllable if there exists a u (t) to drive from an arbitrary
state x0 to the origin.
Reachability:
Can find u (t)to drive from an arbitrary state x0 to any second
arbitrary state x1 .
LTI systems: Controllability are reachability are equivalent for continuous
systems. For discrete systems, there are (defective) discrete
systems which are controllable in a trivial way, consider:

xk+1 =

This is partly why it takes a Ph.D. to design control for an air or space
craft.

0 0
0 0

xk +

0
0

uk

## Reachability is actually the more interesting property, requiring

Invertibility of the controllability matrix.
Stabilizability:
A system is stabilizable if its uncontrollable modes, if any, are openloop stable. Its controllable modes may be stable or unstable.

Page 39

Page 40

Section 4.1.0

## Definitions related to Observability

Observability:
Can the initial state be determined.
Reconstructability:
Can the final state be determined.
Detectability:
A system is detectable if its unobservable modes, if any, are open-loop
stable. Its observable modes may be stable or unstable.
Stabilizability and Detectability
Stabilizability and Detectability relate to whether a closed-loop system can
be stabilized.
For simple stability (as opposed to good performance), we dont need to
observe or control open-loop stable modes.
If any unobservable or uncontrollable modes are fast enough and well
damped, and not excessively excited by disturbances, we may also be
able to achieve good control.

References
 R.E. Kalman. On the general theory of control systems. In Proc. 1st Inter.
Congress on Automatic Control, pages 481493. Moscow: IFAC, 1960.
 R.E. Kalman, Y.C. Ho, and K.S. Narendra. Controllability of Linear
Dynamical Systems, Vol 1. New York: John Wiley and Sons, 1962.
Part 8: Controllability and Observability

Page 41