Lecture 1
Overview
course mechanics
some examples
11
Course mechanics
www.stanford.edu/class/ee263
course requirements:
weekly homework
Overview 12
Prerequisites
control systems
dynamics
Overview 13
Overview 14
Linear dynamical system
dx
= A(t)x(t) + B(t)u(t), y(t) = C(t)x(t) + D(t)u(t)
dt
where:
t R denotes time
Overview 15
x = Ax + Bu, y = Cx + Du
Overview 16
Some LDS terminology
Overview 17
where
t Z = {0, 1, 2, . . .}
Overview 18
Why study linear systems?
Overview 19
Usefulness of LDS
used for
analysis & design
implementation, embedded in real-time systems
Overview 110
Origins and history
Overview 111
methods for linear systems often work unreasonably well, in practice, for
nonlinear systems
Overview 112
Examples (ideas only, no details)
x = Ax, y = Cx
Overview 113
typical output:
3
1
y
3
0 50 100 150 200 250 300 350
3
t
2
1
y
3
0 100 200 300 400 500 600 700 800 900 1000
Overview 114
0.2
t
0
0.2
0 50 100 150 200 250 300 350
1
1
0 50 100 150 200 250 300 350
0.5
0.5
0 50 100 150 200 250 300 350
2
2
0 50 100 150 200 250 300 350
1
1
0 50 100 150 200 250 300 350
2
2
0 50 100 150 200 250 300 350
5
5
0 50 100 150 200 250 300 350
0.2
0.2
0 50 100 150 200 250 300 350
Overview 115
Input design
x = 0 = Ax + Bustatic, y = ydes = Cx
Overview 116
lets apply u = ustatic and just wait for things to settle:
0
0.2
u1
0.4
0.6
0.8
1
200 0 200 400 600 800 1000 1200 1400 1600 1800
t
0.4
0.3
u2
0.2
0.1
0
0.1
200 0 200 400 600 800 1000 1200 1400 1600 1800
2
t
1.5
y1 1
0.5
200 0 200 400 600 800 1000 1200 1400 1600 1800
t
0
1
y2
4
200 0 200 400 600 800 1000 1200 1400 1600 1800
t
Overview 117
using very clever input waveforms (EE263) we can do much better, e.g.
0.2
0
u1
0.2
0.4
0.6
0 10 20 30 40 50 60
t
0.4
u2
0.2
0.2
0 10 20 30 40 50 60
t
1
y1
0.5
0.5
0 10 20 30 40 50 60
t
0
0.5
y2
1
1.5
2
2.5
0 10 20 30 40 50 60
t
Overview 118
in fact by using larger inputs we do still better, e.g.
5
u1
0
5
5 0 5 10 15 20 25
t
1
0.5
u2
0
0.5
1
1.5
5 0 5 10 15 20 25
2
t
1
y1 0
5 0 5 10 15 20 25
t
0
0.5
y2
1.5
2
5 0 5 10 15 20 25
t
Overview 119
Overview 120
Estimation / filtering
u w y
H(s) A/D
Overview 121
1
u(t)
1
0 1 2 3 4 5 6 7 8 9 10
1.5
s(t)
0.5
0
0 1 2 3 4 5 6 7 8 9 10
1
w(t)
1
0 1 2 3 4 5 6 7 8 9 10
1
y(t)
1
0 1 2 3 4 5 6 7 8 9 10
Overview 122
simple approach:
ignore quantization
approximate u as G(s)y
Overview 123
0.8
0.6
0.4
0.2
u(t) (solid) and u
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5 6 7 8 9 10
Overview 124
EE263 Autumn 2008-09 Stephen Boyd
Lecture 2
Linear functions and examples
engineering examples
interpretations
21
Linear equations
a function f : Rn Rm is linear if
f (x + y) = f (x) + f (y), x, y Rn
f (x) = f (x), x Rn R
x+y
f (y)
y
x
f (x + y)
f (x)
Interpretation of aij
n
X
yi = aij xj
j=1
aij is gain factor from jth input (xj ) to ith output (yi)
thus, e.g.,
a27 = 0 means 2nd output (y2) doesnt depend on 7th input (x7)
x1
x2
x3
x4
ib x2
y1 x1 ib y2
Gravimeter prospecting
gi gavg
Thermal system
location 4
heating element 5
x1
x2
x3
x4
x5
y = Ax
pwr. xj
ij rij
illum. yi
n transmitter/receiver pairs
transmitter j transmits to receiver j (and, inadvertantly, to the other
receivers)
pj is power of jth transmitter
si is received signal power of ith receiver
zi is received interference power of ith receiver
Gij is path gain from transmitter j to receiver i
we have s = Ap, z = Bp, where
Gii i = j 0 i=j
aij = bij =
0 i=6 j Gij i 6= j
Cost of production
we have y = Ax
we have r = AT q
rT x = (AT q)T x = q T Ax
n flows with rates f1, . . . , fn pass from their source nodes to their
destination nodes over fixed routes in a network
l = AT d
Linearization
where
fi
Df (x0)ij =
xj x0
is derivative (Jacobian) matrix
beacons
(p1, q1)
(p4, q4) 1
4
2
(x, y) (p2, q2)
unknown position
3
(p3, q3)
estimation or inversion
control or design
mapping or transformation
Estimation or inversion
y = Ax
sample problems:
find x, given y
find all xs that result in y (i.e., all xs consistent with measurements)
if there is no x such that y = Ax, find x s.t. y Ax (i.e., if the sensor
readings are inconsistent, find x which is almost consistent)
y = Ax
sample problems:
Mapping or transformation
sample problems:
where aj Rm
0 0 1
T1
a
T2
a
A=
..
Tn
a
i Rn
where a
then y = Ax can be written as
T1 x
a
T2 x
a
y=
..
Tmx
a
thus yi = h
ai, xi, i.e., yi is inner product of ith row of A with x
geometric interpretation:
Ti x = is a hyperplane in Rn (normal to a
yi = a i )
yi = h
ai, xi = 0
yi = h
ai, xi = 3
i
a
yi = h
ai, xi = 2
yi = h
ai, xi = 1
aij is the gain along the path from jth input to ith output
(by not drawing paths with zero gain) shows sparsity structure of A
(e.g., diagonal, block upper triangular, arrow . . . )
x1 A11 y1
A12
x2 A22 y2
n
X
cij = aik bkj
k=1
z B x A y z AB y
p n m p m
cT1
T
1 B
a
C= .
. = AB = .
cTm TmB
a
Ti bj = h
cij = a ai, bj i
x1 b11 z1 a11 y1
b21 a21
b21 a21
z2
x2 b22 a22 y2
path gain= a22b21
Lecture 3
Linear algebra review
change of coordinates
31
Vector spaces
a set V
a vector sum + : V V V
a scalar multiplication : R V V
a distinguished element 0 V
(x + y) + z = x + (y + z), x, y, z V (+ is associative)
0 + x = x, x V (0 is additive identity)
1x = x, x V
Examples
and v1, . . . , vk Rn
(x)(t) = x(t)
V5 = {x V4 | x = Ax}
(points in V5 are trajectories of the linear system x = Ax)
V5 is a subspace of V4
1v1 + 2v2 + + k vk = 0 = 1 = 2 = = 0
implies 1 = 1, 2 = 2, . . . , k = k
v = 1v1 + + k vk
fact: for a given vector space V, the number of vectors in any basis is the
same
N (A) = { x Rn | Ax = 0 }
Zero nullspace
det(AT A) 6= 0
suppose z N (A)
y = Ax represents measurement of x
Range of a matrix
columns of A span Rm
N (AT ) = {0}
det(AAT ) 6= 0
Interpretations of range
y = Ax represents measurement of x
N (A) = {0}
R(A) = Rn
Interpretations of inverse
x = By is unique solution of Ax = y
n
(bTi y)ai
X
y=
i=1
thus, inner product with rows of inverse matrix gives the coefficients in
the expansion of a vector in the columns of the matrix
Rank of a matrix
(nontrivial) facts:
rank(A) = rank(AT )
roughly speaking:
n is number of degrees of freedom in input x
dim N (A) is number of degrees of freedom lost in the mapping from
x to y = Ax
rank(A) is number of degrees of freedom in output y
x n A m y x n C r B m y
for skinny matrices (m n), full rank means columns are independent
for fat matrices (m n), full rank means rows are independent
(1 in ith component)
obviously we have
x=x 2 t2 + + x
1 t1 + x n tn
where x
i are the coordinates of x in the basis (t1, t2, . . . , tn)
define T = t1 t2 t n so x = T x
, hence
= T 1x
x
x = Tx
, y = T y
so
y = (T 1AT )
x
(Euclidean) norm
important properties:
n
!1/2
1X kxk
rms(x) = x2i =
n i=1
n
Inner product
important properties:
hx, yi = hx, yi
hx + y, zi = hx, zi + hy, zi
hx, yi = hy, xi
hx, xi 0
hx, xi = 0 x = 0
xT y
= 6 (x, y) = cos1
kxkkyk
xT y
kyk2
y
special cases:
y y
xT y > 0 xT y < 0
{x | xT y 0}
0
y
Lecture 4
Orthonormal sets of vectors and QR
factorization
41
normalized if kuik = 1, i = 1, . . . , k
(ui are called unit vectors or direction vectors)
orthogonal if ui uj for i 6= j
orthonormal if both
U T U = Ik
span(u1, . . . , uk ) = R(U )
Geometric properties
if w = U z and w
= U z then
= hU z, U zi = (U z)T (U z) = z T U T U z = hz, zi
hw, wi
n
X
uiuTi = I
i=1
n
X
x= (uTi x)ui
i=1
Pn
the identity I = U U T = T
i=1 ui ui is sometimes written (in physics) as
n
X
I= |uiihui|
i=1
since n
X
x= |uiihui|xi
i=1
(but we wont use this notation)
examples:
e2 e2
e1 x1 e1 x1
Gram-Schmidt procedure
step 1a. q1 := a1
etc.
a2 q2 = a2 (q1T a2)q1
q2
q1 q1 = a1
for i = 1, 2, . . . , k we have
(note that the rij s come right out of the G-S procedure, and rii 6= 0)
possibly nonzero entries
zero entries
S]P
A = Q[R
where:
QT Q = Ir
Full QR factorization
to find Q2:
they are orthogonal (i.e., every vector in the first subspace is orthogonal
to every vector in the second subspace)
this is written
R(Q1) + R(Q2) = Rn
AT z = 0 QT1 z = 0 z R(Q2)
so R(Q2) = N (AT )
(in fact the columns of Q2 are an orthonormal basis for N (AT ))
R(A) + N (AT ) = Rn (recall A Rnk )
can now prove most of the assertions from the linear algebra review
lecture
N (A) + R(AT ) = Rk
Lecture 5
Least-squares
least-squares estimation
BLUE property
51
Least-squares 52
Geometric interpretation
y r
Axls
R(A)
Least-squares 53
krk2 = xT AT Ax 2y T Ax + y T y
Least-squares 54
xls is linear function of y
AA = (AT A)1AT A = I
Least-squares 55
Projection on R(A)
Axls is (by definition) the point in R(A) that is closest to y, i.e., it is the
projection of y onto R(A)
Axls = PR(A)(y)
Least-squares 56
Orthogonality principle
optimal residual
is orthogonal to R(A):
for all z Rn
y r
Axls
R(A)
Least-squares 57
Completion of squares
Least-squares 58
Least-squares via QR factorization
pseudo-inverse is
so xls = R1QT y
Least-squares 59
full QR factorization:
R1
A = [Q1 Q2]
0
Least-squares 510
R1x QT1 y
2
=
QT2 y
Axls y = Q2QT2 y
Least-squares 511
Least-squares estimation
y = Ax + v
Least-squares 512
least-squares estimation: choose as estimate x
that minimizes
kA
x yk
= (AT A)1AT y
least-squares estimate is just x
Least-squares 513
BLUE property
y = Ax + v
called unbiased if x
= x whenever v = 0
(i.e., no estimation error when there is no noise)
same as BA = I, i.e., B is left inverse of A
Least-squares 514
estimation error of unbiased linear estimator is
xx
= x B(Ax + v) = Bv
Least-squares 515
beacons
k4 k1
k2
x
k3
unknown position
Least-squares 516
ranges y R4 measured, with measurement noise v:
k1T
k2T
k3T x + v
y =
k4T
Least-squares 517
compute estimate x
by inverting top (2 2) half of A:
0 1.0 0 0 2.84
x
= Bjey = y=
1.12 0.5 0 0 11.9
Least-squares 518
Least-squares method
compute estimate x
by least-squares:
0.23 0.48 0.04 0.44 4.95
x
=A y= y=
0.47 0.02 0.51 0.18 10.26
Least-squares 519
u w y
H(s) A/D
u(t) = xj , j 1 t < j, j = 1, . . . , 10
Z t
w(t) = h(t )u( ) d
0
Least-squares 520
3-bit quantization: yi = Q(
yi), i = 1, . . . , 100, where Q is 3-bit
quantizer characteristic
example:
1
u(t) 0
1
0 1 2 3 4 5 6 7 8 9 10
1.5
s(t)
0.5
0
0 1 2 3 4 5 6 7 8 9 10
1
y(t) w(t)
1
0 1 2 3 4 5 6 7 8 9 10
1
1
0 1 2 3 4 5 6 7 8 9 10
t
Least-squares 521
we have y = Ax + v, where
Z j
10010
AR is given by Aij = h(0.1i ) d
j1
0.8
0.6
0.4
0.2
u(t) (solid) & u
0.2
0.4
0.6
0.8
1
0 1 2 3 4 5 6 7 8 9 10
Least-squares 522
kx xlsk
RMS error is = 0.03
10
Least-squares 523
0.1
0.05
0.05
0 1 2 3 4 5 6 7 8 9 10
0.15
row 5
0.1
0.05
0.05
0 1 2 3 4 5 6 7 8 9 10
0.15
row 8
0.1
0.05
0.05
0 1 2 3 4 5 6 7 8 9 10
Least-squares 524
EE263 Autumn 2008-09 Stephen Boyd
Lecture 6
Least-squares applications
system identification
61
Least-squares applications 62
using matrix notation, total square fitting error is kAx gk2, where
Aij = fj (si)
x = (AT A)1AT g
corresponding function is
applications:
interpolation, extrapolation, smoothing of data
developing simple, approximate model of data
Least-squares applications 63
1 t1 t21 t1n1
1 t2 t2 tn1
A= 2 2
.. ..
2 n1
1 tm t m t m
Least-squares applications 64
assuming tk 6= tl for k 6= l and m n, A is full rank:
suppose Aa = 0
Least-squares applications 65
Example
least-squares fit for degrees 1, 2, 3, 4 have RMS errors .135, .076, .025,
.005, respectively
Least-squares applications 66
1
p1(t)
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
1
p2(t)
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
1
p3(t)
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
1
p4(t)
0.5
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
t
Least-squares applications 67
for p = 1, . . . , n
regress y on a1, . . . , ap
Least-squares applications 68
solution for each p n is given by
(p)
xls = (ATp Ap)1ATp y = Rp1QTp y
where
Least-squares applications 69
kresidualk
kyk
minx1 kx1a1 yk
P7
minx1,...,x7 k i=1 xi ai yk
p
0 1 2 3 4 5 6 7
example with scalar u, y (vector u, y readily handled): fit I/O data with
moving-average (MA) model with n delays
where h0, . . . , hn R
u(t)
0
4
0 10 20 30 40 50 60 70
t
5
y(t) 0
5
0 10 20 30 40 50 60 70
t
for n = 7 we obtain MA model with
(h0, . . . , h7) = (.024, .282, .418, .354, .243, .487, .208, .441)
5
solid: y(t): actual output
4 dashed: y(t), predicted from model
3
4
0 10 20 30 40 50 60 70
t
obviously the larger n, the smaller the prediction error on the data used
to form the model
suggests using largest possible model order for smallest prediction error
1
relative prediction error kek/kyk
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 5 10 15 20 25 30 35 40 45 50
n
difficulty: for n too large the predictive ability of the model on other I/O
data (from the same system) becomes worse
evaluate model predictive performance on another I/O data set not used to
develop model
(t)
u 0
4
0 10 20 30 40 50 60 70
t
5
y(t)
5
0 10 20 30 40 50 60 70
t
Least-squares applications 617
1
relative prediction error
0.8
0.6
0.4
validation data
0.2
modeling data
0
0 5 10 15 20 25 30 35 40 45 50
n
5
0 10 20 30 40 50 60 70
t
5
solid: y(t)
dashed: predicted y(t)
0
5
0 10 20 30 40 50 60 70
t
m
!1 m
X X
we can compute xls(m) = aiaTi yiai recursively
i=1 i=1
for m = 0, 1, . . . ,
we can calculate
1
P (m + 1)1 = P (m) + am+1aTm+1
1 1
P + aaT = P 1 (P 1a)(P 1a)T
1+ aT P 1a
1
(P + aaT ) P 1 (P 1a)(P 1a)T
1 + aT P 1a
1
= I + aaT P 1 T
P (P 1a)(P 1a)T
1+a P a 1
1
T
aaT (P 1a)(P 1a)T
1 + a P 1a
1 aT P 1a
= I + aaT P 1 T
aa T 1
P T
aaT P 1
1+a P a 1 1+a P a 1
= I
Lecture 7
Regularized least-squares and Gauss-Newton
method
multi-objective least-squares
regularized least-squares
nonlinear least-squares
Gauss-Newton method
71
Multi-objective least-squares
(x Rn is the variable)
we can make one smaller, at the expense of making the other larger
J1 x(1)
x(3)
x(2)
J2
note that x Rn, but this plot is in R2; point labeled x(1) is really
J2(x(1)), J1(x(1))
PSfrag
J1 x(1)
x(3)
x(2)
J1 + J2 =
J2
where
A y
A = , y =
F g
AT A + F T F AT y + F T g
1
=
Example
optimal x:
x = aaT + I
1
a
0.9
0.8
0.7
J1 = (y 1)2
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.5 1 1.5 2 2.5 3 3.5
3
x 10
2
J2 = kxk
Regularized least-squares
x = AT A + I AT y,
1
Ax y is sensor residual
Nonlinear least-squares
m
X
2
kr(x)k = ri(x)2,
i=1
where r : Rn Rm
we measure i = kx bik + vi
(vi is range error, unknown but assumed small)
m
X
2
n
NLLS: find x R that minimizes kr(x)k = ri(x)2, where
i=1
r : Rn Rm
Gauss-Newton method:
10 beacons
+ true position (3.6, 3.2); initial guess (1.2, 1.2)
range estimates accurate to 0.5
5
5
5 4 3 2 1 0 1 2 3 4 5
16
14
12
10
5
0
5 0
0
5 5
10
kr(x)k2
6
0
1 2 3 4 5 6 7 8 9 10
iteration
final estimate is x
= (3.3, 3.3)
estimation error is k
x xk = 0.31
(substantially smaller than range accuracy!)
1 2
1 1
5
5 4 3 2 1 0 1 2 3 4 5
so that next iterate is not too far from previous one (hence, linearized
model still pretty accurate)
Lecture 8
Least-norm solutions of undetermined
equations
81
we consider
y = Ax
where A Rmn is fat (m < n), i.e.,
well assume that A is full rank (m), so for each y Rm, there is a solution
{ x | Ax = y } = { xp + z | z N (A) }
Least-norm solution
minimize kxk
subject to Ax = y
{ x | Ax = y }
N (A) = { x | Ax = 0 }
xln
A = (AT A)1AT
Q Rnm, QT Q = Im
then
kxlnk = kRT yk
minimize xT x
subject to Ax = y
xL = 2x + AT = 0, L = Ax y = 0
find least norm force that transfers mass unit distance with zero final
velocity, i.e., y = (1, 0)
0.04
xln
0.02
0.02
0.04
0.06
0 2 4 6 8 10 12
1
t
0.8
position
0.6
0.4
0.2
0
0 2 4 6 8 10 12
0.2
t
velocity
0.15
0.1
0.05
0
0 2 4 6 8 10 12
consider problem
minimize kAx bk
subject to Cx = d
with variable x
equivalent to
minimize (1/2)kAx bk2
subject to Cx = d
Lagrangian is
xL = AT Ax AT b + C T = 0, L = Cx d = 0
AT A C T AT b
x
=
C 0 d
x = (AT A)1(AT b C T )
C(AT A)1(AT b C T ) = d
so 1
= C(AT A)1C T C(AT A)1AT b d
Lecture 9
Autonomous linear dynamical systems
examples
91
x = Ax
x1
1 0
example 1: x = x
2 1
2
1.5
0.5
0.5
1.5
2
2 1.5 1 0.5 0 0.5 1 1.5 2
1.5
0.5
0.5
1.5
2
2 1.5 1 0.5 0 0.5 1 1.5 2
Block diagram
n n
1/s
x(t)
x(t)
A
1/s
x1
A11
A12
1/s x2
A22
Linear circuit
ic1 il1
vc1 C1 L1 vl1
Chemical reactions
dxi
= ai1x1 + + ainxn
dt
0.9
0.8
0.7
x1 x3
0.6
x(t)
0.5
0.4
0.3
0.2 x2
0.1
0
0 1 2 3 4 5 6 7 8 9 10
t
Autonomous linear dynamical systems 911
example:
0.9
1.0 1 0.1
3 0.7 2
0.1
0.2
state 1 is system OK
state 2 is system down
state 3 is system being repaired
0.9 0.7 1.0
p(t + 1) = 0.1 0.1 0 p(t)
0 0.2 0
(k1) (k2)
x(k) 1/s x 1/s x 1/s x
Ak1 Ak2 A0
Mechanical systems
M q + Dq + Kq = 0
x = f (x)
where f : Rn Rn
x(t)
= f (x(t)) f (xe) + Df (xe)(x(t) xe)
x(t) Df (xe)x(t)
(more later)
m
mg
the linearized system usually, but not always, gives a good idea of the
system behavior near xe
example 1: x = x3 near xe = 0
1/2
for x(0) > 0 solutions have form x(t) = x(0)2 + 2t
= 0; solutions are constant
linearized system is x
example 2: z = z 3 near ze = 0
1/2
for z(0) > 0 solutions have form z(t) = z(0)2 2t
0.5
0.45
0.4 z(t)
0.35
0.3
0.25
0.2
0.15
then
d
(x xtraj) = f (x, t) f (xtraj, t) Dxf (xtraj, t)(x xtraj)
dt
(time-varying) LDS
= Dxf (xtraj, t)x
x
is called linearized or variational system along trajectory xtraj
linearized system is
= A(t)x
x
where A(t) = Df (xtraj(t))
used to study:
Lecture 10
Solution via Laplace transform and matrix
exponential
Laplace transform
matrix exponential
101
suppose z : R+ Rpq
D includes at least {s | s > a}, where a satisfies |zij (t)| eat for
t 0, i = 1, . . . , p, j = 1, . . . , q
L(z)
= sZ(s) z(0)
x = Ax
x(t) = (t)x(0)
0 1
x = x
1 0
1.5
0.5
0.5
1.5
2
2 1.5 1 0.5 0 0.5 1 1.5 2
s 1
(sI A)1 = s2 +1 s2 +1
1 s
s2 +1 s2 +1
(eigenvalues are j)
0 1
x = x
0 0
1.5
0.5
0.5
1.5
2
2 1.5 1 0.5 0 0.5 1 1.5 2
1 1
(sI A) 1
= s s2
1
0 s
(eigenvalues are 0, 0)
1 t
so we have x(t) = x(0)
0 1
Characteristic polynomial
det ij
(1)i+j
det(sI A)
Matrix exponential
1 1 I A A2
(sI A) = (1/s)(I A/s) = + 2 + 3 +
s s s
(tA)2
(t) = L1 (sI A)1 = I + tA + +
2!
at (ta)2
e = 1 + ta + +
2!
M M2
e =I +M + +
2!
x(t) = etAx(0)
x(t) = etax(0)
some things youd guess hold for the matrix exponential (by analogy
with the scalar exponential) do in fact hold
example: you might guess that eA+B = eAeB , but its false (in general)
0 1 0 1
A= , B=
1 0 0 0
A 0.54 0.84 B 1 1
e = , e =
0.84 0.54 0 1
0.16 1.40 0.54 1.38
eA+B = A B
6= e e =
0.70 0.16 0.84 0.30
with s = t we get
etAetA = etAtA = e0 = I
we already found
1 t
etA = L1(sI A)1 =
0 1
1 1
so, plugging in t = 1, we get eA =
0 1
A A2
e =I +A+ + = I + A
2!
since A2 = A3 = = 0
for x = Ax we know
x( + t) = etAx( )
x( + t) x( ) + tx(
) = (I + tA)x( )
exact solution is
suppose x = Ax
z(k + 1) = ehAz(k),
(matrix on righthand side is called state transition matrix for system, and
denoted (t))
ai(s)
Xi(s) =
X (s)
thus the poles of Xi are all eigenvalues of A (but not necessarily the other
way around)
j gives exponential growth rate (if > 0), or exponential decay rate (if
< 0) of term
eigenvalues s
Stability
meaning:
lim p(t)et = 0
t
Lecture 11
Eigenvectors and diagonalization
eigenvectors
left eigenvectors
diagonalization
modal form
discrete-time stability
111
X () = det(I A) = 0
equivalent to:
Av = v
wT A = wT
Av = v
Scaling interpretation
Ax
v
x
Av
(what is here?)
(well see later how this relates to stability of continuous- and discrete-time
systems. . . )
Dynamic interpretation
suppose Av = v, v 6= 0
(tA)2
x(t) = etAv = I + tA + + v
2!
(t)2
= v + tv + v +
2!
= etv
Invariant sets
trajectory
line { tv | t R } is invariant
(in fact, ray { tv | t > 0 } is invariant)
Complex eigenvectors
suppose Av = v, v 6= 0, is complex
for a C, (complex) trajectory aetv satisfies x = Ax
hence so does (real) trajectory
x(t) = aetv
cos t sin t
= et vre vim
sin t cos t
where
v = vre + jvim, = + j, a = + j
suppose wT A = wT , w 6= 0
then
d T
(w x) = wT x = wT Ax = (wT x)
dt
i.e., w x satisfies the DE d(wT x)/dt = (wT x)
T
et ( cos(t) + sin(t))
Summary
left eigenvectors give linear functions of state that are simple, for any
initial condition
block diagram:
x1 x2 x3
1/s 1/s 1/s
1 10 10
eigenvalues are 1, j 10
1
x1
1
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0.5
t
0
x2
0.5
1
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
1
t
0.5
x3
0.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
t
0.9
0.8
0.7
0.6
gT x
0.5
0.4
0.3
0.2
0.1
0
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
t
Eigenvectors and diagonalization 1115
eigenvector associated with eigenvalue j 10 is
0.554 + j0.771
v = 0.244 + j0.175
0.055 j0.077
x1
0
1
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0.5
t
x2
0
0.5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
0.1
t
x3
0.1
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
t
express as
1
A
v1 vn
=
v1 vn
...
n
define T = v1 vn and = diag(1, . . . , n), so
AT = T
and finally
T 1AT =
T 1AT = = diag(1, . . . , n)
then AT = T , i.e.,
Avi = ivi, i = 1, . . . , n
0 1
example: A =
0 0
v1
so all eigenvectors have form v = where v1 6= 0
0
Distinct eigenvalues
rewrite T 1AT = as T 1A = T 1, or
w1T
T
w1
.
. A = ..
wnT wnT
thus
wiT A = iwiT
i.e., the rows of T 1 are (lin. indep.) left eigenvectors, normalized so that
wiT vj = ij
(i.e., left & right eigenvectors chosen this way are dual bases)
Modal form
suppose A is diagonalizable by T
= AT x
Tx = T 1AT x
x =
x x
x
1
1/s
x
n
1/s
i(t) = eitx
x i(0)
when eigenvalues (hence T ) are complex, system can be put in real modal
form:
where i are the complex eigenvalues (one from each conjugate pair)
1/s
1/s
eA = I + A + A2/2! +
2
= I + T T 1 + T T 1 /2! +
= T (I + + 2/2! + )T 1
= T eT 1
= T diag(e1 , . . . , en )T 1
substituting A = T T 1, we have
assume A is diagonalizable
then
x(t) = etAx(0)
= T etT 1x(0)
Xn
= eit(wiT x(0))vi
i=1
interpretation:
1 < 0, . . . , s < 0,
from n
X
x(t) = eit(wiT x(0))vi
i=1
condition for x(t) 0 is:
or equivalently,
wiT x(0) = 0, i = s + 1, . . . , n
|i| < 1, i = 1, . . . , n.
we will see later that this is true even when A is not diagonalizable, so we
have
fact: x(t + 1) = Ax(t) is stable if and only if all eigenvalues of A have
magnitude less than one
Lecture 12
Jordan canonical form
generalized modes
Cayley-Hamilton theorem
121
where
i 1
i ...
Ji = Cnini
... 1
i
Pq
is called a Jordan block of size ni with eigenvalue i (so n = i=1 ni )
more generally,
X
dim N (I A)k = min{k, ni}
i =
Generalized eigenvectors
consider LDS x = Ax
by change of coordinates x = T x = J x
, can put into form x
i = Jix
system is decomposed into independent Jordan block systems x i
x
ni x
ni1 x
1
1/s 1/s 1/s
Generalized modes
q
X
tA tJ
x(t) = e x(0) = T e T 1
x(0) = TietJi (SiT x(0))
i=1
where
S1T
T 1 = ..
SqT
Cayley-Hamilton theorem
X (A) = A2 5A 2I
7 10 1 2
= 5 2I
15 22 3 4
= 0
then
as
I = A (a1/a0)I (a2/a0)A (1/a0)An1
(A is invertible a0 6= 0) so
X (s) = (s 1) (s n)
since
X (A) = X (T T 1) = T X ()T 1
it suffices to show X () = 0
X () = ( 1I) ( nI)
= diag(0, 2 1, . . . , n 1) diag(1 n, . . . , n1 n, 0)
= 0
ni
0 1 0
X (Ji) = (Ji 1I)n1 0 0 1 (Ji q I)nq = 0
...
| {z }
(Ji i I)ni
Lecture 13
Linear dynamical systems with inputs &
outputs
transfer matrix
examples
131
x = Ax + Bu, y = Cx + Du
x(t)
(with u(t) = 1.5) Ax(t) x(t)
B
state derivative is sum of autonomous term (Ax) and one term per
input (biui)
each input ui gives another degree of freedom for x (assuming columns
of B independent)
Ti x + bTi u, where a
write x = Ax + Bu as x i = a Ti , bTi are the rows of A, B
Block diagram
x(t)
x(t)
u(t) B 1/s C y(t)
u B1 1/s y
x1 C1
A11
A12 x2
1/s C2
A22
Transfer matrix
take Laplace transform of x = Ax + Bu:
hence
X(s) = (sI A)1x(0) + (sI A)1BU (s)
so Z t
tA
x(t) = e x(0) + e(t )ABu( ) d
0
so Z t
tA
y(t) = Ce x(0) + Ce(t )ABu( ) d + Du(t)
0
intepretation:
interpretations:
Step matrix
interpretations:
u1 u1 u2 u2
system is:
0 0 0 1 0 0 0 0
0 0 0 0 1 0
0 0
0 0 0 0 0 1 0 0 u1
x = x +
2 1 0 2 1 0
1 0 u2
1 2 1 1 2 1 1 1
0 1 2 0 1 2 0 1
eigenvalues of A are
0.1 h11
0
0.1
h31
0.2 h21
0 5 10 15
t
impulse response from u2
0.2 h22
0.1 h12
0
0.1
h32
0.2
0 5 10 15
t
roughly speaking:
Example 2
interconnect circuit:
C3 C4
u
C1 C2
eigenvalues of A are
0.9
0.8
0.7
s1
0.6
s2
0.5
s3
0.4 s4
0.3
0.2
0.1
0
0 5 10 15
if system is stable,
Z
H(0) = h(t) dt = lim s(t)
0 t
Z Z t
st
(recall: H(s) = e h(t) dt, s(t) = h( ) d )
0 0
m p
if u(t) u R , then y(t) y R where y = H(0)u
define sequences
h > 0 is called the sample interval (for x and y) or update interval (for
u)
where
!
Z h
Ad = ehA, Bd = e A d B, Cd = C, Dd = D
0
for h > 0
extensions/variations:
x = Ax + Bu, y = Cx + Du
is given by
z = AT z + C T v, w = B T z + DT v
original system:
x(t)
u(t) B 1/s C y(t)
dual system:
DT
z(t)
w(t) BT 1/s CT v(t)
AT
interpretation of
Z t
tA
x(t) = e x(0) + e(t )ABu( ) d
0
Z t
tA
y(t) = Ce x(0) + Ce(t )ABu( ) d + Du(t)
0
for t 0:
current state (x(t)) and output (y(t)) depend on past input (u( ) for
t)
i.e., mapping from input to state and output is causal (with fixed initial
state)
Change of coordinates
start with LDS x = Ax + Bu, y = Cx + Du
change coordinates in Rn to x
, with x = T x
then
= T 1x = T 1(Ax + Bu) = T 1AT x
x + T 1Bu
x + Bu,
= A
x y = C x
+ Du
where
A = T 1AT, = T 1B,
B C = CT, =D
D
C(sI 1B
A) +D
= C(sI A)1B + D
T 1AT = diag(1, . . . , n)
write
bT
1
T 1B = .. ,
CT = c1 cn
bT
n
so n
i + bTi u,
X
i = ix
x y= cix
i
i=1
bT x
1
1 1/s c1
1
u y
bT x
n
n 1/s cn
(here we assume D = 0)
x(t + 1) x(t)
u(t) B 1/z C y(t)
we have:
x(1) = Ax(0) + Bu(0),
t1
X
t
x(t) = A x(0) + A(t1 )Bu( )
=0
hence
y(t) = CAtx(0) + h u
Z-transform
w : Z+ Rpq
X
W (z) = z tw(t)
t=0
X
V (z) = z tw(t + 1)
t=0
X
= z z tw(t)
t=1
= zW (z) zw(0)
yields
Lecture 14
Example: Aircraft dynamics
linearized dynamics
steady-state analysis
impulse matrices
141
body axis
horizontal
disturbance inputs:
t: thrust
Linearized dynamics
Steady-state analysis
DC gain from (uw , vw , e, t) to (u, h):
1 0 27.2 15.0
H(0) = CA 1
B+D =
0 1 1.34 24.9
gives steady-state change in speed & climb rate due to wind, elevator &
thrust changes
solve for control variables in terms of wind velocities, desired speed &
climb rate
e .0379 .0229 u uw
=
t .0020 .0413 h + vw
eigenvalues are
Short-period mode
0.5
u(t)
0.5
1
0 2 4 6 8 10 12 14 16 18 20
0.5
h(t)
0.5
1
0 2 4 6 8 10 12 14 16 18 20
u(t)
0
2
0 200 400 600 800 1000 1200 1400 1600 1800 2000
1
h(t)
2
0 200 400 600 800 1000 1200 1400 1600 1800 2000
0 0
0.1 0.1
0 5 10 15 20 0 5 10 15 20
0.5 0.5
h22
h21
0 0
0.5 0.5
0 5 10 15 20 0 5 10 15 20
h12
h11
0 0
0.1 0.1
0 200 400 600 0 200 400 600
0.5 0.5
h22
h21
0 0
0.5 0.5
0 200 400 600 0 200 400 600
impulse response matrix from (e, t) to (u, h)
1 1
h12
h11
0 0
1 1
2 2
0 5 10 15 20 0 5 10 15 20
5 3
2.5
2
h22
h21
0 1.5
0.5
5 0
0 5 10 15 20 0 5 10 15 20
1 1
h12
h11
0 0
1 1
2 2
0 200 400 600 0 200 400 600
3 3
2 2
1 1
h22
h21
0 0
1 1
2 2
3 3
0 200 400 600 0 200 400 600
Lecture 15
Symmetric matrices, quadratic forms, matrix
norm, and SVD
eigenvectors of symmetric matrices
quadratic forms
norm of a matrix
151
then n
X
T T T
v Av = v (Av) = v v = |vi|2
i=1
but also n
T T T X
v Av = (Av) v = (v) v = |vi|2
i=1
Q1AQ = QT AQ =
n
X
T
A = QQ = iqiqiT
i=1
replacements
Interpretations
A = QQT
x QT x QT x Ax
T Q
Q
scale coordinates by i
rotate by QT
rotate back by Q
decomposition
n
X
A= iqiqiT
i=1
expresses A as linear combination of 1-dimensional projections
example:
1/2 3/2
A =
3/2 1/2
T
1 1 1 1 0 1 1 1
=
2 1 1 0 2 2 1 1
q1
q2q2T x
x
q1q1T x 1q1q1T x
q2
2q2q2T x
Ax
then we have
so (i j )viT vj = 0
Example: RC circuit
i1
v1 c1
resistive circuit
in
vn cn
ck v k = ik , i = Gv
where C 1/2 = diag( c1, . . . , cn)
we conclude:
Quadratic forms
n
X
T
f (x) = x Ax = Aij xixj
i,j=1
xT Ax = xT ((A + AT )/2)x
kBxk2 = xT B T Bx
Pn1
i=1 (xi+1 xi)2
kF xk2 kGxk2
xT Ax = xT QQT x
= (QT x)T (QT x)
Xn
= i(qiT x)2
i=1
n
X
1 (qiT x)2
i=1
= 1kxk2
nxT x xT Ax 1xT x
suppose A = AT Rnn
denoted A > 0
A > 0 if and only if min(A) > 0, i.e., all eigenvalues are positive
for example:
if A B and C D, then A + C B + D
if B 0 then A + B A
if A 0 and 0, then A 0
A2 0
if A > 0, then A1 > 0
A 6 B, B 6 A
E = { x | xT Ax 1 }
s1
s2
1/2
semi-axes are given by si = i qi, i.e.:
note:
questions:
Matrix norm
kAxk2 xT AT Ax
max = max = max(AT A)
x6=0 kxk 2 x6=0 kxk 2
p
so we have kAk = max(AT A)
1 2
example: A = 3 4
5 6
35 44
AT A =
44 56
T
0.620 0.785 90.7 0 0.620 0.785
=
0.785 0.620 0 0.265 0.785 0.620
p
then kAk = max(AT A) = 9.53:
2.18
0.620
A 0.620
=
4.99
= 9.53
0.785
= 1,
0.785
7.78
consistent
p with vector
norm: matrix norm of a Rn1 is
max(aT a) = aT a
definiteness: kAk = 0 A = 0
A = U V T
where
A Rmn, Rank(A) = r
U Rmr , U T U = I
V Rnr , V T V = I
r
X
T
A = U V = iuiviT
i=1
hence:
kAk = 1
similarly,
AAT = (U V T )(U V T )T = U 2U T
hence:
r
X
T
A = U V = iuiviT
i=1
x T
V Tx V T x Ax
V U
Av1 = 1u1
A is nonsingular
resolve x along v1, v2: v1T x = 0.5, v2T x = 0.6, i.e., x = 0.5v1 + 0.6v2
v2
u1
x Ax
v1 u2
Lecture 16
SVD Applications
general pseudo-inverse
full SVD
SVD in estimation/inversion
161
General pseudo-inverse
if A 6= 0 has SVD A = U V T ,
A = V 1U T
A = (AT A)1AT
A = AT (AAT )1
i.e., 1
x = AT A + I AT y
here, AT A + I > 0 and so is invertible
1
in fact, we have lim AT A + I A T = A
0
(check this!)
then we have
" #
1 0r(n r) V1T
A = U11V1T =
U1 U2
0(m r)r 0(m r)(n r) V2T
i.e.:
A = U V T
full SVD:
A = U V T
rotate (by V T )
rotate (by U )
1 1
1 rotate by V T 1
u2 u1 rotate by U
0.5
2
suppose y = Ax + v, where
y Rm is measurement
x Rn is vector to be estimated
v is a measurement noise or error
consider estimator x
= By, with BA = I (i.e., unbiased)
Eunc = { Bv | kvk }
x
x=x
x
x
Eunc = x
+ Eunc, i.e.:
true x lies in uncertainty ellipsoid Eunc, centered at estimate x
then:
BlsBlsT BB T
Els E
in particular kBlsk kBk
we have
k1T
k2T
y =
k3T x
k4T
where ki R2
= A y
x
15
x2
10
0
10 8 6 4 2 0 2 4 6 8 10
x1
uncertainty region for x using least-squares
20
15
x2
10
0
10 8 6 4 2 0 2 4 6 8 10
x1
if kA1k is large,
kxk kyk
kAkkA1k
kxk kyk
we have:
Rank(A)
we seek matrix A, p < r, s.t. A A in the sense that
is minimized
kA Ak
p
A =
X
iuiviT
i=1
P
r T
hence kA Ak =
i=p+1 iuivi
= p+1
hence, the two subspaces intersect, i.e., there is a unit vector z Rn s.t.
Bz = 0, z span{v1, . . . , vp+1}
p+1
X
(A B)z = Az = iuiviT z
i=1
p+1
X
2
k(A B)zk = i2(viT z)2 p+1
2
kzk2
i=1
hence kA Bk p+1 = kA Ak
Distance to singularity
another interpretation of i:
i = min{ kA Bk | Rank(B) i 1 }
then the terms iuiviT x, for i = 5, . . . , 30, are substantially smaller than
the noise term v
simplified model:
4
X
y= iuiviT x + v
i=1
Lecture 17
Example: Quantum mechanics
discretization
preservation of probability
example
171
Quantum mechanics
potential V : [0, 1] R
so w = 2v means
(vk+1 vk )/(1/N ) (vk vk1)(1/N )
wk =
1/N
= (i/
h/2m)2) = (i/
h)(V ( h)H
d d
kk2 =
dt dt
+
=
h)H) + ((i/
= ((i/ h)H)
h)H + (i/
= (i/ h)H
= 0
(using H = H T RN N )
H is symmetric, so
from Hv = v (i/
h)Hv = (i/
h)v we see:
eigenvectors of (i/
h)H are same as eigenvectors of H, i.e., v1, . . . , vN
eigenvalues of (i/
h)H are (i/
h)1, . . . , (i/
h)N (which are pure
imaginary)
(i/h)k t 2
2
|m(t)| = e vk = |vmk |2
900
800
700
600
V
500
400
300
200
100
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.2
0
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.2
0
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0.2
0
0.2
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
x
0.08
0.06
0.04
0.02
0
0 0.1 0.2 0.3 0.4
x
0.5 0.6 0.7 0.8 0.9 1
0.15
0.1
0.05
0
0 10 20 30 40 50 60 70 80 90 100
eigenstate
top plot shows initial probability density |(0)|2
bottom plot shows |vk(0)|2, i.e., resolution of (0) into eigenstates
particle rolls half way up potential bump, stops, then rolls back down
then repeats
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
4
t x 10
N/2
X
plot shows probability that particle is in left half of well, i.e., |k (t)|2,
k=1
versus time t
Lecture 18
Controllability and state transfer
state transfer
181
State transfer
Rt is a subspace of Rn
Rt Rs if t s
(i.e., can reach more points given more time)
we define the reachable set R as the set of points reachable for some t:
[
R= Rt
t0
u(t 1)
x(t) = Ct ..
u(0)
where Ct = B AB At1B
thus we have
range(Ct) t < n
Rt =
range(C) t n
where C = Cn is called the controllability matrix
R = range(C) = { x | x1 = x2 }
among all u that steer x(0) = 0 to x(t) = xdes, the one that minimizes
t1
X
ku( )k2
=0
is given by
uln(t 1)
.. = CtT (CtCtT )1xdes
uln(0)
uln is called least-norm or minimum energy input that effects state transfer
can express as
t1
!1
X
uln( ) = B T (AT )(t1 ) AsBB T (AT )s xdes,
s=0
for = 0, . . . , t 1
t1
X
Emin = kuln( )k2
=0
T
= CtT (CtCtT )1xdes CtT (CtCtT )1xdes
= xTdes(CtCtT )1xdes
t1
!1
X
= xTdes A BB T (AT ) xdes
=0
if t s then
t1
X s1
X
T T
A BB (A ) A BB T (AT )
=0 =0
hence !1 !1
t1
X s1
X
A BB T (AT ) A BB T (AT )
=0 =0
so Emin(xdes, t) Emin(xdes, s)
1.75 0.8 1
example: x(t + 1) = x(t) + u(t)
0.95 0 0
6
Emin
0
0 5 10 15 20 25 30 35
10
Emin(x, 3) 1
x2
0
10
10 8 6 4 2
x10 2 4 6 8 10
10
Emin(x, 10) 1
5
x2
10
10 8 6 4 2
x10 2 4 6 8 10
the matrix !1
t1
X
P = lim A BB T (AT )
t
=0
always exists, and gives the minimum energy required to reach a point xdes
(with no limit on t):
( )
t1
X
ku( )k2 x(0) = 0, x(t) = xdes = xTdesP xdes
min
=0
Continuous-time reachability
consider now x = Ax + Bu with x(t) Rn
reachable set at time t is
Z t
m
Rt = e(t )ABu( ) d u : [0, t] R
0
t t2
etA = I + A + A2 +
1! 2!
by C-H, express An, An+1, . . . in terms of A0, . . . , An1 and collect powers
of A:
etA = 0(t)I + 1(t)A + + n1(t)An1
therefore
Z t
x(t) = e ABu(t ) d
0
Z n1
!
t X
= i( )Ai Bu(t ) d
0 i=0
n1
X Z t
i
= AB i( )u(t ) d
i=0 0
= Cz
Z t
where zi = i( )u(t ) d
0
hence
k k+1 t k+2 t2
x(t) = impulsive terms + A Bf + A Bf + A Bf +
1! 2!
in particular, x(0+) = Ak Bf
where fi Rm
by linearity we have
f0
x(0+) = Bf0 + + An1Bfn1 = C ..
fn1
fact: if x(0) R, then x(t) R for all t (no matter what u is)
Example
system is
0 0 1 0 0
0 0 0 1
0
x =
1 x+ u
1 1 1 1
1 1 1 1 1
N 1
!1
X
udln(k) = BdT (ATd )(N 1k) AidBdBdT (ATd )i xdes
i=0
T
BdT (ATd )(N 1k) = BdT e(t )A
T
(t/N )B T e(t )A
similarly
N
X 1 N
X 1
T
AidBdBdT (ATd )i = e(ti/N )ABdBdT e(ti/N )A
i=0 i=0
Z t
T
(t/N ) etABB T etA dt
0
for large N
Z t 1
T (t )AT tA T tAT
uln( ) = B e e BB e dt xdes, 0 t
0
for large N
min energy is Z t
kuln( )k2 d = xTdesQ(t)1xdes
0
where Z t
T
Q(t) = e ABB T e A d
0
can show
the matrix Z 1
t
T
P = lim e ABB T e A d
t 0
always exists, and gives minimum energy required to reach a point xdes
(with no limit on t):
Z t
min ku( )k2 d x(0) = 0, x(t) = xdes = xTdesP xdes
0
since Z tf
(tf ti )A
x(tf ) = e x(ti) + e(tf )ABu( ) d
ti
u1 u1 u2 u2
system is:
0 0 0 1 0 0 0 0
0 0 0 0 1 0
0 0
0 0 0 0 0 1 0 0 u1
x = x +
2 1 0 2 1 0
1 0 u2
1 2 1 1 2 1 1 1
0 1 2 0 1 2 0 1
50
45
40
35
Emin30
25
20
15
10
0
0 2 4 6 8 10 12
tf
5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
t
5
u2(t)
5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
u1(t)
0
5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
t
5
u2(t)
5
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
output y1 for u = 0:
0.8
0.6
y1(t)
0.4
0.2
0.2
0.4
0 2 4 6 8 10 12 14
0.8
0.6
y1(t)
0.4
0.2
0.2
0.4
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
0.8
0.6
y1(t)
0.4
0.2
0.2
0.4
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
Lecture 19
Observability and state estimation
state estimation
discrete-time observability
observability controllability duality
observers for noiseless case
continuous-time observability
least-squares observers
example
191
Noiseless case
then we have
y(0) u(0)
.. = Otx(0) + Tt ..
y(t 1) u(t 1)
hence we have
y(0) u(0)
Otx(0) = .. Tt ..
y(t 1) u(t 1)
hence:
B,
let (A, C,
D)
be dual of system (A, B, C, D), i.e.,
A = AT , = CT ,
B C = B T , = DT
D
C = [B
AB
An1B]
= [C T AT C T (AT )n1C T ]
= OT ,
= CT
similarly we have O
in fact,
N (O) = range(OT ) = range(C)
i.e., unobservable subspace is orthogonal complement of controllable
subspace of dual
in fact we have
y( t + 1) u( t + 1)
x( t + 1) = F .. Tt ..
y( ) u( )
n1
X
n1 n
CA (Az) = CA z = iCAiz = 0
i=0
det(sI A) = sn + n1sn1 + + 0
x = Ax + Bu, y = Cx + Du
y = Cx + Du
y = C x + Du = CAx + CBu + Du
y = CA2x + CABu + CB u + D
u
and so on
hence we have
y u
y
= Ox + T u
.. ..
y (n1) u(n1)
where O is the observability matrix and
D 0
CB D 0
T =
..
CAn2B CAn3B CB D
hence if N (O) = {0} we can deduce x(t) from derivatives of u(t), y(t) up
to order n 1
x = Ax + Bu, y = Cx + Du
= x + etAz satisfies
then state trajectory x
= A
x x + Bu, y = Cx
+ Du
Least-squares observers
1
where Ot = OtT Ot OtT
output y that would be observed, with input u and initial state x(0)
(and no sensor noise), and
t1
X
measured as y ( ) y( )k2
k
=0
t1
!1 t1
X X
ls(0) =
x (AT ) C T CA (AT ) C T y( )
=0 =0
where x
(0) is the estimation error of the initial state
in particular, x
ls(0) = x(0) if sensor noise is zero
(i.e., observer recovers exact state in noiseless case)
t1
1X
kv( )k2 2
t =0
t1
!1
1 X
OtT Ot = (AT ) C T CA
=0
the matrix !1
t1
X
P = lim (AT ) C T CA
t
=0
always exists, and gives the limiting uncertainty in estimating x(0) from u,
y over longer and longer periods:
if A is stable, P > 0
i.e., cant estimate initial state perfectly even with infinite number of
measurements u(t), y(t), t = 0, . . . (since memory of x(0) fades . . . )
if A is not stable, then P can have nonzero nullspace
i.e., initial state estimation error gets arbitrarily small (at least in some
directions) as more and more of signals u and y are observed
particle
range sensors
range 1
5
0 20 40 60 80 100 120
p
x1(0|t) x1(0))2 + (
( x2(0|t) x2(0))2
0.5
0
10 20 30 40 50 60 70 80 90 100 110 120
0.8
0.6
0.4
0.2
0
10 20 30 40 50 60 70 80 90 100 110 120
Z t
q= y( )T y( ) d
0
Z t 1 Z t
AT T A T
ls(0) = Q
x 1
r= e C Ce d eA C T y( ) d
0 0
estimation error is
Z t 1 Z t
AT T A T
x ls(0) x(0) =
(0) = x e C Ce d e A C T v( ) d
0 0
therefore if v = 0 then x
ls(0) = x(0)
Lecture 20
Some parting thoughts . . .
linear algebra
levels of understanding
whats next?
201
Linear algebra
comes up in many practical contexts (EE, ME, CE, AA, OR, Econ, . . . )
Platonic view:
Quantitative view:
Whats next?
(plus lots of other EE, CS, ICME, MS&E, Stat, ME, AA courses on signal
processing, control, graphics & vision, adaptive systems, machine learning,
computational geometry, numerical linear algebra, . . . )