Automatic Control of State Space Systems - Pantelis Sopasakis

The Realm of Linear State Space Control Systems
Draft Version
Pantelis Sopasakis
Dipl. Chem. Eng., Msc. Appl. Math.,
National Technical University of Athens
November 5, 2011
Preface
Usually, the starting point for the study of Automatic Control is the Laplace
Space of the complex variable s C, sometimes referred to as the Complex
Frequency Domain, or the :-transform for digital systems. Using the Laplace
transform, the dierential equations that accrue from the various principles of
mechanics or chemistry, become algebraic equations which can be easily manip-
ulated. However, certain assumptions render this approach rather restrictive.
Firstly, the Laplace Transform cannot be applied on arbitrary functions or in
some cases it becomes cumbersome to invert it. Moreover, the requirement for
zero initial conditions yields a rather sti theory.
The State Space approach copes directly with the dierential equations of the
system and exhibits the following advantages:
1. Single Input Single Output (SISO) and Multiple Input Multiple Output
(MIMO) systems are treated using the same framework.
2. New concepts such as Controllability and Observability are dened and
studied methodologically. The state space representation is more appro-
priate for the theoretical study of the underlying system.
3. Oers a better insight into the systems structure as its states are explicitly
studied in contrast to a sigle-input single-output black box.
4. Analytical solutions are often available. In general, numerical solution
algorithms (e.g. The Runge-Kutta algorithm)
5. The state-space formulation is the basis for the study of nonlinear systems.
The results outlined in these notes apply mainly to linear time-invariant
systems. The author has endeavoured to provide a concise and rigorous but
yet understandable presentation of some of the most interesting results in linear
dynamical systems theory.
Finally, the State Space representation of dynamical systems is the basis for
the modern theory of Model Predictive Control and the ubiquitous Lyapunov
Theory of Stability.
Please cite these notes as follows:
Pantelis Sopasakis, The Realm of Linear State Space Control Systems, 2011,
Available online at http://users.ntua.gr/chvng/en (Accessed on: November 5,
2011).
1
Nobody knows why,
but the only theories which work are the mathematical ones.,
(Michael Holt, in Mathematics in Art.)
Contents
1 Introduction 4
2 Classes of State Space Models 6
2.1 Linear Time Invariant Models . . . . . . . . . . . . . . . . . . . . 6
2.2 Linear Time Variant Models . . . . . . . . . . . . . . . . . . . . . 7
2.3 Input Ane Nonlinear Models . . . . . . . . . . . . . . . . . . . 7
2.4 Bilinear Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3 Coordinates Transformations 8
4 Realizations of LTI systems 12
4.1 Equivalence of LTI systems . . . . . . . . . . . . . . . . . . . . . 12
4.2 Diagonal Realization . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.2.1 Diagonalizability Criteria . . . . . . . . . . . . . . . . . . 16
4.2.2 Only real eigenvalues . . . . . . . . . . . . . . . . . . . . . 18
4.2.3 Complex Diagonalization . . . . . . . . . . . . . . . . . . 19
4.2.4 Real Diagonalization with complex eigenvalues . . . . . . 20
4.2.5 Why is diagonalization useful? . . . . . . . . . . . . . . . 22
4.3 Jordan Canonical Form . . . . . . . . . . . . . . . . . . . . . . . 24
4.3.1 The Jordan Decomposition . . . . . . . . . . . . . . . . . 24
4.3.2 Utility of the Jordan form . . . . . . . . . . . . . . . . . . 27
4.4 Canonical Controllable Form . . . . . . . . . . . . . . . . . . . . 29
4.5 Canonical Observable Form . . . . . . . . . . . . . . . . . . . . . 32
5 Trajectories of LTI systems 33
5.1 General Properties of Solutions . . . . . . . . . . . . . . . . . . . 33
5.2 The state transition matrix . . . . . . . . . . . . . . . . . . . . . 40
5.3 Responses of LTI systems . . . . . . . . . . . . . . . . . . . . . . 41
5.3.1 The Impulse Response . . . . . . . . . . . . . . . . . . . . 41
5.3.2 The Step Response . . . . . . . . . . . . . . . . . . . . . . 42
6 Controllability 44
6.1 Controllability of LTI systems . . . . . . . . . . . . . . . . . . . . 44
6.2 Controllability and Equivalence . . . . . . . . . . . . . . . . . . . 48
6.3 Advanced topics . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.3.1 Controllability under feedback . . . . . . . . . . . . . . . 56
6.3.2 Connection to the Diagonal Realization . . . . . . . . . . 57
6.3.3 Controllability Gramian . . . . . . . . . . . . . . . . . . . 57
2
6.4 Controllability and Cyclicity . . . . . . . . . . . . . . . . . . . . . 60
6.4.1 Introduction to cyclicity . . . . . . . . . . . . . . . . . . . 60
6.4.2 From cyclicity to controllability . . . . . . . . . . . . . . . 65
7 Stability 68
7.1 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.2 Some preliminary results . . . . . . . . . . . . . . . . . . . . . . . 71
7.3 Pole Placement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.3.1 Pole Placement for single input systems . . . . . . . . . . 73
7.3.2 Pole Placement for systems with multiple inputs . . . . . 79
7.4 Stabilizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
7.5 Lyapunov Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.5.1 Direct Lyapunov Theory . . . . . . . . . . . . . . . . . . . 83
3
Chapter 1
Introduction
The Laplace transform and the use of transfer functions oer great exibility
regarding the study of a systems dynamics and stability. However, the require-
ment for the initial state to be the origin proves to be an important drawback.
The state space approach provides a more generic framework for the study of
dynamical systems. Although in what follows we will stick to the study of lin-
ear systems, state space models can be used for the representation and study of
nonlinear systems and systems that are not Laplace-transformable. The system
dynamics evolves through the state variables:
x(t) = [x
1
(t) x
2
(t) x
n
(t)]
R
n
which come to the outer world of the system as output variables:
y(t) = [y
1
(t) y
2
(t) y
p
(t)]
= h(x(t), u(t)) R
p
where h : R
n
R
m
R
p
is a function that denes the state-to-output behaviour
of the system. Finally, let u : R t u(t) R
m
be the input variable to the
system which we assume that is a controlled variable. In what follows we will
not consider any disturbances.
A state space model consists of two main parts which describe the input-to-
state dynamics and the state-to-output relations of the underlying system. In
particular these are:
1. The state equations: They describe the dynamic relationship between
the input variable u and the state vector x. Usually these are ordinary
dierential equations of the form: x(t) = f(t, x(t), u(t)) (time-variant
system) or x(t) = f(x(t), u(t)) (time-invariant system). In these notes
we will study only time-invariant systems where the vector eld f : R
n
R
m
R
n
is a linear functional of the form f(x, u) = Ax +Bu where A
and B are matrices of proper dimensions.
2. The output equations: Describe the state-to-output behaviour of the
system through relations of the form y(t) = h(x(t), u(t)).
Overall, a state space dynamical system admits the following general real-
ization:
x(t) = f(x(t), u(t)) (1.1a)
y(t) = h(x(t), u(t)) (1.1b)
4
Hereinafter we shall use the notation [x, u, y] to refer to a system with
state variable x, input variable u and output variable y. The notion of an
equilibrium point applies to state space models as follows:
Denition 1 (Equilibrium point). A point (x
, u
) R
n
R
m
is called an
equilibrium point for (1.1a) if f(t, x
, u
) = 0 for all t R
5
Chapter 2
Classes of State Space
Models
First we consider of the most general form of state space systems which is
nonlinear and time-variant. We shall refer to this class of systems as
G
.
These have the following structure:
x(t) = f(t, x(t), u(t)) (2.1a)
y(t) = h(t, x(t), u(t)) (2.1b)
In what follows we use the symbols n, m and p respectively to refer to the
dimensions of the state, the input and the output vectors; usually it is m n
and p n. Little can be inferred from this general representation consisting
of (2.1a) and (2.1b) without any further assumptions. In most cases, state
space models have a more specic structure which allows us to derive particular
results. Dominantly, we have the following classes of systems according to the
form of the vector elds f and h.
2.1 Linear Time Invariant Models
These are models in which the vector eld f has the special form:
f(t, x, u) = Ax(t) +Bu(t) (2.2)
Where A M
n
(R) is a square matrix and B M
nm
(R). The state-to-output
relation is described by:
h(t, x, u) = Cx(t) +Du(t) (2.3)
where C M
pn
(R) and D M
pm
(R) while it is quite common that D = 0
therefore Linear Time-Invariant (LTI) models appear to be even simpler. An
LTI system is represented as:
LTI
= (A, B, C, D) (2.4)
6
Here is an example of an LTI system with n = 2 and m = p = 1:
x(t) =
_
2 3
1 1
_
x(t) +
_
1
0
_
u(t) (2.5)
y(t) =
_
1 0
x(t) = x
1
(t) (2.6)
For LTI systems, according to denition 1, the pair (x
, u
) is an equilibrium
point if Ax
+ Bu
= 0. If we choose x
= 0 then the requirement becomes

Bu
= 0 or equivalently u
kerB.
2.2 Linear Time Variant Models
Linear Time-Variant (LTV) models is a more general class than the above de-
scribed one of LTI systems in which the coecients A, B, C, D can vary with
time. LTV models are represented as follows:
x(t) = f(t, x(t), u(t)) = A(t)x(t) +B(t)u(t) (2.7)
y(t) = h(t, x(t), u(t)) = C(t)x(t) +D(t)u(t) (2.8)
Fruitful results arise with the assumption that these matrices are periodic func-
tions of t. LTV systems will be referred to hereinafter as
LTV
.
2.3 Input Ane Nonlinear Models
In this case the function f has the following form:
f(t, x, u) = (x) +
m
i=1
g
i
(x) u
i
(t) = (x) +g
(x) u (2.9)
where : R
n
R
n
and g : R
n
R
n
are smooth functions. These models
describe a wide range of physical systems however in the most general case and
without any further assumptions the derived results are only local; around a
neighbourhood of an equilibrium point.
2.4 Bilinear Models
Bilinear models form a subclass of input ane models particularly useful for
modelling of chemical processes. Additionally they possess a simpler structure
than the one of input ane models therefore one can derive richer results. Bi-
linear models have the form:
x(t) = f(t, x(t), u(t)) = Ax(t) +Bu(t) +x
(t)Su(t) (2.10)
y(t) = h(t, x(t), u(t)) = Cx(t) +Du(t) (2.11)
where S M
nm
(R).
7
Chapter 3
Coordinates
Transformations
A coordinate transformation is a technique used to simplify the structure of
a given model and can be applied to any of the categories of chapter 2. A
coordinate transformation or change of coordinates is dened as follows:
Denition 2 (Coordinates transformation). A function : S S where
S R
n
an open set, is called a coordinates transformation on S (or a local
dieomorphism on S) if the following hold:
1. is injective, i.e. for all y S there is a s S such that (s) = y.
2. is invertible, i.e. there is a function
1
such that
1
((x)) = x for
all x S
3. The functions and
1
are innitely many times dierentiable.
Additionally, if S = R
n
is said to be a global change of coordinates.
Given a system [x, u] with state variable x and input variable u and with
state equation:
x(t) = f(x, u) (3.1)
and a change of coordinates
z = (x) (3.2)
Then
z(t) =
d
dt
(x(t)) =
d
dx
(x(t)) x(t) (3.3)
=
(x(t))f(x, u) (3.4)
But since is invertible (see denition 2), from (3.2) we have that x =
1
(z).
Hence:
z(t) =
(
1
(z))f(
1
(z), u)

f(z, u) (3.5)
8
The output of the transformed system will be:
y(t) = h(x, u)
= h(
1
(z), u) (3.6)
Example 1. Consider of an LTI system
LTI
= (A, B, C, D) and a linear
coordinates transformation z = (x) = Tx where T is a nonsingular matrix
(i.e. invertible). [It is easy for the reader to verify that is indeed a coordinates
transformation]. Since T is invertible, one has that:
x = T
1
z (3.7)
Substitution to the dierential equation of the system yields:
x = Ax +Bu (3.8)
T
1
z = AT
1
z +Bu (3.9)
z = TAT
1
z +TBu (3.10)
The same way, we have that the state-to-output relation in the new coordinates
will be:
y = Cx +Du (3.11)
= CT
1
z +Du (3.12)
So the transformed system is again the LTI system
LTI
[z, u, y] where:
LTI
[z, u, y] = (TAT
1
, TB, CT
1
, D) (3.13)
We leave it to the reader as an exercise to verify that the mapping:
z = (x) = |x|
2
x (3.14)
is a coordinates transformation over R
n
. Show that:
d
dx
(x) = 2(1x)x (3.15)
where 1 = [1 1 . . . 1] and 1x =
x
i
. Show that the mapping is invertible.
Calculate the inverse mapping
1
(z) = (x); for that purpose note that taking
the norm on both sides of (3.14) we have that |z|
2
= |x|
3
. Finally, apply (3.14)
on an LTI system using (3.5) and (3.6).
Denition 3. Two square matrices A and

A are called similar if there is an
invertible matrix T such that

A = TAT
1
.
Proposition 1. Let A,

A M
n
(R) be two similar matrices. Then they have
the same eigenvalues.
Proof. Since A and

A are similar there is a T GL
n
(R) so that

A = TAT
1
.
Let C be an eigenvalue of A and v R
n
be the corresponding eigenvector.
Then
Av = v
9
Left-multiplying both sides with T yields:
TAv = Tv
Since T is full-rank (which is true, since T is invertible) there is a vector u so
that u = T
1v. Therefore:
TAT
1
u = u

Au = u
which proves that is an eigenvalue of TAT
1
with eigenvector u = T
1
v.
Proposition 2. Let A and B be two similar matrices with B = TAT
1
with
T GL(n, R). If v is an eigenvector of A, then Tv is an eigenvector of B.
Proof. Let v be an eigenvalue of A and the corresponding eigenvalue. Then
it holds that:
Av = v (3.16)
T
1
BTv = v (3.17)
B(Tv) = (Tv) (3.18)
which suggests that Tv is an eigenvalue of B.
Changes of coordinates are also used to establish the notion of equivalence
between dynamical systems. Intuitively speaking, two systems are said to be
equivalent if they exhibit qualitatively similar dynamic behaviour.
Denition 4 (Equivalent systems). Two systems
1
[x, u] and
2
[z, u] are said
to be equivalent if there is a change of coordinates that transforms
1
into
2
. In that case we use the notation
1

2
. In particular when we need to
emphasize that two systems are equivalent using a change of coordinates , we
write
1

2
.
As expected, the binary relation is transitive, that is for three systems
1
,
2
and
3
, if
1

2
and
2

3
, then
1

3
. Is is also obvious that this
relation is symmetric meaning that if
1

2
then
2

1
as well. Finally for
every system it holds that . As a result, is an equivalence relation.
Whenever we need to tell whether two systems
1
and
2
are equivalent,
according to the denition we have to nd a coordinates transformation so
that
1

2
. It is however easier to determine a system
so that
1

and
2

. In the next chapter we will describe this procedure in detail.

Example 2. The following LTI system is given:
x(t) =
_
_
1 0 0
0 1 1
0 0 2
_
_
x(t) +
_
_
1
2
3
_
_
u(t) (3.19a)
y(t) = x(t) (3.19b)
And the change of coordinates z = Tx where:
T =
_
_
1 0 0
0 1 1
0 0
2
_
_
(3.20)
10
The matrix T is invertible (as it can be easily seen that [T[ =
2 ,= 0) so
z = Tx establishes a linear coordinates transformation. The inverse matrix of
T is:
T
1
=
_
_
1 0 0
0 1
2
2
0 0
2
2
_
_ (3.21)
This change of coordinates yields the following LTI system:
z(t) =
_
_
1 0 0
0 1 0
0 0 2
_
_
z(t) +
_
_
1
1
3
2
_
_
u(t) (3.22a)
y(t) =
_
_
1 0 0
0 1
2
2
0 0
2
2
_
_z(t) (3.22b)
Notice that in the transformed system, the matrix that multiplies the state vec-
tor in (3.22a) is diagonal which simplies the structure of the system and any
conclusion will be derived much easier.
11
Chapter 4
Realizations of LTI systems
In this chapter we will introduce certain linear coordinates transformations that
allow us to simplify the structure of LTI systems. Equivalence between LTI
systems will be studied in more detail and various criteria for equivalence will
be formulated.
4.1 Equivalence of LTI systems
Consider of the following state space model:
x(t) = Ax(t) +Bu(t) (4.1a)
y(t) = Cx(t) +Du(t) (4.1b)
and let us assume that x(0) = 0. Let us use the notation U(s) /u(t)(s)
and X(s) /x(t)(s) for the Laplace transforms of u(t) and x(t) respectively.
Applying the Laplace transform on (4.1a) and (4.1b) we get:
4.1a sX(s) = AX(s) +BU(s) (4.2)
(sI A)X(s) = BU(s) (4.3)
X(s) = (sI A)
1
BU(s) (4.4)
and for the second equation we have:
4.1b Y(s) = CX(s) +DU(s) (4.5)
Y(s) =
_
C(sI A)
1
B+D
_
U(s) (4.6)
So the way the input aects the output is reected in the following transfer
function:
H(s) = C(sI A)
1
B+D (4.7)
which is dened for all s C except for those for which the matrix sI A is
singular (i.e. its determinant is zero: [sI A[ = 0), that is for the eigenval-
ues of the matrix A. Before proceeding to the statement of a very important
proposition we give the following result:
12
Proposition 3. Let
1
and
2
be two LTI systems with
1
2
. Then there is
a nonsingular matrix T M
n
(R) so that the change of coordinates z = (x) =
Tx is such that
1

2
.
Proof. The proof is left to the reader as an exercise. Hint: Use (3.6).
The meaning of proposition 3 is that if two linear time-invariant systems are
equivalent then they are equivalent by means of a linear transformation. It is
quite easy to intuitively guess that this holds but it is good to verify it rigorously.
A very interesting result is stated in the following proposition:
Proposition 4. Let
1
= (A, B, C, D) be an LTI system and H
1
(s) its transfer
function and let
2
be an LTI system with
1
2
and transfer function H
2
(s).
Then H
1
(s) = H
2
(s).
Proof. Since
1
is equivalent to
2
= (
A,

B,

C, D), according to proposition 3,
there is a change of coordinates z = F(x) so that
2
F

1
and since
1
,
2
are
LTI, F(x) is a linear function, that is there is an invertible matrix T such that
F(x) = Tx. The transfer function of
2
will then be:
H
2
(s) =

C(sI

A)
1
B+D (4.8)
= CT
1
(sI TAT
1
)
1
TB+D (4.9)
= C(sI T
1
TAT
1
T)
1
B+D (4.10)
= C(sI A)
1
B+D (4.11)
= H
1
(s) (4.12)
Note that in the above algebraic manipulations we used the fact that for every
invertible matrices A and B, AB is invertible and (AB)
1
= B
1
A
1
. Equa-
tion (4.12) holds for all s C except for those that are eigenvalues of A. This
completes the proof.
Corollary 5. Two LTI systems are equivalent if and only if they have the same
impulsive response.
The following proposition allows us to compare any LTI systems:
Proposition 6. Let
1
= (A, B, C, D) and
2
= (
A,

B,

C,

D) be two LTI
systems, where A,

A M
n
(R), B,

B M
nm
(R), C,

C M
pn
(R) and D,

D
M
pm
(R). The following are equivalent:
1.
1

2
2. D =

D and for k = 0, 1, 2, . . . , n 1 it holds that CA
k
B =

C
A
k

B
Proof. 1 = 2. If the LTI systems
1
and
2
are equivalent, according to
proposition 3 there is an invertible matrix T M
n
(R) so that
2
T

1
. Then
D =

D and :
A
k

B = CT
1
(TAT
1
)
k
TB (4.13)
= CT
1
TA
k
T
1
TB (4.14)
= CA
k
B (4.15)
13
and it holds for all k = 0, 1, . . . , n 1.
2 = 1. In order to show that the two systems are equivalent, according
to corollary 5 it suces to show that they have the same impulsive response.
Without infringing the generality we may assume that D =

D = 0. The impul-
sive response of
1
is:
y
1
(t) = /
1
H
1
(s) = Ce
At
B (4.16)
Applying the Taylor expansion formula on (4.16) around t = 0 we get:
y
1
(t) = CB+CABt +CA
2
B
t
2
2!
+ +CA
k
B
t
k
k!
+ (4.17)
And the impulsive response of
2
will be:
y
2
(t) =

C
B+

C
Bt +

C
A
2
B
t
2
2!
+ +

C
A
k

B
t
k
k!
+ (4.18)
According to the Cayley-Hamilton theorem from Linear Algebra if CA
k
B =
A
k

B holds for k = 0, 1, 2, . . . , n 1 then it also holds for all k N and
consequently y
1
= y
2
which completes the proof.
Note: The matrices
k
= CA
k
B M
n
(R) for k = 0, 1, . . . , n 1 are known as
Markov parameters of the LTI system.
Example 3. We will show that the following systems are equivalent. System
1
is given by:
x(t) =
_
_
1 0 0
0 1 1
0 0 2
_
_
x(t) +
_
_
1
2
3
_
_
u(t) (4.19)
y(t) = x(t) (4.20)
and
2
is given by:
z(t) =
_
_
1 0 0
0 1 0
0 0 2
_
_
z(t) +
_
_
1
1
3
2
_
_
u(t) (4.21)
y(t) =
_
_
1 0 0
0 1
2
2
0 0
2
2
_
_z(t) (4.22)
Here, we have n = 3. We calculate the Markov parameters of the two systems:
CB =
_
_
1
2
3
_
_
and

C
B =
_
_
1 0 0
0 1
2
2
0 0
2
2
_
_
_
_
1
1
3
2
_
_
=
_
_
1
2
3
_
_
= CB (4.23)
The same way we have:
CAB =
_
_
1
5
6
_
_
(4.24)
14
and
B =
_
_
1 0 0
0 1
2
2
0 0
2
2
_
_
_
_
1 0 0
0 1 0
0 0 2
_
_
_
_
1
1
3
2
_
_
(4.25)
=
_
_
1
2
3
_
_
= CAB (4.26)
(4.27)
And nally:
CA
2
B =
_
_
1
11
12
_
_
(4.28)
while
A
2
B =
_
_
1 0 0
0 1
2
2
0 0
2
2
_
_
_
_
1 0 0
0 1 0
0 0 2
_
_
2
_
_
1
1
3
2
_
_
(4.29)
=
_
_
1
11
12
_
_
= CA
2
B (4.30)
(4.31)
Let us note here that if = diagd
j
jJ
is a diagonal matrix, then its k
th
power
is given by
k
= diagd
k
j
jJ
.
In the sequel we introduce certain linear change of coordinates so that a
linear system acquires a convenient form which allows its further study.
4.2 Diagonal Realization
In example 3 the equivalent system the matrix

A had a diagonal form. This is
in general feasible when A has n linearly independent eigenvectors as stated in
theorem 7.
Theorem 7 (Spectral Decomposition). Assume that A M
n
(R) has n linearly
independent eigenvectors. Then there are matrices V M
n
(R) and M
n
(C)
so that A is decomposed into:
A = VV
1
(4.32)
In particular
=
_
1
(A)
2
(A)
.
.
.
n
(A)
_
_
(4.33)
is a diagonal matrix with the eigenvalues of A and V = [v
1
v
n
] is the
matrix of eigenvectors of A where v
i
is the eigenvector that corresponds to the
eigenvalue
i
(A). Then A
V
and A is said to be diagonalizable.
15
Proof. Assume that there is some V GL(n, R) so that V
1
AV = =
diag
i
n
i=1
is diagonal (i.e. A admits a diagonal realization). A and are
similar, hence they have the same eigenvalues, namely
i
n
i=1
. The corre-
sponding eigenvectors of are e
1
= (1, 0, . . . , 0), e
2
= (0, 1, . . . , 0), ... and
e
n
= (0, 0, . . . , 1). Then the eigenvectors of A are the vectors Ve
1
, Ve
2
, . . . , Ve
n
(see proposition 2) which are again linearly independent.
Assume now that A has n linearly independent eigenvectors, namely t
i
n
i=1
and dene:
V
_
t
1
t
2
. . . t
n
(4.34)
Let
i
n
i=1
be the eigenvalues of A. Then one has that:
V diag
i
n
i=1
=
_
t
1
t
2
. . . t
n
2
.
.
.
n
_
_
(4.35)
=
_

1
t
1

2
t
2
. . .
n
t
n
(4.36)
=
_
At
1
At
2
. . . At
n
(4.37)
= AV (4.38)
Therefore,
diag
i
n
i=1
= V
1
AV (4.39)
The diagonalization of an LTI system is carried out using the linear trans-
formation z = (x) = V
1
x. Then, the matrix

A = V
1
AV is diagonal (recall
A = TAT
1
and here T = V
1
). Of course there are systems that do not
accept a diagonal form (In that case the Jordan Canonical Form is employed).
Diagonalization, if possible, provides complete input-to-state decoupling, i.e. it
becomes clear which inputs aect which states and how much. The dynamics
become much simpler and it becomes easy to solve analytically the system dy-
namics from which the state trajectory accrues. In what follows we assume that
A is diagonalizable.
4.2.1 Diagonalizability Criteria
We know that a matrix A M
n
(R) is diagonalizable if and only if it has n
linearly independent eigenvectors, i.e. when its eigenvectors span the whole R
n
.
This implies that one needs to calculate all eigenvectors of A to check whether
it is diagonalizable and this is not an easy task from the computational point
of view. Here we give an alternative criterion which can be checked easier. We
give the following necessary denitions from Linear Algebra:
Denition 5 (Characteristic Polynomial). Let A M
n
(K) where K is R or
C. The following polynomial:
A
() = det(AI) (4.40)
is called the characteristic polynomial of A and is a polynomial over C.
16
Denition 6 (Eigenvalues, Algebraic Multiplicity). If is a root of the polyno-
mial
A
of multiplicity , it is called an eigenvalue of A and is its algebraic
multiplicity. An eigenvalue is called simple if its algebraic multiplicity is 1.
Hereinafter we shall denote the algebraic multiplicity of an eigenvalue by ().
Denition 7 (Eigenspace, Geometric Multiplicity). Let be an eigenvalue of
A. The following vector space:
1() = ker(AI) = x R
n
[(AI)x = 0 (4.41)
is called the eigenspace of and its dimension is the geometric multiplicity ()
of the eigenvalue .
Note: The geometric multiplicity of an eigenvalue can be calculated using the
fact that for any matrix Xit holds that dim ker X = nrank X(from the rank-
nullity theorem, see [Rom00, pp. 57-58]). Therefore () = dim ker(AI) =
n rank(AI).
Proposition 8. For every eigenvalue of a matrix A it holds that
() () (4.42)
Denition 8 (Semisimple eigenvalue). ss An eigenvalue is said to be semisim-
ple if () = ().
Proposition 9. A matrix is diagonalizable if and only if all its eigenvalues are
semisimple.
Proofs to propositions 8 and 9 can be found in [Mey00]. From proposition 9 it
is clear that if a matrix in M
n
has n distinct eigenvalues, then all of them are
simple and a fortiori semisimple, thus the matrix is diagonalizable.
Example 4 (Diagonalizability). We need to check whether the following matrix
is diagonalizable:
A =
_
_
5 1 0
5 1
5
_
_
(4.43)
Its characteristic polynomial is:
A
() = ( 5)
3
(4.44)
So, it has one eigenvalue = 5 with algebraic multiplicity (5) = 3. The
corresponding eigenspace is:
1(5) = ker(A5I) = ker
_
_
0 1 0
0 0 1
0 0 0
_
_
(4.45)
From the rank-nullity theorem we have that:
dim1(5) = dim ker
_
_
0 1 0
0 0 1
0 0 0
_
_
= 3 rank
_
_
0 1 0
0 0 1
0 0 0
_
_
= 1 (4.46)
17
That is
(5) = 1 < (5) (4.47)
Therefore = 5 is not semisimple and A is not diagonalizable.
In this particular case we could reach the same conclusions a bit dierently.
Using reductio ad absurdum, let us assume that A is diagonalizable. Then is
should be similar to the following matrix:
L =
_
_
5 0 0
0 5 0
0 0 5
_
_
= 5I (4.48)
Then, there exists a T GL(n, R) such that:
A = T
1
LT (4.49)
= T
1
5IT (4.50)
= 5I (4.51)
which is obviously not true. In fact, the only matrix that is similar to L is itself !
4.2.2 Only real eigenvalues
If A is diagonalizable and has only real eigenvalues then the system in the new
coordinates accepts the following representation In the new coordinates, the
system looks as follows:
d
dt
_
_
z
1
z
2
.
.
.
z
n
_
_
=
_
2
.
.
.
n
_
_
_
_
z
1
z
2
.
.
.
z
n
_
_
+
_
b
t
1
b
t
2
.
.
.
b
t
n
_
_
_
_
u
1
.
.
.
u
m
_
_ (4.52)
where

b
i
R
m
and each state (in the new coordinates z = V
1
x) satises an
equation of the form:
dz
i
(t)
dt
=
i
z
i
(t) +

b
t
i
u(t) (4.53)
Which is easily solved to give (For details see proposition 19):
z
i
(t) = e
i
t
z
i
(0) +
_
t
0
e
i
(ts)
b
t
i
u(s)ds (4.54)
Uncontrollable states show up as the

b
i
coecient will be 0. This way it is
understood that there is no way the input can aect some states - a fact that
was not obvious in the initial formulation. Equation (4.54) for

b
i
= 0 becomes:
z
i
(t) = e
i
t
z
i
(0) (4.55)
from which it is now obvious that z
i
is not aected by the input.
Note: To this end, we examine the special case where the input is scalar
(m = 1) and we provide a coordinates transformation so that the elements
18
of

B = V
1
B are either 0 or 1. So, in that case the systems dynamics are
given by an equation of the following form:
d
dt
_
_
z
1
z
2
.
.
.
z
n
_
_
=
_
2
.
.
.
n
_
_
_
_
z
1
z
2
.
.
.
z
n
_
_
+
_
b
1
b
2
.
.
.
b
n
_
_
u(t) (4.56)
For all i = 1, ..., n such that b
i
,= 0 we introduce the variable
i
(t) =
z
i
(t)
b
i
and
in case

b
i
= 0 we have
i
(t) = z
i
(t). For example in case all

b
i
,= 0 the system
becomes:
d
dt
_
2
.
.
.
n
_
_
=
_
2
.
.
.
n
_
_
_
2
.
.
.
n
_
_
+
_
_
1
1
.
.
.
1
_
_
u(t) (4.57)
4.2.3 Complex Diagonalization
The aforementioned procedure applies also when A has complex eigenvalues.
However this leads to linear dierential equations with complex coecients that
are hard to interpret. So for example the system:
d
dt
_
x
1
x
2
_
=
_
1 1
1 1
_ _
x
1
x
2
_
+
_
1
1
_
u(t) (4.58)
is diagonalized using the complex mapping z = Tx with T GL(n, C) (meaning
that z C
n
) given by:
T =
_
1 j
1 j
_
(4.59)
In the new coordinates the system becomes:
d
dt
_
z
1
z
2
_
=
_
1 j
1 +j
_ _
z
1
z
2
_
+
_
1 +j
1 j
_
u(t) (4.60)
Obviously, the states of this system are complex - not exactly the simplied ver-
sion of the system we had in mind. Optionally, we may introduce new variables
in order to normalize all input coecients to 1:
1
=
z
1
1 +j
,
2
=
z
2
1 j
(4.61)
and the system in -coordinates becomes:
d
dt
_

1
2
_
=
_
j
j
_ _

1
2
_
+
_
1
1
_
u(t) (4.62)
Although, one may use the complex diagonalization of a system to draw con-
clusions regarding its dynamics, it is generally advisable to use the real diago-
nalization technique that is provided in the next section.
19
4.2.4 Real Diagonalization with complex eigenvalues
Complex Eigenvalues of Real matrices appear in conjugate pairs. So, if
i
is
a complex eigenvalue of A with
i
= a + jb (where j =
1) then there is
some other eigenvalue
k
, k ,= i of A so that
k
=
i
= a jb (Without loss
of generality we assume that k = i + 1). This gives rise to a pair of dierential
equations of the form:
z
i
(t) = (a +jb)z
i
(t) +u(t) (4.63)
z
i+1
(t) = (a jb)z
i+1
(t) +u(t) (4.64)
Assume for simplicity and without loss of generality that the state and output
vector have the same dimension. Then, at the same time, the state-to-output
relation is given by:
y(t) =
_
c
1
+j j c
n
x(t) +du(t) (4.65)

From (4.63) and (4.64) we make the simple observation that z
i
(t) and z
k
(t) are
conjugate. Assume that z
i
(t) is written in the form:
z
i
(t) = (t) +j(t) (4.66)
where (t) and (t) stand for the real and the imaginary parts of z
i
(t). Then
z
i+1
(t) = (t) j(t) (4.67)
Therefore the dynamics of z
i+1
(t) can be simply retrieved from the dynamics of
z
i
(t) - a fact that renders z
i+1
(t) redundant. Now, if we substitute (4.66) into
(4.63) we get:
z
i
(t) (t) +j

(t) = (a +jb)((t) +j(t)) +u(t) (4.68)
= (a(t) b(t) +u(t)) +j(a(t) +b(t)) (4.69)
This yields the following pair of dierential equations:
(t) = a(t) b(t) +u(t) (4.70)
(t) = b(t) +a(t) (4.71)

or in matrix form:
d
dt
_

_
=
_
a b
b a
_ _

_
+
_
1
0
_
u(t) (4.72)
If we replace z
i
and z
i+1
with and we come up with the following system
of equations:
d
dt
(t) =
_
2
.
.
.
_
a b
b a
_
.
.
.
n
_
_
(t) +
_
_
1
1
.
.
.
1
0
.
.
.
1
_
_
u(t) (4.73)
20
where
(t)
_
_
z
1
z
2
.
.
.
.
.
.
z
n
_
_
(4.74)
Notice that in this case the matrix that appears in (4.73) is no more diagonal,
but it is block-diagonal. In particular, for every real eigenvalue of Ain the initial
coordinates, the corresponding diagonal entry is the eigenvalue itself, while for
each conjugate complex pair of eigenvalues of the form ajb the corresponding
diagonal entry is the block:
_
a b
b a
_
(4.75)
To this end, only one question remains unanswered; what is the linear coordi-
nates transformation that yields the aforementioned equivalent realization as in
(4.73)? It suces to nd a linear transformation that takes the complex diago-
nal form to the aforementioned real diagonal form. Thus, we need a similarity
transformation S M
n
C which carries out the following transformation:
_
a +jb 0
0 a jb
_
S
0
_
a b
b a
_
(4.76)
It is easy to show (how?) that the following matrix is such a similarity trans-
formation:
S
0
=
_
j 1
1 j
_
(4.77)
In the sense that:
S
0
_
a +jb 0
0 a jb
_
S
1
0
=
_
a b
b a
_
(4.78)
Now suppose that the complex diagonal form of a matrix is:
A = diag
1
, . . . ,
k1
, a +jb, a jb,
k+2
, . . . ,
n
(4.79)
Then the similarity transformation S that yields the corresponding real Jordan
form is:
S =
_
_
_
_
_
_
_
_
_
_
_
k k+1
1
1
.
.
.
k j 1
k+1 1 j
.
.
.
1
_
_
_
_
_
_
_
_
_
_
_
(4.80)
or equivalently, in block diagonal notation:
S = block diagI
k
, S
0
, I
nk2
(4.81)
21
The same can be extended easily when more than one pair of eigenvalues is
present.
4.2.5 Why is diagonalization useful?
If a matrix is diagonalizable and we know its diagonal form then rst of all we
have our system in a much simpler representation in which inputs and outputs
are decoupled. It becomes clear which input aects which output (in the new
coordinates) and how much - see for example equation (4.57). Additionally, if
= diag
i
k
i=1
is a diagonal matrix, then operations with it become much
simpler. For instance, the powers of are calculated using the formula:
m
= diag
m
i

k
i=1
(4.82)
What is the same, the inverse of the matrix (if it exists) is given by:
1
= diag
1
k
i=1
(4.83)
The matrix exponential is given by:
e
= diage
k
i=1
(4.84)
And in general for any matrix mapping f : M
n
M
n
it holds that:
f() = diagf(
i
)
k
i=1
(4.85)
In fact, this gives rise to the following extension of a function over the complex
numbers to the group dg(n, C) of n n diagonal matrices with entries from C
[Spr00, Chapter 4]:
Denition 9 (Function Extension 1). Let f : C C be a function dened at
a
1
, a
2
, . . . , a
n
and D(C) is the diagonal matrix diaga
i
n
i=1
. Then we we
dene the function

f : dg(C) dg(C) so that

f() = diagf(a
i
)
n
i=1
.
This way we have extended all functions on complex numbers to the set of
diagonal matrices over C. It is possible to extend a function f : C C to a
function over the the set of diagonalizable matrices which we shall denote by
D(n, C).
Denition 10 (Function Extension 2). Let f : C C be a function and

f
its extension over D(C). Let A be a diagonalizable matrix which is similar to
the diagonal matrix of its eigenvalues and there is a T M
n
(C) such that
A = TT
1
. If f is dened at every eigenvalue of A we dene the function
f : D(n, C) D(n, C) as

f(A) = T
f()T
1
.
Example 5. Let A D(n, R) and f : R R with f(t)
t. Let A = TDT
1
with D d(n, R
+
) and T GL(n, R). Then we dene the extension of f to
be a function

f : D(n, R) D(n, C) so that

f(A) = Tdiag
d
i
n
i=1
T
1
. For
example let us take
A =
_
1 2
2 1
_
D(2, R) (4.86)
22
This matrix is diagonalizable and
A =
_
1 1
1 1
_ _
1
3
_ _
1 1
1 1
_
1
(4.87)
Hence
f(A) =
_
1 1
1 1
_ _
1
3
_ _
1 1
1 1
_
1
(4.88)
=
1
2
_
3 +j
3 j
3 j
3 +j
_
(4.89)
Usually for brevity we use the same symbol for f and

f so in this example we
can write
A instead of

f(A).
There is another well known approach for extending an analytical function
from C to M
n
(C) that employs the Taylor expansion of the initial function.
According to this approach, if a function f over C admits the expansion:
f(t) =
i=0
a
i
f
(i)
(0)
t
i
i!
(4.90)
where f
0
(t) = 1, f
(1)
(t) = f(t) and f
(i)
(t) is the i-th order derivative of f, then
its extension over M
n
(C) is dened to be a function

f : M
n
(C) M
n
(C) so
that:
f(A) =
i=0
a
i
f
(i)
(0)
A
i
i!
(4.91)
and A
i
is dened through matrix multiplication and A
0
= I. An extension
derived using the Taylor expansion approach will be exactly the same with the
one that we get using diagonalization.
Diagonalization stands more as a theoretical tool rather than as a compu-
tational one. Without going into much detail, let us consider the following
function which plays a very important role in the study of stability:
f(t) = e
At
; t R (4.92)
where A is a diagonalizable matrix. f is a mapping f : R M
n
(R). We need
to know whether the following limit exists and converges:
lim
t
f(t) (4.93)
There is a T GL(n, R) so that A = TDT
1
where D is the diagonal matrix
with the eigenvalues of A. One has that:
f(t) = e
At
= Te
Dt
T
1
= T
_
_
e
1
t
e
2
t
.
.
.
e
n
t
_
_
T
1
(4.94)
23
The limit converges if and only if all functions e
i
t
converge. If
i
R then e
i
t
converges if and only if
i
< 0. If
i
C, that is
i
= a
i
+jb
i
then
e
i
t
= e
a
i
t
e
jb
i
t
= e
a
i
t
(cos t +j sin t) (4.95)
and
i
t
= e
a
i
t
(4.96)
And it is now obvious that we have convergence in C if and only if Re(
i
) < 0.
As a result we say that f(t) converges as t if and only if the real part of all
eigenvalues of A is strictly negative. If there is one eigenvalue with positive real
value then lim
t
|f(t)| = . Note that throughout this procedure we did
not diagonalize any matrices; we just used the fact that A can be represented
in a simpler fashion to draw certain results. However, our approach has the
drawback that it works only for diagonalizable matrices. In the next section we
provide a simple realization for not necessarily diagonalizable matrices.
4.3 Jordan Canonical Form
4.3.1 The Jordan Decomposition
Diagonalization applies to matrices whose eigenvectors are linearly independent
(otherwise the linear transformation V as it was dened in theorem 7 is singu-
lar). But what if A cannot be diagonalized? The requirement that eigenvectors
of A are linearly independent imposes a quite restrictive requirement. The Jor-
dan decomposition of a matrix is an extension of the diagonal decomposition.
All matrices admit a Jordan form which in case the matrix is diagonalizable
coincides with its diagonal form.
Denition 11 (Jordan Block). A Jordan Block of size k is dened as the
following matrix:
J
k
() =
_
_
1
1
.
.
.
.
.
.
1
_
_
(4.97)
where J
k
() M
k
(C) and C.
A special case of Jordan block appears when k = 1. Then J
1
() is a scalar:
J
1
() = (4.98)
Accordingly, the 2-dimensional Jordan block looks like:
J
2
() =
_
1
0
_
(4.99)
Denition 12 (Jordan segment). A matrix that possesses the following block-
diagonal form is called a Jordan segment:
J() =
_
_
J
k
1
()
J
k
2
()
.
.
.
J
k
s
()
_
_
(4.100)
24
where k
1
k
2
. . . k
s
1. More explicitly we will be referring to Jordan
segments using the notation J
k
1
,k
2
,...,k
s
().
Denition 13 (Jordan Matrix). A block-diagonal matrix of the following form
is called a Jordan Matrix:
J =
_
_
J(
1
)
J(
2
)
.
.
.
J(
p
)
_
_
(4.101)
where all J(
i
) for i = 1, 2, . . . , p are Jordan segments.
Example 6. The following matrix is in Jordan form:
J =
_
J
3,1
(7)
J
2
(0)
_
(4.102)
which is expanded as:
J =
_
_
7 1
7 1
7
7
0 1
0
_
_
(4.103)
As we will see in what follows, every matrix in M
n
(K) - where K is either R
or C - is similar to a Jordan matrix with entries in C. Additionally, if a matrix is
diagonalizable then its Jordan form coincides with its diagonal form, therefore
Jordan decomposition can be perceived as a generalization of diagonalization.
First of all, we need to give the following denition:
Denition 14 (Index of eigenvalue). Let be an eigenvalue of a matrix A
M
n
(K) - where K is either R or C. We call index of the eigenvalue the
smallest positive integer k so that
rank (AI)
k
= rank (AI)
k+1
(4.104)
Hereinafter, we denote the index of an eigenvalue by i()
Remark: Since is an eigenvalue of A, det (AI) = 0 from which it follows
that the matrix A I is not invertible and not full-rank rank (AI) < n.
It also holds that:
rank (AI) rank (AI)
2
. . . rank (AI)
k
(4.105)
We can now give the following theorem:
Theorem 10 (Jordan Form). Let A M
n
(K) - where K is either R or C -
with distinct eigenvalues
1
,
2
, . . . ,
p
. Then A is similar to a Jordan matrix
J M
n
(C):
J =
_
_
J(
1
)
J(
2
)
.
.
.
J(
p
)
_
_
A (4.106)
25
which contains as many Jordan segments as the number of distinct eigenvalues
of A. Additionally:
1. Each Jordan segment J(
i
) in J consists of (
i
) Jordan blocks - where
(
i
) is the geometric multiplicity of
i
.
2. The index of
i
determines the size of the largest Jordan block in J(
i
).
3. Let r
j
(
i
) rank (AI)
j
. Then the number of j j Jordan blocks in
J(
i
) is given by:
c
j
(
i
) = r
j1
(
i
) 2r
j
(
i
) +r
j+1
(
i
) (4.107)
Note: If an eigenvalue is semisimple then (and only then) the corresponding
Jordan segment in the Jordan matrix is a diagonal matrix.
Example 7. We will nd the Jordan form of the following 5-by-5 matrix:
A =
_
_
2 5 7 6 7
2 3 2 8 2
0 1 5 2 0
1 4 0 9 1
3 4 3 4 8
_
_
(4.108)
The eigenvalues of A are
1
= 1 with algebraic multiplicity (1) = 2 and
2
= 5
with (5) = 3. We already know that the Jordan matrix consists of two Jordan
segments (one for each eigenvalue). We now calculate the following ranks for
the rst eigenvalue:
r
1
(1) = rank(AI) = 4 (4.109)
r
2
(1) = rank(AI)
2
= 3 (4.110)
r
3
(1) = rank(AI)
3
= 3 (4.111)
therefore i(1) = 2 thus the largest block in J(1) has dimensions 2 2. From the
rank-nullity theorem (see [Rom00, pp. 57-58]) we have that (1) = 5r
1
(1) = 1,
therefore only 1 Jordan block corresponds to the eigenvalue
1
= 1. This is
apparently J
2
(1). Now for the eigenvalue
2
= 5 we have:
r
1
(5) = rank(A5I) = 3 (4.112)
r
2
(5) = rank(A5I)
2
= 2 (4.113)
r
3
(5) = rank(A5I)
3
= 2 (4.114)
therefore i(1) = 2 thus the largest block in J(5) has dimensions 2 2 - that
will be J
2
(5). Also, (5) = 5 r
1
(5) = 2, hence we have two Jordan blocks
corresponding to this eigenvalue - one of which is J
2
(5). The other Jordan block
must have dimensions 1 1 so that J is 5 5 (But we can also use equation
(4.107) for this purpose). Finally, the Jordan form of A is:
J =
_
_
J
2
(5)
J
1
(5)
J
2
(1)
_
_
=
_
_
5 1
5
5
1 1
1
_
_
(4.115)
what we have not yet determined here is a matrix T GL(n, C) so that J =
TAT
1
.
26
4.3.2 Utility of the Jordan form
In section 4.2.5 we expounded how the diagonalization of a matrix - if possible -
allows us to calculate easily all functions of matrices (powers, the inverse matrix
if it exists, polynomials, the matrix exponential and the matrix logarithm) - see
denition 10. The same is possible for non-diagonalizable matrices using the
Jordan form of the matrix.
Denition 15 (Function Extension 3). Let f : C C be a k 1 times dier-
ential function at C. Then we dene the extension of this function over the
set of Jordan blocks to be
f(J
k
()) =
_
_
f() f
()
f
()
2!

f
(k1)
()
(k1)!
f() f
()
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
f() f
()
f()
_
_
(4.116)
Example 8 (Matrix Polynomial). Consider the function f : C C dened by
f(z) = z
3
+1. We want to apply the extension of f as given by denition 15 to
the Jordan block J
3
(1), that is:
J
3
(1) =
_
_
1 1 0
0 1 1
0 0 1
_
_
(4.117)
f is twice dierentiable and f
(z) = 3z
2
and f
(z) = 6z. Therefore,

f applied
to J
3
(1) gives:
f(J
3
(1)) =
_
_
f(1) f
(1)
f
(1)
2
0 f(1) f
(1)
0 0 f(1)
_
_
=
_
_
0 3 3
0 0 3
0 0 0
_
_
(4.118)
The function

f corresponds to the matrix polynomial

f(J) = J
3
+I.
Example 9 (Matrix exponential). We extend the exponential function f(z) =
e
z
from the complex numbers domain to the set of Jordan blocks according to
denition 15. This way we dene the function

f over the set of Jordan blocks
and we denote

f(J) = e
J
. According to the denition let us calculate:
f(J
2
(1)) = e
J
2
(1)
=
_
f(1) f
(1)
0 f(1)
_
=
_
e e
0 e
_
(4.119)
A Jordan matrix J is a block-diagonal matrix consisting of Jordan blocks, i.e.
J = block diagJ
i
(
i
j
). A function is extended to the space of Jordan matrices
as follows:
Denition 16 (Function Extension 4). Let f : C C be a k
1
1 times
dierentiable function at . Then we dene the extension of this function over
the set of Jordan segments to be
f(J
k
1
,k
2
,...,k
s
()) = block diag
f(J
k
j
())
s
j=1
(4.120)
27
and the extension of f to the space of all Jordan matrices as:
f(J) = block diag
f(J(
1
)) (4.121)
Example 10. We will calculate the matrix polynomial f(J) = J
2
+3I (i.e. the
extension of (z) = z
2
+ 3 for z C) for the Jordan matrix:
J =
_
_
J
3,2
(1)
J
2
(2)
1
_
_
(4.122)
that will be:
J
2
+ 3I =
_
_
J
3,2
(1)
2
+ 3I
J
2
(2)
2
+ 3I
(1)
2
+ 3
_
_
(4.123)
where:
J
3,2
(1)
2
+ 3I =
_
J
3
(1)
2
+ 3I
J
2
(1)
2
+ 3I
_
(4.124)
and it is now easy to employ equation (4.116) to calculate:
J
3
(1)
2
+ 3I =
_
_
4 2 1
4 2
4
_
_
(4.125)
Following this procedure we derive to the following result:
J
2
+ 3I =
_
_
4 2 1
4 2
4
4 2
4
7 4
7
4
_
_
(4.126)
However, what is most important is that we can extend any function (that
is adequately smooth) to the whole M
n
(C).
Denition 17 (Function extension 4). Let f : C C be an adequately smooth
function. Then for every matrix A M
n
(C) for which there exists a Jordan
matrix such that A = TJT
1
we dene the extension of f over the set M
n
(C)
to be a function

f : M
n
(C) M
n
(C) such that:
f(A) = T
f(J)T
1
(4.127)
Note: Hereinafter we use a common symbol for f,

f,

f and

f. In this sense we
may write:
f(A) = Tf(J)T
1
(4.128)
For example:
e
A
= Te
J
T
1
(4.129)
The following results is of extraordinary importance in control theory and one
of the most important results for the study of LTI systems stability.
28
Proposition 11. The function f : R R
n
dened by f(t) = e
At
x converges
to 0 as t for every x R
n
if and only if the real part of all eigenvalues
of A is strictly negative. If there is one eigenvalue of A with positive real part
then lim
t
|f(t)| = .
Proof. The proof is left to the reader as an exercise. Hint: Use the Jordan
decomposition of A.
Note: A function f(t) : R R
n
is said to converge to a vector x as t
(we denote lim
t
f(t) = x) if:
lim
t
|f(t) x| = 0 (4.130)
Or, what is exactly the same, let f(t) =
_
f
1
(t) f
2
(t) . . . f
n
(t)
, where
f
i
: R R, we say that f converges to x =
_
x
1
x
2
. . . x
n
as t if
and only if:
lim
t
f
i
(t) = x
i
(4.131)
4.4 Canonical Controllable Form
Let = (A, B, C, D) be an LTI system with a single input. Using a proper
coordinates transformation, and under certain conditions, it is possible to bring
the system in the following form:
z =
_
_
0 1 0 0
0 0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 1
a
1
a
2
a
n
_
_
z(t) +
_
_
0
0
.
.
.
0
1
_
_
u(t) (4.132)
This is known as the Canonical Controllable Form of an LTI system. Using the
new system of coordinates z = Tx, we observe that the input aects directly
only one of the state variables, namely z
n
(t). Indeed, by the last row equation
of (4.132), we have that:
z
n
(t) = a
1
z
1
(t) a
2
z
2
(t) a
n
z
n
(t) +u(t) (4.133)
while for all other state variables, it holds that:
z
k
(t) = z
k+1
(t), for k = 1, 2, . . . , n 1 (4.134)
For a given pair of matrices (A, B) and under certain conditions, there is a
linear change of coordinates z = Tx such that the transformed system is in the
canonical controllable form. Our goal here, is to determine an invertible matrix
T M
n
(R) such that:
TAT
1
=
_
_
0 1 0 0
0 0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 1
a
1
a
2
a
n
_
_
(4.135)
29
and
TB =
_
_
0
0
.
.
.
0
1
_
_
(4.136)
Multiplying (4.135) from the right with T yields:
TA =
_
_
0 1 0 0
0 0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 1
a
1
a
2
a
n
_
_
T (4.137)
Now let T be written in the following form:
T =
_
_
t
1
t
2
.
.
.
t
n
_
_
(4.138)
where t
i
R
n
are row-vectors. Substituting (4.138) into (4.137) one gets:
_
_
t
1
t
2
.
.
.
t
n
_
_
A =
_
_
0 1 0 0
0 0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 1
a
1
a
2
a
n
_
_
_
_
t
1
t
2
.
.
.
t
n
_
_
(4.139)
Carrying out the multiplication we have:
_
_
t
1
A
t
2
A
.
.
.
t
n1
A
t
n
A
_
_
=
_
_
t
2
t
3
.
.
.
t
n1
a
1
t
1
a
2
t
2
. . . a
n
t
n
_
_
(4.140)
From the rst n 1 rows we have that:
t
1
A = t
2
, t
2
A = t
3
, , t
n
A = t
n1
(4.141)
Thus we have the recursive formula:
t
k+1
= t
k
A for k = 1, 2, . . . , n 1 (4.142)
30
So if we know t
1
we can determine the whole sequence of t
i
for i = 1, 2, . . . , n1.
The vector t
1
will be determined by equation (4.136) as follows:
(4.136,4.138)
_
_
t
1
t
2
.
.
.
t
n
_
_
B =
_
_
0
.
.
.
0
1
_
_
(4.143)
_
t
1
t
1
A
.
.
.
t
1
A
n2
t
1
A
n1
_
_
B =
_
_
0
.
.
.
0
1
_
_
(4.144)
_
t
1
B
t
1
AB
.
.
.
t
1
A
n2
B
t
1
A
n1
B
_
_
=
_
_
0
.
.
.
0
1
_
_
(4.145)
To this end we give the following denition:
Denition 18 (Controllable Pair). For a pair of matrices (A, B), the matrix
((A, B)
_
B AB A
2
B A
n1
B
(4.146)
is called the controllability matrix of (A, B). If ((A, B) is non-singular, then
the pair (A, B) is called controllable.
If the pair (A, B) is controllable, then (4.145) can be solved as follows:
(4.145) t
1
((A, B) =
_
0 . . . 0 1
(4.147)
t
1
=
_
0 . . . 0 1
((A, B)
1
(4.148)
that is, the vector t
1
is actually the last row of the matrix ((A, B)
1
. After-
wards, t
2
, t
3
, . . . , t
n
are calculated using the recursive formula (4.142). Equation
(4.145) is in fact a linear system which can be solved using any of the available
numerical solution methods (e.g. Gauss-Seidel, Krylov Space methods etc). The
linear system is:
((A, B)
t
1
=
_
_
0
.
.
.
0
1
_
_
(4.149)
Note : Regarding equation (4.147) we used the following fact. Let t
1

M
1n
(R) be a row vector and x, y R
n
be column vectors. Then
_
t
1
x
t
1
y
_
. .
21
=
_
x y
. .
2n
t
1
=
_
x
_
. .
2n
t
1
(4.150)
31
Thus, the left-hand side of equation (4.145) can be written as:
_
_
t
1
B
t
1
AB
.
.
.
t
1
A
n1
B
_
_
=
_
B AB A
2
B A
n1
B
t
1
= ((A, B)
t
1
Example 11. In this example we will convert the following system into its
canonical controllable form:
x(t) =
_
_
1 2 1
3 5 0
0 1 0
_
_
x(t) +
_
_
2
1
3
_
_
u(t) (4.151)
We rst calculate the controllability matrix of the system (Note: In MATLAB,
one may consider using the command ctrb):
((A, B) =
_
_
2 3 4
1 1 14
3 1 1
_
_
(4.152)
This matrix is invertible and its inverse is:
((A, B)
1
=
1
151
_
_
15 7 38
43 10 32
2 11 5
_
_
(4.153)
Using (4.147) we have
t
1
= t
1
=
_
0 . . . 0 1
((A, B)
1
=
1
151
_
2 11 5
(4.154)
Then we are using the recursive formula t
k+1
= t
k
A for k = 1, 2 and we have:
t
2
= t
1
A =
1
151
_
31 56 2
(4.155)
and
t
3
= t
2
A =
1
151
_
199 340 31
(4.156)
which yields the similarity matrix:
T =
1
151
_
_
2 11 5
31 56 2
199 340 31
_
_
(4.157)
which transforms the given system into its canonical controllable form which is:
z(t) =
_
_
0 1 0
0 0 1
3 1 6
_
_
z(t) +
_
_
0
0
1
_
_
u(t) (4.158)
4.5 Canonical Observable Form
32
Chapter 5
Trajectories of LTI systems
In this chapter we study the input-to-state relationships in an LTI system. In
particular we examine the various properties of these dierential equations as
well as their solutions.
5.1 General Properties of Solutions
Denition 19. Any function (t) : R R
n
such that
d
dt
(t) = f((t), u(t)) (5.1)
for some function u(t) : R R
m
, is called a solution of
d
dt
x(t) = f(x(t), u(t)) (5.2)
Sometimes, in order to emphasize that is a solution with respect to the
input u, we use the notation
u
. A solution which satises the initial condition
(t
0
) = x
0
is denoted by (t; t
0
, x
0
) or
u
(t; t
0
, x
0
). A very important property
of a solution
u
(t; t
0
, x
0
) which follows directly from (5.1), is the following:
u
(t; t
0
, x
0
) = x
0
+
_
t
0
f(
u
(; t
0
, x
0
), u())d (5.3)
The rst result is due to Caratheodory and provides sucient conditions for
the existence of a solution for (5.2).
Theorem 12 (Caratheodory). Let f : R
n
R
m
R
n
be continuous and
Lipschitz in the rst argument in the sense that for all > 0 there exists a
L
> 0 such that |f(x, u) f(y, u)| L
|x y| for all x, y B
(R) and
u B
(R
m
); where B
(R) is the Euclidean ball of R

n
of radius . Then for
all x
0
R
n
and every locally Lebesque-integrable function u, the initial value
problem:
d
dt
x(t) = f(x(t), u(t)) (5.4)
x(t
0
) = x
0
(5.5)
has a unique solution
u
(t; t
0
, x
0
).
33
For a proof as well as more results on existence and uniqueness see [Son98,
Section C3]. We should now clarify the meaning of uniqueness in Caratheodorys
theorem. The system (5.2) accepts a unique solution if for every two functions
1
(t),
2
(t) such that
d
dt
1
(t) = f(
1
(t), u(t))
d
dt
2
(t) = f(
2
(t), u(t))
the following implication holds true:
1
(t
0
) =
2
(t
0
)
1
(t) =
2
(t) for all t t
0
(5.6)
The second important result we will prove states that
u
(t; x
0
) is continuous
with respect to x
0
, but for that we need rst the following Lemma which is due
to Bellman and Gronwall. Readers not familiar with measure theory may either
skip the proof or read locally integrable as continuous and for almost all
as for all.
Lemma 1 (Bellman-Gronwall). Let I R be an open interval in R and a
constant c 0. Also, let : I R
+
is a locally integrable function and
: I R
+
is continuous. Assume that for some t
0
I it holds that:
(t) c +
_
t
t
0
()()d (5.7)
for all t I such that t t
0
. Then:
(t) ce
t
t
0
()d
(5.8)
Proof. We dene
(t)
_
t
t
0
()()d (5.9)
for t I with t t
0
. Then (t) = (t)(t) (t)(t), therefore
(t) (t)(t) 0 (5.10)
for almost all t. Now let us dene:
p(t) (t)e
t
t
0
()d
(5.11)
Then p(t) is locally absolutely continuous and therefore dierentiable almost
everywhere with
p(t) = [ (t) (t)(t)] e
t
t
0
()d
(5.12)
and obviously p(t) 0 and p is non-increasing, so
p(t) p(t
0
) (5.13)
where p(t
0
) = c which completes the proof.
Using the Bellman-Gronwall lemma we can prove the following:
34
Theorem 13. Let
u
(t; x) be the solution of (5.2) with f satisfying the con-
ditions of the Caratheodory theorem. Then
u
(t; x) is continuous in x in the
following sense: For every t 0 and > 0 there is a > 0 (which depends on
t and ) such that for every x
1
, x
2
R
n
,
|x
1
x
2
| < = |
u
(t; x
1
)
u
(t; x
2
)| < (5.14)
Proof. Using the integral form of the solution as in (5.3), the norm |
u
(t; x
1
)
u
(t; x
2
)| becomes:
|
u
(t; x
1
)
u
(t; x
2
)| =
_
_
_
_
x
1
+
_
t
0
f(
u
(; x
1
), u())d x
2
_
t
0
f(
u
(; x
2
), u())d
_
_
_
_
and using the triangle inequality we proceed like:
|x
1
x
2
| +
_
_
_
_
_
t
0
[f(
u
(; x
1
), u()) f(
u
(; x
2
), u())] d
_
_
_
_
and since f is Lipschitz in x with some constant L 0:
|x
1
x
2
| +L
_
_
_
_
_
t
0
[
u
(; x
1
)
u
(; x
2
)] d
_
_
_
_
Therefore:
|
u
(t; x
1
)
u
(t; x
2
)| |x
1
x
2
| +L
_
t
0
|
u
(; x
1
)
u
(; x
2
)| d (5.15)
Let (t) = |
u
(t; x
1
)
u
(t; x
2
)|. From (5.15) and by the Bellman-Gronwall
lemma (with (t) = L and c = |x
1
x
2
|) we have:
|
u
(t; x
1
)
u
(t; x
2
)| |x
1
x
2
| e
t
0
Ld
(5.16)
= |x
1
x
2
| e
Lt
(5.17)
For (, t) = e
t
, from the last equation we have |
u
(t; x
1
)
u
(t; x
2
)| <
for all |x
1
x
2
| < which completes the proof.
Note : This proposition oers not only a very important result on the continuity
of the trajectories over a system with respect to its initial conditions but also
an estimate of the norm |
u
(t; x
1
)
u
(t; x
2
)| | that is:
|
u
(t; x
1
)
u
(t; x
2
)| |x
1
x
2
| e
Lt
(5.18)
We will now state and prove that the solutions of systems with similar dynamics
are also similar. In particular we show that two systems x = f(x, u) and x =
g(x, u) under the same initial conditions and the same input have trajectories
that are close to each other as long as f and g are also close to each other.
For that we need an upper bound on their trajectories which depends on the
distance between f and g. In order to dene the distance between the two ow
functions f and g, for a given u we introduce the norm:
|f g|
sup
xR
n
|f(x, u) g(x, u)| (5.19)
35
Proposition 14. Let
u
(t; x
0
) and
u
(t; x
0
) be the solutions of the systems
x = f(x, u) and x = g(x, u) with initial conditions x(0) = x
0
. Assume that f
is Lipschitz with constant L and that:
|f g|
(5.20)
Then
|
u
(t; x
0
)
u
(t; x
0
)|

L
(e
Lt
1) (5.21)
Proof. We have that:
|
u
(t; x
0
)
u
(t; x
0
)|
_
t
0
|f(
u
(; x
0
), u()) g(
u
(; x
0
), u())| d (5.22)
We now add and subtract f(
u
(; x
0
), u()) to get:
(5.22)
_
t
0
|f(
u
(; x
0
), u()) f(
u
(; x
0
), u())| +
|f(
u
(; x
0
), u()) g(
u
(; x
0
), u())| d (5.23)
_
t
0
(L|
u
(; x
0
)
u
(; x
0
)| +|f g|
) d (5.24)
_
t
0
(L|
u
(; x
0
)
u
(; x
0
)| +) d (5.25)
Now, in order to apply the Bellman-Gronwall lemma we perform the following
manipulations:
|
u
(t; x
0
)
u
(t; x
0
)| L
_
t
0
_
|
u
(; x
0
)
u
(; x
0
)| +

L
_
d (5.26)
And adding

L
on both sides of the relation yields:
|
u
(t; x
0
)
u
(t; x
0
)| +

L

L
+
L
_
t
0
_
|
u
(; x
0
)
u
(; x
0
)| +

L
_
d (5.27)
And applying the Bellman-Gronwall lemma with parameters (t) = |
u
(t; x
0
)
u
(t; x
0
)| +

L
, c =

L
and (t) = L we get the result right away.
Remark : Note that u was considered as constant, otherwise is a function
of u and the inequality becomes:
|
u
(t; x
0
)
u
(t; x
0
)|
(u)
L
(e
Lt
1) (5.28)
With this proposition we have proven the following continuity property of the
solutions of a dynamical system which is known as continuity with respect to the
ow - the term ow is usually used to pertain to the vector eld f.
Corollary 15 (Continuity with respect to the ow). For every t > 0 and > 0
there is a > 0 (which depends on t and ) so that :
|f g|
< |
u
(t; x
0
)
u
(t; x
0
)| < (5.29)
36
The previous proposition can be slightly modied to cater for time-varying
systems.
Proposition 16. Let
u
(t; x
0
) and
u
(t; x
0
) be the solutions of the systems
x = f(t, x, u) and x = g(t, x, u) with initial conditions x(0) = x
0
. Assume that
f is Lipschitz with constant L and that for some time interval T :
sup
tT
sup
xR
n
|f(t, x, u) g(t, x, u)| (5.30)
Then for all t T
|
u
(t; x
0
)
u
(t; x
0
)|

L
(e
Lt
1) (5.31)
Proof. It is left to the reader as an exercise to repeat the steps taken in the
proof of proposition 14 and adapt them to this one. Note that as before, here
u is handled as a constant input signal.
The last continuity property has to do with the input to the system.
Proposition 17 (Continuity wrt the inputs). Let f : R
n
R
m
R
n
be a func-
tion that satises the conditions of the Caratheodory theorem and let u(t) and
v(t) be two locally Lebesgue integrable functions. Let f be uniformly continuous
with respect to its second argument and assume that for every t 0:
|u(t) v(t)| (5.32)
Then there is a M > 0 such that:
|
u
(t, x
0
)
v
(t, x
0
)|
M
L
(e
Lt
1) (5.33)
Proof. The proof is based on proposition 14 and is left to the reader as an
exercise. Hint: Dene F
u
(x) = f(x, u) and F
v
(x) = f(x, v). Based on the
uniform continuity property of f we have that there is some M > 0 such that:
|F
u
(x) F
v
(x)| < M (5.34)
The rest is left as an exercise.
One of the most important properties of
u
(t; t
0
, x
0
) is the semigroup prop-
erty which is stated as follows:
Theorem 18 (Semigroup Property). Consider the system (5.2) and let u be a
given input signal so that the system admits a unique solution. Then
u
(t; t
0
, x
0
) =
u
(t t
0
; 0, x
0
), t t
0
(5.35)
Proof. The function y(t) =
u
(t; t
0
, x
0
) is a solution of (5.2), thus
y(t) = f(y(t), u(t)) (5.36a)
y(t
0
) = x
0
(5.36b)
37
Let (t) =
u
(t t
0
; 0, x
0
). Then
(t) =
u
(t t
0
; 0, x
0
)
= f(
u
(t t
0
; 0, x
0
), u(t))
= f(y(t), u(t))
= y(t)
Additionally
(t
0
) =
u
(0; 0, x
0
)
= x
0
= y(t
0
)
So, (t) is also a solution of (5.2) with the same initial value as y(t) which due
to the uniqueness of solutions implies:
(t) = (t), t t
0

u
(t; t
0
, x
0
) =
u
(t t
0
; 0, x
0
), t t
0
Note: This result is particularly useful for the study of time-invariant systems
since one can see that the initial time is of no signicance for the solution, hence
we may always assume without loss of generality that t
0
= 0. For this reason
one may omit t
0
and simply write
u
(t; x
0
) to refer to
u
(t; t
0
, x
0
).
Proposition 19. The solution of an LTI system = (A, B, C, D) with initial
condition x(0) = x
0
is given by
u
(t; x
0
) = e
At
x
0
+
_
t
0
e
A(t)
Bu()d (5.37)
Proof. The proof is based upon the observation that:
d
dt
(e
At
(t)) = e
At
(t) e
At
A(t) (5.38)
So, from the state equation of the LTI system we have:
(t) = A(t) +Bu(t) (5.39)

e
At
(t) e
At
A(t) = e
At
Bu(t) (5.40)
d
dt
(e
At
(t)) = e
At
Bu(t) (5.41)
The last equation by integration from 0 to t gives:
_
t
0
d
dt
(e
At
(t)) =
_
t
0
e
A
Bu() (5.42)
e
At
(t) e
A0
(0) =
_
t
0
e
A
Bu() (5.43)
e
At
(t) = (0) +
_
t
0
e
A
Bu() (5.44)
38
And left-multiplying by (e
At
)
1
= e
At
we have:
(t) = e
At
(0) + e
At
_
t
0
e
A
Bu() (5.45)
(t) = e
At
(0) +
_
t
0
e
A(t)
Bu() (5.46)
Therefore
u
(t; x
0
) = e
At
x
0
+
_
t
0
e
A(t)
Bu()d (5.47)
which is exactly (5.37).
Proposition 20 (Superposition Principle). The solution
u
(t; 0) of an LTI
system is linear with respect to u.
Proof. Let us pick two input functions u
1
and u
2
and two real numbers
1
,
2

R. Then:
1
u
1
+
2
u
2
(t; 0) =
_
t
0
e
A(t)
B(
1
u
1
() +
2
u
2
())d (5.48)
=
1
_
t
0
e
A(t)
Bu
1
()d + (5.49)
2
_
t
0
e
A(t)
Bu
2
()d (5.50)
=
1
u
1
(t; 0) +
2
u
2
(t; 0) (5.51)
Note: Actually it is the superposition principle that qualies a system as a
Linear Dynamical System. There are systems that are linear in this sense (i.e.
satisfy the superposition principle) but cannot be realized in the standard LTI
form we have been studying so far. A quite simple example is a system of a
single input u(t) and a single output y(t) whose dynamics are given by the
following system of equations:
y(t) =
d
dt
u(t) (5.52)
In principle, a linear system is represented in the form:
y(t) = 1(u)(t) (5.53)
where 1 : 1 J is a linear mapping between from the input vector space 1
to the output vector space J. So for example the linear system (5.52) can be
written as:
y(t) = Du(t) (5.54)
where D : C
is the dierentiation operator. A prime example of such

linear systems is the ones that are dened through convolution. In particular,
for h /
2
(R) we dene the following dynamical system:
y(t) =
_

h()u(t )d (5.55)
This system satises the superposition principle so it is qualied as a Linear
System, but again it does not admit a state space representation.
39
5.2 The state transition matrix
If the initial state of the system is the origin, that is x
0
= 0, then the solution
becomes:
u
(t; 0) =
_
t
0
e
A(t)
Bu()d (5.56)
So we can rewrite (5.37) as follows:
u
(t; x
0
) = e
At
x
0
+
u
(t; 0) (5.57)
We give the following denition:
Denition 20. Given an LTI system = (A, B, C, D), the following function
(r, s) e
A(rs)
(5.58)
is called the State Transition Matrix of the system.
Using the denition of the State Transition Matrix in (5.58), equation (5.57)
becomes:
u
(t; x
0
) = (t, 0)x
0
+
u
(t; 0) (5.59)
Applying the semigroup property, we have the following very important formula
u
(t; t
0
, x
0
) = (t, t
0
)x
0
+
u
(t; t
0
, 0) (5.60)
The right-hand side of the last equation consists of the term (t, t
0
)x
0
which
describes the evolution of the systems state without the action of any input
and the term
u
(t; t
0
, 0) which describes the state trajectory for x
0
= 0 with
the input action u(t). Indeed, if no input is applied to the system, the solution
becomes:
(t; t
0
, x
0
) = (t, t
0
)x
0
(5.61)
Another important remark is that under zero input action (u = 0), if the system
is at its equilibrium point x = 0 it will remain there meaning that
0
(t, 0) = 0 (5.62)
The basic properties of the state transition matrix are summarized in the fol-
lowing proposition:
Proposition 21. For all r, s R:
1. (r, r) = I
2. (r, 0)(s, 0) = (r+s, 0) and in general for all t, v it is (r, t)(s, v) =
(r +s, t +v)
3. (r, 0) = (r, 0)
1
4. If A is diagonal then (r, s) is also diagonal.
5.

r
(r, s) = A(r, s) and

s
(r, s) = A(r, s)
40
The rst property simply implies that
u
(t
0
; t
0
, x
0
) = x
0
regardless of u.
Using the second property, the systems initial state can be retrieved from its
state at some other time t. Left-multiplying both sides of (5.59) by (t, 0) we
get:
5.59 (t, 0)
u
(t; x
0
) = (t, 0)(t, 0)x
0
+ (t, 0)
u
(t; 0) (5.63)
x
0
= (t, 0)
u
(t; x
0
) (t, 0)
u
(t; 0) (5.64)
x
0
= (t, 0)
1
(
u
(t; x
0
)
u
(t; 0)) (5.65)
In particular, if no input is present, then
u=0
(t, 0) = 0 and the last equation
becomes:
x
0
= (t, 0)
1
u
(t; x
0
) (5.66)
The fourth property is due to the fact that if A is diagonal, in the form A =
diag(a
1
, a
2
, . . . , a
n
) then e
A
= diag(e
a
1
, e
a
2
, . . . , e
a
n
).
5.3 Responses of LTI systems
The dynamic behaviour of LTI systems is better understood studying their
responses to impulses and steps.
5.3.1 The Impulse Response
Assume that the input to an LTI system is given by:
u(t) = F(t) =
_
_
F
1
.
.
.
F
m1
F
m
_
_
(t) (5.67)
where (t) is the Dirac functional having the property
lim
0
+
_

()d = 1 (5.68)
while
(t) = 0, t ,= 0 (5.69)
Readers familiar with measure theory understand that this denition of (t) is
contradictory since every almost-everywhere zero function has zero integral over
its domain. The Dirac functional is sort of a convention; it is a symbol that
facilitates certain calculus rather than a real function or functional (at least the
way we use it here). A good reference for a rigorous denition and some basic
facts on the Dirac functional is provided by Rudin in [Rud73, chap. 6]. In fact,
the following property is used to dene (t):
> 0 :
_

f(t)(t)dt = f(0) (5.70)

According to (5.37) we have:
u
(t; x
0
) = e
At
x
0
+
_
t
0
e
A(t)
BF()d (5.71)
41
And employing (5.70) we have:
u
(t; x
0
) = e
At
x
0
+e
At
BF = e
At
(x
0
+BF) (5.72)
5.3.2 The Step Response
The Heaviside function, also known as step function is dened as follows:
H
(t)
_
1, t
0, t <
(5.73)
One can notice that H
(t) has the following properties:

H
(t ) = H
0
(t) (5.74)
H
0
(t ) = H
(t) (5.75)
for all , R. Consider now of an LTI system with input:
u(t) = FH
0
(t) =
_
_
F
1
.
.
.
F
m1
F
m
_
_
H
0
(t) (5.76)
Our task here is to determine the state trajectory as in (5.37) using this input.
Before proceeding we recall the following formula:
_
t
0
e
A
d = A
1
_
e
AtI
=
_
e
AtI
A
1
(5.77)
which of course holds only if A is nonsingular. In the general case, this integral
is calculated using the Taylor series expansion of e
At
around t = 0 which is:
e
At
= I +At +
At
2
2!
+. . . +
At
k
k!
+. . . (5.78)
So the integral becomes:
_
t
0
e
A
d =
_
t
0
_
I +A +
A
2
2
2!
+. . . +
A
k
k
k!
+. . .
_
d (5.79)
= It +A
t
2
2!
+A
2
t
3
3!
+. . . +A
k
t
k+1
(k + 1)!
(5.80)
=
i=0
A
i
t
i+1
(i + 1)!
(5.81)
Notice that (5.81) boils down to (5.77) when A is non-singular. With this,
we have all necessary tools to proceed to the derivation of the step response.
According to (5.37) the step response of an LTI system is:
u
(t; x
0
) = e
At
x
0
+
_
t
0
e
A(t)
BFH
0
()d (5.82)
= e
At
x
0
+
_
t
0
e
A(t)
dBF (5.83)
= e
At
x
0
+e
At
_
t
0
e
A
dBF (5.84)
(5.85)
42
So now we have to calculate the integral
_
t
0
e
A
d. In case A is nonsingular,
using (5.77) one has that:
u
(t; x
0
) = e
At
x
0
+e
At
A
1
_
I e
At
BF (5.86)
= e
At
x
0
+A
1
_
e
At
I
BF (5.87)
otherwise, using (5.81)
u
(t; x
0
) = e
At
x
0
+e
At
i=0
A
i
t
i+1
(i + 1)!
BF (5.88)
43
Chapter 6
Controllability
The controllability of a system is related to our ability to steer its state from
any initial state x
1
R
n
to any nal state x
2
R
n
in nite time. The ability
to study the controllability of a state space system is absent in the Laplace
transform analysis.
6.1 Controllability of LTI systems
We give the denition of controllability for a state space system:
Denition 21. A system with state equation x = f(t, x(t), u(t)) is said to
be controllable if for every (t
0
, x
0
) R
+
R
n
and x
1
R
n
there is a t
1
t
0
and a function u(t) : [t
0
, t
1
] R
m
such that
u
(t
1
; t
0
, x
0
) = x
1
.
If a system is time-invariant, i.e. the function f in its state equation is in-
dependent of time, then every solution
u
(t; t
0
, x
0
) is actually independent of
the initial time t
0
. Therefore, without infringing the generality we may assume
that t
0
= 0 and then we denote simply
u
(t; t
0
, x
0
)
u
(t; x
0
) with the prop-
erty
u
(0; x
0
) = x
0
. Then the denition of controllability is restated simpler as
follows:
Denition 22. A time-invariant system with state equation x = f(x(t), u(t))
is said to be controllable if for every x
0
R
n
and x
1
R
n
there is a T 0 and
a function u(t) : [0, T] R
m
such that
u
(T; x
0
) = x
1
.
In general, for a nonlinear system, the controllability study is far from trivial
and usually only local results can be derived. For LTI systems however, things
are much simpler. For example, an LTI system is controllable if and only if we
can steer its state from the origin to any state - an observation that simplies
our analysis. Note that such a property is not true for nonlinear systems.
Proposition 22. For an LTI system, if for all x
R
n
there is a T 0 and
a u : [0, T] R
m
such that
u
(T; 0) = x
, then and only then the system is

controllable.
Proof. If the system is controllable, the stated property obviously holds. The
other way, the solution of an LTI system
u
(t; x
0
) is given by:
u
(t; x
0
) = e
At
x
0
+
_
t
0
e
A(ts)
Bu(s)ds (6.1)
44
and for x
0
= 0 we have
u
(t; 0) =
_
t
0
e
A(ts)
Bu(s)ds (6.2)
Combining (6.1) and (6.2) we have
u
(t; x
0
) = e
At
x
0
+
u
(t; 0) (6.3)
Let x
R
n
. We choose a T 0 and a u : [0, T] R
m
so that
u
(T; 0) = x
e
AT
x
0
(6.4)
Then
u
(T; x
0
) = e
AT
x
0
+x
e
AT
x
0
= x
(6.5)
Another property, weaker than controllability, is known as reachability of the
origin. A system is said to have reachable origin if starting from any state we
can steer it to the origin in nite time. Formally, we give the following denition:
Denition 23. A system with state equation x = f(t, x(t), u(t)) is said to
have reachable origin if for every (t
0
, x
0
) R
+
R
n
there is a t
1
t
0
and a
function u(t) : [t
0
, t
1
] R
m
such that
u
(t
1
; t
0
, x
0
) = 0.
This denition is simplied for time-invariant systems:
Denition 24. A time-invariant system with state equation x = f(x(t), u(t))
is said to have reachable origin if for every x
0
R
n
there is a T 0 and a
function u(t) : [0, T] R
m
such that
u
(T; x
0
) = 0.
Any controllable system has reachable origin, however the inverse is not in
general true. For LTI systems we will show that controllability is identical to
the reachable origin property. Therefore an LTI system is controllable if we can
steer its state from any initial state to the origin.
Proposition 23. An LTI system is controllable if and only if its origin is
reachable.
Proof. We assume that an LTI system has reachable origin and we will show that
it is controllable. Let x
1
R
n
. Then, there is a t
1
0 and a u : [0, t
1
] R
m
so that:
u
(T; x
1
) = 0 (6.6)
(t
1
, 0)x
1
+
u
(t
1
, 0) = 0 (6.7)
(0, t
1
)(t
1
, 0)x
1
+ (0, t
1
)
u
(t
1
; 0) = 0 (6.8)
x
1
+ (0, t
1
)
u
(t
1
; 0) = 0 (6.9)
x
1
= (0, t
1
)
u
(t
1
; 0) (6.10)
x
1
=
_
t
1
0
e
As
Bu(s)ds (6.11)
45
Now, we left-multiply both sides of (6.11) with the matrix

A = e
At
1
, so we
get
e
At
1
x
1
=
_
t
1
0
e
A(t
1
s)
Bu(s)ds (6.12)

Ax
1
=
u
(t
1
, 0) (6.13)
So starting from the initial state 0, it is possible to reach all states z =

Ax with
x R
n
. Since the matrix

A is invertible, any state y R
n
can be written as
y =

A (
A
1
)y), that is setting x
1
=

A
1
)y we have y =

Ax. So equation
(6.13) reveals that there exists proper u : [0, t
1
] R
m
so that the state of
the system becomes y in nite time, and for any y R
n
which according to
proposition 22 implies that the system is controllable.
One intuitively understands at this point that 0 in the previous two propo-
sitions can be replaced by any other point in R
n
. Indeed, this is stated in the
following proposition. The proof to this is left to the reader as an exercise.
Proposition 24. Let w R
n
. The following are equivalent:
1. The LTI system x = Ax +Bu is controllable
2. For every x
1
R
n
there is a t
1
0 and a u : [0, t
1
] R
m
so that
u
(t
1
, x
1
) = w
3. For every x
1
R
n
there is a t
1
0 and a u : [0, t
1
] R
m
so that
u
(t
1
, w) = x
1
These propositions have given us an insight regarding the controllability of
LTI systems but the question how do we know whether a given LTI system
is controllable is still to be answered. The answer is given in the following
theorem which makes use of the controllability matrix of the LTI system (see
denition 18).
Theorem 25. An LTI system is controllable if and only if its controllability
matrix is full rank.
Proof. According to proposition 22, an LTI system is controllable if it can reach
any state in R
n
in nite time starting from 0 using some input u, that is if
u
(t
1
; 0) =
_
t
1
0
e
A(t
1
s)
Bu(s)ds (6.14)
spans R
n
(can be equal to any n-dimensional vector). For this to be true, the
following condition should hold [The only vector that is normal to all other
vectors in R
n
is the zero vector]:
y
_
t
1
0
e
A(t
1
s)
Bu(s)ds = 0 = y = 0 (6.15)
where y R
n
. Assume that y
_
t
1
0
e
A(t
1
s)
Bu(s)ds = 0. Then for all s [0, t
1
]
it holds that y
e
A(t
1
s)
B = 0. Dierentiating the last equation at s = t
1
we
have:
d
ds
_
y
e
A(t
1
s)
B
_
s=t
1
= 0 y
AB = 0 (6.16)
46
and again taking higher order derivatives on both sides of y
e
A(t
1
s)
B = 0 we
have:
(1)
i
y
A
i
B = 0 for i = 0, 1, . . . (6.17)
According to the Cayley-Hamilton theorem, it suces to keep only the rst n
term, therefore (6.17) becomes
y
[B AB A
2
B A
n1
B] = 0 y
((A, B) = 0 (6.18)
In order to have y = 0 for all y that satisfy (6.18) the controllability matrix has
to be full rank.
Remark: A system is controllable if it is possible for an external system (e.g.
a controller) to steer its state anywhere from any initial state in nite time.
This might be considered to be too demanding. For example, in practice we
might need only some of the state variable to be controllable while all other
state variables behave in an asymptotically stable manner. In the sequel we
give an example of a non-controllable system which is controlled by a simple
linear feedback controller.
Example 12. The following system
x(t) =
_
1 0
0 0
_
x(t) +
_
0
1
_
u(t) (6.19)
y(t) = x(t) (6.20)
is not controllable. Indeed, one can verify that its controllability matrix is:
((A, B) =
_
0 0
1 0
_
(6.21)
which obviously has rank 1. Let us use the following linear feedback
u(t) = [0 1]y(t) (6.22)
Then, the closed-loop system becomes:
x(t) =
_
1 0
0 0
_
x(t) +
_
0
1
_
[0 1]y(t) (6.23)
=
_
1 0
0 1
_
x(t) (6.24)
The solution of this system is then:
u
(t; x
0
) =
_
e
t
0
0 e
t
_
x
0
(6.25)
Notice now that for every x
0
R
n
, we have lim
t
u
(t; x
0
) = 0, i.e. the
trajectory of the closed-loop system will approach the origin asymptotically as
t . Although the system (6.19) has not reachable origin, we can stabilize
it asymptotically using a simple linear feedback controller. Note that it is not
possible for the trajectory to reach 0 in nite time (not with this, nor with any
other feedback controller), otherwise the system would be controllable.
Note: A notion close to controllability but slightly weaker is the one of
stabilizability. A system is said to be stabilizable is all its uncontrollable states
are stable, hence even if they cannot be controlled all of them will be bounded
while the controllable states will be moved to the desired position.
47
6.2 Controllability and Equivalence
The property of controllability is transfered by coordinate transformations (or
otherwise put, controllability is not dependent on the choice of coordinates for
the system). The following is a very useful result:
Proposition 26. Let
1
and
2
be two equivalent LTI systems. Then
1
is
controllable if and only if
2
is controllable.
Proof. Let
1
= (A, B) and
2
= (
A,

B) and
2

T

1
, that is, there is an
invertible matrix T M
n
(R) so that

A = TAT
1
and

B = TB. Let us assume
that
1
is controllable. The controllability matrix of
1
is:
((A, B) = [B AB A
n1
B] (6.26)
is full rank. The controllability matrix for
2
is then:
((
A,

B) = [
B

A
B

A
n1
B] (6.27)
=
_
TB TAT
1
TB TA
n1
T
1
TB
(6.28)
=
_
TB TAB TA
2
B TA
n1
B
(6.29)
= T((A, B) (6.30)
and since T is invertible and ((A, B) is full rank, ((
A,

B) is also full rank and
2
is controllable.
It is also easy to show that:
Proposition 27. If the matrices A and

A are similar and the pairs (A, B)
and (
A,

B) are controllable, the system x = Ax + Bu and z =

Az +

Bu are
equivalent.
Proof. The proof is left to the reader as an exercise.
It is quite straightforward to prove the following proposition which provides
a useful equivalent to controllable LTI systems; their Canonical Controllable
Form.
Proposition 28. Every LTI controllable system is equivalent to its Canonical
Controllable Form (See section 4.4).
Proof. The proof is straightforward from (4.147): Let be a given controllable
LTI system. According to theorem 25, its controllability matrix is invertible,
therefore, t
1
in (4.147) can be calculated. Notice that according to proposi-
tion 26, and since is equivalent to its CCF which is controllable, it is itself
controllable.
6.3 Advanced topics
Before we proceed to the next result, we should recall the following property
from Linear Algebra. Let K, L M
n
(R) and K is full rank. Then:
rank(KL) = rank(L) (6.31)
48
Also for any matrix H M
lk
(R) which is written as:
H = [h
1
h
2
h
k
] (6.32)
where h
i
R
l
, it holds that for all i = 1, 2, . . . , k
rankH = rank[h
1
h
2
h
k
[h
i
] (6.33)
That is, the rank of a matrix does not change if we augment it by any column-
vector that is already in it. It also holds the same for any linear combination of
columns in H:
rankH = rank[h
1
h
2
h
k
[
k
i=1
i
h
i
] (6.34)
Proposition 29. Let A M
n
(R), B M
nm
(R) and C M
pn
(R). Assume
that the pair (A, B) is controllable and the matrix
J =
_
A B
C 0
_
(6.35)
is full rank (n +p). Then the system
z = Jz +
_
B
0
_
v (6.36)
is controllable.
Proof. First of all, let us dene

_
B
0
_
(6.37)
Since J is full rank
rank((J, ) = rankJ ((J, ) (6.38)
The matrix is already a block in ((J, ); appending it once more will not
increase the rank of it, in the sense:
rank((J, ) = rankJ ((J, ) = rank[J ((J, ) ] (6.39)
We now recall that
((J, ) = [ J J
n+p1
] (6.40)
Therefore
rank[J ((J, ) ] = rank[J J
2
J
n+p
] (6.41)
49
And using (6.35) and (6.37) and substituting them into (6.41) we have
rank((J, ) = rank
_
AB A
2
B A
n
B B
CB CAB CA
n1
B 0
_
= rank
_
A ((A, B) B
C ((A, B) 0
_
= rank
_
A B
C 0
_ _
((A, B) 0
0 I
p
_
= rank J
_
((A, B) 0
0 I
p
_
= rank
_
((A, B) 0
0 I
p
_
= n +p
which corroborates our initial claim and completes the proof.
The controllability matrix has very interesting algebraic properties. Let us
recall from Linear Algebra that every matrix denes a linear subspace through
its image. Let us briey give a couple of denitions:
Denition 25 (Image). Let K M
n
(R). The image (or column-space) of a
matrix is the following set
ImK !(K) = y R
n
[x R
n
: y = Kx K R
n
(6.42)
It is easy to verify that for every K M
n
(R), its image is a linear subspace
in R
n
. The rank of a matrix is actually the dimension of its image and if a
matrix is full-rank then its image is R
n
. We now need to introduce the notion
of complementary linear spaces:
Denition 26 (Complementary Spaces). Let 1 be a linear subspace of R
n
(we
denote 1 R
n
). There is a linear space J R
n
with the following properties
1. Their only common vector is the zero vector, i.e. 1 J = 0
2. For every x R
n
there is a v 1 and a w J such that x = v + w,
i.e. 1 and J together span R
n
; we denote this by 1 +J = R
n
.
3. For every v 1 and w J it holds that v
w = w
v = 0, i.e. v w.
Then J is called the complementary space with respect to 1 and we denote
J = 1
.
Note 1: We leave it to the reader as an exercise to verify that there is such
a subspace (the denition is well-posed) and that every x R
n
is written in a
unique way as x = v +w for v 1 and w J.
Note 2: As a result of the rank-nullity theorem, if 1 R
n
then:
dim1 + dim1
= n (6.43)
50
Example 13 (Complementary Space). Consider of a subspace 1 of R
3
that is
spanned by the following vectors:
x
1
=
_
_
1
1
0
_
_
and x
2
=
_
_
0
1
1
_
_
(6.44)
i.e. 1 = spanx
1
, x
2
or what is the same 1 is the image of the matrix T =
_
x
1
x
2
. From the rank-nullity theorem we know that dim1
= 1. Hence
we are looking for a vector y such that y x
1
and y x
2
, i.e. x
1
y = 0 and
x
2
y = 0. This can be written in matrix notation as follows:
T
y = 0 y kerT
(6.45)
This way we nd:
y =
_
_
1
1
1
_
_
(6.46)
We now give the denition of an A-invariant linear subspace:
Denition 27 (Invariant Subspace). Let A M
n
(R) and 1 R
n
. 1 is said
to be A-invariant if
v 1 Av 1 (6.47)
For the study of controllability of linear system, the linear space induced by
the controllability matrix plays an important role. This denes the controlla-
bility space as follows:
Denition 28 (Controllability Space). The image of the controllability matrix
of an LTI system is called the controllability space of the system and will be
denoted as C(A, B).
The following result characterizes the algebraic structure of the controllabil-
ity space and is an extremely useful tool for proving many important properties
Lemma 2. The controllability space C(A, B) is A-invariant and contains the
image of B.
Proof. The proof is easy to verify and is left as an exercise. Hint: Use the
Cayley-Hamilton theorem when needed.
Lemma 3. Let A and

A be two similar matrices in M
n
(R). Then they have
the same images [ ImA = Im
A = 1 ].
Proof. The proof is left as an exercise. Hint: Let T M
n
(R) be a nonsingular
matrix. Then ImT = R
n
which means that every x R can be written as
x = Tz for some z R
n
.
Lemma 4. Let A and

A be two similar matrices in M
n
(R). Then, a linear
subspace J R
n
is A-invariant if and only if it is

A-invariant.
51
So far we have examined properties of controllable linear systems and we have
also state criteria for the their controllability. In proposition 28 we provided a
structural description of controllable LTI systems. The following proposition
gives a better understanding on how a non-controllable system looks like in
terms of equivalence. In fact nrank((A, B) is introduced as a measure of the
non-controllability of the system.
Proposition 30 (Kalman Input-State Decomposition). Consider of an LTI
system with matrices A and B that is not controllable and rank((A, B) =
s < n. Then there exists an invertible matrix T such that

A = TAT
1
and
B = TB have the following structure:
A =
_
A
11
A
12
0 A
22
_
(6.48)
with A
11
M
s
(R), A
12
M
r(ns)
(R), 0 M
(ns)(ns)
(R) and A
22

M
(ns)r
(R), and
B =
_
B
1
0
_
(6.49)
where B
1
M
sm
(R) and 0 = 0
(ns)m
. Additionally, the pair (A
11
, B
1
) is
controllable.
Proof. The linear subspace C = Im((A, B) R
n
has dimension s, therefore
there are exactly s linearly independent vectors c
i
s
i=1
which span C. According
to the basis extension theorem of Linear Algebra, there are ns vectors w
i
ns
i=1
which span the vector space C
(the complementary linear subspace of C in R

n
)
that is:
C C
= R
n
(6.50)
which implies that, for every x C and y C
we have x
y = y
x = 0. Now
let:
T
1

_
c
1
c
1
c
s
(6.51)
and
T
2

_
w
1
w
1
w
ns
(6.52)
We now dene
T
_
T
1
T
(6.53)
Then
TB =
_
T
1
T
2
_
B =
_
T
1
B
T
2
B
_
(6.54)
But due to (6.50) and because ImB C, it is T
2
B = 0; therefore
TB =
_
T
1
B
0
_
(6.55)
which is (6.49) with B
1
= T
1
B. Let now x R
n
be any vector and it can be
written as
x =
_
x
1
x
2
_
(6.56)
52
where x
1
R
s
and x
2
R
ns
(in a unique way). Let the matrix

A be written
as follows:
A =
_
A
11
A
12
A
21
A
22
_
(6.57)
Then
Ax =
_
A
11
x
1
+A
12
x
2
A
21
x
1
+A
22
x
2
_
(6.58)
But since C is A-invariant it is also

A-invariant (see lemma 4) and it follows
that:
A
_
x
1
0
_
=
_
A
11
x
1
0
_
; x
1
R
s
(6.59)
which implies that A
21
= 0. Now, in order to prove that the pair (A
11
, B
1
) is
controllable, we calculate the following rank:
s = rank((
A,

B) = rank
_
B
1
A
11
B
1
A
n1
11
B
1
0 0 0
_
(6.60)
= rank
_
B
1
A
11
B
1
A
n1
11
B
1
(6.61)
= rank((A
11
, B
1
) (6.62)
Remark: Proposition 30 tells us that the system in the new coordinates z = Tx
admits the representation:
z =
_
A
11
A
12
0 A
22
_
z +
_
T
1
B
0
_
u (6.63)
This allows us to represent the vector z if the following form:
z =
_
z
co
z
co
_
(6.64)
where z
co
R
s
stands for the controllable states of z and z
co
R
ns
for the
uncontrollable ones. From (6.63) we have that:
z
co
= A
11
z
co
+A
12
z
co
+B
1
u (6.65)
z
co
= A
22
z
co
(6.66)
Example 14 (Decomposition into controllable and not controllable subsys-
tems). We will apply the methodology outlined within the proof of the previous
theorem to the following 3 3 system with one input:
x =
_
_
1 1 1
2 1 0
1 2 1
_
_
x +
_
_
3
6
3
_
_
u (6.67)
The system is not controllable since its controllability matrix is not full-rank:
((A, B) =
_
_
3 6 12
6 0 12
3 6 0
_
_
(6.68)
53
and rank(((A, B)) = 2 < 3 or what is the same dim(C) = 2. Let y C. Then
there is a x R
n
so that:
y = ((A, B)x (6.69)
=
_
_
3 6 12
6 0 12
3 6 0
_
_
_
_
x
1
x
2
x
3
_
_
(6.70)
= 3
_
_
x
1
+ 2x
2
+ 4x
3
2x
1
+ 4x
3
x
1
+ 2x
2
_
_
(6.71)
For x
(1)
=
_
1
3
0 0
we get y
(1)
=
_
1 2 1
and for x
(2)
=
_
0
1
3
0
we get y
(2)
=
_
2 0 2
. The vectors y
(1)
and y
(2)
span C. Let:
T
1
=
_
y
(1)
y
(2)

=
_
_
1 2
2 0
1 2
_
_
(6.72)
We need now to determine a vector y
3
so that y
3
y
1
and y
3
y
2
. This
should belong to the kernel of T
1
(see example 13). Hence we nd:
y
(3)
=
_
_
1
1
1
_
_
= T
2
(6.73)
So T is:
T =
_
T
1
T
2
_
=
_
_
1 2 1
2 0 2
1 1 1
_
_
(6.74)
and the system becomes:
z =
_
_
0 1.5 3
4
3
1
10
3
0 0 0
_
_
z +
_
_
18
0
0
_
_
u (6.75)
This allows us now to decompose the system into controllable and uncontrollable
parts which in particular are:
z
co
=
_
0 1.5
4
3
1
_
z
co
+
_
3
10
3
_
z
co
+
_
18
0
_
u (6.76)
z
co
= 0 (6.77)
The following result is known as the Hautus Lemma and states an equivalent
condition for controllability that is easily checked in practice and can be proven
quite useful in proving other statements on controllability.
Lemma 5 (Hautus). The following are equivalent:
1. The pair (A, B) is controllable
2. rank
_
I A B
= n for all C
54
3. rank
_
I A B
= n for all C that are eigenvalues of A.

Proof. 1. 2. Let us assume that (A, B) is controllable but there is some
0
C
such that rank
_

0
I A B
< n. This means that there is a nonzero vector

z R
n
such that:
z
_

0
I A B
= 0
where 0 R
n+m
(6.78)
Therefore:
z
(
0
I A) = 0

0
z
= z
A where 0 R
n
(6.79)
and
z
B = 0
where 0 R
m
(6.80)
Thus, one has that z
A
k
B =
k
0
z
B = 0 which further implies that:

z
((A, B) = 0 (6.81)
hence ((A, B) is not-full rank, which contradicts the controllability assumption.
2. 1. Assume that for all C it holds that rank
_
I A B
= n and
let us assume that (A, B) is not controllable - assume rank((A, B) = s < n.
Then the system admits a Kalman decomposition, i.e. there is a T GL(n, R)
A TAT
1
=
_
A
11
A
12
0 A
22
_
(6.82)
and
B TB =
_
B
1
0
_
(6.83)
Let be an eigenvalue of A
22
and let v be the corresponding left eigenvalue of The left eigenvector of
a matrix A is a vector
R
n
such that
A =
A
22
, that is:
v
(I A) = 0
(6.84)
We now dene:
w
_
0
v
_
(6.85)
Notice that w is a left eigenvector of

A (i.e. w
A I) = 0
where is
the eigenvalue of A
22
that corresponds to w) and that w
B = 0
. Dene the
nonzero vector p = T
w C
n
and notice that:
p
_
(I A)T
1
B
= w
_
(I

A) TB
= 0
(6.86)
In other words, p is a left eigenvector for
_
(I A)T
1
B
. Since T
1
is
full rank, the condition p
(I A)T
1
= 0
implied that p
(I A) = 0
and
(6.86) becomes:
p
_
I A B
= 0
with p ,= 0 (6.87)
This implied that rank
_
I A B
< n and contradicts out initial assump-

tion.
2. 3. The direction 2. 3. is obvious; we show only that 3. 2.
55
Corollary 31. If the pair (A, B) is controllable then rank
_
A B
= n.
Note: As a result of this lemma, a system is controllable if and only if the
following implication holds for all C:
z
[I A[B] = 0
1n+m
z = 0
n1
(6.88)
where z R
n
. This is a consequence of the second condition of the Hautus
lemma. This condition in turn means that for all C:
z
(I A) = 0
B = 0
_
z = 0 (6.89)
6.3.1 Controllability under feedback
Controllability is a property that is preserved under feedback. If the system with
matrices (A, B) is controllable, then the following system is also controllable:
x = (A+BK
)x +Bv (6.90)
The structure of this system is presented in gure 6.1
Figure 6.1: System (A+BK
, B) with input v
Proposition 32. If (A, B) is a controllable pair, then for every K M
nm
(R)
the pair (A+BK
, B) is also controllable.
Proof. For the proof we employ the Hautus lemma. For C and z R
n+m
let us assume that:
z
_
I (A+BK
) B
= 0
(6.91)
Therefore
z
_
I (A+BK
)
_
= 0
(6.92)
z
B = 0
(6.93)
From the rst equation one has that:
z
Az
BK
= 0
(6.94)
And due to (6.93) we have:
z
(I A) = 0
B = 0
_
z
_
I A B
= 0
(6.95)
and again because of the Hautus lemma (and because the pair (A, B) is con-
trollable):
z = 0 (6.96)
which completes the proof (read the note under lemma 5)
56
6.3.2 Connection to the Diagonal Realization
Earlier, in section 4.2, we referred to the connection between the diagonal re-
alization of a system and its controllability. Consider of a system in the form
(diagonalized and with the coecients of the input vector normalized to 1):
d
dt
_
2
.
.
.
n
_
_
=
_
2
.
.
.
n
_
_
_
2
.
.
.
n
_
_
+
_
_
1
1
.
.
.
1
_
_
u(t) (6.97)
Then, the controllability matrix of this system is:
( =
_
_
1
1

2
1

n
1
1
2

2
2

n
2
.
.
.
.
.
.
.
.
.
.
.
.
1
n

2
n

n
n
_
_
(6.98)
Observe now that if two eigenvalues are equal, e.g.
1
=
2
, then ( right away
has two identical rows and its rank becomes less than n, therefore the system
is not controllable. ( possesses a very special form known as Van der Monde
matrix over the set of eigenvalues (Van der Monde matrices have numerous
applications in numerical analysis and the Discreet Fourier Transform).
Proposition 33. Consider the system
x = x +Bu (6.99)
with dg(n, R) and B M
n1
(0, 1), i.e. is a diagonal matrix and the
elements of B are either 0 or 1. This system is controllable if and only if B has
no zero entries and does not have repeated entries in its diagonal.
The above result can be extended to systems with many inputs. This way
we state the following condition for controllability.
Proposition 34 (Modal Controllability Criterion). Consider the LTI system
with state equation x = Ax + Bu where A is diagonalizable i.e. there is a
T GL(n, R) so that = TAT
1
is diagonal. The diagonal form of the
system is:
x = x +

Bu (6.100)
where

B = TB. If any of the rows of

B is zero, then the system is not control-
lable.
6.3.3 Controllability Gramian
Until now we just now that if a system is controllable then for every pair of
states x
1
and x
2
, there exists an input function u(t) (depending on these states)
so that it can steer the systems state from x
1
(at t = 0) to x
2
at some desired
nite time t = T. What we do not know yet is how to determine an input signal
u(t) over [0, T] so that x(0) = x
1
and x(T) =
u
(T; x
1
) = x
2
. For that purpose,
let us introduce the following function known as the Controllability Gramian of
the system:
57
Denition 29 (Controllability Gramian). For a pair of matrices (A, B) that
correspond to an LTI system, we dene the Controllability Gramian of the sys-
tem to be a mapping W
c
: R M
n
(R) with:
W
c
(t)
_
t
0
e
A(t)
BB
e
A
(t)
d (6.101)
A simple change of coordinates () = t , meaning d = d, yields:
W
c
(t) =
_
(t)
(0)
e
A
BB
e
A
(d) =
_
t
0
e
A
BB
e
A
d (6.102)
The rst thing to notice here is that W
c
is positive semidenite (we denote
W
c
(t) _ 0 for all t R), i.e. for every vector z it holds that:
z
W
c
(t)z 0 for all t R (6.103)
Indeed, let z R
n
. Then:
z
W
c
(t)z =
_
t
0
z
e
A
BB
e
A
zd (6.104)
=
_
t
0
_
B
e
A
z
_
_
B
e
A
z
_
d (6.105)
=
_
t
0
_
_
_B
e
A
z
_
_
_
2
d 0 (6.106)
On Inner Product Spaces: In order to understand what follows we need to
meticulously make some remarks. So let 1 be an inner product space. A vector
y is said to be perpendicular to x if y, x = 0; we also denote y x. Let
A 1; then we say that y is perpendicular to A (we denote y A) if it is
perpendicular to all vectors in A. If A is a linear subspace of 1 and B
A
is a
basis for it then y is perpendicular to A if it is perpendicular to all vectors in
B
A
(y B
A
). Finally, the only vector in 1 that is perpendicular to 1 is 0:
x 1 x, y = 0 for all y 1 x = 0 (6.107)
So for example in R
n
, recall that the inner product between two vectors is
dened as
x, y y
x (6.108)
Let e
i
n
i=1
be a basis for R
n
and y R
n
such that:
y, e
i
= 0 for i = 1, 2, . . . , n (6.109)
Then, it follows that y = 0.
A more interesting example involves the vector space /
2
([0, 1]) which is dened
as follows:
/
2
([0, 1])
_
f : [0, 1] R;
_
1
0
f
2
(t)dt <
_
(6.110)
This is an inner product space and its inner product is dened as follows:
f, g f, g
L
2
([0,1])

_
1
0
f(t)g(t)dt (6.111)
58
So once again f /
2
([0, 1]) means that:
f, g = 0 for all g /
2
([0, 1]) (6.112)
or equivalently
_
1
0
f(t)g(t)dt = 0 for all g /
2
([0, 1]) (6.113)
which implies that f = 0, i.e. f(t) = 0 for all t [0, 1].
Proposition 35. An LTI system is controllable if and only if its controllability
gramian is non-singular.
Proof. 1. Assume that the system is controllable. Then as we already know it
is possible to steer the system from the origin x = 0 to any desired state x
at
some time instant T. Formally speaking, for every T > 0 and for every x
R
n
,
there is an input signal u() (dened over [0, T]) such that
u
(T; 0) = x
or as
we say
u
(t; 0) spans the whole R
n
- span
t>0
u
(t; 0) = R
n
. (Recall that
the only vector of R
n
that is normal to all other vectors in the space is 0).
Therefore, the following implication holds:
z
u
(t; 0) for all t [0, T] and for all u /
2
[0,T]
= z = 0 (6.114)
or what is the same:
z
u
(T; 0) = 0 for all u /
2
[0,T]
= z = 0 (6.115)
The left-hand side of this equation can be rewritten as follows:
z
_
T
0
e
A(T)
Bu()d = 0 for all u /
2
[0,T]
(6.116)
which implies that
z
e
A(T)
B = 0 for all [0, T] (6.117)
From which we have that:
_
t
0
z
e
A(t)
B
_
z
e
A(t)
B
_
= 0 (6.118)
z
W
c
(t)z = 0 for all t [0, T] (6.119)
Therefore (6.114) is rewritten as follows:
z
W
c
(t)z = 0 = z = 0 for all t > 0 (6.120)
And since we know that W
c
(t) is positive semidenite for all t > 0 we conclude
that it is positive denite, i.e. for all t > 0:
W
c
(t) _ 0 (6.121)
z
W
c
(t)z = 0 = z = 0 (6.122)
Hence W
c
(t) is non-singular (positive denite matrices have only strictly posi-
tive eigenvalues and are non-singular).
59
2. We will now assume that W
c
(t) is non-singular and we will prove that it
is controllable. Let x
0
and x
1
be two distinct states in R
n
and T > 0. Then we
will nd an input signal u(t) (dened over [0, T]) so that
u
(T; x
0
) = x
1
. Let:
u(t) = B
e
(Tt)A
W
c
(T)
1
_
x
1
e
TA
x
0
(6.123)
We then verify that:
u
(T; x
0
) = e
AT
+
_
T
0
e
A(T)
BB
e
(T)A
W
c
(T)
1
(6.124)
_
x
1
e
TA
x
0
d (6.125)
= e
AT
+
_
T
0
e
A(T)
BB
e
(T)A
d (6.126)
W
c
(T)
1
_
x
1
e
TA
x
0
(6.127)
= e
AT
+W
c
(T)W
c
(T)
1
_
x
1
e
TA
x
0
(6.128)
= x
1
(6.129)
which completes the proof
Equation (6.123) provides a way to move the state from any initial to any nal
point at some specied time instant. However this solution is not unique as
there are lots of other ways to move from one state to another. In some cases
constant or piecewise constant inputs (over t) are appropriate. Plausibly, one
wonders how did such a complicated equation as (6.123) occur. The answer will
come up later as we study the Linear Quadratic Regulator.
The Controllability Gramian is dened through (6.101) but when it comes to
its calculation, we prefer to use the following proposition. The controllability
gramian satises an Initial Value Problem (A dierential equation and an Ini-
tial Condition). This is either solved analytically - if possible - or a numerical
approximation method is chosen (e.g. Runge-Kutta).
Proposition 36. The controllability Gramian satises the following IVP:
W
c
= BB
+AW
c
+W
c
A
(6.130)
W
c
(0) = 0 (6.131)
6.4 Controllability and Cyclicity
In this section we give some results on controllability which will introduce us to
the realm of its very algebraic nature. First we provide the denition of a cyclic
space and a brief introduction to properties of cyclic spaces and in the sequel
we try to savvy its connection to controllability:
6.4.1 Introduction to cyclicity
Denition 30 (Cyclic subspace). Let A R
n
and p R
n
. Let R[t] be the ring
of polynomials over R. We dene the set:
:(A, p) P(A)p; P R[t] (6.132)
60
which is called an A-cyclic subspace of R
n
generated by p. The vector p is called
a generator of :(A, p).
Note 1 : :(A, p) is an A-invariant space. Indeed, let w :(A, p). Then,
there is a P R[t] so that w = P(A)p, therefore Aw = AP(A)p = Q(A)p
where Q is another polynomial (in particular if S(t) = t
2
it is Q = P S).
Thus Aw :(A, p). Since :(A, p) is A-invariant, if w :(A, p) then
Aw, A
2
w, . . . , A
k
w, . . . :(A, p).
Note 2 : :(A, p) is the smallest A-invariant subspace of R
n
that contains
p. Formally:
:(A, p) =
T R
n
: AT T, p T (6.133)
Indeed, it is easy to see that if T R
n
and T is A-invariant with p T, then
:(A, p) T. From the invariance property A
k
p T for k = 0, 1, . . ., thus for
all P R[t] we have that P(A)p T.
Note 3 : A polynomial of R[t] is made up as a nite linear combination of
the elements:
I, A, A
2
, . . . , A
k
, . . . (6.134)
i.e. for some positive integer N it has the form:
P(A) = a
0
I +a
1
A+. . . +a
k
A
N
=
N
i=0
a
i
A
i
(6.135)
There is a minimum integer k such that A
k
is a linear combination of lower
powers of A:
A
k
=
k1
i=0
i
A
i
(6.136)
Right-multiplying by p yields:
A
k
p =
k1
i=0
i
A
i
p (6.137)
This gives rise to the following important denition:
Denition 31 (A-annihilator of p). An A-annihilator of p is any monic poly- A polynomial is called
monic if its
highest-order
coecient is 1.
nomial R[t] such that
(A)p = 0 (6.138)
Note (Existence of A-annihilators). Given a vector x R
n
consider the se-
quence of vectors x, Ax, . . .. We know that there is some p N, 0 p n
so that A
p
x is a linear combination of x, Ax, . . . , A
p1
x, that is:
A
p
x =
0
x +
1
Ax +. . . +
p1
A
p1
x (6.139)
Let us dene the polynomial R[t] as:
(t) = t
p
p1
t
p1
. . .
1
t
0
(6.140)
It is easy to see that:
(A)x = 0 (6.141)
is an A-annihilator of x according to the denition.
61
Denition 32 (Minimum polynomial of a vector). The monic polynomial of
the lowest degree that is an A-annihilator of p is called the minimal polynomial
of the vector p with respect to A.
Note : Following equation (6.137), the A-annihilator of p is given by:
m(t) = t
k
k1
i=0
i
t
i
(6.142)
The mp of a vector p (with respect to the matrix A) is always a divider of the
minimal polynomial of A. We recall the denition of the minimal polynomial:
Denition 33 (Minimal polynomial of a matrix). For a matrix A M
n
(R) we
dene its minimal polynomial to be the monic polynomial R[t] of minimum
degree so that (A) = 0.
As it is known from Linear Algebra, the minimal polynomial of a matrix
is uniquely dened and it always divides the characteristic polynomial of the
matrix and it also divides any polynomial q such that q(A) = 0. With the fol-
lowing proposition we establish a relationship between the minimal polynomial
of A and the minimal polynomials of the vectors of a basis of the whole space.
n
(R) and B = b
1
, b
2
, . . . , b
n
be a generator of The least common
multiple (lcm) of two
polynomials
1
and
2
is a polynomial such
that
1
|,
2
| and for
f R[t] we have that
1
|f,
2
|f |f.
R
n
. Let
i
R[t] be the minimum polynomial of b
i
with respect to A. Then the
minimal polynomial of A is the least common multiple (LCM) of
1
,
2
, . . . ,
n
:
(t) = lcm
1
,
2
, . . . ,
n
(6.143)
Proof. Let us dene:
(t) lcm
1
,
2
, . . . ,
n
(6.144)
Claim 1. The minimum polynomial of A divides (t)
According to the denition of ,
i
[ and
i
(A)b
i
= 0 for every i. Thus
(A)b
i
= 0 for each i, that is b
i
A((A)) for each i. Therefore, B
A((A)) A((A)) = R
n
(A) = 0. The minimum polynomial of A is
known to divide all polynomials with the property (A) = 0, hence [.
Claim 2. It also holds that [
We perform the euclidean division of by and try to prove that the
remainder must be 0. The degree of
i
is less or equal that the degree of , thus
there exist polynomials d
i
and r
i
with degr
i
< deg
i
such that:
(t) = d
i
(t)
i
(t) +r
i
(t) (6.145)
We have that:
(A)b
i
= d
i
(A)
i
(A)b
i
+r
i
(A)b
i
(6.146)
Notice that (A)b
i
= 0 and
i
(A)b
i
= 0. Therefore r
i
(A)b
i
= 0. This
should mean that r
i
= 0; if not, then r
i
is a minimum polynomial of b
i
(wrt
A) with degree less than the degree of
i
which is impossible according to the
denition of the minimum polynomial of a vector (wrt A). Therefore r
i
is the
zero polynomial.
62
Note : It is a direct consequence of this proposition that if x R
n
is an
arbitrary vector whose minimal polynomial with respect to A is R[t] then
[ (where is the minimal polynomial of the matrix A). To this end we can
state and prove the following proposition:
Proposition 38 (Description of :(A, p)). Let k be the minimum positive in-
teger such that A
k
is in the span of A
i
for i = 0, 1, . . . , k 1 as in (6.136). The
set B
Z
p, Ap, . . . , A
k1
p is a basis for :(A, p).
Proof. By denition of k it holds that A
k
span B
Z
and what is the same for
every s > k it also holds that A
s
span B
Z
. Therefore, for every P R[t] we
have P(A)p span B
Z
. It is left to the reader as an exercise to verify that B
Z
is linearly independent.
Denition 34 (Cyclic matrix). A matrix A M
n
(R) is called cyclic if there
exists a p R
n
such that :(A, p) = R
n
. The vector p is called a generator of
R
n
with respect to A.
Note : According to this denition, a matrix A is cyclic if there exists a vector
p R
n
such that:
span
_
p, Ap, . . . , A
n1
p
_
= R
n
(6.147)
Proposition 39 (Criterion for cyclicity). A matrix A is cyclic if and only if
the degree of its minimal polynomial is n, i.e. if the minimal polynomial of A
equals its characteristic polynomial.
Proof. This is left to the reader as an exercise.
Remark : A direct consequence of this proposition is that if a matrix has dis-
tinct (not repeated) eigenvalues then it is cyclic.
The following proposition is some kind of converse to preposition 37 and pro-
vides a sucient condition for a vector to be a generator or R
n
with respect to
a cyclic matrix:
n
(R) be a cyclic matrix and have minimal polyno-
mial . Let R[t] be the minimum polynomial of p R
n
(wrt A). If = ,
then p is a generator of R
n
with respect to A.
Proof. Since A is cyclic, its minimal polynomial has degree n, i.e. has the form:
(t) = t
n
+
n1
t
n1
+. . . +
1
t +
0
(6.148)
for some reals
0
,
1
, . . . ,
n1
. We know that (A)p = 0 which is written as:
A
n
p +
n1
A
n1
p +. . . +
1
Ap +
0
p = 0 (6.149)
(i.e. ) is the monic polynomial of the smallest degree so that (A)p = 0. If
q R[t] is of degree < n then q(A)p ,= 0. One can now verify that (this is an
exercise) the set p, Ap, . . . , A
n1
p is linearly independent thus generates R
n
(Hint: Use the denition of a linearly independent set of vectors).
63
An important remark: Already from proposition 38 it has become clear that
deg = dim:(A, p); the degree of the minimal polynomial of a vector is equal
to the dimension of the cyclic subspace it generates. If it is supposed to generate
the whole space R
n
(with respect to A), it is necessary that deg = n.
Before we state and prove a particularly important result related to cyclicity,
we recall the denition of relatively prime or coprime polynomials:
Denition 35 (Coprime polynomials). Let p, q K[t] be two polynomials over The greatest
common divider of
two polynomials p and
q is a polynomial
d K[t] such that
d|p, d|q and
s|p, s|q s|d
a eld K (e.g. K = R). These are called relatively prime or comprime if there
greatest common divider (gcd) is 1.
The following theorem is of high importance for coprime polynomials:
Theorem 41 (GCD and Coprime polynomials). For any two polynomials p, q
K[t] their greatest common divider is a linear combination of p and q with coef-
cients in K[t], i.e. there exist two polynomials a, b K[t] such that:
gcdp(t), q(t) = a(t)p(t) +b(t)q(t) (6.150)
It is immediate that if p and q are coprime then there are polynomials a, b K[t]
such that:
a(t)p(t) +b(t)q(t) = 1 (6.151)
Proof. A proof can be found in [Kli+99, Thm. 1.16]. More on coprime polyno-
mials can be found in [Bea96, Chap. 4].
The following result is known as the Birkho-McLane lemma and is useful to
establish a relationship between cyclicity and controllability in section 6.4.2.
The proof is extracted from [Won67]:
Lemma 6 (Birkho-McLane). Let A be a cyclic matrix with minimal polyno-
mial . Let b be a generator of R
n
with respect to A (i.e. :(A, b) = R
n
).
Let R[t] be a polynomial that is relatively prime with . Then the vector
p = (A)b is a generator of R
n
with respect to A.
Proof. The proof goes along the lines of proof of proposition 37. Let rst of
all R[t] be the minimum polynomial of p, i.e. (A)p = 0. According to
proposition 40, it suces to show that = ; We rst prove that [ and then
that [. Since and are coprime, there are polynomials , R[t] such that
for all t R:
(t)(t) +(t)(t) = 1 (6.152)
(A)(A) +(A)(A) = I (6.153)
(A)(A)b +(A)(A)b = b (6.154)
(A)p = b (6.155)
Let x R
n
= :(A, b) be an arbitrary vector. Then, there exists a polynomial
R[t] such that:
x = (A)b = (A)(A)p (6.156)
Hence
(A)x = (A)(A)(A)p = 0 (6.157)
64
This implies that (A) = 0. Indeed, (6.157) tells us that x A((A)) for all
x R
n
, thus A((A)) = R
n
or what is the same (A) = 0. Since (A) = 0
it follows immediately that [. From claim 2 within the proof of proposition
37 it follows that [. Overall = and according to proposition 40 p is a
generator of R
n
with respect to A.
6.4.2 From cyclicity to controllability
The notion of cyclicity allows us to make a connection with the one of control-
lability. It is straightforward to notice that A is cyclic with generator p if and
only if (A, p) is controllable. Another simple result is stated as follows:
n
(R) and b R
n
. The following are equivalent:
1. (A, b) is controllable and A is cyclic.
2. The set B b, Ab, . . . , A
n1
b is linearly independent.
Proof. The proof is simple and is left to the reader as an exercise.
Now we will state the main result of this section which is due to W.M.
Wonham [Won67]. It is often the case that none of the inputs to a system
can guarrantee controllability although the overall system is controllable. For
instance, consider the system with 2 states and 2 inputs:
x(t) =
_
1 1
2 1
_
x(t) +
_
1
2
1
1
2
_
u(t) (6.158)
This system has cotrollability matrix:
( =
_
1
2
1 1 +
1
2
1
2
1
2 1 +
2 2
2
_
(6.159)
which is full-rank and the system is controllable. If we eliminate the second
input we come up with the single input system:
x(t) =
_
1 1
2 1
_
x(t) +
_
1
2
1
_
u
1
(t) (6.160)
this system is not controllable. The same exactly will happen if we eliminate
the rst input. The system becomes:
x(t) =
_
1 1
2 1
_
x(t) +
_
1
2
_
u
2
(t) (6.161)
which is again not controllable. We understand that we need both inputs to
completely control the system. A question that comes up is whether there is
any other input that leads to a controllable system. The answer is positive if
A is cyclic (see theorem 43). In this example you can see that if we change the
input into v(t) = u
1
(t) +u
2
(t), or in matrix-vector form:
v(t) =
_
1 1
u(t) (6.162)
65
The new system with input v(t) will be:
x(t) =
_
1 1
2 1
_
x(t) +
_
1 +
1
2
1
2
_
v(t) (6.163)
which is controllable by means of a single input.
Theorem 43 (Wonhams theorem). Let A M
n
(R) and B M
nm
(R). If R(B) denotes the
image/column-space of
the matrix B
(A, B) is controllable and A is cyclic then there is a b !(B) so that (A, b)
is controllable.
Proof. Finding a b !(B) such that (A, b) is controllable is equivalent to
determining such a vector so that it is a generator of R
n
with respect to A (see
denition 34). For this reason we shall use the Birkho-McLane lemma.
Let b
i
R
n
be the i-th column-vector of B and
i
its minimal polynomial
with respect to A. Let be the minimal polynomial of A. According to
proposition 37 it is:
= lcm
1
,
2
, . . . ,
m
(6.164)
Let z R
n
be a generator of R
n
with respect to A. For each i = 1, 2, . . . , m
there is a
i
R[t] so that b
i
=
i
(A)z. We need to determine numbers
r
1
, r
2
, . . . , r
m
so that the vector

b = r
1
b
1
+r
2
b
2
+. . . +r
m
b
m
is a generator of
R
n
wrt A. According to the Birkho-McLace lemma, it suces to choose r
i
so
that the polynomial:
(t) = r
1
1
(t) +r
2
2
(t) +. . . +r
m
m
(t) (6.165)
is coprime with . The roots of are the eigenvalues of A in C. In order for
and to be coprime we require that (
j
) ,= 0 for all
j
(A).
Claim 3. The polynomials ,
1
,
2
, . . . ,
m
are coprime:
gcd,
1
,
2
, . . . ,
m
= 1 (6.166)
Proof of the claim. Let = gcd,
1
,
2
, . . . ,
m
. By denition of the GCD,
[, thus there is a polynomial so that = . Similarly, for every i =
1, 2, . . . , m we have that [
i
, hence there exist polynomials
i
such that
i
=

i
. Then:
(A)b
i
= (A)
i
(A)z (6.167)
= (A)(A)
i
(A)z (6.168)
= (A)
i
(A)z (6.169)
= 0 (6.170)
This means that is an A-annihilator of b
i
, hence the minimal polynomial of
b
i
, namely
i
, divides :
i
[ for all i = 1, 2, . . . , m (6.171)
lcm
i
m
i=1
[ (6.172)
[ (6.173)
It is obvious that [ (indeed, = ). Therefore [ , [ = which
implies that = 1. The proof of the claim is complete.
66
By equation (6.166) and by the fact that for each eigenvalue
j
of A it is
(
j
) = 0 it is not possible that
i
(
j
) = 0 simultaneously for all i = 1, 2, . . . , m.
67
Chapter 7
Stability
7.1 Denitions
In this section we give the basic denitions regarding stability of state space
systems. Readers might be already familiar with the kind of stability known
as BIBO (bounded-input bounded-output) stability and various criteria based on
the transfer function of the system such as the Routh criterion or the Nyquist
one. Here we tackle the notion of stability considering always the closed-loop
form of the system instead of an input-output formulation. From this point of
view, any state space system of the form:
x = f(x, u) (7.1)
using a feedback law u = (x) becomes
x = f(x, (x)) = h(x) (7.2)
In what follows we always assume that h : R
n
R
n
is Lipschitz, that is there
exists a constant L 0 so that for every x, y R
n
:
|h(x) h(y)| L|x y| (7.3)
So, our analysis here takes into account the closed loop form of the system. The
feedback law will be properly chosen so that the closed loop system exhibits
the desired dynamic characteristics. For consistency we recall the denition of
an equillibrium point:
Denition 36. Consider the following autonomous state space system in closed-
loop form:
x = h(x) (7.4)
A point x
R
n
is said to be an equillibrium point for (7.4) if h(x
) = 0.
If h is linear, i.e. h(x) = Ax then an equillibrium point for f has the
property:
Ax
= 0 x kerA (7.5)
Where kerA is the kernel of the matrix A, dened as:
kerA y R
n
[Ay = 0 (7.6)
68
If A is invertible then kerA is a singleton kerA = 0. In general, kerA is a
linear subspace of R
n
with dimension dim(kerA) = n rankA (as it follows
from the dimension theorem of Linear Algebra).
Example 15. The following LTI system is given:
x(t) =
_
_
1 2 1
2 4 3
1 2 1
_
_
x(t) (7.7)
We need to determine all its equillibrium points. So, let x
= [x
1
x
2
x
3
]
be
an equillibrium point for this system. Then:
_
_
1 2 1
2 4 3
1 2 1
_
_
_
_
x
1
x
2
x
3
_
_
= 0
_
_
_
x
1
+ 2x
2
x
3
= 0
2x
1
+ 4x
2
+ 3x
3
= 0
x
1
+ 2x
2
+x
3
= 0
(7.8)
Substracting the third from the rst row yields:
x
3
= 0 (7.9)
And by means of the second-row equation we have that
2x
1
+ 4x
2
= 0 x
1
= 2x
2
(7.10)
So if x
1
= then x
2
= /2. Hence any equillibrium point of (7.7) has the
form:
x
=
_
_
/2
0
_
_
=
_
_
1
1/2
0
_
_
(7.11)
The set of all equillibrium point for R forms a linear subspace of R
3
of
dimension 1 (i.e. the set of equillibrium points is a line in the 3-dimensional
space).
If x
is an equillibrium point for (7.4) and (t; x
) the solution with (0; x
) =
x
then for all t 0 we also have (t; x
) = x
, that is every solution starting

for an equillibrium point will remain on that. But what happens is we start
from a point x
0
that is -close to that equillibrium point; for linear systems the
following three options are possible:
1. The equillibrium point will attract the initial point and eventually the
trajectory will converge to it, in the sense that lim
t
(t; x
0
) = x
2. The equillibrium point will repulse the initial point and the state trajectory
will move to innity lim
t
|(t; x
0
)| =
3. The trajectory will retain a constant distance from the equillibrium point
These are incorporated into the denition of stability which is stated as follows
for nonlinear systems:
Denition 37. Consider a system in the form (7.4) and let x
be an equillib-
rium point for it. This point is called a stable equillibrium point if for every
> 0 there is a () > 0 so that |x x
| < = |(t; x)| < for all t 0.

69
According to this denition, for every ball B
(x
) around the equillibrium

point x
, we can nd some other ball B
(x
) so that starting from therein, the

whole trajectory remains in B
(x
). It is implied in the denition that

and a fortiori B
(x
) B
(x
). However, if a trajectory retains a constant

distance from an equillibrium point (e.g. the trajectory is a circle with center
x
and radius ), then it is again a locally stable equillibrium point. It is clear

that stability, as it is given in denition 37 does not suggest that the trajectory
converges to the equillibrium point. For that reason we introduce the notion of
a locally asymptotically stable point.
be an equil-
librium point for it. This point is called a locally asymptotically stable (LAS)
equillibrium point if it is a stable equillibrium point and there is a > 0 such
that lim
t
(t; x) = x
for all x with |x x
| < .
Remark : The limit in this denition is meant in the norm sense, that is:
lim
t
(t; x) = x
lim
t
|(t; x) x
| = 0 (7.12)
If in denition 38, we change into we then come up with the denition of
global asymptotic stability which is given as follows:
be an equillib-
rium point for it. This point is called a globally asymptotically stable equillibrium
point if it is a stable equillibrium point and lim
t
(t; x) = x
for all x R
n
.
For a locally asymptotically stable point x
we dene its domain of attraction

to be the set of all initial points that will eventually converge to it:
be a locally
asymptotically stable equillibrium point. The following set is called the domain
of attraction of x
.
o(x
) =
_
x R
n
[ lim
t
(t; x) = x
_
(7.13)
It is obvious that in any case, if x
is a locally asymptotically stable point

then x
o(x
). Additionally x
is globally asymptotically stable if and only

if o(x
) = R
n
.
Asymptotic stability can be understood using the so-called //-functions.
We give the following denitions:
Denition 41. We dene the following four classes of functions:
1. A function : R
+
R
+
is said to be /-class and we denote / if it
is continuous, strictly increasing and (0) = 0.
2. A function : R
+
R
+
is said to be /
-class and we denote /
if
/ and it is unbounded.
3. A : R
+
R
+
is said to be /-class and we denote / if it is
continuous, strictly decreasing and lim
() = 0
4. A : R
+
R
+
R
+
is said to be //-class and we denote // if
(, t) / for all t R
+
and (r, ) / for all r R
+
70
7.2 Some preliminary results
In this section we state some results that hold for nonlinear systems in general.
In the following proposition we use the notion of //-class functions as dened
above to state a very useful result on asymptotic stability.
Proposition 44. Consider a system in the form (7.4) and let x
be an equillib-
rium point for it. The point is locally asymptotically stable if and only if there
is a // and a constant > 0 so that:
|(t; x
0
) x
| (|x
0
x
|, t) (7.14)
for all x
0
such that |x
0
x
| < and for all t 0.

Proof. Let us assume that x
satises (7.14). Let us choose an > 0; (, t) is

/-class, so we may choose a such that 0 < < and (, 0) . Now for
|x
0
x
| < we have:
|(t; x
0
) x
| (|x
0
x
|, t) (|x
0
x
|, 0) (, 0) (7.15)
So if (7.14) holds then x
is stable and more:

0 lim
t
|(t; x
0
) x
| lim
t
(|x
0
x
|, t) = 0 (7.16)
Therefore
lim
t
|(t; x
0
) x
| = 0 lim
t
(t; x
0
) = x
(7.17)
Inversely, let us assume that x
is an asymptotically stable equillibrium point.

Let us rst consider a xed x
0
o(x
) and let (t) |(t; x

0
) x
|. Since
x
is LAS, it follows from the denition that lim

t
(t; x
0
) = x
, therefore
lim
t
|(t; x
0
) x
| = 0, that is lim
t
(t) = 0. Since : R
+
R
+
is uniformly bounded from above (i.e. there is a constant M 0 such that
(t) M t 0), we can nd a

/ so that

(t) (t) t 0.
Let us now x a t 0. Following a similar procedure with the one in the
proof of theorem 13, one can see that there is a function : R
+
R
+
with
/ so that |(t; x
0
) x
| (|x
0
x
|) [ The reader should go back to

theorem 13 and adjust the steps taken therein ]. This completes the proof.
The result stated in proposition 44 oers an alternative denition of local
asymptotic stability which has been used by some authors as the denition itself.
Furthermore, it oers a better insight into the notion of asymptotic stability.
To this end we may say that asymptotic stability consists of two components:
1. Stability in the sense of Lyapunov: For each > 0 there is a > 0 such
that |(t; x
0
) x
| < for all t 0 and for all x

0
with |x
0
x
| < -
in other words x
is a stable equilibrium point.

2. Attractivity : For each > 0 there is a > 0 and a T > 0 such that
|(t; x
0
) x
| < for all t T and for all x

0
with |x
0
x
| < .
In theorem 13 we found out that the solutions of any dynamic system are contin-
uous with respect to the initial conditions for xed t and one can notice that this
result can be extended for any nite t but no clue is given for this dependence
71
as t . In the following proposition we prove that the continuity argument
holds as t for initial conditions that are attracted by the same LAS equi-
librium point. This is easily intuitively understood; since the trajectories are
attracted and converge to a point they will approach each other at any desired
distance after some time.
Proposition 45. Let x
be a locally asymptotically stable point of (7.4) and

x, y o(x
). Then for every > 0 there is a t

0
0 so that for all t t
0
it
holds |(t; x) (t; y)| < .
Proof. Since x
is an asymptotically stable equillibrium point, according to

proposition 44 there is a // and a constant > 0 so that:
|(t; x
0
) x
| (|x
0
x
|, t) (7.18)
for all x
0
such that |x
0
x
| < and for all t 0. Then it is easy to see that:

|(t; x) (t; y)| = |((t; x) x
) ((t; y) x
)| (7.19)
|((t; x) x
)| +|((t; y) x
)| (7.20)
(|x x
|, t) +(|y x
|, t) (7.21)
Due to the properties of //-class functions we may nd t
1
and t
2
so that:
(|x x
|, t) <

2
, t t
1
(7.22)
and
(|y x
|, t) <

2
, t t
2
(7.23)
Setting t
0
= maxt
1
, t
2
completes the proof.
Sometimes, systems have asymptotic stable equilibrium points such that
trajectories converge to them exponentially. This gives rise to the following
denition:
Denition 42 (Exponential Stability). If a x
is an asymptotically stable point

for (7.4) and satises (7.14) with
(r, t) = Ere
ct
(7.24)
for some constants E, c > 0, then it is called exponentially stable.
As we will see in the following section, the notions of stability, local asymp-
totic stability, global asymptotic stability and exponential asymptotic stability
coincide for the class of LTI systems.
Proposition 46 (Stability of LTI systems). For a linear time-invariant au-
tonomous system x = Ax the following are equivalent:
1. 0 is a locally asymptotically stable equillibrium point.
2. 0 is a globally asymptotically stable equillibrium point.
3. 0 is an exponentially stable equillibrium point.
4. All eigenvalues of A have strictly negative real part.
72
Note : Let
1
,
2
, . . . ,
n
be the eigenvalues of A and let
i
0
be the eigenvalue
with the largest real part, i.e. Re(
i
0
) Re(
i
) for i = 1, 2, . . . , n. Then we
can easily shown using the Jordan canonical form of A that there is an E > 0
so that:
|(t; x
0
)| E|x
0
|e
Re(
i
0
)t
(7.25)
If Re(
i
0
) < 0 it follows that |(t; x
0
)| 0 as t (for a xed x
0
). Moreover,
the system is exponentially stable with:
(r, t) = Ere
Re(
i
0
)t
(7.26)
If a system is stable then starting from any initial state in R the trajectory of the
system converges to 0 as t . All eigenvalues of the system have negative
real part and the one with the most negative real part determines the speed
of the convergence according to equation (7.26). Furthermore, the existence of
conjugate complex eigenvalues of the form a jb (with a > 0) will imply that
there will be some of the modes of the system exhibit oscilatory behaviour.
Denition 43 (Hurwitz). A matrix is called Hurwitz if all its eigenvalues have
negative real part.
Note : It follows from proposition 46 that A being Hurwitz is necessary and
sucient for asymptotic stability. Next we give an equivalent condition to sta-
bility of LTI systems.
Proposition 47. The origin of x = Ax is stable if and only if for every eigen-
value
i
of A it holds that Re(
i
) 0 and for every eigenvalue with Re(
i
) = 0
and algebraic multiplicity
i
2 it holds that rank(A
i
I) = n
i
.
7.3 Pole Placement
7.3.1 Pole Placement for single input systems
According to proposition 46, stability depends solely on the eigenvalues of A.
Although the proposition is stated for autonomous systems (without input) it
has very important implications on non-autonomous systems and it actually
provides a way to control them. Consider a single-input system in the standard
LTI form:
x = Ax +Bu (7.27)
We will determine a linear feedback law u(t; x) = k
x(t) - where k R
n
- so
that the closed loop system
x = Ax +Bk
x = (A+Bk
)x (7.28)
is (locally/globally/exponentially) asymptotically stable (referring to the equil-
librium point x
= 0. So, according to proposition 46 we require that the matrix

G A + Bk
has all its eigenvalues with strictly negative real parts. This is
what is known as the pole placement problem (we call poles of the closed-loop
system the eigenvalues of G) and is stated as follows:
Problem (Pole Placement): For a given pair of matrices A M
n
(R) and
73
B R
n
, determine under what assumptions there is a vector k R
n
so that
G A+ Bk
has the desired eigenvalues

1
,
2
, . . . ,
n
(all of which should
have negative real parts for the closed-loop system to be stable).
The necessary condition so that the pole placement problem is solvable is the
pair (A, B) to be controllable. If so, then the initial system with a change of
coordinates can be brought to its Canonical Controllable Form. The matrix

A
in CCF has the following useful algebraic property:
Proposition 48. For the matrix
A =
_
_
0 1 0 0 0
0 0 1 0 0
.
.
.
.
.
.
1
a
1
a
2
a
n1
a
n
_
_
M
n
(R) (7.29)
its determinant is:
det

A = (1)
n
a
1
(7.30)
and its characteristic polynomial is:
() = det (
AI) (7.31)
= (1)
n
(
n
+a
n
n1
+. . . +a
2
+a
1
) (7.32)
Theorem 49 (Pole placement). Let A M
n
(R) and B R
n
be the matrices
of an LTI system with a single input u : R
+
R and assume that the pair
(A, B) is controllable. Let p R[; n] be any monic polynomial of degree n, i.e.
a polynomial in the form p() = p
1
+p
2
+. . . +p
n
n1
+
n
. Then there is a
vector k R so that the characteristic polynomial of A+Bk
is p.
Proof. Since (A, B) is controllable, there is a T GL(n, R) so that:
A TAT
1
=
_
_
0 1 0 0
0 0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 1
a
1
a
2
a
n
_
_
(7.33)
and
B TB =
_
_
0
.
.
.
0
1
_
_
(7.34)
Then for any

k R
n
with

k =
_

k
1

k
2
. . .

k
n
we have:
A+

B
=
_
_
0 1 0 0
0 0 1 0
.
.
.
.
.
.
.
.
.
.
.
.
0 0 0 1
a
1
+

k
1
a
2
+

k
2
a
n
+

k
n
_
_
(7.35)
74
Now, according to the previous proposition, the characteristic polynomial of
A+

B
is
() = (1)
n
_
n
+ (
k
n
a
n
)
n1
+. . . + (
k
2
a
2
) + (
k
1
a
1
)
_
(7.36)
We require that p = which means that:
p
1
= (1)
n
(
k
1
a
1
)

k
1
= (1)
n
p
1
+a
1
(7.37)
p
2
= (1)
n
(
k
2
a
2
)

k
1
= (1)
n
p
2
+a
2
(7.38)
.
.
.
p
n
= (1)
n
(
k
n
a
n
)

k
n
= (1)
n
p
n
+a
n
(7.39)
These equations are used to calculate the vector

k. But it holds that:
A+

B
= TAT
1
+TB
(7.40)
= TAT
1
+TB
TT
1
(7.41)
= T(A+B
T)T
1
(7.42)
Which means that the matrix

A+

B
is similar to the matrix A+B
T with
similarity matrix T, therefore they have the same characteristic polynomial.
Thus, the closed loop matrix A+Bk
with k given by:

k
=

k
T (7.43)
has characteristic polynomial p.
Example 16. We need to design a linear feedback controller u = Kx for the
following system:
x =
_
_
1 2 1
1 3 2
4 6 0
_
_
x +
_
_
1
1
1
_
_
u (7.44)
so that the eigenvalues of the closed loop system x = (A + BK)x are exactly
1
= 1 and
2,3
= 4 j. This means that the characteristic polynomial of
the closed loop matrix A+BK must be:
p() = (
1
)(
2
)(
3
) (7.45)
= ( + 1)( + 4 +j)( + 4 j) (7.46)
= ( + 1)(( + 4)
2
+ 1) (7.47)
=
3
+ 9
2
+ 25 + 17 (7.48)
We now need to transform the system into its canonical controllable form. The
controllability matrix of the sytem is:
((A, B) =
_
_
1 4 14
1 4 12
1 2 8
_
_
(7.49)
75
We now solve the following linear system to determine the vector t
1
(see equation
(4.147)):
((A, B)
t
1
=
_
_
0
.
.
.
0
1
_
_
(7.50)
from which we get:
t
1
=
_
0.5 0.5 1
(7.51)
t
2
= t
1
A =
_
1 0.5 0.5
(7.52)
t
3
= t
2
A =
_
3.5 2.5 0
(7.53)
Therefore, we dene:
T =
_
_
t
1
t
2
t
3
_
_
=
_
_
0.5 0.5 1
1 0.5 0.5
3.5 2.5 0
_
_
(7.54)
This matrix transforms our system into the following canonical controllable
form:
z =
_
_
0 1 0
0 0 1
22 3 4
_
_
z +
_
_
0
0
1
_
_
u (7.55)
with z = Tx. So, according to the previous proposition:
k
1
= (1)
3
p
1
+a
1
= 17 + 22 = 5 (7.56)
The same way we calculate

k
2
and

k
3
and we come up with:
k =
_
_
5
28
13
_
_
(7.57)
and eventually:
k = T
k =
_
_
71
44
14
_
_
(7.58)
Using this gain the closed loop system becomes:
x =
_
_
70 46 15
72 47 16
75 50 14
_
_
x (7.59)
which has exactly the desired eigenvalues.
The pole placement theorem presented above can be simplied a bit. The
requirement that the closed-loop system has a given characteristic polynomial
is stonger than what we actually need that is the system to have the prescribed
76
eigenvalues. Let us notice that for any matrix A M
n
(R), its characteris-
tic polynomial
A
() = det(A I) has the same roots with the polynomial

A
() = det(I A). In particular it holds that:

A
() = (1)
n

A
() (7.60)
For a matrix

A that is in the canonical controllable form we have that:

A
() =
n
+a
n
n1
+. . . +a
2
+a
1
(7.61)
We can now state and prove the following theorem:
Theorem 50 (Pole placement 2). Let A M
n
(R) and B R
n
be the matrices
of an LTI system with a single input u : R
+
R and assume that the pair
(A, B) is controllable. Let
i
n
i=1
C be a self-conjugate
1
set of desired
eigenvalues for the closed-loop system. Then, there is a vector k R so that the
of A+Bk
has eigenvalues
i
n
i=1
.
Proof. We will just sketch the proof and leave it to the reader as a necessary
exercise. The proof follows the same steps as the proof of the rst pole placement
theorem. Here we require that p = where p is the polynomial given by:
p() =
n
i=1
(
i
) (7.62)
and is the polynomial:
() = det(I (A+BK
)) (7.63)
The following results is known as the Zubov-Wonham theorem. A full proof
can be found in [LeoShu10]. According to this theorem, the ability to place
poles to arbitrary positions is necessary for a system to be controllable. Before
that we need to brush up some Linear Algebra rst. We recall the denition of
the spectrum of a matrix:
Denition 44 (Spectrum). The set of eigenvalues of a matrix A is called the
spectrum of the matrix and is denoted by (A).
We now give the following result for block matrices without a proof.
Proposition 51. We have the following block matrix
W =
_
A B
0 D
_
(7.64)
then the following hold:
1. det W = det A det D
2. (W) = (A) (D)
1
A set S C is called self-conjugate if for every element of S, its conjugate is also in S.
Formally s S s S.
77
Proof. For a proof of the rst claim and more on determinants of block matrices
read [John00]. The second can be easily deduced from the rst one.
Theorem 52 (Zubov-Wonham). The pair (A, B) is completely controllable if
and only if for every choice of the self-conjugate set =
i
n
i=1
C there is a
matrix K M
nm
(R) such that (A+ BK
) = (i.e. the closed-loop matrix

of the system x = Ax+Bu with feedback u = K
x has eigenvalues the numbers

in ).
Note : For the single-input case where K R
n
we know that the necessity
condition of Zubov-Wonham theorem holds (see theorems 49 and 50). Here
we give a proof of the suciency of the theorem, i.e. we show that is we
can arbitrarily place the poles of a system via feedback, then the system is
controllable.
Proof. Assume that the system has n s uncontrollable modes with s ,= 0.
Then, there is a T GL(n, R) so that the system is written in the Kalman
decoupled form:
z =
_
A
11
A
12
0 A
22
_
z +
_
B
1
0
_
u (7.65)
with A
11
M
s
(R), A
12
M
r(ns)
(R), A
22
M
(ns)r
(R) and B
1
M
sm
(R)
and the pair (A
11
, B
1
) is controllable and z = Tx is the state of the system in the
new coordinates. Given a K
=
_
K
1
K
M
mn
(R), with K
1
M
ns
(R)
and K
2
M
n(ns)
(R), the closed-loop system with feedback u = K
z be-
comes:
z =
__
A
11
A
12
0 A
22
_
+
_
B
1
0
_
_
K
1
K
_
z (7.66)
=
__
A
11
A
12
0 A
22
_
+
_
B
1
K
1
B
1
K
2
0 0
__
z (7.67)
=
_
A
11
+B
1
K
1
A
12
+B
1
K
2
0 A
22
_
z (7.68)
The spectrum of the closed-loop system becomes:
(
_
A
11
+B
1
K
1
A
12
+B
1
K
2
0 A
22
_
) = (A
11
+B
1
K
1
) (A
22
) (7.69)
Even if we assume that (A
11
+ B
1
K
1
) is admissible by means of a proper
choice of K
1
, we notice that (A
22
) is completely not aected by the choise of
K
. Also notice that the number of eigenvalues of the matrix A

11
+B
1
K
1
is s
(counting multiplicities) and we need to be able to assign n eigenvalues which
is impossible.
Another interesting result is known as the Ackermans formula for pole place-
ment and is given in the following proposition:
Proposition 53 (Ackermans Formula). If (A, B) is a controllable pair with
B R
n
(single input system) and p is a polynomial having as roots the desired
poles, then the unique feedback for pole placement is:
K
=
_
0 0 1
((A, B)
1
p(A) (7.70)
78
Example 17 (Pole placement using Ackermans formula). We need to deter-
mine a linear feedback u = Kx that places the poles of the system:
x =
_
1 2
0 1
_
x +
_
1
3
_
u (7.71)
to 1 and 2. For that reason we consider the following polynomial which has
roots exactly 1 and 2:
p(s) = (s 1)(s 2) (7.72)
We now calculate p(A) which is:
p(A) = (AI)(A2I) =
_
0 2
0 0
_
(7.73)
The controllability matrix of the system is:
((A, B) =
_
1 7
3 3
_
(7.74)
which is invertible. Using Ackermans formula we have:
K
=
_
0
1
3
(7.75)
and the closed loop system is then:
x =
1
3
_
3 7
0 6
_
x (7.76)
which has the desired poles.
7.3.2 Pole Placement for systems with multiple inputs
So far so good, but only for systems with a single input. What happens
with systems with more that one input? It turns out that the single input
theory applies directly and without any modication to multiple input sys-
tems. The matrix B M
nm
(R) of a multiple input system has the form
B =
_
b
1
b
2
. . . b
n
(where b
i
are the columns of B). Assume that B
contains no zero columns; if it does we may remove them along with the corre-
sponding input variables and reduce the dimension of the input vector. Let b
i
0
be a nonzero column of B and assume that (A, B) is controllable. The open
loop system is described by:
x = Ax +b
1
u
1
+. . . +b
m
u
m
(7.77)
We single out b
i
0
and we rewrite the system as follows:
x = Ax +B
i
0
u
i
0
+b
i
0
u
i
0
(7.78)
where B
i
0
M
n(m1)
(R) is B without the column i
0
and u
i
0
R
m1
is the
input vector u without the input i
0
. We now determine a feedback u
i
0
= K
x
where K M
n(m1)
(R) so that the closed loop system becomes:
x = (A+B
i
0
)x +b
i
0
u
i
0
(7.79)
79
Figure 7.1: Pole placement for a system with multiple inputs
Have we chosen K so that the pair (A + B
i
0
, b
i
0
) is controllable, we can
use Ackermans formula to nd a state feedback u
i
0
= F
x that places the poles

of the overall system to the desired positions. So the only question we have to
answer is how do we chose a K so that (A+B
i
0
, b
i
0
) is controllable? This
is actually quite simple: we can choose one randomly! The probability that we
havent chosen a proper Kis zero! Let us rst give an example of pole placement
for a system with 3 states and 3 inputs and afterwards we will explain why we
can use almost any K.
Example 18 (Multiple Inputs: Pole placement). Consider the following system
with 3 states and 3 inputs:
x =
_
_
1 1 2
3 1 0
1 4 1
_
_
x +
_
_
1 1 3
1 2 1
0 1 5
_
_
u (7.80)
where x, u R
3
. In particular u =
_
u
1
u
2
u
3
. It is easy to see that this

system is controllable. Let us dene the input vector without the third input to be
u
3
=
_
u
1
u
2
(we have chosen i

0
= 3). The the given system is rewritten
as:
x =
_
_
1 1 2
3 1 0
1 4 1
_
_
x +
_
_
1 1
1 2
0 1
_
_
u
3
+
_
_
3
1
5
_
_
u
3
(7.81)
We now choose randomly a K M
32
(R) which can be simply:
K =
_
_
1 0
0 1
0 0
_
_
(7.82)
Then using the partial feedback law u
3
= K
x the system becomes:

x =
_
_
2 2 2
4 3 0
1 5 1
_
_
. .
MA+B
3
K
x +
_
_
3
1
5
_
_
. .
H
u
3
(7.83)
80
Assume now that we need to place the poles of the system (M, H) at 1, 2
and 3. Using Ackermans formula we get the feedback law:
u
3
= F
x (7.84)
where
F =
_
_
2.2018972
0.7733927
0.5241832
_
_
(7.85)
The overall control law then becomes:
u =
_
K
_
x =
_
K F
x (7.86)
As we mentioned earlier, the probability that we choose an inappropriate K
is zero. This is based upon the fact that the family of uncontrollable systems
forms a set of measure zero and in nowhere dense inside the set of all systems.
This is formally stated as follows [Son98]:
Proposition 54. Let o
n,m
= (A, B) M
n
(R) M
nm
(R) and let o
c
n,m
contain all controllable pairs in o
n,m
. Then o
c
n,m
is an open and dense set in
o
n,m
.
Proof. For a proof and more on topological properties of o
c
n,m
see [Son98].
7.4 Stabilizability
While controllability is necessary (and sucient) for pole placement, it is not
necessary for stabilization. Although we can control the system 100% and we
cannot determine a feedback so as to place the closed lood poles arbitrarily we
might be able to nd a feedback so that the origin is an asymptotically stable
point of the closed loop system. Let us demonstrate this through the following
example.
Example 19. The following system is not controllable:
x =
_
1 2
0 2
_
x +
_
1
0
_
u (7.87)
this is obvious since the system is written in the Kalman input-state decompo-
sition form. However, if we try the feedback law:
u(x) =
_
2 0
x (7.88)
Then, the closed loop system is:
x =
_
1 2
0 2
_
x (7.89)
from which it is obvious that its eigenvalues are 1 and 2 which renders the
origin an asymptotically stable point. Trying various other feedbacks we will nd
out that one of the two eigenvalues of the closed loop system is always 2! The
second eigenvalue can be placed arbitrarily anywhere on R.
81
Proposition 55 (Partial Pole Placement - Exercise). Consider an LTI system
with n states and m inputs. The rank its controllability matrix is s < n. Then:
1. Under any linear feedback u(x) = K
x, the closed loop system has n s

immovable eigenvalues, namely there is a J =
i
ns
i=1
so that:
J (A+BK
) for all K M
nm
(R) (7.90)
2. Given a self-conjugate set =
j
s
j=1
, there is a linear feedback u(x) =
K
x with K =
_
K
1
0
where K
1
M
sm
(R)
(A+BK
) = J (7.91)
Proof. The proof is based on facts that have been explained before and is left
to the reader as an exercise. Hint: Use the Kalman input-state decoupled form
of the system.
To this end we give the denition of stabilizability which is necessary and suf-
cient for the existence of a feedback which asymptotically stabilized the closed
loop system (renders the origin an asymptotically stable equilibrium point).
Denition 45 (Asymptotically Stabilizable). A control system x = f(x, u) is
(asymptotically) stabilizable if there is a feedback law u(x) = (x) so that the
closed loop system x = f(x, (x)) is asymptotically stable.
Note : It follows from the denition that if a system is controllable then it is
stabilizable. The converse though is not true.
Example 20. Consider a control system in the Kalman input-state decoupled
form:
x =
_
A
11
A
12
0 A
22
_
x +
_
B
1
0
_
u (7.92)
where A
11
M
s
(R). Let x be decomposed into its controllable and uncontrollable
modes as x =
_
x
co
x
co
with x
co
R
s
and x
co
R
ns
where s is the rank
of the controllability matrix of the system.
x
co
= A
11
x
co
+A
12
x
co
+B
1
u (7.93)
x
co
= A
22
x
co
(7.94)
Let us assume that all eigenvalues of A
22
have strictly negative real part, i.e.
equation (7.94) has asymptotically stable dynamics. This means that if we de-
note by
co
(t) the solution of (7.93) it holds that
|
co
(t)| 0 as t
For the controllable pair (A
11
, B
1
) let us choose a feedback K
1
so that the matrix
A
11
+B
1
K
1
has all its eigenvalues with negative real part. Applying this feedback
to the system yields:
x
co
= (A
11
+B
1
K
1
)x
co
+A
12
x
co
(7.95)
x
co
= A
22
x
co
(7.96)
82
Let us now assume that the initial condition of the system is:
x
0
=
_
x
0
co
x
0
co
_
(7.97)
then the trajectory of the closed loop system is:
(t) =
_

co
(t)
co
(t)
_
=
_
e
(A
11
+B
1
K
1
)t
x
0
co
+
_
t
0
e
(A
11
+B
1
K
1
)t
A
12
co
()d
e
A
22
t
x
0
co
_
(7.98)
Notice that (t) converges to the origin from any initial point (why?).
We now state the following stabilizability criterion [Hes09]:
Proposition 56 (Popov-Belevitch-Hautus Stabilizability Criterion). The fol-
lowing conditions are equivalent:
1. The pair (A, B) is stabilizable.
2. rank
_
I A B
= n for all C with non-negative real part.

3. rank
_
I A B
= n for all C that are eigenvalues of A and have

non-negative real part.
Another result, which is due to Wonham, provides an approach to stabilizabil-
ity through cyclicity. Let R[t] be the minimal polynomial of A. This
polynomial can be factored as follows:
(t) =
+
(t)
(t) (7.99)
where
+
(t) has all its roots in the closed
7.5 Lyapunov Theory
Lyapunov theory splits into two categories. Firstly, the direct Lyapunov the-
ory applies on linear and nonlinear systems to provide results on the stability
properties of their equillibrium points (assumed to be the origin). The direct
approach tackles the system as is without any transformations of approxima-
tions. Second, the indirect approach oers local stability results regarding the
linearization of a nonlinear system around an equilibrium point. This approach
uses mainly tools from linear systems theory but is more limited compared to
the direct approach.
7.5.1 Direct Lyapunov Theory
We rst give the denition of a directional derivative (or Lie derivative) along
the trajectories of a dynamical system:
Denition 46 (Derivative along the trajectories of a system). Given a dynam-
ical system : x(t) = f(x(t)), where f : R
n
R
n
, and a function V : R
n
R
we dene the derivative of V along the trajectories of the system as:
dV
dt
dV
dt
x(t)
=
_
V
z
f(z)
_
z=x(t)
(7.100)
83
Notation : Various notational schemes have been adopted by dierent authors.
Some of the most widely used are (L
f
V )(x) and

V (x). The former is the most
concise and clear in my opinion.
Example 21. Consider the LTI system : x = Ax and the function V (z) =
z
Pz where P o
n
(R) is a symmetric matrix. The Lie derivative of V along
the trajectories of is calculated as follows:
dV
dt
x(t)
=
d (z
Pz)
dt
z=x(t)
(7.101)
= x
Px +x
P x (7.102)
= x
Px +x
PAx (7.103)
= x
(A
P+PA)x (7.104)
Recall that every scalar is equal to its transpose, so for example for any vector
y R
n
and any matrix Q M
n
(R), it is true that y
Qy = (y
Qy)
= y
y.
Therefore, equation (7.103) becomes:
x
Px +x
PAx = (x
Px)
+x
PAx (7.105)
= x
Ax +x
PAx (7.106)
= x
(P
+P)Ax (7.107)
But, since P is assumed to be symmetric, one has that:
dV
dt
x(t)
= 2x
PAx (7.108)
We will end up with the same result if we use the fact that
dV
dt
[
= L
f
V =
V
x
f(x). In particular we have:
V (x)
x
=
(x
Px)
x
= x
(P+P
) = 2x
P (7.109)
Hence:
dV
dt
= (L
f
V )(x) = 2x
PAx (7.110)
Note 1: Note that nowhere did we have to calculate explicitly the trajectory
of the system in order to come up with the Lie direvative of V along the trajec-
tories of the system.
Note 2: The fact that 2x
PAx = x
(A
P + PA)x in no way does it imply

that 2PA = A
P + PA! Also, if P is assumed to be symmetric (and A does

not possess any special structure as usually is the case), then A
P+PA is also
symmetric, while PA is not guaranteed to be so. If we have to choose between
these two we should denitely go for the second one. Symmetric matrices are
rich in properties that facilitate algebraic manipulations. For instance, sym-
metric matrices have only real eigenvalues, they are always diagonalizable and
specically we can choose their eigenvectors to form an orthonormal basis for
R
n
. I guess the trick is already clear... Given any quadratic form V (x) = x
Qx
we can convert it to a symmetric form as follows:
V (x) = x
Qx = x
Q+Q
2
x = x

Qx (7.111)
84
The matrix

Q =
1
2
(Q + Q
) is symmetric. Let us close this note with the

following equation:
(L
f
V )(x) = 2x
PAx = 2x
(PA)
+PA
2
x = x
(A
P+PA)x (7.112)
Example 22. In this example we will be a bit more explicit. Consider the
dynamical system given by:
d
dt
_
x
1
x
2
_
=
_
x
2
2x
2
x
1
_
=: f(x
1
, x
2
) (7.113)
and the function V : R
2
R given by:
V (x) = V (x
1
, x
2
) =
1
2
(x
2
1
+x
2
2
) =
1
2
x
x (7.114)
The derivative of V is:
x
V (x) =
_
V (x
1
,x
2
)
x
1
V (x
1
,x
2
)
x
2
_
=
_
x
1
x
2
(7.115)
And the Lie derivative of V along the trajectories of (7.113) is the function
V : R
2
R given by:
V (x
1
, x
2
) =
_
x
1
x
2
_
x
2
2x
2
x
1
_
= 2x
2
2
(7.116)
Property of L
f
V : The solution of the time-invariant nonlinear system x =
f(x) with initial condition x(0) = x
0
is denoted as (t, x
0
). Then it holds that
for all t 0:
V ((t, x
0
)) V ((0, x
0
)) =
_
t
0
V ((, x
0
))d (7.117)
which is simply written as:
V ((t, x
0
)) V (x
0
) =
_
t
0
V ((, x
0
))d (7.118)
We need a couple of more denitions before stating the main result of this
section. The notion of positive (semi)denite functions extends the notion of
(non-negative) positive numbers on R.
Denition 47 (Positive (semi)denite function). A function V : D R, where
D R
n
is said to be positive semidenite if V (0) = 0 for every x D it holds
that V (x) 0. It is called positive denite if additionally for every x D0
it is true that V (x) > 0. A function V is called negative (semi)denite if V
is positive (semi)denite.
Note : Candidate positive denite functions are in many cases quadratic func-
tions, that is functions of the form V (x) = x
Qx where Q M
n
(R) is an
appropriate matrix.
85
Denition 48 (Positive (semi)denite matrix). A matrix Q M
n
(R) is called
positive (semi)denite if the corresponding quadratic function V (x) = x
Qx is
positive (semi)denite.
An algebraic criterion exists to test whether a given symmetric matrix is
positive denite or positive semidenite.
Proposition 57 (Criterion for Positive Semi-deniteness). Let Q M
n
(R) be
a symmetric matrix - we denote by o
n
(R) the set of all symmetrix matrices in
M
n
(R). Then, it is positive semi-denite if and only if all its eigenvalues are
(non-negative)positive.
Proof. 1. () We rst prove that if Q is symmetric and positive semidenite,
then all its eigenvalues are non-negative. Recall from linear algebra that a
symmetric matrix has only real eigenvalues, thus we can compare them with 0.
If R is an eigenvalue of Q and v R
n
is the corresponding eigenvector,
then Qv = v. Left-multiplying by v
yeilds:
v
Qv = v
v (7.119)
=
v
Qv
v
v
=
v
Qv
|v|
2
2
0 (7.120)
Note that the division by v
v = |v|
2
2
is allowed since v ,= 0.
2. () We now prove that if all eigenvalues of the symmetric Q are non-
negative, then the matrix is positive semi-denite. It is known that symmetric
matrices can be diagonalized by a unitary matrix U |
n
(R), i.e. if dg(n, R)
is the diagonal matrix of eigenvalues of Q, then there is a unitary matrix U so
that:
Q = U
1
U = U
U (7.121)
This means that the column-vectors of U, namely u
1
, u
2
, . . . , u
n
, are eigenvec-
tors of Q. These eigenvectors form an orthonormal basis or R
n
(i.e. |u
i
| = 1
and u
i
u
j
= 0 for i ,= j), so any vector x R
n
can be represented as a linear
combination:
x =
n
i=1
i
u
i
(7.122)
Let
i
denote the eigenvalue that corresponds to u
i
. We now calculate the
following:
x
Qx =
n
i=1
n
j=1
j
u
i
Qu
j
(7.123)
=
n
i=1
n
j=1
j
u
i
u
j
(7.124)
=
n
i=1
2
i
i
u
i
u
i
(7.125)
=
n
i=1
2
i
i
|u
i
|
2
2
(7.126)
86
From this we conclude that x
Qx 0 for all x if and only if

i
0 for i =
1, 2, . . . , n.
Proposition 58 (Exercise). If L M
n
(R) is any matrix, then Q L
L is
symmetric and positive semidenite.
Proof. This proposition provides a way to construct positive semidenite ma-
trices. proposition is left to the reader as an exercise.
Theorem 59 (Cholesky factorization). Let Q be a symmetric and positive
denite matrix. Then, there is a unique lower triangular matrix L with strictly
positive diagonal entries such that Q = LL
. If Q is a positive semidenite
and symmetric matrix then the uniqueness of L is dropped and the diagonal
entries of L are non-negative.
Notation : We adopt the following practical notation for positive and negative
(semi)denite matrices. If A is a positive semidenite matrix, we write A _ 0
while if it is positive denite we write A ~ 0. For negative matrices we use
the notation A _ 0 and A 0 respectively. The symbol A _ B should be
interpreted as A B _ 0 (The matrix A B is positive semidenite). We
need to underline that _ is not a partial order in M
n
(R), i.e. there are pairs of
matrices A and B such that neither A _ B nor B _ A.
The Cholesky factorization theorem allows us to prove that a symmetric and
positive denite quadratic form denes a vector norm while a symmetric and
positive semidenite one denes a seminorm. For completeness we provide rst
the denition of a vector norm and a vector seminorm.
Denition 49 (Norm/Seminorm). A mapping | | : R
n
x |x| [0, ) is
called a (vector) norm if it satises the following properties:
1. For every R it holds that |x| = [[|x|
2. For all x, y R
n
it is |x +y| |x| +|y|
3. |x| = 0 = x = 0
If the third property is dropped, then the mapping is called a seminorm.
We are already familiar with various norms in R
n
the most widely used one
being the Euclidean norm denoted as | |
2
. The Euclidian norm is dened
as |x|
2
=
x. Norms are useful because they allow us to dene and study

convergence in vector spaces. We have previously claried what we mean when
we say that the trajectory of a linear system converges to 0 in the sense of | |
2
.
It is:
lim
t
(t, x
0
) = 0 lim
t
|(t, x
0
)|
2
= 0 (7.127)
Where the left-hand side equation has no meaning whatsoever unless we specify
that the convergence is meant in the | |
2
-sense. Imagine now that someone
tells us that a trajectory converges to 0 in the | |
sense, where | |
is some
other norm. Should we assume that it converges in the | |
2
-sense as well.
The answer is positive if and only if the two norms are equivalent. Here is the
denition of equivalent norm:
87
Denition 50 (Equivalent norms). Let | |
and | |
be two norms in R
n
.
The two norms are called equivalent if there are numbers m, M 0 so that for
all x R
n
it holds:
m|x|
|x|
M|x|
(7.128)
We now prove that positive denite quadratic forms, i.e. functions of the
form f(x) = x
Px where P ~ 0 induce norms.

Proposition 60. Let P o
n
(R) with P
= P ~ 0. The mapping | |
P
: R
[0, ) dened by:
|x|
P
=
Px (7.129)
is a vector norm in R. If P _ 0, then | |
P
is a seminorm.
Proof. Assume rst that P is symmetric and positive denite. Then, there
is a unique lower triangular matrix L with strictly positive diagonal entries
(Cholesky factorization theorem) so that P = LL
. Then:
|x|
P
=
Px =
_
x
LL
x =
_
(L
x)
(L
x) = |L
x|
2
(7.130)
For R we have:
|x|
B
= |L
x|
2
= [[|L
x|
2
= [[|x|
P
(7.131)
Second, let x, y R
n
. Then:
|x +y|
P
= |L
x +L
y|
2
|L
x|
2
+|L
y|
2
= |x|
P
+|y|
P
(7.132)
The third property of the vector norm accrues from the fact that L being a
lower triangular matrix with strictly positive diagonal entries implies that it is
invertible, hence ker L = 0. Therefore:
|x|
P
= 0 |L
x|
2
= 0 L
x = 0 x kerL
= 0 (7.133)
This last property does not hold if L is lower triangular with a zero element in
its diagonal. The determinant of a triangular matrix equals the product of its
diagonal elements. In that case it would be [L[ = [L
[ = 0 rendering the matrix

non-invertible. Then the kernel of L
would have dimension at least one and

there are non-zero vectors such that L
x = 0.
We are now ready to state and prove the main result of this section (The
proof can be also found in [Kha02]):
Theorem 61 (Lyapunov stability theorem). Let x = 0 be an equilibrium point
for the system x = f(x) with x R
n
and D R
n
with D 0. The vector eld
f is locally Lipschitz (so that the dierential equation admits unique solutions).
Let V : R
n
[0, ) be a continuously dierentiable function such that:
1. V is positive denite in D.
2. L
f
V is negative semidenite in D.
then x = 0 is stable. If additionally L
f
V is negative denite in D, then the
origin is locally asymptotically stable.
88
Proof. For any > 0 take r (0, ] such that:
B
r
(0) x R
n
[|x| r D (7.134)
Let min
x=r
V (x). V is positive denite so > 0. Take a (0, ) and
dene the set:
= x B
r
[V (x) (7.135)
The trajectory of the system starting inside
will remain in
for all future

times t 0. This is a consequence of condition #2. To corroborate this claim
we use equation (7.118). Let x
0

and (t, x
0
) be the trajectory starting
from x
0
. Then
V ((t, x
0
)) V (x
0
) =
_
t
0
( L
f
V ) ((, x
0
)) d 0 (7.136)
That is, for all t 0:
V ((t, x
0
)) V (x
0
) (t, x
0
)
(7.137)
Since V is continuous and V (0) = 0,
0 and we can nd a > 0 so that:

B
B
r
(7.138)
Which further means that any trajectory startin from B
(i.e. from inside
)
will remain in
thus it will remain in B

r
. Formally speaking:
x(0) = x
0
B
x
0

(t; x
0
)
(t; x
0
) B
r
(7.139)
The same thing in norm notation:
|x
0
| < |(t; x
0
)| < r for all t 0 (7.140)
which states exactly that the origin is a stable equilibrium. We shall now prove
the second part of the theorem regarding asymptotic stability. We need to show
that:
lim
t
|(t, x
0
)| = 0 (7.141)
Let us now investigate the implications of the Lyapunov theorem on linear
and time-invariant systems.
Theorem 62. Given an LTI system x = Ax and any positive denite matrix
Q ~ 0, the following are equivalent:
1. The origin is an asymptotically stable equilibrium point for the system.
2. There is a unique positive denite and symmetric matrix P
= P ~ 0 such
that
A
P+PA = Q (7.142)
89
References
[Bea96] J.A. Beachy, Abstract Algebra, 2nd edition, 1996, ISBN 0-88133-866-4,
Available online at http://www.math.niu.edu/beachy/aaol/contents.html.
[Bha82] S.P. Bhattacharyya and E. de Souza, Pole assignment via Sylvesters
equation, Systems & Control Letters, 1(4), 1982, pp.261-263.
[Boy11] S. Boyd, Lecture Slides (Stanford University) 2010-11. Available
online at http://www.stanford.edu/boyd/ee263/lectures.html. Accessed:
November 5, 2011.
[Con99] E.H. Connell, Elements of Abstract and Linear Algebra, 2000, Available
online at http://www.math.miami.edu/ec/book/.
[Gan77] F.R. Gantmacher, The theory of matrices, Chelsea, New York, 1977.
ISBN: 0-8284-0131-4.
[Hes09] J.P. Hespanha, Linear Systems Theory, Princeton University Press,
Princeton, New Jersey, 2009, ISBN: 978-0-691-14021-6.
[John00] John R. Silvester, Determinants of Block Matrices, The Math-
ematical Gazette, 84(501), 2000, pp.460-467, Available online at
www.jstor.org/stable/3620776. Access date: November 5, 2011.
[Kha02] , H.K. Khalil, Nonlinear Systems, Prentice Hall editions (3
rd
edition),
New Jersey, 2002, ISBN: 0-13-067389-7.
[Kli+99] R.E. Kliman, N. Sigmon and E. Stitzinger, Applications of Abstract
Algebra with MAPLE, CRC Press, London, ISBN 0-8493-8170-3.
[LeoShu10] G.A. Leonov and M.M. Shumanov, Stabilization of Controllable Lin-
ear Systems, Nonlinear Dynamics and Systems Theory, 10(3), 2010, pp.235-
268.
[MehXu96] V. Mehrmann and H. Xu, An analysis of the pole placement problem.
I. The single input case, Electronic Transactions on Numerical Analysis, 4,
1996, pp.89-105.
[Mes90] Robert Messer, Linear Algebra: gateway to mathematics, HarperCollins
College Publishers, New York 1990, ISBN: 0-06-501728-5.
[Mey00] Carl D. Meyer, Matrix Analysis and Applied Linear Algebra, SIAM
Editions, Available online at matrixanalysis.com.
90
[Rom00] Stevens Roman, Advanced Linear Algebra, Spring editions, New York,
2000, ISBN: 0-387-24766-1 doi: 10.1007/0-387-27474-X.
[Rud73] Walter Rudin, Functional Analysis. Mc. Graw Hill editions, 1973, ISBN
0-07-054225-2.
[Sha96] Sharipov R. A. Course of Linear Algebra and Multidimensional Geom-
etry: the textbook, Publications of Bashkir State University Ufa, 1996.
ISBN 5-7477-0099-5.
[Sho00] Thomas S. Shores, Applied Linear Algebra and Matrix Analysis,
Springer-Verlag Editions, Undergraduate Texts in Mathematics, 2007, doi:
10.1007/978-0-387-48947-6.
[Son98] Eduardo D. Sontag, Mathematical Control Theory: Determin-
istic Finite Dimensional Systems, Springer Editions, Second edi-
tion, New York, 1998, ISBN: 0-387-984895, available online at
www.math.rutgers.edu/sontag.
[Spr00] Springer T. A., Linear Algebraic Groups, Birkhauser editions,
Boston 2000, ISBN: 978-0-8176-4839-8, e-ISBN: 978-0-8176-4840-4, doi:
10.1007/978-0-8176-4840-4.
[Won67] Wonham W. M., On Pole Assignment in Multi-Input Controllable Lin-
ear Systems, IEEE Transactions on Automatic Control, 12(6), 1967, pp.
660-665.
91

Automatic Control of State Space Systems - Pantelis Sopasakis

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Automatic Control of State Space Systems - Pantelis Sopasakis

Diunggah oleh

Hak Cipta:

Format Tersedia

The Realm of Linear State Space Control Systems

= 0 then the requirement becomes

. In the next chapter we will describe this procedure in detail.

x(t) +du(t) (4.65)

(t) = b(t) +a(t) (4.71)

(z) = 6z. Therefore,

f(J) = block diag

> 0 such that |f(x, u) f(y, u)| L

(R) is the Euclidean ball of R

(t) = A(t) +Bu(t) (5.39)

is the dierentiation operator. A prime example of such

f(t)(t)dt = f(0) (5.70)

(t) has the following properties:

, then and only then the system is

. From the rank-nullity theorem we know that dim1

B = TB have the following structure:

(the complementary linear subspace of C in R

= n for all C that are eigenvalues of A.

< n. This means that there is a nonzero vector

B = 0 which further implies that:

< n and contradicts out initial assump-

is an equillibrium point for (7.4) and (t; x

) the solution with (0; x

then for all t 0 we also have (t; x

, that is every solution starting

| < = |(t; x)| < for all t 0.

) around the equillibrium

, we can nd some other ball B

) so that starting from therein, the

). It is implied in the denition that

). However, if a trajectory retains a constant

and radius ), then it is again a locally stable equillibrium point. It is clear

for all x with |x x

we dene its domain of attraction

is a locally asymptotically stable point

is globally asymptotically stable if and only

-class and we denote /

| < and for all t 0.

satises (7.14). Let us choose an > 0; (, t) is

is stable and more:

is an asymptotically stable equillibrium point.

) and let (t) |(t; x

is LAS, it follows from the denition that lim

|) [ The reader should go back to

| < for all t 0 and for all x

is a stable equilibrium point.

| < for all t T and for all x

be a locally asymptotically stable point of (7.4) and

). Then for every > 0 there is a t

is an asymptotically stable equillibrium point, according to

| < and for all t 0. Then it is easy to see that:

is an asymptotically stable point

= 0. So, according to proposition 46 we require that the matrix

has the desired eigenvalues

is similar to the matrix A+B

with k given by:

) = (i.e. the closed-loop matrix

x has eigenvalues the numbers

. Also notice that the number of eigenvalues of the matrix A

x that places the poles

. It is easy to see that this

(we have chosen i

x the system becomes:

x, the closed loop system has n s