Slides 4up

System Identification
and
Today
Parameter Estimation
PeSi/0/1
Introduction part 1: System Identification

What is system identification?
A set of methods to obtain a mathematical model of a dynamic system from
experimental data.
What is the goal of system identification?
The obtained model should on the one hand be compact and on the other
hand be adequate for the intended purpose. Such purpose may be the
design of a controller for that system.
How is system identification being carried out?
In dedicated experiments measurements are carried out on a system. The
obtained system response is analysed, e.g. with M ATLABs system
identification toolbox ident.
Ronald Aarts
PeSi/1/2
Assessment / examination
Systems and signals
Continuous and discrete time systems

Time and frequency domain
Ronald Aarts
PeSi/1/1
Example: Data from piezo mechanism of Project F, input piezo current, output
position, Ts = 33 s.
Input and output signals
900
900
y []
Ronald Aarts
http://blackboard.utwente.nl/
Software: Matlab
800
700
2100
800
700
2100
u []
BlackBoard:
Ronald Aarts
Mechanical Automation /
Werktuigbouwkundige Automatisering (Wa)
Room: Horstring Z 229
Phone: (053) 489 2557
Email: R.G.K.M.Aarts@utwente.nl
Course topics and course material
y []
Lecturer:
Introduction of System identification and Parameter estimation
u []
Course number: 191131700
2050
2000
0
0.2
0.4
0.6
t [s]
0.8
2050
2000
0.5
0.501 0.502 0.503 0.504 0.505

t [s]
This data will be used more often during the lectures and is available for download.
Ronald Aarts
PeSi/1/3
How is this toolbox being used?

Often with a GUI:
Introduction part 2: Parameter Estimation
What is parameter estimation (in this lecture)?
A set of methods to obtain a physical model of a dynamic system from
experimental data.
What is the goal of parameter estimation?
The obtained parameters should represent certain physical properties of
the system being examined.
Click-and-play??
And what is the use of these lectures?
An overview of the applied methods, the background of the algorithms, the
pitfalls and suggestions for use.
Ronald Aarts
Physical models
PeSi/1/4
Ronald Aarts
PeSi/1/5
vs. Mathematical models
microscopic
First principles: conservation laws,
physical properties, parameter estimation
PDEs: -dimensional
non-linear
no general-purpose technique
macroscopic
Experimental:
planned experiments, look for relations in data,
system identification
ODEs: finite dimension
LTI
standard techniques
LTI: Linear Time Invariant system, e.g. described by means of a transfer

function
y(t)
Y (s)
= G(s) of
= G(s)
U (s)
u(t)
Where possible, non-linearities are removed from the data by preprocessing
prior to system identification
Ronald Aarts
How is parameter estimation being carried out?

The parameters of the model are fitted to obtain the best representation
of the experimental data, e.g. with M ATLABs optimization toolbox.
PeSi/1/6
Approach for (1) system identification and (2) parameter estimation:
Input/output selection
Experiment design
Collection of data
Choice of the model structure (set of possible solutions)
Estimation of the parameters
Validation of the obtained model (preferably with separate
validation data).
1. The models used for system identification are mathematical:

The underlying physical behaviour is unknown or incomplete.
To the extend possible physical insight is used, e.g. to select the model
structure (representation, order).
2. The parameters in the models for parameters estimation have, to a more or
lesser extend, a physical meaning.
Ronald Aarts
PeSi/1/7
Overview topics: 1. System identification (first part of the lectures):

Overview topics: 2. Parameter estimation (first and second part of the lectures):
Introduction, discrete systems, basic signal theory
Open-loop LTI SISO systems, time domain, frequency domain
Non-linear model equations
Subspace identification
Identifiability of parameters
Non-parametric identification: correlations and spectral analysis
Linearity in the parameters
Identification with Prediction Error-methods: prediction, model structure,

approximate models, order selection, validation
Error propagation
Experiment design
Experiment design
Ronald Aarts
PeSi/1/8
Ronald Aarts
PeSi/1/9
Software:
M ATLAB (any recent version) + identification toolbox + optimization toolbox: E.g.
available from NCS.
Overview topics: 3. Miscellaneous (second part of the lectures):

MIMO-systems
Identification in the frequency domain

Identification of closed loop systems
Non-linear optimisation
Cases.
Course material:
Lecture notes / slides in PDF-format from BlackBoard site.
On-line M ATLAB documentation of the toolboxes (selection).
Examination (5 EC):
Exercises parallel to the lectures, usually to be handed in next week.
One exercise includes a lab experiment (during the break in March / April).
The answers will be graded and may be discussed during an oral exam.
Grades will be available on-line.
Ronald Aarts
PeSi/1/10
Grades:
Ronald Aarts
Approximately 65% of maximum:

5.5
100% of maximum: 10.0
In between: linear interpolation
PeSi/1/11
System
T
Systems and signals

A description of the system T should specify how the output signal(s) y depend
on the input signal(s) u.
The signals depend on time: continuous or at discrete time instants.
The phase shift is found from the phase angle G(i).
Continuous versus discrete time

Time domain versus frequency domain
The output y(t) for an arbitrary input signal u(t) can be found by considering
all frequencies in the input signal: Fourier transform.
Examples for LTI systems:
Frequency Response Function (FRF) or Bode plot

Impulse response of step response
State space model
Transfer function
Ronald Aarts
PeSi/2/1
The input signal can be considered as a sequence of impulses with some (time
dependent) amplitude u(t). The outputs due to all impulses are added:
Continuous time signals and systems: Convolution integral
Z
Ronald Aarts
l=0
g(l)u(k l)
PeSi/2/3
PeSi/2/2
Discrete time dynamic systems

Signals are sampled at discrete time instances tk with sample time Ts,:
uk1
uk
uk+1
uk+2
=
=
=
=
u(tk1)
u(tk )
u(tk+1)
u(tk+2)
differential operator
du
su
Z dt
s1u
u dt
(k = 0, 1, 2, ...)
=
=
=
=
u(tk Ts)
u(tk )
u(tk + Ts)
u(tk + 2Ts )
Transfer function y = G(z)u + H(z)e (in z-domain)

Continuous time
g( )u(t ) d
Discrete time signals and systems: Summation

y(k) = (g u)(k) =
Procedure (in principle): Transform the input into the frequency domain, multiply
with the Bode plot or FRF, transform the result back to the time domain.
Ronald Aarts
The impulse response g(t) of an LTI system can also be used to compute the
output y(t) for an arbitrary input signal u(t):
y(t) = (g u)(t) =
It specifies for each frequency the input-output relation for harmonic signals
(after the transient behaviour vanished):
The amplification equals the absolute value |G(i)|.
Several system descriptions are available:
The Bode plot of a system G(i) is the amplitude and phase plot of the
complex function G depending on the real (angular) frequency .
vs. Discrete time

shift operator
(1 q)uk uk uk+1
(1 q 1)uk uk uk1
State space equations:

x(t)
= Ax(t) + Bu(t) + Kv(t)

x(tk+1) = Ax(tk ) + Bu(tk ) + Kv(tk )
y(t) = Cx(t) + Du(t) + v(t)
y(tk ) = Cx(tk ) + Du(tk ) + v(tk )
Ronald Aarts
PeSi/2/4
Discrete time dynamic systems: stability

Continue time
Discrete systems and phase lag
vs. Discrete time
Zero order hold (ZOH) discretisation introduces a phase lag:
z transform
Transfer function:
Imaginary axis:
s = i
Stability:
Poles inside the unit circle:
|z| < 1
Phase lag depends on

frequency and sample time
Ts: Ts/2 (in rad).
0.5
x []
Relation between s en z: z = esTs (Franklin (8.19))

Poles in LHP:
Re(s) < 0
b1z 1 + b2z 2
G(z) =
1 + a1z 1 + a2z 2
n1 s + n0
G(s) =
2
d2s + d1s + d0
phase [deg]
Laplace transform
0.5
1
0
0.5
Undamped poles:
Unit circle:
z = eiTs
1
t [s]
1.5
20
40
60
Ts = 0.1 s
80
0
10
[rad/s]
10
So the phase lag is 90o at the Nyquist frequency N = s/2 = /Ts.
Ronald Aarts
PeSi/2/5
Ronald Aarts
PeSi/2/6
Fourier transforms
Signal characterisation
M ATLABs identification toolbox ident works with time domain data. Even then,
the frequency domain will appear to be very important. Furthermore,
identification can also be applied in the frequency domain.
Frequency content: Fourier transform
Continuous-time deterministic signals u(t): Fourier integral:

Z
Z
1
u(t)eit dt
u(t) =
U () =
U ()eit d
2
For a finite number (N ) of discrete-time samples ud(tk ): Fourier summation:

UN (l ) =
ud(tk ) =
1
1 NX
UN (l )eil tk
N l=0
UN (l ) with l = Nl s = Nl 2
Ts , l = 0, ..., N 1 is the discrete Fourier
transform (DFT) of the signal ud(tk ) with tk = kTs, k = 0, ...N 1.
For N equal a power of 2, the Fast Fourier Transform (FFT) algorithm can be
applied.
Deterministic or stochastic signals?
Ronald Aarts
ud(tk )eil tk
k=0
Energy
Power
NX
1
PeSi/2/7
Ronald Aarts
PeSi/2/8
Energy and power (continue time signals)
900
10
800
10
Energy spectrum: u() = |U ()|2
YN []
y []
Example: 32768 data samples of the piezo mechanism (left).
700
Energy:
Eu =
10
u(t)2dt =
600
10
0
0.5
t [s]
10
10
f [Hz]
10
Power spectrum:
u() = lim
Power:
Pu = lim
T T
ZT
1
T T
1
2
u()d
|UT ()|2
u(t)2dt =
1
2
u()d
DFT (right) in 16384 frequencies computed with M ATLABs fft command.

Horizontal axis in steps of 1/(total measurement time) = 0.925 Hz.
[ UT () is de Fourier transform of a continuous time signal with a finite duration ]
Ronald Aarts
Ronald Aarts
PeSi/2/9
Signal types
Energy and power (discrete time signals)
Deterministic with finite energy:

R
Eu = u()d bounded.
u() = |U ()|2 limited.
Energy spectrum: u() = |U ()|2
1
0
Deterministic with finite power:

R
Pu = u ()d finite.
u() = T1 |UT ()|2 unbounded.
Stochastic with finite power:
R
Pu = u ()d finite.
u() = T1 |UT ()|2 bounded.
Ronald Aarts
PeSi/2/10
Energy:
1
1
0
1
1
0
20
40
60
80
ud(k)2 =
k=
u() = lim
Power:
Pu = lim
(periodogram)
Z
1
1 NX
ud(k)2 = u()d
N N
k=0
s
100
Ronald Aarts
u()d
t [s]
PeSi/2/11
1
|UN ()|2
N N
Power spectrum:
1
0
Eu =
(from the DFT)
PeSi/2/12
Convolutions
Stochastic signals
Continuous time: y(t) = (g u)(t) =
g( )u(t ) d
Realisation of a signal x(t) is not only a function of time t, but depends also on
the ensemble behaviour.
After Fourier transform: Y () = G() U ()
Discrete time: y(k) = (g u)(k) =
l=0
An important property is the expectation:E f (x(t))
(t = 0, 1, 2, ...)
g(l)u(k l)
After Fourier transform: Y () = G() U ()
Examples: Mean
Power
E x(t)
E (x(t) E x(t))2
Cross-covariance: Rxy ( ) = E[x(t) Ex(t)][y(t ) Ey(t )]

Autocovariance: Rx( ) = E[x(t) Ex(t)][x(t ) Ex(t )]
Example: u is the input and g(k), k = 0, 1, 2, ... is the impulse response of the
system, that is the response for an input signal that equals 1 for t = 0 en
equals 0 elsewhere.
Then with the expressions above y(k) is the output of the system.
White noise: e(t) is not correlated met signals e(t ) for any 6= 0.
Consequence: Re( ) = 0 for 6= 0.
Ronald Aarts
Ronald Aarts
PeSi/2/13
Systems and models
Power density or Power Spectral Density:

x() =
Rx( ) ei d
PeSi/2/14
xd () =
Rxd (k) eikT
k=
A system is defined by a number of external variables (signals) and the

relations that exist between those variables (causal behaviour).
With (inverse) Fourier transform:

1
Rx( ) =
2
x() ei d
Rxd (k) =
T
xd () eikT d
2
Z
?

Power:
E (x(t) E x(t))2 = Rx(0) =
1
=
2
Ronald Aarts
E (xd(t) E xd(t))2 = Rxd (0) =

T
xd () d
=
2
Z
x() d
PeSi/2/15
Signals:
Ronald Aarts
measurable input signal(s) u

measurable output signal(s) y
unmeasurable disturbances v (noise, non-linearities, ...)
PeSi/2/16
Estimators
Non-parametric (system) identification
Suppose we would like to determine a vector with n real coefficients of which

the unknown (true) values equal 0.
An estimator N has been computed from N measurements.
t-domain: Impulse or step response IDENT: cra
f -domain: Bode plot
Give models with many numbers, so we dont obtain models with a small
number of parameters.
This estimator is
The results are no simple mathematical relations.
unbiased if the estimator E N = 0.

consistent, if for N the estimator E N resembles a -function, or in
other words the certainty of the estimator improves for increasing N .
The results are often used to check the simple mathematical relations that
are found with (subsequent) parametric identification.
Non-parametric identification is often the fist step.
The estimator is consistent if it is unbiased and the asymptotic covariance

lim covN = lim E(N E N )(N E N )T = 0
Ronald Aarts
Ronald Aarts
The IDENT commands impulse & step use a different approach that is related to the
parametric identification to be discussed later.
PeSi/2/17
Correlation analyse
Ident manual: Tutorial pages 3-9,10,15;

Function reference cra (4-42,43).
G0
?

PeSi/3/1
The Finite Impulse Response (FIR)

g (k), k = 0, 1, 2, ..., M , is a model
estimator for system G0(z) for sufficiently high order M :
v
u
IDENT: spa & etfe
y(t) = G0(z) u(t) + v(t)
y(t)
M
X
k=0
g (k)u(t k)
(t = 0, 1, 2, ...)
Note: In an analysis the lower limit of the summation can be taken less than 0 (e.g. m) to
verify the (non-)existence of a non-causal relation between u(t) and y(t).
Using the Impulse Response g0(k), k = 0, 1, 2, ... of system G0(z)

y(t) =
k=0
g0(k)u(t k) + v(t)
(t = 0, 1, 2, ...)
So the transfer function can be written as: G0(z) =
g0(k)z k
k=0
Impulse response of infinite length.

Assumption that the real system is linear and v is a disturbance (noise, not
related to input u).
Ronald Aarts
PeSi/3/2
How do we compute the estimator

g (k)?
u(t) en v(t) are uncorrelated (e.g. no feedback from y to u!!!).

P
Multiply the expression y(t) = g0(k)u(t k) + v(t) with u(t )
and compute the expectation.
This leads to the Wiener-Hopf equation:
Ryu( ) =
k=0
Ronald Aarts
g0(k)Ru ( k)
PeSi/3/3
Wiener-Hopf: Ryu( ) =
k=0
g0(k)Ru ( k)
If u(t) is a white noise signal, then Ru( ) = u2( ), so

g0( ) =
yu ( )
Ryu ( )
R
and
g ( ) =
2
u
u2
yu ( ) from N
How do we compute the estimator for the cross covariance R
measurements?
Sample covariance function:
N
yu
R
( ) =
k=0
g0(k)Ru ( k)
And what if u(t) is not a white noise signal?

Option 1: Estimate the autocovariance e.g. with
N
1 X
N
u
u(t)u(t )
R
( ) =
N t=
and solve the linear set of M equations for
g (k):
N
yu
R
( ) =
N
1 X
y(t)u(t )
N t=
M
X
N
u
g (k)R
( )
k=1
Not so favorable ....
is asymptotically unbiased (so for N ).

Ronald Aarts
PeSi/3/4
k=0
Ronald Aarts
PeSi/3/5
g0(k)Ru ( k)
Finding a pre-whitening filter L(z) for uF (k) = L(z)u(k):
And what if u(t) is not a white noise signal (continued)?

Try to use a linear model of order n (default in ident n = 10)
Option 2: Filter input and output with a pre-whitening filter L(z):
Suppose we know a filter L(z), such that uF (k) = L(z)u(k) is a white
noise signal.
Apply this filter to the output as well (yF (k) = L(z)y(k)), then
yF (t) = G0(z) uF (t) + L(z)v(t)
so the impulse response
g (k) can also be estimated from uF (k) and
yF (k).
L(z) = 1 + a1z 1 + a2z 2 + ... + anz n

and look for a best fit of the n parameters such that e.g.
minimised.
Exercise 3.
OK, but how do we find such a pre-whitening filter L(z)?
Ronald Aarts
PeSi/3/6
Ronald Aarts
PeSi/3/7
N
1 X
u2 (k) is
N k=1 F
Impulse response of the piezo mechanism.

An example (1): The piezo mechanism.
[ir,R,cl]=cra(piezod,200,10,2);
Warning: All equations starting from y(t) = G0(z) u(t) + v(t) do not account
for offsets due to non-zero means in input and/or output. So detrend!
Covf for filtered y
Covf for prewhitened u
2000
2500
1800
2000
Upper right: u is indeed

whitened.
1500
1600
load piezo;
Ts = 33e-6;
u=piezo(:,2);
y=piezo(:,1);
piezo =
iddata(y,u,Ts);
piezod =
detrend(piezo);
plot(piezod);
150
100
y1
50
0
50
100
150
0.2
0.4
0.6
0.8
100
u1
50
50
100
0.2
0.4
0.6
0.8
1000
1400
500
1200
1000
200
100
100
200
500
50
Correlation from u to y (prewh)
50
Impulse response estimate
0.1
2500
0.08
2000
0.06
1500
0.04
The horizontal axes count the

time samples, so the values
should be scaled with
T = 33 s.
1000
0.02
500
0
0
0.02
0
50
100
150
Lower right: The impulse

response is causal.
200
500
50
100
150
200
Time
Ronald Aarts
PeSi/3/8
Ronald Aarts
Second example: Known system
PeSi/3/9
Impulse response of known system
Ident manual page 2-15:
Impulse response
3
yk 1.5yk1 + 0.7yk2 = uk1 + 0.5uk2 + ek ek1 + 0.2ek2.
G0 (exact)
2.5
So
and
G0(q) =
q 1 + 0.5q 2
1 1.5q 1 + 0.7q 2
1 q 1 + 0.2q 2
H0(q) =
1 1.5q 1 + 0.7q 2
Simulation:
Ronald Aarts
N
T
fs
u(t)
e(t)
=
=
=
or
or
z + 0.5
G0(z) = 2
z 1.5z + 0.7
z 2 z + 0.2
H0(z) = 2
z 1.5z + 0.7
4096
1s
1 Hz
binary signal in frequency band 0..0.3fs
white noise (random signal) with variance 1
PeSi/3/10
cra with n = 10
1.5
cra with n = 1 (not

enough pre-whitening)
0.5
0.5
1.5
Ronald Aarts
10
15
20
Time [s]
25
30
35
PeSi/3/11
40
Spectral analyse
Ident manual: Tutorial pages 3-10,15,16;

Function reference etfe (4-53,54),
spa (4-193195).
N (a)
The estimator G
(b)
y(t) = G0(z) u(t) + v(t)
(c)
is unbiased.
has an asymptotic variance
1 |U ()|2 unequal 0 !
v ()/ N
N
v
u
?

G0
Difficulty: For N there is more data, but there are also estimators at more
(=N/2) frequencies, all with a finite variance.
Fourier transform (without v): Y () = G0(eiT )U (), so

G0(eiT ) =
is asymptotically uncorrelated for different

frequencies .
Y ()
.
U ()
Solutions:
N (eiT ) =
Estimator for G0(eiT ) using N measurements: G
1. Define a fixed period N0 and consider an increasing number of

measurements N = rN0 by r . Carry out the spectral analysis for each
period and compute the average to obtain a good estimator in N0/2
frequencies.
YN ()
.
UN ()
V ()
N (eiT ) = G0(eiT ) + N
Effect of v: G
.
UN ()
Ronald Aarts
2. Smoothen the spectrum in the f -domain.
PeSi/3/12
Ronald Aarts
PeSi/3/13
An example: 32768 data samples of the output signal (position) of the piezo
mechanism.
An example: 32768 data samples of the output signal (position) of the piezo
mechanism.
Split the DFT in parts:
Filter the DFT with a Hamming filter:

8
10
6
10
10
10
6
10
10
10
5
10
10
10
10
10
10
10
10
10
10
10
10
10
1
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
Original DFT and with Hamming filters of width 8, 64 and 512.

DFT 16384 frequencies
Ronald Aarts
and 2048 and 256 frequencies
PeSi/3/14
Signal Processing Toolbox: Filters neighbouring frequencies with cosine function.
Ronald Aarts
PeSi/3/15
With IDENT (1): periodogram (data spectra) of input and output signals
Periodogram
slope -2
Output # 1
10
10
10
10
10
10
10
The output signal shows a

low frequent slope of -2,
which is caused by the
existence of a pure
integrator 1/s in the system
(slope -1 in a Bode plot and
squared in the
periodogram).
With IDENT (2): identification

ETFE (Empirical Transfer Function Estimate, manual page 4-53):
N (eiT ) =
Estimate the transfer function G with Fourier transforms G
in 128 (default) frequencies.
YN ()
UN ()
Smoothing is applied and depends on a parameter M , which equals the width

of the Hamming window of a filter that is applied to input and output (small M
means more smoothing).
SPA (SPectral Analysis, manual page 4-193):
10
Input # 1
The input signal is

reasonable white.
10
10
10
10
Frequency (Hz)
10
Ronald Aarts
10
PeSi/3/16
Estimate the transfer function G with Fourier transforms of the covariances

()
N (eiT ) = yu
G
in 128 (default) frequencies.
u()
Smoothing is also applied and again depends on a parameter M , that sets the
width of the Hamming window of the applied filter.
Ronald Aarts
PeSi/3/17
ETFE of the piezo example

Frequency response
Choice between ETFE and SPA?

0
Not always straightforward to predict which method will perform best. Why
not try both?
Amplitude
10
ETFE is preferred for systems with clear peaks in the spectrum.
u
u |
yu()|2
yu() = t
y ()
u ()
10
10
10
10
window
parameter:
0
200
Phase (deg)
SPA estimates also the noise spectrum v(t) = y(t) G0(z)u(t)

according to

2

yu()
v () = y ()
u()
Measure of signal
to noise ratio with the coherence spectrum
v
10
400
600
800
2
10
10
Frequency (Hz)
10
What is the real width of the peak near 2 kHz?
Ronald Aarts
PeSi/3/18
Ronald Aarts
PeSi/3/19
M = 15
M = 30
M = 60 *
M = 90
M = 120
Second example: Spectral analysis of the known system
SPA of the piezo example

Frequency response
Ident manual page 2-15:

0
10
10
10
10
So
G0(q) =
and
H0(q) =
10
window
parameter:
Phase (deg)
200
M = 15
M = 30
M = 60
M = 90 *
M = 120
400
600
800
2
10
10
Frequency (Hz)
Ronald Aarts
10
PeSi/3/20
Simulation:
q 1 + 0.5q 2
1 1.5q 1 + 0.7q 2
N
T
fs
u(t)
e(t)
=
=
=
Spectra:
Frequency response
1
Periodic input,
r periods
10
G0
10
r=1
10
10
10
0
180
360
ETFE, M = 30
SPA, M = 30
90
Filtering, Hamming
window of width w
180
270
10
10
10
Frequency (Hz)
Ronald Aarts
10
w=1
10 10
f [Hz]
10
w=2
10 10
f [Hz]
w=4
10
10
0
180
360
3
PeSi/3/22
10 10
f [Hz]
10
10
Ronald Aarts
r = 16
10
10
r=4
10
10
Phase (deg)
PeSi/3/21
Phase [deg] Magnitude []
Amplitude
10
z 2 z + 0.2
H0(z) = 2
z 1.5z + 0.7
or
4096
1s
1 Hz
Ronald Aarts
Smoothen the ETFE:
z + 0.5
G0(z) = 2
z 1.5z + 0.7
or
1 q 1 + 0.2q 2
1 1.5q 1 + 0.7q 2
Phase [deg] Magnitude []
Amplitude
10
10 10
f [Hz]
PeSi/3/23
10
10 10
f [Hz]
10
10 10
f [Hz]
Intermezzo: Linear regression and Least squares estimate

Going from many to just a few parameters: a first step
Idea: Try to recognise features in the data.
Immediately in u(k) en y(k) ???
Regression:
Prediction of variable y on the basis of information provided by other
measured variables 1, ..., d.
In the spectral models: Are there features e.g. like peaks as are are
expected in the Bode plots / FRF (eigenfrequency, ...) of a system with a
complex pole pair.
Collect = .. .
d
In the impulse response (measured or identified):

(1) Recognise features (settling time, overshoot, ...).
(2) Realisation algorithms to be discussed next.
Problem: find function of the regressors g() that minimises the difference
y g() in some sense.
So y
= g() should be a good prediction of y.
Example in a stochastic framework: minimise E[y g()]2.
Ronald Aarts
PeSi/3/24
Ronald Aarts
PeSi/4/1
Linear regression Examples:

15
Special case: regression function g() is linear in the parameters .

Note that this does not imply any linearity with respect to the variables
from .
Special case: g() = 11 + 22 + ... + dd
So g() = T .
Ronald Aarts
PeSi/4/2
Linear fit y = ax + b.
Then g() = T with input vector =
and parameter vector =
"
"
x
1
a
. So: g() =
b
Quadratic function y = c2x2 + c1x + c0.
10
5
0
0
x 1
5
x []
a
.
b
15
10
5
0
5
x2
Then g() = T with input vector = x

0
5
x []
1
c2
h
i c2
and parameter vector = c1 . So: g() = x2 x 1 c1 .

c0
c0
Ronald Aarts
PeSi/4/3
10
"
y []
Regression function g() is parameterised. It depends on a set of

parameters
= .. .
d
y []
Linear regression:
10
Linear least-squares estimate (1):
Least-squares estimate (LSE):

N measurements y(t), (t),
Minimise VN () =
t = 1, ..., N .
N
1 X
[y(t) g((t))]2 .
N t=1
So a suitable is N = arg min VN ().

Linear case VN () =
N
1 X
[y(t) T (t)]2.
N t=1
Ronald Aarts
PeSi/4/4
A global minimum is found for N that satisfies a set of linear equations, the
normal equations
N
1 X
N t=1
(t)T (t) N =
N
1 X
N t=1
(t)y(t).
If the matrix on the left is invertible, the LSE is
N =
Ronald Aarts
N
1 X
(t)T (t)
N t=1
is a quadratic function of .
N
1 X
[y(t) T (t)]2
N t=1
It can be minimised analytically: All partial derivatives

zero in the minimum:
N
1 X
2(t)[y(t) T (t)] = 0
N t=1
VN ()
have to be
The solution of this set of equations is the parameter estimate N .
Ronald Aarts
PeSi/4/5
Linear least-squares estimate Matrix formulation:
Linear least-squares estimate (2):
In the linear case the cost function VN () =
N
1 X
(t)y(t).
N t=1
PeSi/4/6
Collect the output measurements in the vector YN =
and the inputs in the N d regression matrix N =
Normal equations: T
N N N = N YN .
Estimate N = N YN
(Moore-Penrose) pseudoinverse of N : N = T
N N
Note: N N = I.
Ronald Aarts
PeSi/4/7
y(1)
..
,
y(N )
T (1)
..
.
T
(N )
i1
T
N.
Linear least-squares estimate in Matlab:
Example 1: The well-known linear fit y = ax + b
Solution x of overdetermined Ax = b with rectangular matrix A,

so more equations than unknowns, or
more rows than columns, or
A is m-by-n with m > n and full rank n
Measurements xi and yi for i = 1, ..., N .
1) Manual solution:
Then least squares solution x

= A b
% Preferred
Parameter estimate:
Ronald Aarts
PeSi/4/8
1 ||Y ||2.
Cost function (in vector form) VN = N
N
N 2
i1
Ronald Aarts
#1 " P
#
" P
P
xi y i
xi
x2
i
P
P
= P
yi
xi
1
PeSi/4/9
that filters a given signal u(t) with uF (k) = L(z)u(k), such that
VN =
N
1 X
u2 (k) is minimised.
N k=1 F
1) Manual solution:
T
N YN .
In Matlab: theta = Phi\Y;
L(z) = 1 + a1z 1 + a2z 2 + ... + anz n
#
"
x(1) 1
y(1)
a
.
.
.
.
.
. and =
2) Matrix solution: YN =
.
, N =
b
x(N ) 1
y(N )
Find a linear filter of order n (e.g. n = 10)
1 P(y ax b)2 .
Cost function VN = N
i
i
Estimate N = N YN = T
N N
"
Example 2: Pre-whitening filter L(z) for correlation analysis
Measurements xi and yi for i = 1, ..., N .
VN
VN
= 0 and
= 0, so
a
b
Ronald Aarts
Example 1: The well-known linear fit y = ax + b
(yi axi b)2.
P
#
#
" P
# "
" P
P
xi(yi axi b) = 0
xi y i
a
x2
i P xi
P
P
=
yi
b
xi
1
(yi axi b) = 0
In Matlab:
x = A\b;
x = pinv(A)*b;
x = inv(A*A)*A*b;
1
Cost function VN = N
VN
= 0 for all i = 1, 2, ..., n.
ai
2) Matrix solution: Write cost function in vector form

1 ||Y ||2 with best fit
N = YN .
VN = N
N
N 2
N
PeSi/4/10
Ronald Aarts
PeSi/4/11
For both solutions substitute in VN =
N
1 X
u2 (k)
N k=1 F
Measurements uk and yk for k = 1, ..., N .
uF (k) = u(k) + a1u(k 1) + a2u(k 2) + ... + anu(k n).
Consider A(z) = 1 + a1z 1 and B(z) = b1z 1 + b2z 2,
VN
1) Partial derivatives
can be computed, etc.
ai
so (1 + a1z 1)yk = (b1z 1 + b2z 2)uk + ek .
2) Vector form
Cost function VN =

u(n + 1)+ a u(n) + a u(n 1) +...+ anu(1)
1
2

1
u(n + 2)+ a1u(n + 1) + a2 u(n) +...+ anu(2)
VN =

..
N

n(N ) +a1u(N 1)+a2u(N 2)+...+anu(N n)
Parameter vector = [a1a2...an]T
2

2
(yk + a1yk1 b1uk1 b2uk2)2.
Estimate N = T
N N
Ronald Aarts
Ronald Aarts
PeSi/4/12
Least-squares estimate (LSE) from normal equations Summary:

1 ||Y ||2.
Write the cost function as VN = N
N
N 2
T
N N N = N YN
with pseudo-inverse N = T
N N
i1
Is that all? Often yes, but ...
Ronald Aarts
PeSi/4/14
i1
T
N YN
or PHIN\YN.
PeSi/4/13
Least-squares estimate (LSE) from normal equations Pitfalls:

h
Analytical difficulty: What if T

N N is singular?
h
Numerical difficulty: What if T

N N is near-singular?
How to solve the normal equations anyhow?
How about the accuracy of the estimate N ?
Specific answers can be given for easy problems like y = ax + b.
Solution for the best fit N = N YN ,
y(3)
a1
..
Matrix solution: Collect = b1 , YN =
and
y(N )
b2
y(2)
u(2)
u(1)
..
..
..
N =
.
y(N 1) u(N 1) u(N 2)
Recognise YN and N , then compute best fit N = N YN .
Compute the LSE from the normal equations
ARX models will

be discussed in one
of the next chapter
1 e
u + A(z)
Example 3: An ARX fit yk = B(z)
k
A(z) k
T
N.
For a more generic treatment we will look at

Singular Value Decomposition (svd).
Ronald Aarts
PeSi/4/15
Singular Value Decomposition (svd):

Svd and (near) singularity:
= U V T
or
1 ..
d
V
Condition number of matrix is the ratio of the largest and smallest singular
values 1/d.
If
all ii > 0 (i = 1, 2, ..., d), then the condition number is finite
h
T is invertible. Matrix has full (column) rank.
is a N d matrix with N (rows) > d (columns).
U is an orthogonal N N matrix.
V is an orthogonal d d matrix.
is a N d matrix with the singular values ii = i 0 (i = 1, 2, ..., d)

and 0 elsewhere.
Usually the singular values are sorted: 1 2 ... d 0.
Ronald Aarts
PeSi/4/16
hIf i =i 0 for i = r + 1, r + 2, ..., d, the condition number is infinite

T is singular, r is the rank of matrix .
h
Large condition numbers indicate near-singularity of T

N N and will
lead to numerical inaccuracies.
Ronald Aarts
PeSi/4/17
Svd and condition number (1):
Svd and condition number (2):
Consider quadratic function y = c2x2 + c1x + c0
Consider linear functions y = ax + b and y = a1x + a2(10 x) + b
x2(1) x(1) 1
..
..
..
Regression matrix N =
depends only on x(i).
x2(N ) x(N ) 1
40
c0
x(1)
..
Regression matrices N =
x(N )
20
5
x []
10
svd(phi) = [160.3; 5.2; 1.2]

cond(phi) = 131
phi\y = [0.43; 2.33; 2.25]
Exact: [0.40; 2.00; 3.00]
Ronald Aarts
10
5
10
x []
svd(phi) = [842.5; 9.1; 0.1]

cond(phi) = 6364
phi\y = [0.43; 10.9; 63.8]
Exact: [0.40; 10.0; 57.0]
PeSi/4/18
0
0
20
5
x []
10
svd(phi) = [19.8; 1.75]

cond(phi) = 11.3
phi\y = [1.97; 2.76]
svd(phi) = [23.7; 14.8; 0.0]

cond(phi) = 5.7 1016
phi\y = [1.70; 0.28; 0]
Exact: [2.00; 3.00]
Exact: [2.00; 0.00; 3.00]
Ronald Aarts
1
x(1) 10x(1) 1
..
..
..
..
and
N
1
x(N ) 10x(N ) 1
15
c0
y []
HH
HH
15
10
5
0
5
y []
y []
Warning: Rank deficient, rank = 2 tol = 4.8e-14.
PeSi/4/19
Step 1 (continued): Then only U1, 1 and V1 contribute to , so

Svd and (near) singularity (2):
Step 1: Split in 1 with non-zero singular values and 2 0:
= U V T
or
U1
U2
2
V1
= U11V1T
T
V2
is a N d matrix with N (rows) > d (columns).
U is an orthogonal N N matrix. U1 are the first r columns.

V is an orthogonal d d matrix. V1 are the first r columns.
1 is a r r diagonal matrix with the nonzero singular values

1 2 ... r > 0. The rest of the matrix is zero.
Ronald Aarts
PeSi/4/20
x(1) 10x(1) 1
..
..
..
Regression matrix N =
x(N ) 10x(N ) 1
y []
15
10
5
0
10
svd(phi) = [23.7; 14.8; 0.0]

cond(phi) = 5.7 1016
phi\y = [1.70; 0.28; 0]
Warning: Rank deficient, rank = 2 tol = 4.8e-14.
Exact: [2.00; 0.00; 3.00]

Ronald Aarts
U1
V1T
Step 2: Compute the (pseudo) inverse by only considering this part:

T
= V11
1 U1
Then the LSE is
with
1
1 = diag(1/1 , 1/2 , ..., 1/r ).
N = V11U1T YN
This is the output of the M ATLAB command pinv(phi)*yn

Note: If the rank r of is less than d, the solution is not unique.
PeSi/4/21
SVD and LSE summary part 1
Consider again the linear function y = a1x + a2(10 x) + b
5
x []
or
Ronald Aarts
SVD and LSE example
[u,s,v]=svd(phi);
r=1:2; % rank
u1=u(:,r); s1=s(r,r); v1=v(:,r);
% Compute pinv(phi)*y:
v1*diag(1./diag(s1))*u1 * y =
[ 1.68; -0.29; 0.14 ]
v2=v(:,3) = [ 0.099; 0.099; -0.99 ]
The parameters can be modified with any

multiple of V2, e.g. into
[1.97; 0; 2.76].
PeSi/4/22
Singular Value Decomposition (svd) can provide the numerical tools to

compute the linear least squares estimate.
It can detect a (near) singular regression matrix. The pseudo inverse of the
regression matrix N can always be computed reliably.
r parameters or linear combinations of parameters are obtained. In case of
non-uniqueness all solutions can be found found by taking into account
matrix V2.
What is left?
The accuracy of the estimate N .
This will be discussed after some system identification topics .....
Ronald Aarts
PeSi/4/23
Approximate realisations (not part of ident)

Given: impulse response g(k) for k = 0, 1, ..., (Markov parameters) of a
finite-dimensional linear time invariant system.
Wanted: state space model of minimal order, like
Observations:
G(z) =
g(k) =
g(k)z
k=0
CAk1B
Ronald Aarts
g(1)
g(2)
g(3)
g(2)
g(3)
g(4)
g(3)
g(4)
g(5)
..
..
..
g(nr ) g(nr + 1) g(nr + 2)
Hnr ,nc =
x(k + 1) = Ax(k) + Bu(k)

y(k) = Cx(k) + Du(k)
Algorithm by Ho & Kalman (1966) to construct an exact model (A, B, C, D)

from the impulse response:
Compose Hankel matrix H:
G(z) = D + C(zI A)
k1
k=0
PeSi/5/1
To determine the rank n and to compute the decomposition, singular value

decomposition (SVD) will be applied:
Then Hnr ,nc =
C
CA
..
n
CA r 1
g(nc)
g(nc + 1)
g(nc + 2)
..
g(nr + nc 1)
h
i
B AB Anc1B
so rank(H) is the minimal order n (provided nr and nc are sufficiently large).
Ronald Aarts
PeSi/5/2
1/2
Then also H = Unn
1/2
n VnT and compute

1/2
B from the first column of H2 = n VnT and
H = UnnVnT
1/2
C from the first row of H1 = Unn .
with Un and Vn unitary matrices (UnT Un = VnT Vn = In)

and n a diagonal matrix with the positive singular values
1 2 . . . n > 0 on the diagonal.
Note that zero singular values are removed in the singular value decomposition.
Ronald Aarts
PeSi/5/3
1/2
A is n
1/2
using the shifted Hankel matrix
UnT H Vnn
H nr ,nc =
Ronald Aarts
g(2)
g(3)
g(4)
g(3)
g(4)
g(5)
g(4)
g(5)
g(6)
..
..
..
g(nr + 1) g(nr + 2) g(nr + 3)
PeSi/5/4
g(nc + 1)
g(nc + 2)
g(nc + 3)
..
g(nr + nc)
What if there is noise and g(k) is only a finite series for k = 0, 1, ..., N ?
Example: The estimated impulse response (CRA) of the simulated data from
z + 0.5
G0(z) = 2
z 1.5z + 0.7
8
Construct a Hankel matrix with nc + nr = N .
Apply SVD (H = U V T ) and find out how many n singular values are
significantly nonzero.
Next construct an approximating Hankel matrix of rank n according

Hn = UnnVnT , in which Un and Vn are the first n columns of U and V .
n is the diagonal matrix with the first n (nonzero) singular values.
Apply the original Ho&Kalman algorithm to Hn to compute B and C. For A
the shifted Hankel matrix H with the original series g(k) is used.
SVDs:
7.410
3.602
0.033
0.033
0.023
0.023
0.016
0.015
0.009
Ronald Aarts
PeSi/5/5
Ronald Aarts
Conclusion:
order n = 2.
PeSi/5/6
Identified system (with Ho&Kalman algorithm):

A=
C=
"
h
0.829 0.379
0.379
0.673
1.3915 0.9681
B=
"
D=
1.392
0.968
0
Subspace identification (E.g. n4sid in ident)
More recent developed realisation method (early 90s, KU Leuven), using

directly the input and output data (so without explicit impulse response).
0.999z + 0.496
As a transfer function: G(z)

= 2
z 1.502z + 0.701
Pole-zero-plot:
No explicit model equations needed (specify only the order n, very well
suited for MIMO).
Mathematically elegant (robust, reliable) and efficient (optimisation with

linear equations).
Not all mathematical issues of the optimisation are settled.
but is well suited as an initial guess to obtain better PE-models (these will
be discussed afterwards and may need non-linear iterative optimisation
algorithms).
Poles (x) and Zeros (o)
0.8
0.6
G0
0.4
0.2
0.2
0.4
0.6
The obtained solution is sub-optimal,
0.8
Ronald Aarts
0.5
0.5
PeSi/5/7
Ronald Aarts
PeSi/5/8
xk+1 = A xk + B uk + K ek
yk = C xk + D uk + ek
Subspace algorithm (van Overschee and de Moor (1994,1996); ident manual

pages 3-13,17 & 4-140)
Solution approach:
Typical state space model structure:

xk+1 = A xk + B uk
y k = C xk + D u k + v k
If the states xk , input uk and output yk were known, then :

Compute C and D with linear regression,
compute ek ,
compute A, B and K also with linear regression.
Measured input and output uk and yk

Noise source vk
Characterisation of the noise source according to the innovations form

xk+1 = A xk + B uk + K ek
yk = C xk + D uk + ek
How do we find the states xk ?
Measured input and output uk and yk

White noise source ek
Determine the order n of the system:

From a Singular Value Decomposition of the correct matrix.
Then find an estimator for the states.
Compute the state matrices as above.
Equivalent to yk = G(z)uk + H(z)ek with

G(z) = C(zI A)1B + D
H(z) = C(zI A)1K + I
Ronald Aarts
This overview is rather black box alike. So-called PEM models will be discussed in much more
detail later.
PeSi/5/9
Ronald Aarts
PeSi/5/10
Example subspace identification of simulated data from
Example subspace identification of the piezo data
z + 0.5
G0(z) = 2
z 1.5z + 0.7
Order estimation:
Pole-zero-plot:
Model singular values vs order
10
Order estimation:
Red: Default Choice
Pole-zero-plot:
Model singular values vs order
2.5
2
1.5
Red: Default Choice
0.8
2
0.6
Log of Singular values
0.5
0.4
Log of Singular values
0.2
0
0
0.5
1
1.5
0.2
-1
5
0.4
2.5
-2
0.6
0.8
-3
-4
0.5
n4s2.
G0 and G
-5
0
Ronald Aarts
8
Model order
10
12
14
0.5
8
Model order
10
12
Matlab 7.0.4: 4e order?
16
(May be different for other versions)
PeSi/5/11
Ronald Aarts
14
16
n4s4 and G
n4s5.
G
n4s4 are quite
The poles of G
unlikely as the main resonance
peak is overlooked.
PeSi/5/12
Frequency response
Spectral analysis:
Amplitude
10
Prediction Error identification Methods: Introduction
10
A model with the essential dynamics of a system with a finite (and limited)
number of parameters is wanted for
10
10
10
10
Phase (deg)
180
Simulation of the process.
360
Prediction of future behaviour.
540
720
900
2
10
Controller design.
10
Frequency (Hz)
10
...
n4s4 and G
n4s5 compared with a non-parametric ETFE100-model.
G
Poles (using M ATLABs pole and damp commands after exporting the model from idents
GUI to the workspace and transforming into a tf model):
The applicability of non-parametric models is limited.
0.9988: almost pure integator

0.96 0.27i: frequency 1.3 kHz and relative damping 0.031
0.82 0.56i: frequency 2.9 kHz and relative damping 0.013
Subspace models may be suboptimal.
Ronald Aarts
Ronald Aarts
PeSi/5/13
PEM: system description and disturbances
with measured input u(t) and output y(t).

The non-measurable disturbance v(t) includes:
PEM: prediction and prediction error (ident manual page 3-16)
H0
LTIFD (Linear Time Invariant Finite Dimensional) system

y(t) = G0(z) u(t) + v(t)
PeSi/6/1
+
-
G0
v
?

Assume the dynamic system is given as y(t) = G(z)u(t) + H(z)e(t) with

observations Yt1 = {y(s), s t 1} and Ut1 = {u(s), s t 1}.
What is the best prediction for y(t) ?
Situation I: u = 0, so y(t) = v(t) = H(z)e(t) and Vt1 is known.
measurement noise,
process disturbances,
effects on non-measured inputs.

non-linearities.
Noise model: v(t) with power spectrum v () that can be written as

v(t) = H0(z) e(t)
Then v(t) = e(t) + m(t 1), with m(t 1) =
k=1
h(k)e(t k).
As H(z) has a stable inverse, also Et1 is known, and then also
m(t 1) = [1 H 1(z)]v(t).
The expectation for v(t) is:
The white noise e(t) has a variance e2 and

the transfer function H0(z) is stable, monic (H0() = 1) and minimum phase
(so it has a stable inverse).
v (t|t 1) := E{v(t)|Vt1} = m(t 1).
Ronald Aarts
Ronald Aarts
PeSi/6/2
PeSi/6/3
Situation II: u 6= 0, so y(t) = G(z)u(t) + v(t) and Ut1 and Yt1 are known.
Then Vt1 is also known.
So E{y(t)|Ut1, Yt1} = G(z)u(t) + E{v(t)|Vt1},
Model structures
and y
(t|t 1) = G(z)u(t) + [1 H 1(z)]v(t).
Predictor model: Two rational functions (G(z), H(z)).
With v(t) = y(t) G(z)u(t) we find
Model candidates from the model set M = {(G(z, ), H(z, )) | Rd}.
y
(t|t 1) = H 1(z)G(z)u(t) + [1 H 1(z)]y(t),
the predictor
and the prediction error
(t) := y(t) y(t|t 1) = H 1(z)[y(t) G(z)u(t)].
Parameterisation: models are represented by real valued parameters from a

parameter set .
Examples of parameterisations:
Predictor models: Tuning of the transfer functions G(z) and H(z). When they
both are exactly equal to the real G0(z) and H0(z), the prediction error (t)
is a white noise signal.
In practice: tune the estimates to minimise the error (t) with a least squares
fit.
Ronald Aarts
PeSi/6/4
Coefficients of series
impulse response models
Coefficients in state space matrices
subspace identification
Coefficients in fractions of polynomials PE methods
Ronald Aarts
PeSi/6/5
ARX Model structure (ident manual page 3-11)

1 ,)
1
, H(z, ) = A(z 1
,
G(z, ) = B(z
A(z 1,)
,)
Model structures
equation
G(z, )
H(z, )
ARX
A(z 1) y(t) = B(z 1) u(t) + e(t)
B(z 1,)
A(z 1,)
1
A(z 1,)
ARMAX
A(z 1) y(t) = B(z 1) u(t) + C(z 1) e(t)
B(z 1,)
A(z 1,)
C(z 1,)
A(z 1,)
1 ,)
1
y(t) = B(z
u(t) + A(z 1
e(t)
A(z 1,)
,)
OE
1)
y(t) = B(z
u(t) + e(t)
F (z 1)
B(z 1,)
F (z 1,)
A(z 1, ) y(t) = B(z 1, ) u(t) + e(t)
FIR
y(t) = B(z 1) u(t) + e(t)
B(z 1, )
BJ
1)
C(z 1 )
u(t) + D(z
y(t) = B(z
1 ) e(t)
F (z 1)
B(z 1,)
F (z 1,)
C(z 1,)
D(z 1,)
with B(z 1, ) = b1 + b2z 1 + ... + bnb z (nb1)

A(z 1, ) = 1 + a1z 1 + ... + ana z na
and = [a1 a2 ... ana b1 b2 ... nnb ]T Rna+nb .
Name
(AutoRegressive eXogenous)
Predictor: y(t|t 1; ) = B(z 1, ) u(t) + [1 A(z 1, )] y(t)

is linear in the parameters .
Ronald Aarts
PeSi/6/6
Ronald Aarts
PeSi/6/7
Identification criterium
Properties of the model structures
Ingredients for the identification
Linearity-in-the-parameters: if H 1(z)G(z) and [1 H 1(z)] are

polynomials, then the predictor y is a linear function in .
Result: least squares criterium for (t) is quadratic in and there is an
analytical solution for the best fit (fast algorithm, suitable as order estimator).
OK: ARX, FIR. Not: ARMAX, OE, BJ.
Independent parameterisation of G(z, ) and H(z, ): no common parameters
in G and H.
Result: independent identification of G and H.
OK: OE, FIR, BJ. Not: ARX, ARMAX.
The structure and the order define the model set M.
Ronald Aarts
PeSi/6/8
Consistency properties of the least squares criterium

If
the system (G0, H0) is in the chosen model set and
u() 6= 0 in sufficient frequencies (sufficiently exciting)
then G(z, N ) and H(z, N ) are consistent estimators.
If
the system G0 is in the chosen model set,
G and H are parameterised independently (FIR, OE, BJ) and
u() 6= 0 in sufficient frequencies (sufficiently exciting)
then G(z, N ) is a consist estimator.
the experimental data {(y(t), u(t)), t = 1, ..., N }.

the model set M.
the identification criterium.
PE-methods: consider the prediction error (t, ) = y(t) y(t|t 1, ) for all t
as a function of .
Least squares criterium: minimisation of the scalar function of (t, ):
Ronald Aarts
PeSi/6/9
Example of the consistency properties

Model system from the ident manual page 2-14:
Or G0(z) =
1.0 1.0z 1 + 0.2z 2

z 1 + 0.5z 2
and
H
(z)
=
0
1.0 1.5z 1 + 0.7z 2
1.0 1.5z 1 + 0.7z 2
A(z 1) = 1.0 1.5z 1 + 0.7z 2
This is an ARMAX structure with B(z 1) =

z 1 + 0.5z 2
1
1
2
C(z
Simulation (as before):
PeSi/6/10
N
P
1
or a filtered error: VN = N
(L(z)(t, ))2.
t=1
Instrumental variable (IV) techniques: try to obtain an uncorrelated signal (t, ).
Note: The IV-method combined with an ARX model set can also provide a
consistent estimator for G0 even when the noise model is not in the chosen
model set.
Ronald Aarts
N
P
1
VN = N
(t, )2
t=1
Ronald Aarts
N
T
fs
u(t)
e(t)
=
=
=
) = 1.0 1.0z
+ 0.2z
4096
1s
1 Hz
PeSi/6/11
Parametric models:
G0, H0
armax2221 na = 2, nb
arx221
na = 2, nb
oe221
nb = 2, nf
arx871
na = 8, nb
= 2, nc = 2, nk = 1
= 2, nk = 1
= 2, nk = 1
= 7, nk = 1
Spectral analysis:
Parametric models:
G0, H0
armax2221 na = 2, nb
arx221
na = 2, nb
oe221
nb = 2, nf
arx871
na = 8, nb
model
G0 and H0 M.
only G0 M.
H = 1.
High order guess.
Noise spectrum (|H|):
Pole-zero-plot:
Frequency response
= 2, nc = 2, nk = 1
= 2, nk = 1
= 2, nk = 1
= 7, nk = 1
Step response:
Step Response
Power Spectrum
10
10
model
G0 and H0 M.
only G0 M.
H = 1.
High order guess .
1
1
10
Amplitude
0.8
10
0.6
10
0.4
10
6
2
10
10
0.2
10
5
0
0
4
0.2
Phase (deg)
10
90
0.4
2
0.6
180
1
0.8
270
2
10
10
10
10
Frequency (rad/s)
Ronald Aarts
0.5
0.5
PeSi/6/12
10
10
10
10
15
Frequency (rad/s)
Ronald Aarts
20
Time
25
30
35
40
PeSi/6/13
Approximate modelling:
What happens if the model set does not fit?
Asymptotic variance
Error: (t, ) = H 1(z, )[y(t) G(z, )u(t)]

= H 1(z, )[(G0(z) G(z, ))u(t) + v(t)]
For system that fit in the model set, it can be proven for n , N and
n/N 0 (n N ) that
Power spectrum: (, ) =
N (ei ))
var(G
n v ()
,
N u()
N (ei ))
var(H
n v ()
n
= |H0(ei )|2.
N e2
N
v () = e2|H0(ei )|2.
|G0(ei ) G(ei , )|2 u() + v ()

with
|H(ei , )|2
N
P
1
(t, )2.
Cost function: VN = N
t=1
() = E(t,
Limit for N : VN V
)2.
1
(, ) d
2
shows the limit to which the estimator N converges.
() =
With Parseval: V
Ronald Aarts
PeSi/6/14
Ronald Aarts
PeSi/6/15
Minimising
Example (Ljung 1999, Example 8.5, page 268)
1 |G0(ei ) G(ei , )|2 u () + v ()

d
2
|H(ei , )|2
Z
y(t) = G0(z)u(t), with white noise input (u() 1) and 4th order G0
Two mechanisms:
G0(z) =
|G0(ei ) G(ei , )|2 u ()

.
|H(ei , )|2
Minimising
0.001z 2(10 + 7.4z 1 + 0.924z 2 + 0.1764z 3)

1 2.14z 1 + 1.553z 2 0.4387z 3 + 0.042z 4
Approximate models:
Fitting of nominator and denominator.

Observation: The way G0 is approximated by G(z, N ) depends on the noise
model H(z, ) (so in fact filtering occurs).
Ronald Aarts
PeSi/6/16
2nd order OE (oe221): y(t) =
2nd order ARX (arx221):

(1 + a1 z 1 + a2 z 2) y(t) = (b1 z 1 + b2 z 2) u(t) + e(t)
Ronald Aarts
Amplitude Bode-plot of G0, 2nd order OE (oe221) and 2nd order ARX
(arx221) (left).
b1 z 1 + b2 z 2
u(t) + e(t)
1 + f1 z 1 + f2 z 2
PeSi/6/17
Approximate modelling: using a fixed noise model

That means: H(ei , ) = H(z)
10
0
10
With Parseval we obtain the limit to which the estimator N converges now
by minimising
10
1 |G0(ei ) G(ei , )|2 u() + v ()

d, or
2
|H(ei )|2
|A|
Amplitude Bode plot
10
10
10
10
10
10
10
10
10
frequency (rad/s)
10
10
frequency (rad/s)
Background: In the ARX estimation there is an extra filtering with the (a priori
unknown) function (right):
1
|H(ei , )|2
Ronald Aarts
= |A(ei , )|2
1
u()
d
|G0(ei ) G(ei , )|2
2
|H(ei )|2
Z
Mechanism:
Find the least squares estimate G(ei , ) of G0(ei ) by applying a frequency
domain weighting function u()/|H(ei )|2.
Example: For the OE-model H(z) = 1.
PeSi/6/18
Ronald Aarts
PeSi/6/19
Example (extended with prefilter for identification)

Approximate modelling: using a prefilter L(z)
New cost function: VN =
N
N
1 X
1 X
F (t, )2 =
(L(z) (t, ))2.
N t=1
N t=1
y(t) = G0(z)u(t), with PRBS input (u() 1) and 4th order G0

G0(z) =
0.001z 2(10 + 7.4z 1 + 0.924z 2 + 0.1764z 3)

1 2.14z 1 + 1.553z 2 0.4387z 3 + 0.042z 4
Bode amplitude plots of 2e order (approximate) estimates:
F () = E
F (t, )2.
Limit for N : VN V
Frequency response
Again with Parseval the limit to which the estimator N converges:

|L(ei )|2
1
d
{|G0(ei ) G(ei , )|2 u() + v ()}
2
|H(ei , )|2
Z
Observation: The way G0 is approximated by G(z, N ) depends on the

modified noise model L1(z)H(z, ), of which the prefilter L(z) can be tuned
by the user.
Ronald Aarts
PeSi/6/20
Amplitude
F () =
V
10
G0
OE, L(z) = 1
ARX, L(z) = 1
10
10
10
10
Frequency (rad/s)
OE, L(z) = L1(z)

ARX, L(z) = L2(z)
Applied 5th order Butterworth prefilters:

L1(z): high-pass, cut-off 0.5 rad/s.
L2(z): low-pass, cut-off 0.1 rad/s.
Ronald Aarts
PeSi/6/21
Remarks:
Both the 2nd order OE model without prefilter and the 2nd order ARX
model with prefilter L2(z) give a good model estimate at low frequencies.
For the obtained ARX-model the Bode amplitude plots of the prefilter
|L2(ei )| (dashed) and the ultimate weighting function |L2(ei )A(ei )|
(solid) are:
Frequency response
10
Amplitude
10
The estimated model depends of course on the chosen model structure. In

addition the result can be tuned by
the chosen input spectrum u(),
10
10
10
10
10
10
10
Frequency (rad/s)
10
Note that ARX models (with and without a prefilter) can be computed
uniquely and quick from a linear optimisation problem, whereas the
identification of OE models involves an iterative algorithm for a nonlinear
optimisation.
Ronald Aarts
In idents GUI a Butterworth filter of a specified order can be applied to

input and output data.
At M ATLABs command prompt any prefilter L(z) can be applied by
computing
uF (t) = L(z) u(t) and yF (t) = L(z) y(t).
and next using the filtered data for the identification.
See also e.g. help idfilt.
PeSi/6/22
the chosen prefilter L(z) and

(when possible) the chosen noise model H(z) (actually complementary to L(z)).
F () =
V
Ronald Aarts
1
2
{|G0(ei ) G(ei , )|2 u () + v ()}
PeSi/6/23
|L(ei )|2
d
|H(ei , )|2
Example 4th order G0:

Approximate modelling: time domain analysis
The asymptotic estimator of an ARX model
(order denominator = na, order nominator = nb 1, delay nk = 0)
with an input signal that satisfies Ru( ) = 0 for 6= 0 is always:
G0(z) =
0.001z 2(10 + 7.4z 1 + 0.924z 2 + 0.1764z 3)

1 2.14z 1 + 1.553z 2 0.4387z 3 + 0.042z 4
Impulse responses of: G0, arx120, arx230, arx340, oe210 and oe320:
Impulse Response
Impulse Response
0.06
0.08
0.05
0.07
0.06
0.04
a stable G(z, ),
0.05
0.03
0.04
an exact result for the first nb samples of the impulse response g0(k),
k = 0, ..., nb 1.
0.02
0.03
0.02
0.01
0.01
On the contrary, in an OE model the (total) squared difference between the

impulse responses of system and model are minimised.
0.01
0.01
0.02
2
Time
0.02
10
20
30
40
50
Time
60
70
80
90
100
Note: nb 2 does not give meaningful results.

Ronald Aarts
PeSi/6/24
Ronald Aarts
PeSi/6/25
What is a good model?

Choice of the model set
Depends on the goal: Is it e.g. good enough to design a controller of does it

simulate the most important dynamic behaviour?
The chosen model parameterisation, so writing [G(z, ) and H(z, )] as a

polynomial fraction.
The choice of the model structure in the transfer functions
[G(z, ), H(z, )]: ARX / ARMAX / OE / BJ:
Linearity in the parameters (ARX, FIR) or not.
Independent parameterisation of G(z, ) and H(z, ) (OE, FIR, BJ) or not.
De choice of the model complexity, so the number of parameters or the
order of the polynomials.
What are the criteria to make these choices in order to obtain a good model for
a fair price?
Ronald Aarts
PeSi/6/26
More general: small bias and small variance.

Conflict:
For a small bias a large model set is necessary (high order, flexible
model structure).
A small variance is obtained easier with a small number of parameters.
Large model set:

Good fit with data (small residue)
Large variance
Model depends on the specific noise
realisation
Ronald Aarts
Small model set:

Bad fit with data (large residue)
Small variance
Better distinction between stochastic
and structured effects
PeSi/6/27
What makes up the price of the model?

Price of the identification: Complexity of the algorithm, amount of work.
Price of model usage: Order in the case of controller design or
simulations.
Methodology for the choice of the model set
A priori considerations
Analysis of the data
A posteriori comparison of the models
Validation of the models
Ronald Aarts
PeSi/6/28
A priori considerations
Physical insight regarding the (minimal) order of the model.
Physical insight regarding the nature of the noise disturbance.
Relation between the number of data points N and the number of parameters
to be estimated N : General:
N N
Rule of thumb: N > 10N
Note that the required number of points depends strongly on the signal to noise
ratio.
Ronald Aarts
PeSi/6/29
A posteriori comparison of the identified models

Analysis of the data
For a chose model structure, compare the obtained results for the criterium
VN (N , Z N ) as a function of the parameters sets N of different model orders.
Information regarding the model order from

Non-parametric identification: (anti)resonance peaks, phase behaviour.
Approximate realisation methods by doing an SVD-analysis of the
Hankel-matrix or for subspace identification.
ARX-example (system with na = 2, nb = 3) in left graph: VN decreases for

increasing number of parameters (i.e. model order), so the best fit is obtained
with the most complex model!
Model order selection: estimation data set
Model order selection: validation data set
0.012
0.012
0.0115
0.0115
0.011
0.011
loss function
N
P
(n) = 1
n(t)T
R
n (t)
N
t=1
with n (t) = [y(t 1) ... y(t n) u(t) ... u(t n)]T .
(n) becomes singular when the order n is taken
For noiseless data matrix R
loss function
Evaluation of the rank of the Toeplitz matrix:
0.0105
0.0105
0.01
0.01
0.0095
0.0095
too large (compare with the ARX order estimator).

0.009
10
0.009
15
# of parameters
Ronald Aarts
PeSi/6/30
Ronald Aarts
10
# of parameters
PeSi/6/31
15
Solution to avoid this overfit: cross-validation:

ARX order selection
Split data into two parts: Z N = Z (1)Z (2).

Identify N from Z (1) and evaluate criterium for Z (2): right graph: There is a
minimum VN near 5 prameters, but it is not very distinct.
Model order selection: validation data set
0.012
0.012
0.0115
0.0115
0.011
0.011
loss function
loss function
Model order selection: estimation data set
0.0105
0.01
0.0095
0.0095
0.009
0.009
10
15
Akaikes Information Criterion (AIC)
Akaikes Final Prediction Error Criterion (FPE)
0.0105
0.01
For ARX models automatic (and reliable) selection criteria exist (considering
N (N , Z N )) (ident manual page 3-70 ff, 4-183 ff):
EV
# of parameters
10
AIC = log((1 + 2 n/N ) V )

1+n/N
FPE = 1n/N V
Rissanens minimum description length (MDL)
MDL = (1 + log N n/N ) V
15
# of parameters
Ronald Aarts
PeSi/6/32
Ronald Aarts
PeSi/6/33
Comparison between arx432 and 4th order G0

ARX order estimation: 4th order ARX model system with noise
Garx432(z) =
With validation data:
With one data set:
Model Fit vs # of pars

2
1.8
1.8
1.6
1.6
1.4
1.2
0.8
0.6
0.4
G0(z) =
1.4
0.001z 2(10 + 7.4z 1 + 0.924z 2 + 0.1764z 3)

1 2.14z 1 + 1.553z 2 0.4387z 3 + 0.042z 4
Pole/zero:
Bode:
Frequency response
1.2
10
0.8
Amplitude
% Unexplained of output variance
% Unexplained of output variance
Model Fit vs # of pars

2
0.001z 2(10 + 7.3z 1 + 0.8z 2)

1 2.147z 1 + 1.557z 2 0.4308z 3 + 0.0371z 4
0.8
0.6
10
0.6
2
10
0.4
0.4
3
10
0.2
0.2
2
10
0.2
10
0
0
10
15
# of pars
20
25
30
10
15
# of pars
20
25
30
0.2
Blauw = AIC and MDL criterium, Rood = Best fit
Phase (deg)
90
0.4
180
0.6
270
360
450
0.8
2
10
10
Frequency (Hz)
Ronald Aarts
PeSi/6/34
Ronald Aarts
PeSi/6/35
0.5
0.5
Suggestion: filter with a low-pass filter in the frequency range 0...3788 Hz and
resample with a factor 4.
ARX order estimation: Piezo data
Comparison between arx542 and a (new) etfe200:
Frequency response
Amplitude
10
Bode:
Pole/zero:
Frequency response
10
10
0.8
0.6
10
10
10
Amplitude
10
4
10
10
0.4
10
0
0.2
10
10
10
400
0
600
0.2
100
800
2
10
10
Frequency (Hz)
10
Phase (deg)
Phase (deg)
200
200
0.4
300
400
0.6
500
0.8
600
700
2
10
10
0.5
0.5
Problem: The suggested arx 8 10 4 puts too much emphasis on high

frequencies.
The right graph shows a comparison with the non-parametric etfe 60.
First eigenfrequency 1300 Hz ( 0.03).

Clearly more emphasis on the low-frequent behaviour (sufficient ?)
Ronald Aarts
Ronald Aarts
PeSi/6/36
Validation of the models

The ultimate answer to the question whether the identified model is good
enough, so also for the choice of the model structure and order.
Techniques:
Comparison of model G(z, N ) with previous results:
Non-parametric estimates (spectra, impulse response).
Model reduction: pole-zero-cancellation (within the relevant confidence
interval) may indicate too high model order.
Frequency (Hz)
PeSi/6/38
PeSi/6/37
Simulation of the model: Consider the simulation error

esim(t) = y(t) ysim(t) = y(t) G(z, N ) u(t).
Optimal case: esim (t) = v(t) = H0(z)e(t).
In order to avoid overfit, cross-validation is preferable.
Residual test: For a consistent model the prediction error (t) (the residue)
converges to a white noise signal. Two tests:
1. Is (t, N ) a realisation of white noise?
Evaluate autocovariance RN ( ) ( ).
and H)
(Test for both G

2. Is (t, N ) a realisation of a stochastic process?
N ( ) 0.
Evaluate cross-covariance Ru
(Test for G)
Large confidence intervals for one or more parameters may indicate too
high model order (or too few data).
Ronald Aarts
Ronald Aarts
PeSi/6/39
Method 1: Autocovariance RN ( ) =
1 NX
(t + )(t) for 0.
N t=1
Requirement RN ( ) small for > 0 and N large:
|RN ( )|/RN (0) N/ N
Example: 4th order G0 with noise (arx442-model).
arx432
oe342
arx222
System model
OK
OK
not OK
Noise model
OK
not OK
not OK
(autocovariance does not fit in the plot)
Autocorrelation of residuals for output 1

0.2
with N the confidence level for the confidence interval with probability (E.g.
N95% = 1.96, N99% 3).
0.1
0.1
N
Method 2: Cross-covariance Ru
( ) =
1 NX
N t=1
0.2
20
(t + )u(t) for 0.
10
15
20
10
15
20
Cross corr for input 1and output 1 resids
0.05
0.05
N
N (k)R
u
R
(k)
0.1
20
k=
Ronald Aarts
10
0.1

N ( ) N
Requirement Ru
P/ N
with estimator for P : PN =
15
PeSi/6/40
15
Ronald Aarts
10
0
Samples
PeSi/6/41
Identification design: Introduction
Design variables for the estimation of the transfer functions
The user can tune the identified model by choosing
The asymptotic estimator of the parameters minimises
the model set (model structure and order)
the experimental conditions like the number of data samples N , de sample time Ts, the input spectrum u(), etc.
a prefilter L(q)
Which requirements should be satisfied by the model?
Consistency? No problem with validation and sufficiently exciting input

signal. However, is the requirement for a correct model set realistic?
Adequate for the intended purpose: simulation, prediction, diagnostics,

controller design.
Ronald Aarts
PeSi/7/1
F () =
V
1
|L(ei )|2
d
{|G0(ei ) G(ei , )|2 u() + v ()}
2
|H(ei , )|2
Z
in which a compromise between two mechanisms plays a role:

1
|G0(ei ) G(ei , )|2 Q(, ) d, with
2
Q(, ) = u() |L(ei )|2 / |H(ei , )|2.
Minimisation of
Fitting of |H(ei , )|2 with the error spectrum.
The limit depends on the model set, the prefilter L(q) and the input
spectrum u().
Ronald Aarts
PeSi/7/2
Experiment design
Fitting of G0(ei ): How can we get a fit in terms of the usual Bode plots?
The vertical axis of the Bode amplitude plot shows log |G|, which is
approximated better with a small relative error. We are minimising
Z
1 G0(ei ) G(ei , ) 2
|
| |G0(ei )|2Q(, ) d, so Q(, )
2
G0(ei )
should be large when G0(ei ) becomes small.
The horizontal axis shows log , which means that high frequencies
dominate. This can be compensated by looking at
|G0(ei )|2 Q(, ) d = |G0(ei )|2 Q(, ) d(log ). So the fit at
low frequencies improves if |G0(ei )|2 Q(, ) is larger than in that
frequency region.
The factor Q can be manipulated by modification of the design variables: input
spectrum u(), noise model set {H(ei , )}, prefilter L(q).
Ronald Aarts
PeSi/7/3
First analysis of the process (preparation):
What are the possible input(s) and output(s) of the system?
How are these signals being measured?
What are usual and extreme amplitudes of the signals.
At which frequencies do we expect essential dynamic behaviour?
Preliminary experiments (not specifically aiming at identification):
Measurement of output signal (noise) for constant input signal.
Measurement of a step response:

- Linearity
- Relevant frequencies
- Static gain
- Selection input signal
Non-parametric identification (correlation and frequency analysis) with harmonic or broadband input signal:
- Relevant frequencies
- Duration of the experiment
- Sample frequency
- Selection input signal
Ronald Aarts
PeSi/7/4
Pseudo-Random, Binary Signal

Input signals
Quite often a broadband input signal is desirable: sufficiently exciting and the
data contains information of the system in a large range of frequencies.
Matlab-tool: u = idinput(N,type,band,levels)
Signal that can be generated easily by a computer algorithm using n shift

registers and modulo-2 addition :
s(t) = xn(t)
xi(t + 1) = xi1(t)
2in
x1(t + 1) = a1x1(t) a2x2(t) ... anxn(t)
00=11=0
01=10=1
Clock
Possibilities for (type):

?
RGS: Random, Gaussian Signal: discrete white noise with a flat

spectrum
RBS: Random, Binary Signal (default).
x1
State 2
?
A

A
A
?

a1
PRBS: Pseudo-Random, Binary Signal.

SINE: sum of harmonic signals (sine functions).
State 1
x2
?
A

A
A
?

a2
xn1
-
State n
?
A

A
A
?

an1
xn=s
-
?
A

A
A
?

an

Given: initial condition and binary coefficients a1, ..., an.

Output: Deterministic periodic signal. Pseudo-random?
Ronald Aarts
PeSi/7/5
Ronald Aarts
PeSi/7/6
Maximum length PRBS: The period of the PRBS is as large as possible and
equals M = 2n 1 if the coefficients are chosen correctly, for example:
n ai = 1 n ai = 1 n
ai = 1
n
ai = 1
3
1,3
7
1,7
11
9,11
15
14,15
4
1,4
8 1,2,7,8 12 6,8,11,12 16 4,13,15,16
5
2,5
9
4,9
13 9,10,12,13 17
14,17
6
1,6
10
3,10
14 4,8,13,14 18
11,18
For M the autocovariance Ru( ) converges to a white noise

autocovariance.
Ru( )
c2(1 M12 )
c2 (1 + 1 )
M
M
The binary signal s(t) can be transformed into a signal u(t) with amplitude
c and mean m with u(t) = m + c(1 + 2 s(t)):
Ru(0) = (1 M12 )c2

2
Ronald Aarts
Ronald Aarts
E
E

E
E
E
E
E
E
E
E
This is also a binary signal, but it is a stochastic signal (instead of deterministic

like the PRBS).
PRBS, Nc = 2
6
-
Two classes:
c
The actual clock period can also be taken
equal to a multiple of the sample
frequency:
uNc (t) = u(ent(t/Nc)).
The signal is stretched.
The spectrum is no longer flat and even
has some frequencies with u() = 0.
u()/2
1.6
1. Generated from u(t) = c sign[R(q)u(t 1) + w(t)], in which

w(t) is a stochastic white noise process, and
R(q) is a stable linear filter. The power spectral density u() depends on
R(q). For R(q) 0 the output spectrum is flat.
2. A random binary signal u(t) that equals
u ()/2
c, according to
P r(u(t) = u(t 1)) = p and
P r(u(t) = u(t 1)) = 1 p, in which
p is the probability that the signal does not
p = 0.90
change the next sample time (0 < p < 1).
p = 0.75
p = 0.50
The power spectral density depends on p:
For p = 1/2 the spectrum is flat.
1.5
1.4
Nc = 10
1.2
0.8
Nc = 5
0.6
Nc = 3
0.4
0.5
Nc = 1
0.2
0.5
1.5
2.5
3.5
PeSi/7/9
PeSi/7/8
Random Binary Signal
PRBS, Nc = 1
Ronald Aarts
= 1, ..., M 1
The binary character is advantageous in the case of (presumed)

non-linearities as it has a maximum power for a limited amplitude.
PeSi/7/7
The power spectrum can be modified by filtering of the signal with a (linear)
filter, but then the binary character is no longer guaranteed.
Eu(t)
= m+M
c (1 + 1 )
Ru( ) = M
M
Ronald Aarts
PeSi/7/10
0.5
1.5
2.5
3.5
Periodic sum of harmonic signals (sine functions)
Sample frequency s = 2/Ts
Compute
Two different situations:

data acquisition.
u(t) =
r
X
k sin(k t + k )
k=1
with a user defined set of excitation frequencies {k }k=1,...,r and associated

amplitudes {}k=1,...,r .
Selection of the phases {}k=1,...,r :
For a minimal amplitude in the time domain: Schroeder-phased sines

Randomly chosen.
Ronald Aarts
Advice for data acquisition: collect as many data as possible.

For a fixed total measurement time TN = N Ts and unlimited N the sample
time Ts is as small as possible.
For a fixed (or limited) N the sample time Ts should not be too small in order to
capture also the slowest responses of the system.
Rule of thumb:
Upper limit Ts, lower limit s:
Nyquist frequency N = s/2 > highest relevant frequency.
For a first order system with bandwidth b: s 10b.
Information of the system is obtained in a limited number of frequencies.
PeSi/7/11
identification & application of the model.
Lower limit TN > 510 times the largest relevant time constant.
Make sure to apply (analog) anti-aliasing filters when needed.
Ronald Aarts
PeSi/7/12
The advice for parametric identification & application of the model is:
Numerical aspects define an upper limit for s:
A continuous system with state space matrix Ac and ZOH discretisation has a
discrete state space matrix Ad = eAcTs .
For Ts 0 it appears that Ad I: all poles cluster near z = 1.
The physical length of the range of the difference equation becomes smaller.
PE-methods also emphasise high frequencies if Ts is too small:
The prediction horizon (one step ahead) becomes small.
For smaller Ts / Zlarger s also higher frequencies are included in the
minimisation of
Rule of thumb for the upper limit for a first order system : s 30b .
Combined (for a first order system): 10b s 30b.
Strategy:
Data acquisition and preliminary experiments at a high s.
Where needed reduction of s for parametric identification by applying a

digital filter.
|G0(ei ) G(ei , )|2 Q(, ) d.
In e.g. an ARX model it is very likely that the weighting of (too) high frequencies
is too large (Q(, ) = u () |A(ei , |2).
Ronald Aarts
PeSi/7/13
Ronald Aarts
PeSi/7/14
Parameter identification: Model parameters have physical meaning.

Data processing:
If necessary an anti-aliasing and/or noise filter should be used during the
measurement: analog low-pass filters.
Example: double pendulum with actuators

(motors) driving the rotations 1 and 2
with torques T1 and T2.
Outliers / spikes: visual inspection and manual removal.

Means and drift: remove of model explicitly.
Scaling of signals: very important for MIMO systems.

Delays: compensate or model explicitly.
Adaptation of the sample frequency for the identification: after digital

anti-aliasing filtering (explicit or automatic in a Matlab command).
Take notice of this step when the input data is generated: Do not put
energy in a frequency band that is later removed (e.g. for a PRBS: set Nc).
Non-linear equations of motion

(see e.g. Dynamics of Machines (113173)):
"
PeSi/7/15
m1
J1
1
l1
l2
m3
m3 l1 l2 ( 22 + 2 1 2 ) sin 2 + (m2 + m3 )l1 g sin 1 + m3 l2 g sin(1 + 2 )

m3 l1 l2 21 sin 2 + m3 l2 g sin(1 + 2 )
"
T1
T2
with (reduced) mass matrix

=
M
Ronald Aarts
m 2 J2
m2 l12 + m3 (l12 + l22 + 2l1l2 cos 2) + J1 + J2 m3 (l22 + l1l2 cos 2 ) + J2

m3(l22 + l1l2 cos 2) + J2
m3 l22 + J2
Ronald Aarts
PeSi/8/1
Parameter-linear form
Parameters:
Link lengths (l1 and l2) are known.
Gravity g is known
Masses and inertias (m1, J1, m2, J2 and m3) are to be estimated.
Measurements of the torques (T1 and T2).
Angles known as function of time (1(t) and 2(t)).
Parameter-linear form (2)

"
=
with M
m2 l12 + m3 (l12 + l22 + 2l1l2 cos 2 ) + J1 + J2 m3 (l22 + l1l2 cos 2 ) + J2

m3 (l22 + l1l2 cos 2 ) + J2
m3 l22 + J2
with
measurements vector = [T1, T2]T ,
parameter vector p including m1, J1, m2, J2 and m3,
regression matrix (q , q ,
q ) depending only on known kinematic quantities
with q = [1, 2]T .
Ronald Aarts
PeSi/8/2
T1
T2
m3 l1 l2 ( 22 + 2 1 2 ) sin 2 + (m2 + m3 )l1 g sin 1 + m3 l2 g sin(1 + 2 )

m3 l1 l2 21 sin 2 + m3 l2 g sin(1 + 2 )
Can the equations of motion be expressed in a parameter-linear form?
= (q , q ,
q )p
"
"
T1
T2
= (q , q ,
q)
m1
J1
m2
J2
m3
???
Mass m1 does not show up in the equations of motion, so it can not be

identified!
Ronald Aarts
PeSi/8/3
#

Parameter-linear form (3)

The other parameters are collected in p = [J1, J2, m2, m3]T . Then the
elements of can be written as
11
12
13
14
=
=
=
=
21
22
23
24
=
=
=
=
1 +
2
2
1
l1g sin 1 + l1
1 +
2) cos 2 ( 2 + 2 1 2) sin 2)
l1l2((2
2
2
1 + l2(
l1g sin 1 l2g sin(1 + 2) + l1

2 1 + 2 )
0
1 +
2
0
2
l1l2( 2
1 sin 2 + 1 cos 2) l2 g sin(1 + 2) + l2 (1 + 2)
So matrix indeed depends only on known quantities and not on any of the
parameters in p: A parameter linear form can be obtained:
Experimental approach
Measure the torques i = [T1, T2]T at N time instances ti (i = 1..N ) with
known angles, velocities and accelerations q i = [1, 2]T , q i = [ 1, 2]T and
1,
2]T .
q i = [
Collect b =
1
2
..
N
and A =
1
2
..
N
in
b = Ap.
With the pseudoinverse the LS solution is p = Ab.
= p.
Ronald Aarts
PeSi/8/4
Ronald Aarts
PeSi/8/5
Experimental approach (3)

In order to compute the solution p = Ab, matrix A has to be known with
sufficient accuracy, so all angles, velocities and accelerations have to be known
with sufficient accuracy.
In order to compute the solution p = Ab, the pseudoinverse of matrix A has
to exist. In other words: AT A may not be singular or all columns of A should be
linearly independent (full column rank). This is the case when:
All parameters are independent.
The input angles, velocities and accelerations q , q ,
q are sufficiently exciting.
The rank of matrix A can be checked with an SVD analysis:
All singular values should be (sufficiently) unequal zero.
Ronald Aarts
PeSi/8/6
An upper bound for the relative error of the estimated parameter vector is
||p pLS ||
||en||
cond(A)
,
||p||
||b||
where en is the measurement noise and the condition number is
cond(A) =
max(A)
,
min (A)
with the largest and smallest singular values max(A) and min(A),
respectively.
The input angles, velocities and accelerations q , q ,
q can be optimised to
minimise the condition number.
Ronald Aarts
PeSi/8/7

How can such an optimisation be carried out?
How can velocities and accelerations be computed accurately when only
position sensor data is available?
Possible solution: Use finite Fourier series:
i(t) =
Ni
X
ail sin(f lt) + bil cos(f lt) + i(0)
i(t) =
Ni
X
ailf l cos(f lt) bil f l sin(f lt)
Ni
X
ailf2l2 sin(f lt) bilf2l2 cos(f lt)
l=1
i(t) =
l=1
The use of harmonic function makes it possible to compute velocities and

accelerations analytically.
From experimental data the time derivatives can be computed by considering
only the relevant frequencies after Fourier transform.
l=1
For each i there are 2 Ni + 1 parameters (ail, bil for i = 1..Ni and i(0))
that can be optimised for optimal excitation while e.g. motion constraints are
satisfied, e.g. with the M ATLAB command fmincon.

Due to periodicity several periods can be averaged for improved signal to

noise ratio.
The fundamental pulsation of the Fourier series f should match the total
measurement time.
The input signal contains only the selected frequencies.
Ronald Aarts
Ronald Aarts
PeSi/8/8
Identifiability
PeSi/8/9
Identifiability (2)
m 2 J2
Column rank deficiency of matrix A is not always easy to see beforehand as a
linear dependency of parameters can be hidden well.
Consider e.g. a horizontal pendulum, so with no gravity. Then with parameter
vector p = [J1, J2, m2, m3]T the elements of can be written as
11
12
13
14
=
=
=
=
21
22
23
24
=
=
=
=
Ronald Aarts
2
1 +
2
l1 1
1 +
2) cos 2 ( 2 + 2 1 2) sin 2)
l1l2((2
2
2
2
1 + l (
+ l1
2 1 + 2)
0
1 +
2
0
2
l1l2( 2
1 sin 2 + 1 cos 2) + l2 (1 + 2)
PeSi/8/10
Now the first and third column of

are dependent as
"
13
23
2
= l1
"
11
21
m1
J1
1
l1
l2
It implies that p1 = J1 and p3 = m2 can not be estimated separately. Instead

2 p = J + l2 m can be estimated.
only the linear combination p1 + l1
3
1
1 2
2m , J , m ]T a matrix
By defining a new parameter vector p = [J1 + l1
2 2
3
with full rank can be constructed and the pseudoinverse exists (provided
sufficient excitation).
Such a parameter vector is denoted a base parameter vector.
Ronald Aarts
PeSi/8/11
m3
Measurement errors
Suppose parameter set p is obtained for measurement set and parameter
set p + p is obtained for measurement set + .
How are p and related?
For cost function VN (p, ) and the estimated parameters sets:
VN
VN
(p , ) = 0 and
(p + p, + ) = 0
p
p
Measurement errors (2)

T
For least squares criterion VN (p, ) = 1
2 (p, ) (p, ):

1
VN
T (p, ) = (p, )T S(p, ) with
(p,
)
=
p
p 2
(p, )
.
sensitivity matrix S(p, ) =
p
First derivative:
Taylor series for the second expression

2VN
2VN
(p , )p +
(p , ) = 0
2
p
p
Second derivative:
S(p, )
2VN
= (p, )T
+ S(p, )T S(p, )
2
p
p
Small residue, so ignore first term and
so
p =
"
#1
2VN
(p , )
p2
2VN
(p , )
p
Ronald Aarts
PeSi/8/12
p = (S(p, )T S(p, ))1S(p, ) = S(p, )
Ronald Aarts
PeSi/8/13
Non-linear parameter optimisation (as not the whole world is linear)
Non-linear least-squares problem
Numerical minimisation of function V by iterative search methods
Quadratic criterion: VN () = ()T ()
(i+1) = (i) + f (i)
Gradient: VN () = S()T () with sensitivity matrix S.
with search direction f (i) and (positive) step size factor .

1. using function values only, or
2. using function values and the gradient, or
3. using function values, the gradient and the Hessian (second derivative
matrix).
Hessian: VN () = S()T S() + ()T S ()

or VN () S()T S()
Gauss-Newton ( = 0, = 0) / damped Gauss-Newton ( = 0) /
Levenberg-Marquardt:
Typical member of group 3: Newton algorithms with search direction

i1
f (i) = V ((i))
h
(i+1)
(i)
(i)
(i)
(i)
N
= N N S(N )T S(N ) + I
V ((i))
Subclass of group 2 are quasi-Newton methods: estimate Hessian.
Ronald Aarts
PeSi/8/14
see e.g. M ATLABs Optimization Toolbox.
Ronald Aarts
PeSi/8/15
1
(i)
(i)
S(N )T (N )
Outline:
Parameter identification: More complex model.
Joint 3
Modelling: Which effects are included in the (dynamic) model?

Masses: acceleration and gravity: Yes.
Friction: Yes.
Gravity compensation spring: Yes.
Driving system: Yes.
Elasticity of the mechanism: No, so only low frequencies.
Link 3
Link 2
Hinge 3
Beam 3
Slider truss
Hinge 4
Joint 4
Link 4
Beam 4
Beam 2
Joint 2
Hinge 5
Joint 5
Joint 6
Link 1
Hinge 2
Link 5
Beam 5
Beam 1
Link 6
Hinge 6
Hinge 1
Joint 1
Beam 6
Modelling: Complete the equations of motion and indicate the parameters.
Base
y
Figure 1. Staubli RX90B six-axis industrial robot.
Courtesy of
Staubli, Faverges, France.
Figure 2.
Finite element model of the Staubli RX90B.
Identification: Linear-in the parameters or not?
Six axes industrial robot
Ronald Aarts
PeSi/8/16
Ronald Aarts
From Dynamics of Machines (113173):
M
q + DF (x)T M (D2F (x)q )q f + DF (e,c)T (c) = T ,
i
q are the six degrees of freedom, i.e. the joint angles.

First two terms are the inertia effects:
.
Reduced mass matrix M
Accelerations
q.
First order geometric transfer functions of Jacobians DF .
Force f includes gravity.
Total stress in the gravity compensating spring (c).
Joint torques T
Ronald Aarts
PeSi/8/17
m(k) , J(k) Rq ny
q
Modelling (2): Masses and spring parameters
Modelling (1): Masses and spring
Experiment: How is the data collected?
(3)
Lumped Masses: Each link is described by a

symmetric rotational inertia matrix J (k), a mass
m(k) and a vector s(k) defining the center of
gravity with respect to the corresponding element
node at which the body is lumped.
(k)
Rp ny s
Rq nx
Rq nz
p
y
Rp nz
Rp nx
x
(Toon Hardeman, 2005)
For each link element a lumped parameter vector p(l,k) is defined as

p(l,k) = (m, msx , msy , msz , Jxx , Jxy , Jxz , Jyy , Jyz , Jz z )(k)
containing the components of m(k), s(k) and J (k), respectively.
Total stress in pre-stressed spring: (c) = (c,0) + ks e(c) ,
(4)
depends on elongation e(c) and parameters: pre-stress (c,0) and stiffness ks .
PeSi/8/18
Ronald Aarts
PeSi/8/19
Modelling (3): Driving system
Modelling (4): Friction first try
i5
ij
(m)
nj
Jj
k5
(f )
Tj
Tj
(m)
kj
q5
T5
n56
qj
(m)
T (m) = kj ij .
n5
i6
Part of the torque is needed for

the acceleration of the motor inertia
(m)
Jj j .
(f )
T6
q6
(m)
k6
Friction torques have been measured in single joint constant velocity

experiments. Commonly used model equation:
(f )
T5
T6
6
J6 n6
(b) Joints 5 and 6.
(a) Joints 1 to 4.
Figure 3.
T (f ) /T (max)
T (m) TT J (m)T
q T (f ) = T ,
(6)
Stribeck
(v,0)
+ Tj
qj
viscous
0.04
0.30
0.20
0.10
0.00
0.03
Dots (): experiments.

0.02
Dashed (- - -):
estimated in the full
velocity range.
0.01
0.00
0.00
Angular velocity q [rad/s]
with gear ratios T and all motor inertias collected in J (m).
0.05
0.10
0.15
0.20
(a) Full velocity range.
PeSi/8/20
(b) Low velocity range.
Ronald Aarts
Solid (): estimated in

the range from 0 to 0.5
rad/s.
PeSi/8/21
Modelling (6): Friction conclusion
Modelling (5): Friction improved model

After looking carefully at the experimental
data and with physical insight:
(a)
(s) j
Coulomb

(a)
(s) j
(s,0) qj /qj
+ Tj
e
0.40
The input-output relation of the complete driving system
(f )
(a,0) qj /qj
Tj
= Tj
e
(C,0)
Tj
Layout of the driving system.
(f )
All friction losses are combined into a single term Tj .
Ronald Aarts
(f )
Tj
=
T (f ) /T (max)
Generated torques depend on

(m)
known motor constants kj
and
currents ij
J5

(a)
(v)
(s) j
(f )
(a,0) qj /qj
(v,0) (1j )
Tj
= Tj
e
+ Tj
qj
.
(v)
(v,0) (1j )
+ Tj
qj
.
0.40
(7)
0.04
(7)
For each friction torque there are five unknown parameters:

(a,0)
the static friction torque Tj
,
T (f ) /T (max)
T (f ) /T (max)
(s)
0.30
0.20
0.10
0.00
0.03
the Stribeck velocity qj ,
0.02
the Stribeck velocity power j ,
0.01
the viscous friction coefficient Tj
(v,0)
0.00
0.00
0.05
0.10
0.15
0.20
(a) Full velocity range.
(a)
(b) Low velocity range.
and
(v)
the viscous friction power j .
All parameters can be obtained with a non-linear parameter estimation,

provided a reasonable initial guess is available.
(a,0)
(v,0)
This friction model gives an accurate fit in the full velocity range with a minimal
parameter set and physically sound model structure (Rob Waiboer, 2005).
Parameters Tj
and Tj
depend strongly on temperature and have to
be identified again in each experiment.
Fortunately, these parameters appear linearly in the model!
Ronald Aarts
Ronald Aarts
PeSi/8/22
PeSi/8/23
Modelling (8): Parameter linear form
Modelling (7): Putting it all together
(
q , q , q )p = T (m)
All submodels can be combined in a single equation:

(m)
M
q + DF (x)T M (D2F (x)q )q f + DF (e,c)T (c) + T (f ) = T (m).(8)
h
The motor inertias J (m) are included in the new reduced mass matrix M
(m)
The joint driving torques T
are computed from the measured motor
currents.
This acceleration linear form is well suited for simulations.
(m)
The parameter linear form is well suited for a linear least squares fit of the
model parameters p:
p(l): 60 parameters of the lumped inertia parameters of the six links.
p(s): 2 parameters of the spring (pre-stress (c,0) and stiffness ks ).
p(m) : 6 parameters with the motor inertias.
It can be proven that a parameter linear form can be obtained, provided:

the proposed model structure is used to model the mass
and inertia properties (Toon Hardeman, 2004) and
the non-linear parameters in the friction model are taken
constant.
Ronald Aarts
(9)
p(f ) : 14 parameters representing the linear part of seven equations for

friction torques.
So in total 82 parameters!
PeSi/8/24
Ronald Aarts
PeSi/8/25
Parameter identification: LS solution

Identification experiment
Estimate parameter vector p by minimising the residual in the linear system
of equations
Move the robot along a trajectory for the joint angles q (t).
Record the joint driving torques T (m) (t) at n time instances.
Ap = b + .
The least-squares solution
Compute all matrices i (

q i, q i, q i) from the (actual) trajectory q (t).
p = arg min ||||22
Collect all data in the regression matrix A and measurement vector b:
1(
q 1, q 1, q 1)
..
A=
n(
q n, q n, q n)
T 1(m)
..
and b =
(m)
Tn
(15)
Identification experiment: Multi-sine (well-defined input in frequency domain).
Ronald Aarts
PeSi/8/26
(14)
(16)
can be obtained using the pseudo-inverse A
p = Ab = (AT A)1AT b,
(19)
provided the inverse (AT A)1 exists, so A has full rank.
Ronald Aarts
PeSi/8/27
Parameter identification (2): SVD

Unfortunately, A is rank deficient
Parameter identification (3): SVD result
rank(A) = r < m,
(20)
105
where m is the length of the parameter vector p (m = 82).
Apply Singular Value Decomposition
A = U V T ,
(21)
with orthogonal 6n 6n matrix U , m m matrix V .
"
S
0
For two different trajectories the

rank of the regression matrix
appears to be r = 55.
105
1010
1015
1020
10
20
(22)
30
40
50 55 60
70
82
Singular value number
Does that mean that 55

parameters can be estimated
meaningfully?
contains the nonnegative singular values S = diag(1, 2, ..., m) in

descending magnitude.
Ronald Aarts
PeSi/8/28
Ronald Aarts
PeSi/8/29
Parameter identification (4): essential parameters and null space
Parameter identification (5): parameter solution and ambiguity
Split S in nonzero part S 1 (r r) and zero part S 2
Transform measurement vector b with the left singular matrix U
S=
"
S1 0
0 S2
g = U T b,
,
then the original LS problem can be replaced by
and accordingly
U=
U1 U2
and V =
V1 V2
(24)
Transform and partition parameter vector p with the right singular matrix V :
p = V 1(E) + V 2(N )
(25)
(E) is the essential parameter vector that contains the parameter

combinations that can be estimated independently,
(N ) is associated with the null space: It can have any value!
Ronald Aarts
(30)
(23)
PeSi/8/30
= arg min ||
"
S
0
g ||22,
(31)
with solution for the essential parameters

(E)
g
= i , for i = 1, 2, ...r.
i
(32)
An infinite number of parameters is found
p = V 1
(E) + V 2
(N )
(34)
as
(N ) can take any value.
Ronald Aarts
PeSi/8/31
Parameter identification (6): residual error and parameter accuracy

Parameter identification (7): truncated SVD
An estimate of the variance of the residual is

s2 = var() =
6n
X
1
i
6n r i=1
(38)
In other words: Take only r singular values into account where r is smaller
than the rank of the regression matrix A.
This is the truncated or partial SVD method.
Now the variance of the LS estimate

i equals
var(
i ) =
s2
.
i2
To avoid degradation of the parameter accuracy: Do not take into account too
small singular values.
(39)
So for small singular values, the accompanying

i can not be estimated
accurately.
Such inaccurate estimates will ruin the parameters in p!!!
Ronald Aarts
PeSi/8/32
How large should r be?

Large enough to avoid model errors and to obtain a
small residue.
Small enough to avoid degradation of the accuracy of the
parameter estimate and overfit.
Ronald Aarts
PeSi/8/33
Intermezzo: Scaling
Parameter identification (8): Residual error
Consider the residual error as a function of the number of essential parameter
r.
10.0
Scaling is important for any MISO, SIMO or MIMO identification!
The absolute errors in all torques are weighted equally.
kk22
5.0
2.0
1.5
1.0
1
10
20
30
40
50
55
Number of singular values r
Ronald Aarts
PeSi/8/34
About 20 parameters seems to

be sufficient.
E.g. scaling with the maximum torques for each joint.
How can we determine r more

exactly?
It depends on the error in the
parameter estimate that is
accepted.
The resulting errors in the parameters are also absolute errors.

E.g. scaling with some nominal parameter set changes this effectively in relative
errors of all parameters.
Ronald Aarts
PeSi/8/35
Parameter identification (9): Applying truncated SVD
Simulation results
Measured and simulated joint torques (22 essential parameters).
102
1
(a) Joint 1.
Dots (): singular values i .
(b) Joint 3.
102
(c) Joint 6.
0.10
0.4
0.15
0.08
0.3
10
106
108
1
10
20
30
40
50
55
Singular value number
Parameter scaling: gi should be

near i, which is true for about
the first 30 parameters.
0.2
0.1
0
0.1
0.2
0.06
normalised joint torque
Solid (): magnitude of gi.
0.10
, |g| and 10% threshold
104
0.04
0.02
0
0.02
0.04
0.06
0.05
0.05
0.10
0.3
0.08
0.4
0.10
0
0.15
0
From an estimate of the variance s2 and a maximum parameter error of 10%,
time [s]
time [s]
time [s]
Figure 8.
The simulated and measured joint torques along the trajectory as a function of time. The simulation has been carried out with model M1 .
With:
the measured torque,
the simulated torque and
the residual torque.
the value of the smallest singular value is computed: The dashed line (- - -).
Remaining residual error appears to be mainly unmodelled high-frequent
dynamics.
So only 22 or 23 parameters can be estimated sufficiently accurate.
Ronald Aarts
PeSi/8/36
Ronald Aarts
PeSi/8/37
e
r_1
Closed loop identification

Identification of a system G0 being controlled by a control system C can be e 3
desirable / necessary for:
r_1 1
Noise H_0
instability of the system,
r_2
u
2
C
G_0
safety requirements,
Controller
System
impossibility to remove the control system,
controller relevant identification.
Situation:
Prerequisites: y(t) = G0(q) u(t) + H0(q) e(t)
u(t) = C(q) [r2(t) y(t)] + r1(t)
The control system C(q) is known or not.
of system G0 from measurements of system output
Wanted: estimate G
y(t) and furthermore system input u(t) and/or feedforward r1(t) and/or
tracking reference r2(t).
Problem: input u(t) is correlated with the disturbance v(t)

H0(q) e(t) because of the controller C(q).
Ronald Aarts
PeSi/9/1
r_2
2
Notations:
y
Noise
G_0
Controller
System
S0(q) = (I + C(q)G0(q))1
and the output sensitivity W0(q) = (I + G0(q)C(q))1

(SISO: S0 = W0)
it follows u(t) = S0(q)r(t) C(q)W0(q)H0 (q)e(t)
y(t) = G0(q)S0(q)r(t) + W0(q)H0(q)e(t)
Ronald Aarts
PeSi/9/2
H_0
y
By defining signal r(t) = r1(t) + C(q)r2(t) the feedback is written as:

u(t) = r(t) C(q)y(t).
With the input sensitivity
e
3
e
r_1
Closed loop identification problems:
r_2
2
r_1
r_2
2
H_0
Noise
u
C
G_0
Controller
System
Solution strategy depends also on starting point:
When is it possible
(a)
(b)
to identify a model [G(q, N ), H(q, N )] consistently.

to identify a system model G(q, N ) consistently.
(c)
to formulate an explicit approximation criterium that characterises the

asymptotic estimator G(q, ) independent of v ()
(d)
to set a specific order for G0 in the model set.
(e)
to identify an unstable process G0.
(h)
to guarantee that the identified model for G0 will be stabilised by the

applied control system C.
Noise
H_0
y
u
C
G_0
Controller
System
Measurements of the system output y(t).
Measurements of the system input u(t).
Knowledge regarding presence and properties of signals r1(t) and

r2(t).
Measurements of r1(t) and/or r2(t).
Knowledge of controller C(q).
For most solution methods it is required that system G0 (and its model) have at
least one delay.
What can go wrong?
Ronald Aarts
PeSi/9/3
Ronald Aarts
Example of spectral analysis: SISO closed loop system

r_2
2
r_1
Noise
H_0
y
u
C
G_0
Controller
System
Suppose that accurate estimators for u() and yu() exist (N ), then
an estimator for G0 is given by:
i ) =
G(e
Measurements of system input u(t) and output y(t).

Background information: r2(t) 0, r1(t) and e(t) uncorrelated.
Define: z(t) = C(q)H0(q)e(t).
u() = |S0(ei )|2 [r1 () + z ()]

yu () = |S0(ei )|2 [G0(ei )r1 ()
Ronald Aarts
PeSi/9/5
yu ()
G (ei )r1 () z ()/C(ei )
= 0
u()
r1 () + z ()
Situation 1: No noise, so z(t) 0 and z () = 0, then:

i ) = G0(ei )
G(e
Situation 2: No external signal, so r1(t) 0 and r1 () = 0, then:
1 z(t)]
Then: y(t) = S0(q) [G0(q)r1(t) + C(q)
u(t) = S0(q) [r1(t) z(t)]
So:
PeSi/9/4
i ) =
G(e
1
C(ei )
in other words the inverse controller is estimated!

1 ()]
C(ei ) z
In general: weighted combination of both situations.
Ronald Aarts
PeSi/9/6
Overview techniques:
Dual Youla
Coprime factorisation
Two-steps
Tailor-made
Joint input/output
Indirect
Direct
IV
Example of parametric identification: SISO system with proportional feedback

Background information: r1(t) r2(t) 0, and assume a system with a
proportional controller: u(t) = f y(t) with f R.
Consider a first order ARX model with two parameters = (a, b)T :
(t, ) = y(t) + a y(t 1) b u(t 1)
H)
Consistency (G,
Consistency G
+
+
+
-
+
+
+
-

+
+
+
+
+
+
With feedback, so: (t, ) = y(t) + (a bf ) y(t 1)
Tunable bias
Fixed model order
+
Unstable system
+
(G(q, N ), C) stable C known
n
in this course.
+

n
+
+

j
+
n
+
+
+
+
j
+
+
n
+

+
n
+
+
/+
j
In other words all models with the same

a
bf predict an identical error and
can not be distinguished.
Ronald Aarts
PeSi/9/7
Idea: Neglect feedback and estimate model for G0 and/or H0 with common
open-loop techniques from measured {u(t), y(t)}.
Consistent estimators for G0 and/or H0 are possible if
The system (G0, H0) is in the model set.
There is at least one sample delay in CG0.
There is a sufficiently exciting (measurable or unmeasurable) signal r(t)
and/or controller C is sufficiently complex (e.g. nonlinear or order equal
to or larger than G0).
Conclusion: measurements of {u(t), y(t)} seems sufficient.

However: Requirement (a) is not practical.
Ronald Aarts
PeSi/9/8
Open-loop asymptotic identification criterium
Direct closed-loop identification
a.
b.
c.
Ronald Aarts
PeSi/9/9
1 |G0(ei ) G(ei , )|2 u() + |H0(ei )|2e()

d
2
|H(ei , )|2
Z
is for the closed-loop situation

|S0(ei )|2
1
r ()
{|G0(ei ) G(ei , )|2
2
|H(ei , )|2
|H0(ei )|2|S0(ei )|2
e()} d
+
|H(ei , )|2S(ei , )|2
Z
with sensitivity S(ei , ) = (1 + C(ei )G(ei , ))1.

Independent parameterisation is no longer advantageous, e.g. OE (H = 1):
G(ei , ) = G0(ei ) minimises the first term, but G is also in S in the second
term.
Requirements for a small bias: Accurate (and fixed) noise model H0 and/or a
large signal-to-noise ratio at the input u.
Ronald Aarts
PeSi/9/10
Indirect closed-loop identification

Idea: (1) identify closed-loop system with common open-loop techniques and
measurements of {r(t), y(t)} and (2) next compute the open-loop system with
knowledge of the controller.
Properties indirect method:
System: y(t) = S0(q)[G0(q) r(t) + H0(q) e(t)].
Model set: y(t) = [Gc(q, ) r(t) + Hc(q, ) e(t)].
Step 1 identify in fact S0G0 and S0H0.
Step 2:
G(q, )
= Gc(q, N )
1 + G(q, )C(q)
H(q, )
= Hc(q, N )
1 + G(q, )C(q)
Gc(q, N )
1 C(q)Gc(q, N )
Hc(q, N )
H(q, ) =
1 C(q)Gc (q, N )
Consistency is comparable with direct method.

Order of model can not be chosen beforehand.
is usually the sum of the orders of G
c and C.
The order of G
A consistent estimation of only G0 is possible using independent parameterisation.
G(q, ) =
with parameter vector for the open-loop system.
Ronald Aarts
PeSi/9/11
In the case a frequency response is available, e.g. from
a f -domain measurement, e.g. with a spectrum-analyser: often applied

for mechanical systems.
a t-domain measurement with a multi-sine excitation (e.g. from
idinput), processed with FFT.
an estimate from a non-parametric identification.
Advantages of identification in the frequency domain:
j ), j
G(
= {1, 2, ..., N }
data reduction
simplicity of processing: filtering is a multiplication.
evaluation of models in f -domain: aiming at application.
choice discrete/continuous models.
Ronald Aarts
PeSi/10/1
Data is complex (amplitude/phase)

Frequency grid can be chosen (design)
j ) = G0(eij ) + V (j ), with
Suppose: G(
the real response G0(eij ) and
a stochastic noise process V (j ).
Identification problem: Find N by minimising
N
X
k=1
PeSi/9/12
The data:
Frequency domain identification
Ronald Aarts
k ) G(eik , )|2|W (k )|2

|G(
with G(eik , ) =
B(eik , )
and W (k ) a chosen weighting function.
A(eik , )
Resembles time domain OE-problem (non-linear-in-the-parameters).

Ronald Aarts
PeSi/10/2
In practice:
Accurate initial estimate with linear regression techniques.

Subsequent non-linear optimisation (e.g. Sanathanan-Koerner
algorithm).
Weighting function is important: yes/not emphasis on high-frequent
behaviour.
Example: the freqid toolbox
Starting point are frequency

domain data:
Measured FRF.
Time domain data + FFT.
Tricks:
An ARX-like problem (linear-in-the-parameters) can be obtained by
rewriting the optimisation problem (implicit weighting with |A(eik , )|)
previous iteration.
The GUI has some similarities with the ident GUI.
Implementation examples:
Commercial fdident toolbox from ELEC, Vrije Universiteit Brussel.

Freely downloadable freqid toolbox developed by R.A. de Callafon,
initially at the TU Delft.
Ronald Aarts
PeSi/10/3
Example: the freqid toolbox (2)
Models are fitted using a non-linear weighted least squares fit, e.g. using the
accompanying lsfits toolbox.
Ronald Aarts
PeSi/10/4
Example: the freqid toolbox (3)

Often used weighting function: Coherence spectrum, see M ATLABs help
cohere.
Typical weighting functions:
None.
Inverse of data.
User defined, e.g. frequency
range.
Supplied weighting function.
Identification can be carried out using both discrete-time and continuous-time

models.
Ronald Aarts
PeSi/10/5
Cyu () =
Estimate:
1 for a perfect correlation between input u and output y

0 for no correlation between input u and output y
v
u
2
N
u |
yu ()|
N
yu() = t
C
N
N
y ()
u ()
Identification is possible for both open-loop and closed-loop systems, but in

the latter case the coherence spectrum should be used with care.
Ronald Aarts
PeSi/10/6
Aspects of MIMO systems

Polynomials become polynomial matrices, e.g. ARX:
And the ny nu matrix B(q 1) = B0 + B1q 1 + ... + Bnbq nb, or
A(q 1)y(t) = B(q 1)u(t) + e(t)

with ny ny matrix A(q 1) = Iny + A1q 1 + ... + Anaq na, or
A(q 1) =
a11(q 1) a12(q 1)
a21(q 1) a22(q 1)
..
..
any1(q 1) any2(q 1)
a1ny (q 1)
a2ny (q 1)
..
...
any ny (q 1)
b (q 1) b12(q 1) b1nu(q 1)
11 1
b21(q ) b22 (q 1 ) b2nu(q 1)
B(q 1) =
..
..
..
...
1
1
bny1(q ) bny2(q ) bny nu(q 1)
nb
nkkj
with elements bkj (q 1) = b1
+ ... + bkj kj q nkkj nbkj +1.
kj q
na
1 + ... + a kj q nakj .
with elements akj (q 1) = kj + a1
kj q
kj
Ronald Aarts
PeSi/10/7
Usually: choice of a specific model structure, e.g. common denominator

polynomial a(z). E.g. the transfer function G(z) = A(z)1 B(z) of a 2 2
system
b11(z)
G(z) = ba(z)
12(z)
a(z)
#
"
b21(z)
b11(z) b21(z)
a(z)
1
b22(z) = (a(z)I22 )
b12(z) b22(z)
a(z)
Note: this expression has a large number of parameters, especially when the
poles differ for each of the elements in the matrix. Then the denominator has to
contain all poles and in each element some of them need to be compensated
by zeros of bij (z).
Ronald Aarts
PeSi/10/8
Available in the ident toolbox:
Models of polynomials with ARMAX-, OE- and BJ-structure for

MISO-systems.
Models of polynomials with ARX-structure for MIMO-systems.
Arbitrary state space models for MIMO-systems with subspace
identification.
Structured state space models for MIMO-systems with PEMtechniques: OE- and ARMAX-structure.
Minimisation of
p
N X
1 X
w i 2
ip (t, ).
N t=1 i=1
Simpler representation by means of state space models: MIMO is a

straightforward extension of SISO.
Example: Subspace techniques.
Extra requirements regarding fixed parameters, available orders and

structures may be necessary, e.g. for the physical model / identifiability /
uniqueness.
Ronald Aarts
Ronald Aarts
PeSi/10/9
PeSi/10/10
Closed-loop MIMO identification: For the direct, indirect and twee-steps

methods it depends on the cababilities of the applied open-loop identification.
Frequency domain MIMO identification: Algorithm is also applicable for m
jl (k ) and
inputs and p outputs: Measure all m p frequency responses G
minimise
p X
N
m X
X
j=1 l=1 k=1
jl (k ) Gjl (eik , )|2|Wjl (k )|2

|G
Also in this case there is a large variation of available parameterisations for

MIMO systems, e.g. left MFD (for the number of outputs p number of inputs
m):
G(q, ) = A(q, )1 B(q, ) with
A(q, ) = I + A1 q 1 + ... + Ana q na
B(q, ) = B0 + B1 q 1 + ... + Bnb q nb
Ronald Aarts
PeSi/10/11

Slides 4up

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Slides 4up

Diunggah oleh

Hak Cipta:

Format Tersedia

System Identification

Introduction part 1: System Identification

Continuous and discrete time systems

Input and output signals

Course topics and course material

Introduction of System identification and Parameter estimation

Course number: 191131700

0.501 0.502 0.503 0.504 0.505

How is this toolbox being used?

vs. Mathematical models

LTI: Linear Time Invariant system, e.g. described by means of a transfer

How is parameter estimation being carried out?

Approach for (1) system identification and (2) parameter estimation:

1. The models used for system identification are mathematical:

Overview topics: 1. System identification (first part of the lectures):

Open-loop LTI SISO systems, time domain, frequency domain

Non-linear model equations

Non-parametric identification: correlations and spectral analysis

Linearity in the parameters

Identification with Prediction Error-methods: prediction, model structure,

Overview topics: 3. Miscellaneous (second part of the lectures):

Identification in the frequency domain

Approximately 65% of maximum:

Systems and signals

The phase shift is found from the phase angle G(i).

Continuous versus discrete time

Examples for LTI systems:

Frequency Response Function (FRF) or Bode plot

Discrete time dynamic systems

Transfer function y = G(z)u + H(z)e (in z-domain)

Discrete time signals and systems: Summation

Several system descriptions are available:

vs. Discrete time

State space equations:

= Ax(t) + Bu(t) + Kv(t)

Discrete time dynamic systems: stability

Discrete systems and phase lag

vs. Discrete time

Zero order hold (ZOH) discretisation introduces a phase lag:

Phase lag depends on

Relation between s en z: z = esTs (Franklin (8.19))

So the phase lag is 90o at the Nyquist frequency N = s/2 = /Ts.

Continuous-time deterministic signals u(t): Fourier integral:

For a finite number (N ) of discrete-time samples ud(tk ): Fourier summation:

Deterministic or stochastic signals?

Energy and power (continue time signals)

Energy spectrum: u() = |U ()|2

Example: 32768 data samples of the piezo mechanism (left).

DFT (right) in 16384 frequencies computed with M ATLABs fft command.

[ UT () is de Fourier transform of a continuous time signal with a finite duration ]

Energy and power (discrete time signals)

Deterministic with finite energy:

Energy spectrum: u() = |U ()|2

Deterministic with finite power:

(from the DFT)

Continuous time: y(t) = (g u)(t) =

After Fourier transform: Y () = G() U ()

Discrete time: y(k) = (g u)(k) =

An important property is the expectation:E f (x(t))

After Fourier transform: Y () = G() U ()

Cross-covariance: Rxy ( ) = E[x(t) Ex(t)][y(t ) Ey(t )]

Systems and models

Power density or Power Spectral Density:

Rxd (k) eikT