Stankovi
This book will guide you from a review of continuous-time signals and systems,
through the world of digital signal processing, up to some of the most advanced theory and techniques in adaptive systems, time-frequency analysis
and sparse signal processing.
It provides simple examples and explanations for each, including the most
complex transform, method, algorithm or approach presented in the book. The
most sophisticated results in signal processing theory are illustrated on simple
numerical examples.
The book is written for students learning digital signal processing and for engineers and researchers refreshing their knowledge in this area. The selected
topics are intended for advanced courses and for preparing the reader to solve
problems in some of the state of art areas in signal processing.
Ljubia Stankovi
DIGITAL SIGNAL
PROCESSING
ADAPTIVE SYSTEMS
TIME-FREQUENCY ANALYSIS
SPARSE SIGNAL PROCESSING
2015
DIGITAL
SIGNAL
PROCESSING
with selected topics
ADAPTIVE SYSTEMS
TIME-FREQUENCY ANALYSIS
SPARSE SIGNAL PROCESSING
Ljubia Stankovic
2015
ISBN-13: 978-1514179987
ISBN-10: 1514179989
All right reserved. Printed and bounded in the United States of America.
Ljubia Stankovic
To
my parents
Boo and Cana,
my wife Sneana,
and our
Irena, Isidora, and Nikola.
Contents
I Review
13
Chapter 1
15
15
18
22
27
29
31
31
34
36
42
45
46
47
51
Chapter 2
53
53
58
61
63
68
Contents
2.3
2.4
2.5
2.6
69
74
79
99
Chapter 3
Chapter 4
z-Transform
163
4.1 Definition of the z-transform
163
4.2 Properties of the z-transform
165
4.2.1 Linearity
165
4.2.2 Time-Shift
165
4.2.3 Multiplication by exponential signal: Modulation166
4.2.4 Differentiation
166
4.2.5 Convolution in time
167
4.2.6 Table of the z-transform
167
4.2.7 Initial and Stationary State Signal Value
168
4.3 Inverse z-transform
168
4.3.1 Direct Power Series Expansion
168
4.3.2 Theorem of Residues Based Inversion
172
4.4 Discrete systems and the z-transform
174
4.5 Difference equations
177
4.5.1 Solution Based on the z-transform
177
4.5.2 Solution of Difference Equations in the Time
Domain
180
4.6 Relation of the z-transform to other Transforms
185
101
101
107
112
116
118
124
127
128
131
134
139
142
146
150
152
161
Ljubia Stankovic
4.7
4.8
4.9
Problems
Solutions
Exercise
187
191
207
Chapter 5
211
212
217
220
224
230
230
236
238
239
241
246
249
258
Chapter 6
261
261
262
262
313
313
313
320
323
Chapter 7
265
270
274
277
278
279
281
285
287
292
296
308
Contents
7.1.4 Variance
Second-Order Statistics
7.2.1 Correlation and Covariance
7.2.2 Stationarity and Ergodicity
7.2.3 Power Spectral Density
7.3 Noise
7.3.1 Uniform Noise
7.3.2 Binary Noise
7.3.3 Gaussian Noise
7.3.4 Complex Gaussian Noise and Rayleigh Distribution
7.3.5 Impulsive Noises
7.3.6 Noisy Signals
7.4 Discrete Fourier Transform of Noisy Signals
7.4.1 Detection of a Sinusoidal Signal Frequency
7.5 Linear Systems and Random Signals
7.5.1 Spectral Estimation of Narrowband Signals
7.6 Detection and Matched Filter
7.6.1 Matched Filter
7.7 Optimal Wiener Filter
7.8 Quantization effects
7.8.1 Input signal quantization
7.8.2 Quantization of the results
7.9 Problems
7.10 Solutions
7.11 Exercise
7.2
325
330
330
331
332
334
334
335
338
343
344
346
346
350
354
360
362
363
366
370
371
376
388
394
412
415
Chapter 8
417
417
421
423
Adaptive Systems
8.1 Introduction
8.2 Linear Adaptive Adder
8.2.1 Error Signal
8.2.2 Autocorrelation Matrix Eigenvalues and Eigenvectors
8.2.3 Error Signal Analysis
8.2.4 Orthogonality Principle
8.3 Steepest Descend Method
8.4 LMS Algorithm
8.4.1 Convergence of the LMS algorithm
8.5 LMS Application Examples
432
437
439
440
451
452
454
Ljubia Stankovic
Chapter 9
454
458
461
462
467
473
475
475
476
477
479
483
483
487
489
495
496
496
497
499
500
Time-Frequency Analysis
9.1 Short-Time Fourier Transform
9.2 Windows
9.2.1 Rectangular Window
9.2.2 Triangular (Bartlett) Window
9.2.3 Hann(ing) Window
9.2.4 Hamming Window
9.2.5 Blackman and Kaiser Windows
9.2.6 Discrete Form and Realizations of the STFT
9.2.7 Recursive STFT Realization
9.2.8 Filter Bank STFT Implementation
9.2.9 Signal Reconstruction from the Discrete STFT
515
516
523
523
524
525
527
528
529
535
536
540
502
506
510
513
514
10
Contents
9.3
9.4
9.5
9.6
9.7
550
561
563
563
568
601
604
607
608
608
612
614
616
621
625
630
633
641
647
651
656
659
660
670
674
676
677
678
681
685
695
698
700
704
704
707
708
Ljubia Stankovic
11
711
721
723
727
729
731
736
750
754
760
761
764
776
785
795
796
797
800
802
Index
805
810
12
Contents
Preface
13
14
Preface
Introduction
15
16
Introduction
continuous
discrete-time
digital
0.8
0.6
0.6
x(t)
x(n)
xd(n)
0.8
0.4
0.4
0.2
0.2
10
t
Figure 1
15
10
n
15
1111
1110
1101
1100
1011
1010
1001
1000
0111
0110
0101
0100
0011
0010
0001
0000
15
14
13
12
11
10
9
8
7
6
5
4
3
2
1
0
10
15
Ljubia Stankovic
17
ANALOG SYSTEM
x(t)
ha(t)
y(t)
DIGITAL SYSTEM
x(t)
x(n)
A/D
Figure 2
y(n)
h(n)
y(t)
D/A
18
Introduction
Part I
Review
19
Chapter 1
Continuous-Time Signals and Systems
of discrete-time signals are obtained by sampling continuoustime signals. In many applications, the result of signal processing is presented and interpreted in the continuous-time domain.
Throughout the course of digital signal processing, the results will be discussed and related to the continuous-time forms of signals and their parameters. This is the reason why the first chapter is dedicated to a review of
signals and transforms in the continuous-time domain. This review will be
of help in establishing proper correspondence and notation for the presentation that follows in the next chapters.
1.1
OST
CONTINUOUS-TIME SIGNALS
22
and
"
(t)dt = 1.
(1.2)
"
x (t )( )d =
"
x ( )(t )d.
(1.3)
"
( )u(t )d =
"t
( )d
du(t)
= ( t ).
(1.4)
dt
A sinusoidal signal, with amplitude A, frequency 0 , and initial phase
, is a signal of the form
x (t) = A sin(0 t + ).
(1.5)
(1.6)
(1.7)
is also periodic with period T = 2/0 . Fig. 1.1 depicts basic continuoustime signals.
Ljubia Stankovic
1
(t)
u(t)
1
0
-1
-2
0
-1
(a)
-4
(b)
-4
-2
-2
0
t
1
sin(t)
1
b(t)
23
0
-1
-1
(c)
-4
-2
0
t
(d)
-4
Figure 1.1 Continuous-time signals: (a) unit-step signal, (b) impulse signal, (c) boxcar signal,
and (d) sinusoidal signal.
An e jn t .
0
n =0
Example 1.2. Find the periods of signals: x1 (t) = sin(2t/36), x2 (t) = cos(4t/15 +
2), x3 (t) = exp( j0.1t), x4 (t) = x1 (t) + x2 (t), and x5 (t) = x1 (t) + x3 (t).
Periods are calculated according to (1.6). For x1 (t) the period follows
from 2T1 /36 = 2, as T1 = 36. Similarly, T2 = 15/2 and T3 = 20. The
period of x4 (t) is the smallest interval containing T1 and T2 . It is T4 = 180 (5
periods of x1 (t) and 24 periods of x2 (t)). For signal x5 (t), when the periods of
components are T1 = 36 and T3 = 20, there is no common interval T5 such
that the periods T1 and T3 are contained an integer number of times. Thus,
the signal x5 (t) is not periodic.
24
max | x (t)| ,
<t<
Signal energy
Ex =
"
| x (t)|2 dt,
(1.8)
(1.9)
(1.10)
PAV
1
= lim
T 2T
"T
| x (t)|2 dt.
1.2
n=
(1.11)
Xn e j2nt/T
if the Dirichlet conditions are met: (1) the signal x (t) has a finite number of
discontinuities within the period T; (2) it has a finite average value in the
period T; and (3) the signal has a finite number of maxima and minima.
Since the signal analysis deals with real-world physical signals, rather than
with mathematical generalizations, these conditions are almost always met.
Ljubia Stankovic
25
"
$ 1 T/2
e j2mt/T e j2nt/T dt
e j2mt/T , e j2nt/T =
T
T/2
%
1
for m = n
=
.
sin( (mn))
=
0
for m = n
(mn)
It means that the inner product of any two different basis functions is zero
(orthogonal set), while the inner product of a function with itself is 1 (normal
set). In the case of orthonormal set of basis functions, it is easy to show that
the weighting coefficients Xn can be calculated as projections of x (t) onto
the basis functions e j2nt/T ,
"
#
$ 1 T/2
x (t)e j2nt/T dt.
Xn = x (t), e j2nt/T =
T
(1.12)
T/2
This relation follows after a simple multiplication of the right and left sides
& T/2
of (1.11) by e j2mt/T and an integration within the period T1 T/2 () dt.
Example 1.3. Show that the Fourier series coefficients Xn of a periodic signal x (t)
can be obtained by minimizing the mean square error between the signal and
nN= N Xn e j2nt/T within the period T.
n= N
Xn e j2nt/T ,
I/Xm
T/2
"
T/2
= 0 follows
1
T
T/2
"
'
'2
'
'
N
'
j2nt/T '
' x ( t ) Xn e
' dt.
'
'
n= N
e j2mt/T
T/2
Xm =
(
1
T
x (t)
T/2
"
T/2
n= N
Xn e j2nt/T
dt = 0
(1.13)
26
F (z)
F ( x, y),
=
j
z
x
y
+
*
F (z)
=
+
j
F ( x, y).
z
x
y
Often a half of these values is used in the definition, what does not change
our results.
= 0 let us denote X by z, all terms
In order to prove the form I/Xm
m
N
j2nt/T
= f (z) that not depend on z = x + jy = Xm by
in x (t) n= N Xn e
a + jb and e j2nt/T = e j , then we have to show that
F (z)
| f (z)|2
=
= 2e j f (z).
z
z
In our case
'
'
'
'
| f (z)|2 = 'a + jb + e j ( x + jy)'
(1.14)
= 2 Re{e j f (z)}
and
| f (z)|2
= 2 Im{e j f (z)}.
(1.15)
y
Therefore, all calculations with two real-valued equations (1.14) and (1.15)
are the same as using one complex valued relation
*
+
| f (z)|2
| f (z)|2
F (z)
+j
=
+j
F ( x, y) =
.
x
y
x
y
z
Since the signal and the basis functions are periodic with period T, in
all previous integrals, we can use
1
T
T/2
"
T/2
x (t)e
j2nt/T
1
dt =
T
T/2
" +
T/2+
x (t)e j2nt/T dt
(1.16)
Ljubia Stankovic
27
1 jt/2 1 1 jt/2
+ + e
e
.
4
2 4
1 j2t/4 1 1 j2t/4
+ + e
e
.
4
2 4
(1.17)
Thus, comparing the signal definition with the basis functions e jnt/4 , we
may write X2 = 1/4, X0 = 1/2, and X2 = 1/4. Other coefficients are equal
to zero.
Example 1.5. Calculate the Fourier series coefficients of a periodic signal x (t)
defined as
x (t) =
with
n=
x0 (t + 2n)
(1.18)
1/4
"
1e j2nt/2 dt =
1/4
sin(n/4)
,
n
(1.19)
n= M
Xn e jnt .
28
Xn
x(t)
-1
1.5
1.5
1
2
0.5
0
-2
-1
0
t
1.5
-0.5
-1
0
t
-2
-1
0
t
1
x (t)
0.5
30
(b)
-2
1.5
0
-0.5
0.5
0
(a)
-0.5
x (t)
x (t)
Figure 1.2
x (t)
-1/4 1/4
0.5
0
(c)
-2
-1
0
t
-0.5
(d)
Figure 1.3 Illustration of signal reconstruction by using a finite Fourier series with: (a)
coefficients Xn within 1 n 1, (b) coefficients Xn within 2 n 2, (c) coefficients Xn
within 6 n 6, and (d) coefficients Xn within 30 n 30.
Xn =
1
T
T/2
"
T/2
x (t) cos(
2nt
1
)dt j
T
T
An jBn
=
.
2
T/2
"
T/2
x (t) sin(
2nt
)dt
T
(1.20)
Ljubia Stankovic
29
A n = Xn + X n =
2
T
T/2
"
x (t) cos(
2nt
)dt,
T
T/2
"
x (t) sin(
2nt
)dt.
T
T/2
Xn X n
2
=
Bn =
j
T
T/2
(1.21)
n=
= X0 +
=
Xn e j2nt/T + X0 +
Xn e j2nt/T
n =1
n =1
A0
2nt
2nt
+ An cos(
) + Bn sin(
)
2
T
T
n =1
n =1
(1.22)
(1.23)
,
with | Xn | = A2n + Bn2 /2. For real-valued signals the integrals in (1.21),
corresponding to An and Bn , are even and odd functions with respect to
n. Therefore, it is possible to calculate
1
Hn =
T
and to get
T/2
"
T/2
.
2nt
2nt
) + sin(
) dt
x (t) cos(
T
T
(1.24)
An = Hn + Hn
Bn = Hn Hn .
The coefficients calculated by (1.24) are the Hartley series coefficients. For a
real-valued and even signal x (t) = x (t) this transform reduces to
Cn =
1
T
T/2
"
T/2
x (t) cos(
2nt
)dt
T
30
n=
1
x ( t + n ).
2
n=
x ( t + n ),
and then periodically extended with period T = 1. Find the Fourier series
coefficients and the reconstruction formula.
(d) Comment the coefficients convergence in all cases.
1
1/2
1/2
"
te j2n/(1/2)t dt =
1
j4n
M
1
sin(4nt)
.
4 n =1
2n
1
1
1/2
"
te j2nt dt =
'1/2
'1/2
'
1
1
'
te j2nt ''
e j2nt '
+
2
j2n
0
2n
(
)
0
(1)n
(1)n 1
=
+
,
j4n
(2n)2
0.6
0.6
0.4
0.4
2
0.2
0
(a)
-0.5
0
t
0.5
0.2
0
0.6
0.6
0.4
0.4
30
0.2
0
-1
(c)
-0.5
0
t
0.5
(b)
-1
x (t)
x (t)
-1
31
x (t)
x (t)
Ljubia Stankovic
-0.5
0
t
0.5
-0.5
0
t
0.5
0.2
0
(d)
-1
Figure 1.4 Reconstruction of a signal using the Fourier series. Reconstructed signal is denoted
by x M (t), where M indicates the number of coefficients used in reconstruction.
with X0 = 1/8. Note that the relation between the Fourier coefficients in (a)
(b)
( a)
and (b) is 2X2n = Xn . The reconstruction is presented in Fig.1.5.
(c) For the signal xc (t) extended with its reversed version follows
Cn =
1/2
"
te j2nt dt +
=2
"1
1/2
1/2
"
(1 t)e j2nt dt
t cos(2nt)dt =
(1)n 1
2 2 n2
M
1
1 (1)n
2
cos(2nt)
2 2
4
n=1 2 n
M
1
1
cos(2 (2n 1)t).
2 2
2
4
n=1 (2n 1)
32
0.6
0.6
0.4
0.4
x2(t)
x1(t)
0.2
0
-0.5
0
t
0.5
0
-1
0.6
0.6
0.4
0.4
x30(t)
x6(t)
-1
(a)
0.2
0.2
0
-1
(c)
-0.5
0
t
0.5
(b)
-0.5
0
t
0.5
-0.5
0
t
0.5
0.2
0
-1
(d)
0.6
0.6
0.4
0.4
x2(t)
x1(t)
Figure 1.5 Reconstruction of a periodic signal, with a zero interval extension before using the
Fourier series.
0.2
0
-0.5
0
t
0.5
0
-1
0.6
0.6
0.4
0.4
x30(t)
x6(t)
-1
(a)
0.2
0.2
0
-1
(c)
-0.5
0
t
0.5
(b)
-0.5
0
t
0.5
-0.5
0
t
0.5
0.2
0
-1
(d)
Figure 1.6 Reconstruction of a periodic signal after an even extension before using the Fourier
series (cosine Fourier series).
Ljubia Stankovic
33
(1.25)
A system is linear if, for any two signals x1 (t) and x2 (t) and arbitrary
constants a1 and a2 , it holds
y(t) = T { a1 x1 (t) + a2 x2 (t)} = a1 T { x1 (t)} + a2 T { x2 (t)}.
(1.26)
T { x (t t0 )} = y(t t0 ),
(1.27)
for any t0 .
Linear time-invariant (LTI) systems are fully described by their response to the impulse signal. If we know the impulse response of these
systems,
h(t) = T {(t)},
then for arbitrary signal x (t) at the input, the output can be calculated, by
using (1.3), as
"
y(t) = T { x (t)} = T
x ( )(t )d
Linearity
"
x ( )T {(t )}d
Timeinvariance
"
x ( )h(t )d.
"
x ( )h(t )d.
(1.28)
34
(1.29)
Example 1.7. Find a convolution of signals x (t) = u(t + 1) u(t 1) and h(t) =
e t u ( t ).
"
x ( )h(t )d =
"1
1 e(t ) u(t )d
& t +1
e d = et (e 1/e),
t 1
"
t 1
& t +1
=
e u()d =
e d = 1 e(t+1) ,
0
t +1
for t 1
for 1 t < 1
for t < 1.
(1.30)
since
|y(t)| = |
=
"
"
x (t )h( )d |
"
| x (t )|| h( )|d Mx
| x (t )h( )|d
"
if (1.30) holds.
It can be shown that the absolute value integrability of the impulse
response is the necessary condition for a linear time-invariant system to be
stable as well.
Ljubia Stankovic
1.3
35
FOURIER TRANSFORM
The Fourier series has been introduced and presented for periodic signals,
with a period T. Assume now that we extend the period to infinity, while not
changing the signal. This case corresponds to the analysis of an aperiodic
signal x (t). Its transform, the Fourier series coefficients normalized by the
period, is given by
lim Xn T = lim
T/2
"
T
T/2
x (t)e
j2nt/T
dt =
"
x (t)e jt dt
(1.31)
X () =
x (t)e jt dt,
(1.32)
is called the Fourier transform (FT) of a signal x (t). For the Fourier transform existence it is sufficient that a signal is absolutely integrable. There
are some signals that do not satisfy this condition, whose Fourier transform
exists in a form of generalized functions, such as delta function.
The inverse Fourier transform (IFT) can be obtained by multiplying
both sides of (1.32) by e j and integrating over ,
"
X ()e j d =
" "
x (t)e j( t) dtd.
e j( t) d = 2( t)
"
X ()e jt d.
(1.33)
36
Example 1.8. Calculate the Fourier transform of x (t) = Aeat u(t), a > 0.
According to the Fourier transform definition we get
X () =
"
Ae at e jt dt =
A
.
( a + j)
1
0
for t > 0
for t = 0 .
for t < 0
(1.34)
e at for t < 0
where a > 0 is a real-valued constant. It is obvious that
lim x a (t) = x (t).
a 0
where
Xa () =
It results in
"0
e at e jt dt +
"
e at e jt dt =
X () =
2
.
j
2
.
ja2 + j2
(1.35)
(1.36)
(1.37)
Ljubia Stankovic
37
Example 1.10. Find the Fourier transform of (t), x (t) = 1 and u(t).
The Fourier transform of (t) is
FT{(t)} =
"
(t)e jt dt = 1.
(1.38)
sign(t) + 1
2
1
+ ().
j
(1.39)
= Ae j(0 t+ )
"
Ae j(0 (t )+ ) h( )d
"
h( )e j0 d = H (0 ) x (t),
(1.40)
where
H () =
"
h(t)e jt dt
(1.41)
is the Fourier transform of h(t). The linear time-invariant system does not
change the form of an input complex harmonic signal Ae j(0 t+ ) . It remains
complex harmonic signal after passing through the system, with the same
frequency 0 . The amplitude of the input signal x (t) is changed for | H (0 )|
and the phase is changed for arg { H (0 )}.
1.3.2 Properties of the Fourier Transform
The Fourier transform satisfies the following properties:
38
1. Linearity
FT{ a1 x1 (t) + a2 x2 (t)} = a1 X1 () + a2 X2 (),
(1.42)
"
tt
x (t)e jt dt =
"
x (t)e jt dt = X (),
(1.43)
x (t)e j0 t e jt dt = X ( 0 )
(1.44)
if x (t) = x (t).
3. Modulation
FT{ x (t)e j0 t } =
"
"
x (t t0 )e jt dt = X ()e jt0 .
(1.45)
5. Time-scaling
FT{ x ( at)} =
6. Convolution
"
" "
x ( at)e jt dt =
" "
X ( ).
a
| a|
x ( )h(t )e jt ddt
(1.46)
(1.47)
Ljubia Stankovic
39
7. Multiplication
FT{ x (t)h(t)} =
1
2
"
1
2
"
x (t)
"
H ( )e jt de jt dt
(1.48)
H ( ) X ( )d = X () H () = H () X ().
1
x (t)y (t)dt =
2
"
9. Differentiation
FT
dx (t)
dt
| x (t)|2 dt =
1
2
"
"
X ()Y ()d
(1.49)
| X ()|2 d.
"
d
1
= FT
X ()e jt d = jX ().
dt 2
(1.50)
10. Integration
The Fourier transform of
"t
x ( )d
"t
"
x ( )u(t )d =
x ( )d.
= FT{ x (t)}FT{u(t)} =
*
+
1
1
+ () =
X () + X (0)().
X ()
j
j
FT
x ( )d
"t
(1.51)
40
2X () for > 0
X (0)
for = 0 .
Xa () =
(1.52)
0
for < 0
It can be written as
Xa () = X () + X ()sign() = X () + jXh ()
(1.53)
"
p.v.
x ( )
d.
t
(1.54)
where p.v. stands for Cauchy principal value of the considered integral.
1.3.3 Relationship Between the Fourier Series and the Fourier Transform
Consider an aperiodic signal x (t), with the Fourier transform X (). Assume that the signal is of a limited duration (i.e., x (t) = 0 for |t| > T0 /2).
Then,
X () =
T"0 /2
x (t)e jt dt.
(1.55)
T0 /2
n=
x (t + nT ).
The periodic signal x p (t) can be expanded into Fourier series with the
coefficients
1
Xn =
T
T/2
"
T/2
(1.56)
Ljubia Stankovic
41
x p (t)e
j2nt/T
dt =
T"0 /2
T0 /2
T/2
x (t)e jt dt|=2n/T
or
1
(1.57)
X ()|=2n/T .
T
It means that the Fourier series coefficients are the samples of the Fourier
transform, divided by T. The only condition in the derivation of this relation
is that the signal duration is shorter than the period of periodic extension
(i.e., T > T0 ). The sampling interval in frequency is
Xn =
2
2
, <
.
T
T0
It should be smaller than 2/T0 , where T0 is the signal x (t) duration. This
is a form of the sampling theorem in the frequency domain. The sampling
theorem in the time domain will be discussed later.
In order to write the Fourier series coefficients in the Fourier transform
form, note that a periodic signal x p (t), formed by a periodic extension of
x (t) with period T, can be written as
x p (t) =
n=
x (t + nT ) = x (t) t
n=
(t + nT ).
(t + nT )
n=
(1.58)
(1.59)
*
+
*
+ *
+
2
2
2
2
2
= X ()
T n = T X T n T n
T n=
n=
since
FT
n=
(t + nT )
n=
jnT
"
n=
(t + nT )e jt dt
*
+
2
2
=
T n .
T n=
(1.60)
42
When a signal
(1.61)
(t0 ) = .
(t0 ) = 0.
Around this point the phase can be expanded into a Taylor series as
1
(t) t = [(t0 ) t0 ] + [ (t0 ) ] + (t0 )t2 + ...
2
Ljubia Stankovic
43
A(t)e j((t)t) dt
= A(t0 )e j((t0 )t0 )
"
1 ( t
ej 2
0 )t
dt
where A(t)
= A(t0 ) is also used. With
"
j 12 at2
dt =
<
2j
| a|
10
8
(t0 ) = 8.
The amplitude of X () is
'<
'
=
'
2 ''
2
'
2
2
=
((
)
)
exp
t
1
t
| X ()| A(t0 ) '
'
0
0
' (t0 ) '
8
( >*
?*
+
+ )
1
10 2
10 2
= exp
1
2
8
8
(1.63)
(1.64)
The signal, stationary phase approximation of the Fourier transform and the
numerical value of the Fourier transform amplitudes are shown in Fig.1.7
44
x(t)
1
0
-1
-4
-3
-2
-1
0
t
|X()|
0
-100
-80
-60
-40
-20
20
40
60
80
100
-20
20
40
60
80
100
|X()|
Numeric calculation
0.5
0
-100
-80
-60
-40
Figure 1.7 The signal (top), along with the stationary phase method approximation of its
Fourier transform and the Fourier transform obtained by a numeric calculation (bottom).
2Na
+1/(2N 1)
Ljubia Stankovic
45
and
(t0 ) = 2N (2N 1) a
+
(2N 2)/(2N 1)
.
2Na
(1.65)
'
'
= A2 (
)'
'
' (2N 1) 2aN
'
2Na
*
+1/(2N 1)
(1 2N )
+ /4
arg { X ()} (t0 ) t0 + /4 =
2N
2aN
for a large value of a.
For N = 1 and A(t) = 1, we get | X ()|2 = |/a| and arg { X ()} =
2 /(4a) + /4.
(1.67)
the method of stationary phase states that if the Fourier transform phase
function (t) is monotonous and the amplitude B(t) is sufficiently smooth
function, then
1
x (t) =
2
"
B()e
j () jt
d B(0 )e
j (0 ) j0 t
<
j
, (1.68)
2 | (0 )|
(0 ) = t,
and
is the group delay.
t g = ()
46
tb
a
(0 ) = a.
<
j
2 | (0 )|
=
*
+
t b 2 j((tb)2 /(2a)+/4)
1
= exp(
)e
.
a
2a
h(t))
2
exp(20 )e ja0 /2 jb0 + j0 t
Example 1.14. For a system with frequency response H () = | H ()| e j0 the impulse response is h(t). Find the impulse response of the systems with transfer
functions shown in Fig.1.8 with:
(a) Ha () = | H ()| e j4 ,
2
(b) Hb () = | H ()|@e j2 , and
A
(c) Hc () = | H ()|
3
4
+ 14 cos(22 ) e j0 .
1
2
"
1
2
"
H ()e j2 e jt d.
t
4
j(t2 /8 +/4)
1
.
8 2
|H (j)|
|H (j)|
arg{H (j)}
arg{H (j)}
arg{H (j)}
|h (t)|
|h (t)|
|h (t)|=|h(t-4)|
47
|H (j)|
Ljubia Stankovic
Figure 1.8 Frequency response of systems (amplitude, top row, and phase, middle row) with
corresponding impulse responses (amplitude - bottom row).
48
hc (t) =
1
1
h (t) + hb (t)
8 b
8
1
h ( t ).
4 b
1.5
LAPLACE TRANSFORM
"
x (t)est dt,
(1.69)
"
0
if
e at est dt =
'
'
e(s+ a)t '
s+a
1
s+a
Ljubia Stankovic
49
}=
"
x (t)e
t jt
"
dt =
x (t)est dt = X (s).
(1.70)
1
lim
2j T
"+ jT
X (s)est ds
jT
where the integration is performed along a path in the region of convergence of X (s).
Example 1.16. Consider a signal x (t) such that x (t) = 0 for |t| > T (time-limited
signal). Its Fourier transform is X (). Derive the relation to calculate the
Laplace transform X (s) for any within the region of convergence, based
on the value of X ().
&
1
jt d. The
Based on X () the signal values are x (t) = 2
X ()e
Laplace transform is
"T
"
1
jt
X ()e d est dt
X (s) =
2
1
2
"
X ()
"T
est+ jt dt d =
"
X ()
sinh(( j s) T )
d.
j s
(1.71)
50
dx (t) st
e dt = s
dt
"
x (t)est dt = sX (s).
This relation follows by integration in part of the first integral, with the
assumption that the values of x (t)est are zero as t .
In many applications it has been assumed that the systems are causal
with corresponding causal signals used in calculations. In these cases x (t) =
0 for t < 0, i.e., x (t) = x (t)u(t). Then the so called one-sided Laplace transform (unilateral Laplace transform) is used. Its definition is
X (s) =
"
x (t)est dt.
dx (t) st
e dt = x (t)est |0 + s
dt
"
0
Ljubia Stankovic
51
dt
L{
"t
0
x ( )d } = L{u(t) t x (t)} =
1
X (s)},
s
&
d N y(t)
dy(t)
d M x (t)
dx (t)
+
+
+
(
)
=
+ ... + b1
+ b0 x (t)
...
a
a
y
t
b
0
M
1
dt
dtn
dt
dt N
with the initial conditions x (0) = x (0) = x (n1) (0) = 0. The Laplace transform of both sides of this differential equation is
a N s N Y (s) + ... + a1 sY (s) + a0 Y (s) = b M s M X (s) + ... + b1 sX (s) + b0 X (s).
Transfer function of this system is of the form
H (s) =
Y (s)
b s M + ... + b1 s + b0
= M N
.
X (s)
a N s + ... + a1 s + a0
52
s+5
A1
A2
A3
=
.
+
+
s+4
s+1
s+2
(s + 4)(s2 + 3s + 2)
For example,
A1 = ( s + 4)
s+5
= 1/6.
(s + 4)(s2 + 3s + 2) |s=4
1 4t
3
4
e u(t) e2t u(t) + et u(t).
6
2
3
1/s2
s a
(s a)2 +20
0
(s a)2 +20
1
( s a )2
sX (s) x (0)
dX (s)/ds
&
s F (s )ds
X (s a)
X (s)/s
Ljubia Stankovic
1.6
N=32
|H(j)|2
N=4
|H(j)|2
|H(j)|2
N=2
Figure 1.9
53
BUTTERWORTH FILTER
The most common processing systems in communications and signal processing are filters, used to selectively pass a part of the input signal in the
frequency domain and to reduce possible interferences. The basic form is a
lowpass filter. Here we will present a simple Butterworth lowpass filter.
The squared frequency response of the Butterworth lowpass filter is
| H ( j)|2 =
1+
C2N .
1+
1+
B
B
1
j
jc
1
s
jc
C2N
C2N for s = j.
54
Re{s}
Figure 1.10
N=5
Im{s}
Im{s}
N=4
Im{s}
N=3
Re{s}
Re{s}
sk
jc
+2N
= 1 = e j(2k+ )
+ for k = 0, 1, 2, ..., 2N 1.
k =
2N
2
For a given filter order N and frequency c the only remaining decision is
to select a half of the poles sk that belong to H (s) and to declare that the
remaining half of the poles belong to H (s). Since we want that a filter is
stable then we chose the poles
s0 , s1 , ..., s N 1
within the left side of the s plane, where Re{s} < 0, i.e., /2 < k < 3/2.
The symmetric poles with Re {s} > 0 are the poles of H (s). They are not
used in the filter design.
Example 1.18. Design a lowpass Butterworth filter with:
(a) N = 3 with c = 1,
(b) N = 4 with c = 3.
2k +
+ , for k = 0, 1, 2.
6
2
Ljubia Stankovic
55
2
2
1
3
) + j sin( ) = + j
3
3
2
2
2
+ ) + j sin(
+ ) = 1
s1 = cos(
3
3
3
3
2
2
2
2
1
3
+
) + j sin(
+
)= j
s2 = cos(
3
3
3
3
2
2
s0 = cos(
with
H (s) =
(s +
1
2
3
1
2 )( s + 2
+j
3
2 )( s + 1)
1
(s2 + s + 1)(s + 1)
2k +
+ , for k = 0, 1, 2, 3.
8
2
s1 = 3 cos(
2
s2 = 3 cos(
2
s3 = 3 cos(
2
s0 = 3 cos(
) + j3 sin( + )
8
2
8
3
3
+
) + j3 sin( +
)
8
2
8
5
5
+
) + j3 sin( +
)
8
2
8
7
7
+
) + j3 sin( +
)
8
2
8
with
c
(s2 + 2.296s + 9)(s2 + 5.543s + 9)
9
= 2
(s + 2.296s + 9)(s2 + 5.543s + 9)
H (s) =
In practice we usually do not know the filter order, but its passband
frequency p and stopband frequency s , with a maximal attenuation in
the passband a p [dB] and a minimal attenuation in the stopband a p [dB], as
shown in Fig.1.11. Based on these values we can calculate the order N and
the critical frequency c needed for a filter design.
56
1
A
|H(j)|2
As
Figure 1.11
1+
1+
p
c
1
s
c
2
C2N A p
(1.72)
2
C2N As .
ln p ln s
Nearest greater integer is assumed for the filter order N. Then we can
use any of the relations in (1.72) with equality sign to calculate c . If we
'
'2
choose the first one then c will satisfy ' H ( j p )' = A2p , while if we use
the second relation the value of c will satisfy | H ( js )|2 = A2s . These two
values differ. However both of them are within the defined criteria for the
transfer function.
The relation
a = 20 log A
or A = 10a/20 should be used for the attenuation given in [dB] .
All other filter forms, like passband and highpass, may be obtained
from a lowpass filter with appropriate signal modulations. These modulations will be discussed for discrete-time filter forms in Chapter V.
Chapter 4
z-Transform
Fourier transform of discrete signals and the DFT are used for
direct signal processing and calculations. A transform that generalizes these transforms, in the same way as the Laplace transform
generalizes the Fourier transform of continuous signals, is the z-transform.
This transform provides an efficient tool for qualitative analysis and design
of the discrete systems.
4.1
HE
n=
x (n )zn ,
(4.1)
169
170
z-Transform
Re{z}
Figure 4.1
Im{z}
Im{z}
Im{z}
Re{z}
Re{z}
n =0
n =0
n =0
1
1
z
z
.
=
+
=
+
1 a/z
1 b/z
za
zb
where a and b are complex numbers, |b| > | a|. Find the z-transform of x (n)
and its region of convergence.
The z-transform is
X (z) =
n =1
n=
an zn
bn zn + 2z2 =
n =1
n =1
an zn bn zn + 2z2
a/z
z/b
a
z
=
+ 2z2 =
+
+ 2z2 .
1 a/z
1 z/b
za
zb
Im{z}
Im{z}
Re{z}
Figure 4.2
4.2
171
Im{z}
Ljubia Stankovic
Re{z}
Re{z}
4.2.1 Linearity
The z-transform is linear since
Z{ ax (n) + by(n)} =
n=
with the region of convergence being at least the intersection of the regions
of convergence of X (z) and Y (z). In special cases the region can be larger
than the intersection of the regions of convergence of X (z) and Y (z) if
some poles, defining the region of convergence, cancel out in the linear
combination of transforms.
4.2.2 Time-Shift
For a shifted signal x (n n0 ) the z-transform is
Z{ x (n n0 )} =
n=
x ( n n0 ) z n =
n=
x (n)z(n+n0 ) = X (z)zn0 .
172
z-Transform
n =0
n =0
n =0
#
$
= zn0 X (z) x (0) x (1)z1 ... x (n0 1)zn0 +1 .
"
For n0 = 1 follows Z{ x (n + 1)u(n)} = zX (z) x (0). Note that for this signal
x ( n + n0 ) u ( n ) = x ( n + n0 ) u ( n + n 0 ).
Z{ an x (n)} =
z
x ( n ) a n z n = X ( ),
a
n=
Z{e j0 n x (n)} =
n=
x (n)e j0 n zn = X (ze j0 )
x(n)zn and
n =0
dX (z)
= nx (n)zn1 .
dz
n =0
Z{nx (n)u(n)} = z
dX (z)
.
dz
d N X (z)
.
dz N
Ljubia Stankovic
173
Z{ x (n) y(n)} = Z{
=
n= m=
x (m )y (n m )zn =
x (m)y(n m)}
m=
l = m=
x ( m ) y ( l ) z m l = X ( z )Y ( z )
with the region of convergence being at least the intersection of the regions of convergence of X (z) and Y (z). In the case of a product of two ztransforms it may happen that some poles are canceled out causing that the
resulting region of convergence is larger than the intersection of the individual regions of convergence.
4.2.6 Table of the z-transform
Signal x (n)
(n)
u(n)
an u(n)
nan1 u(n)
an u(n 1)
an x (n)
a|n| , | a | < 1
x ( n n0 )
nx (n)u(n)
n ( n 1) x ( n ) u ( n )
cos(0 n)u(n)
sin(0 n)u(n)
1
n! u ( n )
z-transform X (z)
1
z
1 z , | z | > |1 |
z
az , |z | > | a |
z , z > a
| | | |
( a z )2
z
az ,
|z| < | a|
X (z/a)
z (1 a2)
,
(z a)(1 az)
n
0
z
X (z)
zdX (z)/dz
z2 d2 X (z)/dz2
1z1 cos(0 )
12z1 cos(0 )+z2
1z1 sin(0 )
12z1 cos(0 )+z2
exp(z)
z
z 1 X ( z )
174
z-Transform
(4.2)
(4.3)
z 1
x ( n + 1) z
N n =0
z 1
x (n)z
n =0
"
= lim [ x ( N + 1) x (0)].
N
Thus,
lim [ x ( N + 1) x (0)] = zX (z) x (0) X (z),
4.3
INVERSE Z-TRANSFORM
n=
Xn z n
Ljubia Stankovic
175
1
= 1 + q + q2 + ... = qn
1q
n =0
1
1 12 z1
1
1 3z
identify possible regions of convergence and find the inverse z-transform for
each of them.
Obviously the z-transform has the poles z1 = 1/2 and z2 = 1/3. Since
there are no poles in the region of convergence there are three possibilities
to define the region of convergence: 1) |z| > 1/2, 2) 1/3 < |z| < 1/2, and 3)
|z| < 1/3. The signals are obtained by using power series expansion for each
case.
1) For the region of convergence |z| > 1/2 the z-transform should be
written in the form
X (z) =
1
1
1
2z
1
3z(1
1
3z )
#1#
1
1 n
1 n
# # < 1 or |z| >
z
for
=
=
# 2z #
n
1
2z
2
1 2z
n =0
n =0
# #
!
"
#1#
1
1 n
1 n
# # < 1 or |z| >
=
=
z
for
# 3z #
n
1
3z
3
1 3z
n =0
n =0
1
2
1
.
3
Both of these sums converge for |z| > 1/2. The resulting power series expansion of X (z) is
1 n
1 1 n
z
z
n
2n
3z n
=0 3
n =0
1
1
= n zn n zn .
2
3
n =0
n =1
X (z) =
1
1
u ( n ) n u ( n 1).
2n
3
176
z-Transform
2) For 1/3 < |z| < 1/2 the z-transform should be written in the form
X (z) =
2z
1
+
1 2z
3z(1
1
3z )
0
1
= (2z)n = 2n zn for |2z| < 1 or |z| <
1 2z n=0
n=
# #
"
!
#1#
1
1 n
1 n
=
= n z for ## ## < 1 or |z| >
1
3z
3
3z
1 3z
n =0
n =0
1
2
1
.
3
They converge for 1/3 < |z| < 1/2. The resulting power series expansion is
X (z) = 2z
n=
2 n z n
1 1 n
z
n
3z n
=0 3
1 n
1
z n zn .
n
2
3
n=
n =1
1
1
u(n 1) n u(n 1).
2n
3
2z
1
+
.
1 2z
1 3z
0
1
1
= (2z)n = 2n zn for |2z| < 1 or |z| <
1 2z n=0
2
n=
0
1
1
= (3z)n = 3n zn for |3z| < 1 or |z| < .
1 3z n=0
3
n=
n=
2 n z n +
n=
3 n z n
0
1 n
1
z + n zn .
n
2
3
n=
n=
1
1
u(n 1) + n u(n).
2n
3
Ljubia Stankovic
177
X (z) = e a/z
1
1
( a/z)2 + ( a/z)3 + ...
2!
3!
follows
x (n) = (n) + a(n 1) +
= an
1
u ( n ).
n!
1 2
1
a (n 2) + a3 (n 3)+
2!
3!
z2 + 1
(z 1/2)(z2 3z/4 + 1/8)
find the signal x (n) if the region of convergence is |z| > 1/2.
z2 + 1
z2 + 1
=
(z 1/2)(z z1 )(z z2 )
(z 1/2)2 (z 1/4)
where z1 = 1/2 and z2 = 1/4. Writing X (z) in the form of partial fractions
X (z) =
A
B
+
(z 12 )2 z
1
2
C
z
1
4
(4.4)
178
z-Transform
(z
1 2
2)
16
17
+
.
1
z
1/4
z 2
For the region of convergence |z| > 1/2 and a parameter | a| 1/2 holds
1
1
1
1
=
+ a 2 z 2 + . . . ) = a n 1 z n .
a = z (1 + az
za
z (1 z )
n =1
d
1
1
(
)=
= ( n 1 ) a n 2 z n
2
da z a
(z a)
n =2
n1
1
1
u(n 2) 16 n1 u(n 1) + 17 n1 u(n 1).
2n 2
2
4
a
m
m! da
m!
n =1
n =1
produces the inverse z-transform
Ljubia Stankovic
179
where C is any closed contour line within the region of convergence. The
complex plane origin is within the contour. By multiplying both sides of
X (z) by zm1 , after integration along the closed contour within the region
of convergence we get
1
2j
zm1 X (z)dz =
1
2j
n=
where zi are the poles of zn1 X (z) within the integration contour C that is
in the region of convergence and k is the pole order. If the signal is causal,
n 0, and all poles of zn1 X (z) within contour C are simple (first-order
poles with k = 1) then, for a given instant n,
$
%
(4.5)
x (n) = [zn1 X (z)(z zi )]|z=zi .
zi
2z + 3
(z 1/2)(z 1/4)
zn1 (2z + 3)
(z
1
zn1 (2z + 3)
1
(z )|z=1/2 +
(z )|z=1/4
1
1
2
4
(z 2 )(z 4 )
1
1
2 )( z 4 )
1
n 1 4
= 2 1
4
1 7
4n 1 2
1
4
= 16
1
1
14 n1 .
2n 1
4
z1 (2z + 3)
(z
z | z =0 +
1
1
2 )( z 4 )
z1 (2z + 3)
(z
1
1
2 )( z 4 )
z1 (2z + 3)
(z
1
1
2 )( z 4 )
1
(z )|z=1/2
2
1
(z )|z=1/4 = 0.
4
180
z-Transform
1
1
u(n 1) 14 n1 u(n 1).
2n 1
4
It has been assumed that the signal is causal. Using the theorem of
residuum prove that x (n) = 0 for n < 0 with |z| > 1/2.
Hint: Since for each n < 0 there is a pole at z = 0 of the order n + 1, to
avoid different derivatives for each n we can make a substitution of variables
z = 1/p, with dz = dp/p2 . New region of convergence in the complex
plane p will be p < 2. All poles are now outside this region and outside the
integration contour, producing the zero-valued integral.
4.4
m=
x (m)h(n m)
n=
h (n )zn .
m=
|h(m)| < .
It means that the z-transform exists at |z| = 1, i.e., that the circle
|z| = 1
Re{z}
Re{z}
4
1.5 h2(n)
3 h3(n)
40
20
0.5
-10
Figure 4.3
"x".
10
-10
Re{z}
2
60 h (n)
Im{z}
Im{z}
181
Im{z}
Ljubia Stankovic
10
-10
10
plot the regions of convergence and discuss the stability and causality. Find
and plot the impulse response for each case.
182
z-Transform
The regions of convergence are shown in Fig.4.3. The system described by H1 (z) is causal but not stable. The system H2 (z) is stable but not
causal, while the system H3 (z) is both stable and causal. Their impulse responses are presented in Fig.4.3 as well.
| H (e j )| = | H (z)||z=e j .
Consider a discrete system whose transfer function assumes the form of a
ratio of two polynomials
H (z) =
where z0i are zeros and z pi are poles of the transfer function. For the
amplitude of frequency response we my write
! !
! B ! TO1 TO2 ...TO M
| H (e j )| = !! 0 !!
A0 TP1 TP2 ...TPN
H (z) =
z e j/3
z 0.95e j/3
TO1
TP1
Ljubia Stankovic
183
2
O1
1.5
/3
|H(ej)|
Im{z}
0.5
Re{z}
-2
/3
Figure 4.4 Poles and zeros of a first-order notch filter (left). The frequency response of this
notch filter (right).
4.5
DIFFERENCE EQUATIONS
An important class of discrete systems can be described by difference equations. They are obtained by converting corresponding differential equations
or by describing an intrinsically discrete system relating the input and output signal in a recursive way. A general form of a linear difference equation
with constant coefficients, that relates the output signal at an instant n with
the input signal x (n) and the previous input and output samples, is
y(n)+ A1 y(n 1) + ... + A N y(n N ) = B0 x (n)+ B1 x (n 1) + ... + B M x (n M ).
4.5.1 Solution Based on the z-transform
The z-transform of the linear difference equation, assuming zero-valued
initial conditions, is
B0 + B1 z1 + ... + B M z M
X ( z ).
1 + A1 z1 + ... + A N z N
184
z-Transform
(4.6)
If the input signal is x (n) = 1/4n u(n) find the output signal.
1
1 12 z1 + 16 z2
X ( z ).
The z-transform of the input signal is X (z) = 1/(1 14 z1 ) for |z| > 1/4. The
output signal z-transform is
Y (z) =
z3
(z 12 )(z 13 )(z 14 )
For a causal system the region of convergence is |z| > 1/2. The output signal
is the inverse z-transform of Y (z). For n > 0 it is
y(n) =
zi =1/2,1/3,1/4
z n +2
(z
1
1
3 )( z 4 ) |z=1/2
z n +2
(z
=6
1
1
2 )( z 4 ) |z=1/3
1
8
3
n + n.
2n
3
4
"
z n +2
(z 12 )(z 13 ) |z=1/4
$
6
8
3
+
u ( n ).
2n
3n
4n
Note: This kind of solution assumes the initial values from the system causality and x (n) as y(0) = x (0) = 1 and y(1) 5y(0)/6 = x (1), i.e., y(1) =
13/12.
(4.7)
Find its impulse response and discuss its behavior in terms of the system
coefficients.
Ljubia Stankovic
185
For the impulse response calculation the input signal is x (n) = (n)
with X (z) = 1. Then we have
(1 + A1 z1 )Y (z) = ( B0 + B1 z1 )
Y (z) =
B0 + B1 z1
.
1 + A 1 z 1
The pole of this system is z = A1 . The are two possibilities for the region
of convergence |z| > | A1 | and |z| < | A1 |. For a causal system the region of
convergence is |z| > | A1 |. Thus, the z-transform Y (z) can be expanded into a
geometric series with q = A1 z1 = ( A1 /z) < 1
!
"!
"
Y (z) = B0 + B1 z1 1 A1 z1 + A21 z2 A31 z3 + ... + ( A1 z1 )n + ...
= B0 + B0
n =1
n =1
( A1 )n zn + B1 ( A1 )(n1) zn
with
y(n) = B0 (n) + ( A1 )n1 ( A1 B0 + B1 )u(n 1).
We can conclude that, in general, the impulse response has an infinite duration for any A1 = 0. It is a result of the recursive relation between the output
y(n) and its previous value(s) y(n 1). This kind of systems are referred to
as infinite impulse response (IIR) systems or recursive systems. If the value
of coefficient A1 is A1 = 0 then there is no recursion and
y(n) = B0 (n) + B1 (n 1).
Then we have a system with a finite impulse response (FIR). This kind of
system produces an output to a signal x (n) as
y(n) = B0 x (n) + B1 x (n 1).
They are called moving average (MA) systems. Systems without recursion
are always stable since a finite sum of finite signal values is always finite.
Systems that would contain only x (n) and the output recursions, in this
case,
y(n) + A1 y(n 1) = B0 x (n)
are auto-regressive (AR) systems or all pole systems. This kind of systems
could be unstable, due to recursion. In our case the system is obviously
unstable if | A1 | > 1. Systems (4.7) are in general auto-regressive moving
average (ARMA) systems.
186
z-Transform
If the region of convergence were |z| < | A1 | then the function Y (z)
would be expanded into series with q = z/A1 < 1 as
!
"
B0 + B1 z1
B1
B0
+
( A11 z)n
=
Y (z) =
z
A1
A1 n
A1 z1 (z/A1 + 1)
=0
= B0
= B0
n=
n=
( A1 )n1 z(n1) +
( A1 )n zn +
B1 0
( A1 )n zn
A1 n=
B1 0
( A1 )n zn
A1 n=
with
B1
( A1 )n u(n).
A1
This system would be stable if |1/A1 | < 1 and unstable if |1/A1 | > 1, having
in mind that y(n) is nonzero for n < 0. This is an anticausal system since it
has impulse response satisfying h(n) = 0 for n 1.
Here, we have just introduced the notions. These systems will be
considered in Chapter 5 in details.
y(n) = B0 ( A1 )n u(n 1) +
(4.8)
(4.9)
iN + A1 iN 1 + ... + A N = 0.
Ljubia Stankovic
187
Ci in
i =1
5
1
y ( n ) y ( n 1) + y ( n 2) = x ( n )
6
6
(4.10)
1
1
+ C2 n .
2n
3
1
1
+ C2 n + 3n + 1.
2n
3
188
z-Transform
C1
C
+ 2 +4=5
2
3
(4.11)
Consider a periodic signal x (n) with a period N and its DFT values X (k ),
x (n) =
1 N 1
X (k )e j2nk/N .
N k =0
(4.12)
1
X (k)e j2nk/N ,
N
Ljubia Stankovic
189
Z{ xk (n)}
.
1 e j2k0 /N z1
(4.13)
1
X (k )e j2nk/N }
N
N 1
1
1
1 e j2k z N
= X (k) e j2nk/N zn = X (k)
.
N
N
1 e j2k/N z1
n =0
Z{ xk (n)} = Z{
(4.14)
or
Note that the zero
1
2j
z N 2 Yk (z)dz = 0
(4.15)
"
= z N 1 X ( k 0 )
"
N
1 e j2k0 /N z1 " j2k0 /N
z=e
1
z N e j2k0
= X (k0 ) lim
= X ( k 0 ).
j2k0 /N
N
ze j2k0 /N z e
190
z-Transform
Z {xk(n)}
1/(1-e
j2 k n/N -1
0
z ), k k0
1/(1-e
j2 k n/N -1
z=ej2k/N
z ), k=k0
j2k/N
z=e
k =k
Im{z}
Im{z}
Im{z}
z=ej2k0/N
k k
0
Re{z}
Re{z}
Re{z}
Figure 4.5 Zeros and the pole in Z{ xk (n)} (left), the pole in 1/ (1 e j2k0 n/N z1 ) for k = k0
(middle), and the pole in 1/ (1 e j2k0 n/N z1 ) for k = k0 (right). Illustration is for N = 16.
(4.16)
X (z)
1 e j2k0 n/N z1
1 e j2k0 n/N z1
1 e j2k0 n/N z1 1 e j2k0 n/N z1
1 e j2k0 n/N z1
Y (z) =
X (z)
1 2 cos(2k0 n/N )z1 + z2
Y (z) =
X (z)
Ljubia Stankovic
191
(4.18)
= y1 ( N ) e j2k0 y1 ( N 1).
It requires just one additional complex multiplication for the last instant
and for one frequency. The total number of multiplications is 2 N + 4. It
is reduced with respect to the previously needed 4N real multiplications.
The total number of additions is 4N + 2. It is increased. However the time
needed for a multiplication is much longer than the time needed for an
addition. Thus, the overall efficiency is improved. The efficiency is even more
improved having in mind that (4.18) is the same for calculation of X (k0 ) and
X (k0 ) = X ( N k0 ).
4.6
By sampling a signal x (t), the Laplace transform integral can be approximated by a sum
X (s) =
x (t)est dt
=
n=
x (nt)esnt t =
n=
x (n)esnt
with x (n) = x (nt)t. Comparing this relation with the z-transform definition we can conclude that the Laplace transform of x (t) corresponds to the
z-transform of its samples with
z = exp(st),
that is,
X (s) X (z)|z=exp(st) .
(4.19)
192
z-Transform
x (n)zn j .
|z=e
n=
Example 4.14. A causal discrete-time signal x (n) has the Fourier transform X (e j ).
Write its z-transform in terms of the Fourier transform of the discrete-time
signal, i.e., write the z-transform value based on its values on the unit circle.
X (e j )e jn d
x (n)z
n =0
1
2
1
=
2
X (e j )
e jn zn d
n =0
X (e j )
d,
1 e j z1
N 1
n =0
x (n)zn j2k/N .
|z=e
Example 4.15. Consider a discrete-time signal with N samples different from zero
within 0 n N 1. Show that all values of X (z), for any z, can be calculated
based on its N samples on the unit circle in the z-plane.
N 1
n =0
1 N 1
X (k )e j2nk/N .
N k
=0
Thus, the z-transform of x (n), using only the values of the IDFT where the
original signal is nonzero, 0 n N 1,
X (z) =
1 N 1 N 1
1 N 1 1 z N e j2k
X (k )e j2nk/N zn =
X (k)
1 j2k/N
N k =0 n =0
N k
=0 1 z e
Ljubia Stankovic
193
N=16
j
z=ej2 k/16
Im{z}
/t
Im{z}
Im{s}=
z=e
- /t
Re{s}=
Re{z}
Re{z}
Figure 4.6 Illustration of the z-transform relation with the Laplace transform (left), the Fourier
transform of discrete signals (middle), and the DFT (right).
4.7
1 N 1
1 N 1
1
X (k )e j2nk/N zn =
X ( k ).
1 e j2k/N
N k =0 n =0
N k
z
1
=0
PROBLEMS
Problem 4.1. Find the z-transform and the region of convergence for the
following signals:
(a) x (n) = (n 2),
(b) x (n) = a|n| u(n),
(c) x (n) = 21n u(n) + 31n u(n)
Problem 4.2. Find the z-transform and the region of convergence for the
following signals:
(a) x (n) = (n + 1) + (n) + (n 1),
(b) x (n) = 21n [u(n) u(n 10)].
dX (z)
dz
y(n) = nx (n)u(n)
194
z-Transform
in the discrete-time domain, with the same region of convergence for X (z)
and Y (z), find a causal signal whose z-transform is
(a) X (z) = e a/z , |z| > 0.
(b) X (z) = ln(1 + az1 ), |z| > | a|.
Problem 4.4. (a) How the z-transform of x (n) is related to the z-transform
of x (n)?
(b) If the signal x (n) is real-valued show that its z-transfrom satisfies
X ( z ) = X ( z ).
Problem 4.5. If X (z) is the z-transform of a signal x (n) find the z-transform
of
y(n) =
x ( k ) x ( n + k ).
k =
X (z) =
z+1
.
(2z 1)(3z + 2)
3 56 z1
(1 14 z1 )(1 13 z1 )
1
4
(1 4z)( 14
3
2
2 z+z )
Ljubia Stankovic
195
Problem 4.10. Find the impulse response of a causal system whose transfer
function is
z+2
H (z) =
.
( z 2) z2
Problem 4.11. Find the inverse z-transform of
X (z) =
z2
.
z2 + 1
5
1
5
3
y(n 2) y(n 3) = 3x (n) x (n 1) + x (n 2).
16
16
4
16
11
1
3
y(n 1) + y(n 2) = 2x (n) x (n 1)
6
2
2
196
z-Transform
2
1
y ( n 1) + y ( n 2) = x ( n )
2
4
(4.20)
to the input signal x (n) = 31n u(n) by a direct solution of the differential
equation in the discrete-time domain and by using the z-transform. The
initial conditions are y(n) = 0 for n < 0.
Problem 4.19. The first backward difference is defined as
x ( n ) = x ( n ) x ( n 1 ),
and the mth backward difference is defined by
m x ( n ) = m 1 x ( n ) m 1 x ( n 1 ).
The first forward difference is
x (n) = x (n + 1) x (n),
with the mth forward difference being
m x ( n ) = m 1 x ( n + 1 ) m 1 x ( n ).
Find the z-transforms of these differences.
Problem 4.20. Based on the poles-zero geometry plot the amplitude of the
frequency response of system
y(n) = x (n)
2x (n 1) + x (n 2) + r 2y(n 1) r2 y(n 2)
Ljubia Stankovic
197
Problem 4.21. Plot the frequency response of the discrete system (comb
filter)
1 z N
H (z) =
1 rz N
1/N
4.8
SOLUTIONS
n=
( n 2 ) z n = z 2
for any z = 0.
(b) For this signal
X (z) =
n=
a|n| zn =
n=
an zn +
(1 a2 ) z
an zn = (1 az)(z a)
n =0
for |z| < 1/a and |z| > a. If | a| < 1 then the region of convergence is
a < |z| < 1/a.
(c) In this case
X (z) =
X (z) =
1 n
1
1
1
+
z + n zn =
n
1 1
1 1
2
3
z
1
1
n =0
n =0
2
3z
2 56 z1
(1 12 z1 )(1 13 z1 )
z(2z 56 )
(z 12 )(z 13 )
for |z| > 1/2 and |z| > 1/3. The region of convergence is |z| > 1/2.
Solution 4.2. (a) The z-transform is
X (z) =
n=
1
= z + 1 + z 1 = z + 1 + .
z
198
z-Transform
j2/10
z=e
/2
Im{z}
z=1/2
Re{z}
Figure 4.7
n=
x (n )zn =
9
1 n
1 (2z)10
n
=
=
=
z
2z
(
)
2n
1 (2z)1
n =0
n =0
z10 ( 12 )10
z10 z10 ( 12 )10
=
z 1
z 12
z9 (z 12 )
The expression for X (z) is written in this way in order to find the region of
convergence, observing the zero-pole locations in the z-plane, Fig.4.7. Poles
are at z p1 = 0 and z p2 = 1/2. Zeros are z0i = e j2i/10 /2, Fig.4.7. Since the ztransform has a zero at z0 = 1/2, it will cancel out the pole z p2 = 1/2. The
resulting region of convergence will include the whole z plane, except the
point at z = 0.
Solution 4.3. (a) For X (z) = e a/z holds
dX (z)
a
a
= z 2 e a/z = X (z)
dz
z
z
Ljubia Stankovic
dX (z)
dz
x (n) =
It means that
a2
a3
, x (3) =
,...
2
23
x (1) = a, x (2) =
or
x (n) =
an
u ( n ).
n!
dX (z)
d(ln(1 + az1 ))
az2
az1
= z
=z
=
.
dz
dz
1 + az1
1 + az1
Therefore
Z [nx (n)] = z
producing
dX (z)
az1
=
dz
1 + az1
( a)n
u ( n 1).
n
n=
x (n)zn .
m=
x (n)zm = X (1/z).
199
200
z-Transform
X (z ) as
!
"
X (z ) = x (n ) (z )n .
n=
Since
(z )n
( zn )
we get
X (z ) =
n=
x (n )zn =
n=
x ( n ) z n = X ( z ),
n=
y (n )zn =
n= k=
x (k ) x (n + k )zn ,
1
1
.
2
3z 1 3z
#2#
# < 1, that is |z| > 2 , corresponds to the problem forNow the condition # 3z
3
mulation region of convergence. In order to obtain the inverse z-transform,
write
1
1
1
= X1 ( z ) ,
X (z) =
2
3z 1 3z
3z
where
X1 ( z ) =
1
.
2
1 3z
Ljubia Stankovic
201
n =0
2
3z
"n
! "n
2
=
zn .
n =0 3
n=
x (n ) zn
(4.21)
X (z) =
x (n) =
or
& ' n 1
13 23
, for n = 1, 2, ...,
x (n) =
1
3
elsewhere,
! " n 1
2
u ( n 1 ).
3
Solution 4.7. Since the signal is causal the region of convergence is outside
the pole with the largest radius (outside the circle passing through this pole).
202
z-Transform
1
2
and z p2 = .
2
3
z+1
A
B
+
=
(2z 1)(3z + 2) 2z 1 3z + 2
3
1
A= , B= .
7
7
The terms in X (z) should be written in such a way that they represent sums
of geometric series for the given region of convergence. From the solution
of the previous problem, we conclude that
X (z) =
A 1
B
1
+
.
2
1
2z 1 2z
3z 1 + 3z
|z| >
1
2
and
!
!
"
"
B
1
B
2 n n B
2 n n 1
2
=
z =
z
, |z| > .
2
3z 1 + 3z
3z n
3
3
3
3
n =0
=0
The z-transform, with m = n + 1, assumes the form
! "
!
"
A 1 m 1 m B
2 m 1 m
+
X (z) =
z
z .
2 m
3 m
3
=1 2
=1
Replacing the values for A and B it follows
! "
!
"
3 1 m m
1
2 m m
+
z
z .
X (z) =
7 m =1 2
14 m
3
=1
The signal x (n) is obtained by comparing this transform with the ztransform definition,
!
" "
! ! "n
3 1
1
2 n
+
x (n) =
u ( n 1 ).
7 2
14
3
Ljubia Stankovic
203
(1
3 56 z1
1 1
1 1
4 z )(1 3 z )
with A = 1, B = 2.
A
1
1 1
4z
B
1 13 z1
(a) The region of convergence must contain |z| = 1, for a stable system. It is
|z| > 13 .
From
H (z) =
1
1 14 z1
! "n
n =0
1
4
+
z
2
1 13 z1
! "n
1
1
1
+2
zn , |z| > and |z| >
3
3
4
n =0
1 1
3z
= 2
3z
1
= 2 (3z)n = 2 (3z)m with |z| < .
m=n
1 3z
3
m=
n =1
1
4z
1
= (4z)n = (4z)m with |z| < .
m=n
1 4z
4
m=
n =1
204
z-Transform
(1 4z)( 14
3
2
2 z+z )
can be written as
H (z) =
(1 4z)(z
3
4
+ j 14 )(z
3
4
j 14 )
z+2
A
B
C
+ + 2.
=
z2
z
z2 ( z 2)
z
( A + B)z2 + (2B + C ) 2C = z + 2.
The coefficients follow from
A+B=0
2B + C = 1
2C = 2,
as A = 1, B = 1, and C = 1. The transfer function is
H (z) =
z 1
1
1
2 .
1
z
z
1 2z
Ljubia Stankovic
205
The region of convergence for a causal system is |z| > 2. The inverse ztransform for a causal system is the system impulse response
h ( n ) = 2n 1 u ( n 1 ) ( n 2 ) ( n 1 ) = ( n 2 ) + 2n 1 u ( n 3 ).
The system is not stable.
Solution 4.11. The z-transform X (z) can be written in the form
X (z) =
1
1
z2
2z
2z
=
+
.
2
z+j zj
z +1
For the region of convergence defined by |z| > 1 the signal is causal and
1
1
x (n) = [1 + (1)n ] jn u(n) = [1 + (1)n ]e jn/2 u(n).
2
2
For n = 4k, where k 0 is an integer, x (n) = 1 , while for n = 4k + 2 the
signal values are x (n) = 1. For other n the signal is x (n) = 0.
For |z| < 1 the inverse z-transform is
1
x (n) = [1 + (1)n ] jn u(n 1).
2
3 54 z1 +
3 2
16 z
5 2
1 3
32
z
16 z
3 54 z1 +
(1 12 z1 +
1 z 1 +
1
1
1
=
+!
.
"2 +
1 1
(1 12 z1 )
1 4z
1 14 z1
3 2
16 z
1 2
1 1
16 z )(1 2 z )
For a causal system the region of convergence is outside of the pole z = 1/2,
that is |z| > 1/2. Since
#
$%
%
1
d
z
%
=
!
"2
%
1
da 1 az
a=1/4
1 14 z1
%
%
%
d n (n1) %%
1
n1 (n1) %
=
=
= ( n + 1) n z n ,
a
z
na
z
%
%
%
%
da n=0
4
n =0
n =0
a=1/4
a=1/4
1
1
1
u ( n ) + ( n + 1) n u ( n ) + n u ( n ).
n
4
4
2
206
z-Transform
3
1
H ( z ) = 1 z 1 + z 2 .
4
8
The z-transform of the input signal x (n) = 1/4n u(n) is
X (z) =
1
1 14 z1
with the region of convergence |z| > 1/4. The output signal z-transform is
Y (z) = H (z) X (z) =
(1 12 z1 )(1 14 z1 )
1
= 1 z 1 .
1 1
2
(1 4 z )
1
1 13 z1
1 z 6
.
1 z 1
1 z 6
(1 z1 )(1 1/3z1 )
= Y1 (z) Y1 (z)z6
1
3/2
1/2
.
=
(1 z1 )(1 1/3z1 ) 1 z1 1 13 z1
!
3 1
y1 ( n ) =
2 2
" #n $
1
u ( n ).
3
207
1/3
3/2
Im{z}
Im{z}
Im{z}
Ljubia Stankovic
3/2
Re{z}
1/3
3/2
Re{z}
Re{z}
Figure 4.8 Poles and zeros of the system (left), input signal z-transform (middle), and the
z-transform of the output signal (right).
3 1
y(n) =
2 2
%
&
" #n $
" #
1
3 1 1 n 6
u(n)
u ( n 6 ).
3
2 2 3
11 1 1 2
3
z + z ) = X (z)(2 z1 )
6
2
2
as
H (z) =
2 32 z1
11 1
6 z
+ 12 z2
The poles are at z p1 = 1/3 and z p2 = 3/2 with the region of convergence
|z| > 3/2. It means that the system is not stable, Fig.4.8.
The z-transform of the input signal is
3
X (z) = 1 z1 for |z| > 0.
2
The output signal transform is
Y (z) =
2 32 z1
11 1
6 z
+ 12 z2
"
3
1 z 1
2
2 32 z1
1 13 z1
The output signal transform does not have a pole z = 3/2 since this pole is
canceled out. The output signal is
y(n) =
1
3 1
u(n)
u ( n 1 ).
n
3
2 3n 1
208
z-Transform
z
1
1
=
.
z2 + 3z + 2
1 + z 1
1 + 2z1
The inverse z-transform of X (z) is
X (z) =
z
za
!
"
z
1
1
a
=
.
X (z) =
(z a)(z 1) 1 a z 1 z a
1
1 an
u ( n 1)
[u(n 1) an u(n 1)] =
1a
1a
x (n) =
n 1
ak , n > 0.
k =0
2
1
y(n)
y ( n 1) + y ( n 2) = 0
(4.22)
2
4
in the form yi (n) = Ci in . The characteristic polynomial is
2
2
1
+ =0
2
4
Ljubia Stankovic
with 1,2 =
2
4
209
2
4 .
2
2 n
2
2 n
+j
) + C2 (
j
)
yh (n) = C1 (
4
4
4
4
1
1
= C1 n e jn/4 + C2 n e jn/4 .
2
2
A particular solution is of the input signal x (n) = 31n u(n) form. It is y p (n) =
A 31n u(n). The constant A is obtained by replacing this signal into (4.20)
1
2
1
1
1
1
+ A n 2 = n
A
A n
3
2 3n 1
4 3
3
3 2 9
+ ) = 1.
A (1
2
4
Its value is A = 0.886. The general solution is
y(n) = yh (n) + y p (n) = C1
1 jn/4
1
1
+ C2 n e jn/4 + 0.886 n .
e
n
2
2
3
Since the system is causal with y(n) = 0 for n < 0 then the constants C1
and
C2 may be obtained from the initial condition following from
y(n)
2
1) + 14 y(n
2 y(n
x (1) = 22 + 13 ,
2
2 y (0 )
(4.23)
C1 + C2 + 0.886 = 1
2
2
2
2
1
2 1
+j
)/2 + C2 (
j
)/2 + 0.886 =
+ ,
C1 (
2
2
2
2
3
2
3
1
1
cos(n/4 1.5137) + 0.886 n .
2n
3
2
1
Y (z)
Y ( z ) z 1 + Y ( z ) z 2 = X ( z )
2
4
with
Y (z) =
2 1
2 z
1 2
4z
1 13 z1
210
z-Transform
with
Y (z) =
2
4
(z (
+j
z3
2
2
4 ))( z ( 4
2
1
4 ))( z 3 )
Using, for example, the residual value based inversion of the z-transform,
!
"
n 1
[
(
)(
)]
z
Y
z
z
z
y (n) =
i | z = zi
z1,2,3 =
= z n +2
j 2
2
2
2
4 j 4 ,1/3
#
#
#
#
+ z n +2
#
2 j 2
1 #
(z
)(z 3 ) 2+ j 2
(z
4
1
+ z n +2
#
#
#
#
#
2+ j 2
1 #
)(z 3 ) z= 2 j 2
4
4
4
#
#
#
1
#
#
(z 2+4 j 2 )(z 24 j 2 ) #z=1/3
% n +2
2+j 2
4
2+ j 2
4
+
=
1
2n +2
e j(n+2)/4
j 2
2+ j 2
4
1
3
j 2
2
% n +2
2j 2
4
1
1
3n +2 ( 1 1 2 + 1 )
9
3 2
4
2n +2
e j(n+2)/4
j 2
2 j 2
4
2 j 2
4
+ 0.886
13
13
1
2
1
2
1
= n e jn/4
+ n e jn/4
+ 0.886 n
4
4
2
2
3
2+j 2 3
2j 2 3
1
1
= 2 0.9984 n cos(n/4 1.5137) + 0.886 n ,
2
3
1
3
1
3n
Ljubia Stankovic
211
Its z-transform is
Z [2 x (n)] = (1 z1 )2 X (z).
Z [m x (n)] = (1 z1 )m X (z).
The z-transform of the first forward difference is
m 1
( z 1 ) m j 1 j x (0 ).
j =0
1
[1 ( 22 + j 22 )z1 ][1 ( 22 j 22 )z1 ]
2z + z2
=
H (z) =
1 r 2z1 + r2 z2
[1 r ( 22 + j 22 )z1 ][1 r ( 22 j 22 )z1 ]
2
2
[z r ( 22
[z (
2
2
2
)][
z
j
2
2 )]
2
j 22 )][z r ( 22 j 22 )]
+j
+
2
2
! !
! B ! TO1 TO2
TO1 TO2
| H (e j )| = !! 0 !!
=
.
A0 TP1 TP2
TP1 TP2
jr
2
2 .They
are
The values of TP1 and TO1 , and TP2 and TO2 , are almost the same for
any except = /4 where the distance to the transfer function zero is
212
z-Transform
2
O1
T
1.5
Im{z}
|H(ej)|
0.5
Re{z}
Figure 4.9
-2
- /4
/4
0, while the distance to the corresponding pole is small but finite. Based on
this analysis the amplitude of frequency response is presented in Fig.4.9.
The input discrete-time signal is
x (n) = x (nt)nt = [2 cos(n/6) sin(n/4) + 0.5e jn/3 ]/60.
This system will filter out signal components at = /4. The output
discrete-time signal is
y(n) = [2 cos(n/6) + 0.5e jn/3 ]/60.
Corresponding continuous-time output signal is
y(t) = 2 cos(10t) + 0.5e j20t .
Solution 4.21. The zeros of the system are
N
= 1 = e j2m
z
o
N 1
N 1
z zom
z e j2m/N
=
.
1/N e j2m/N
z z
m=0 pm
m =0 z r
Ljubia Stankovic
213
| H (e j )|
= 1 for z = e j2m/N
| H (e j )| = 0 for z = e j2m/N .
The same holds for
H (z) =
EXERCISE
Exercise 4.1. Find the z-transform and the region of convergence for the
following signals:
(a) x (n) = (n 3) (n + 3),
(b) x (n) = u(n) u(n 20) + 3(n),
(c) x (n) = 1/3|n| + 1/2n u(n),
(d) x (n) = 3n u(n) + 2n u(n),
(e) x (n) = n(1/3)n u(n).
(f) x (n) = cos(n 2 ).
Exercise 4.2. Find the z-transform and the region of convergence for the
signals:
(a) x (n) = 3n u(n) (2)n u(n) + n2 u(n).
(b) x (n) = nk=0 2k 3nk ,
(c) x (n) = nk=0 3k .
Exercise 4.3. Find the inverse z-transform of:
8
(a) X (z) = 1zz + 3, if X (z) is the z-transform of a causal signal x (n).
(b) X (z) = (zz+22)z2 , if X (z) is the z-transform of a causal signal x (n).
2
6z +3z2
(c) X (z) = 6z
2 5z +1 , if X ( z ) is the z-transform of an unlimited-duration
signal x (n). Find
n= x (n ) in this case.
214
z-Transform
x (n kN ), where N is an integer,
k=
k=
x ( n k ) x ( n + k ).
(2 z )
(1 4z)(1 3z)
(4.24)
Ljubia Stankovic
215
Hint: Since y(0) does not follow from (4.25) obviously the system
output was "preloaded" before the input is applied. This fact can be taken
into account by changing the input signal at n = 0 to produce the initial
output. It is x (n) = 1/4n u(n) + (n). Now the initial conditions are y(0) = 2
and y(1) = 5/3 + 1/4 = 23/12 and we can apply the z-transform with this
new input signal.
Exercise 4.11. Solve the difference equation using the z-transform
1
x ( n + 2) x ( n + 1) + x ( n ) = 0
2
with initial condition x (0) = 0 and x (1) = 1/2. The signal x (n) is causal.
Exercise 4.12. Using the basic trigonometric transformations show that a
real-valued signal y(n) = cos(2k0 n/N + ) is a solution of the homogeneous difference equation
y(n) 2 cos(2k0 /N )y(n 1) + y(n 2) = 0.
with similar conclusions as in the complex-valued signal case.
Exercise 4.13. For the system
H (z) =
(1 z1 )(1 + z1 ) 3 1 2 cos(2k/8)z1 + z2
1 + z 2
(1 rz1 )(1 + rz1 ) k
=1 1 2r cos(2k/8)z
and r = 0.9999 plot the amplitude of the frequency response and find the
output to the signal
x (n) = cos(n/3 + /4) + sin(n/2) + (1)n .
Chapter 7
Discrete-Time Random Signals
signals cannot be described by simple mathematical functions. Their values are not known in advance. These signals can be
described by stochastic tools only. Here we will restrict the analysis
to the discrete-time random signals. The first-order and the second-order
statistics will be considered.
7.1
ANDOM
{ x (n)}, n = 1, 2, ..., N,
(7.1)
1
( x (1) + x (2) + ... + x ( N )).
N
Example 7.1. Consider a random signal x (n) whose one realization is given
in Table 7.1. Find the mean value of this signal. Find how many samples of
the signal are within the intervals [1, 10], [11, 20],...,[91, 100]. Plot the number
of occurrences of signal x (n) samples within these intervals as a function of
the interval range.
320
Table 7.1
A realization of random signal
54
56
23
31
37
35
67
61
40
66
120
62
53
26
55
12
55
56
84
77
39
58
38
66
52
54
54
42
48
52
50
51
61
47
23
42
55
66
67
63
31
70
28
69
60
67
49
50
71
57
11
43
69
71
34
95
77
47
74
42
75
99
87
69
83
89
18
49
35
44
45
52
41
81
39
67
64
25
59
64
62
57
72
68
66
42
73
50
60
36
60
57
72
68
66
42
73
50
60
36
60
x(n)
110
100
90
80
70
60
mean(x)
50
40
30
20
10
0
Figure 7.1
10
20
30
40
50
60
70
1 100
x (n) = 55.76.
100 n
=1
80
90
100
Ljubia Stankovic
321
25
20
15
10
10
Figure 7.2
20
30
40
50
60
70
80
90
100
Histogram of random signal x (n) with 10 intervals [10i + 1, 10i + 10], i = 0, 1, 2, ..., 9.
From Table 7.1 or the graph in Fig. 7.1 we can count that, for example,
there is no a signal sample whose value is within the interval [1, 10]. Within
[11, 20] there are two signal samples (x (42) = 12 and x (95) = 11). In a similar
way, the number of signal samples within other intervals are counted and
presented in Fig.7.2. This kind of random signal presentation is called a
histogram of x (n), with defined intervals.
Example 7.2. For the signal x (n) from the previous example assume that a new
random signal y(n) is formed as
y(n) = int
x (n) + 5
10
"
where int {} denotes the nearest integer. It means that y(n) = 1 for 1
x (n) 10, y(n) = 2 for 11 x (n) 20, ..., y(n) = i for 10(i 1) + 1 x (n)
10i up to i = 10. Plot the new signal y(n). What is the set of possible values of
y(n). Present on a graph how many times each of the possible values of y(n)
appeared in this signal realization. Find the mean value of the new signal
y(n) and discuss the result.
The signal y(n) is shown in Fig.7.3. This signal assumes values from
the set {2, 3, 4, 5, 6, 7, 8, 9, 10}.
For the signal y(n), instead of histogram we can plot a diagram of the
number of occurrences of each value that y(n) can assume. It is presented in
322
11
y(n)
10
9
8
7
mean(y)
6
5
4
3
2
1
0
10
Figure 7.3
20
30
40
50
60
70
80
90
100
1 100
y(n) = 6.13.
100 n
=1
The mean value can also be written, by grouping the same values of y(n), as
1
(1 n1 + 2 n2 + 3 n3 + ... + 10 n10 ) =
100
n
n
n
n
= 1 1 + 2 2 + 3 3 + ... + 10 10 ,
N
N
N
N
y =
where N = 100 is the total number of signal values and ni is the number
showing how many times each of the values i appeared in y(n). If there is a
sufficient number of occurrences for each outcome value i then
n
Py (i ) = i
N
can be considered as the probability that the value i appears. In that sense
y = 1 Py (1) + 2 Py (2) + 3 Py (3) + ... + 10 Py (10)
10
y(i) Py (i)
i =1
Ljubia Stankovic
323
25
0.25
20
0.2
15
0.15
10
0.1
0.05
P (i)
y
9 10
9 10
Figure 7.4 Number of appearances of each possible value of y(n) (left) and the probabilities
that the random signal y(n) takes a value i = 1, 2, . . . , 10 (right).
with
10
Py (i) = 1.
i =1
In general, the mean for each signal sample could be different. For
example, if the signal values represent the highest daily temperature during
a year then the mean value is highly dependent on the considered sample.
In order to calculate the mean value of temperature, we have to have several
realizations of these random signals (measurements over M years), denoted
by { xi (n)}, where argument n = 1, 2, 3, ... is the cardinal number of the day
within a year and i = 1, 2, ..., M is the index of realization (year index). The
mean value is then calculated as
x (n) =
1
1 M
( x1 (n) + x2 (n) + ... + x M (n)) =
x i ( n ),
M
M i
=1
(7.2)
for each n. In this case we have a set (a signal) of mean values { x (n)}, for
n = 1, 2, ..., 365.
Example 7.3. Consider a signal x (n) with realizations given in Table 7.2. Its values
are equal to the monthly average of maximal daily temperatures in a city
measured from year 2001 to 2015. Find the mean temperature for each month
over the considered period of years. What is the mean value of temperature
over all months and years? What is the mean temperature for each year?
324
Table 7.2
Average of maximal temperatures value within months over 15 years, 2001-2015.
Jan
10
6
10
3
7
7
7
12
7
8
8
4
3
11
6
Feb
4
7
11
11
10
11
12
5
12
12
10
6
6
12
13
Mar
18
11
10
13
13
17
13
9
13
10
13
15
16
14
8
Apr
17
23
16
19
21
17
19
20
23
17
24
18
17
18
22
May
22
22
21
22
27
27
23
21
27
27
23
25
27
22
22
Jun
29
32
26
26
29
25
32
37
33
33
33
26
28
29
29
Jul
30
35
32
34
30
37
34
34
29
38
33
27
30
34
30
Aug
28
33
31
29
34
34
38
34
31
32
31
33
32
34
34
Sep
27
22
23
26
24
33
21
27
25
23
27
23
29
23
23
Oct
17
26
19
22
20
22
21
22
21
20
21
23
24
21
18
Nov
17
22
17
12
16
14
12
20
6
15
16
13
12
20
15
Dec
5
8
4
9
11
14
10
7
11
9
8
11
10
11
8
The signal for years 2001 to 2007 is presented in Fig.7.5. The mean
temperature for the nth month, over the considered years, is
x (n) =
1 15
x20i (n),
15 i
=1
where the notation 20i is symbolic in the sense, 2001, 2002, ... 2015, for
i = 01, 02, ..., 15. The mean-value signal x (n) is presented in the last subplot
of Fig. 7.5. The mean value over all months and years is
x =
12 15
1
x20i (n) = 19.84.
15 12 n=1 i
=1
1 12
x20i (n).
12 n
=1
Ljubia Stankovic
45
45
x2001(n)
35
25
15
15
5
1
45
9 10 11 12
-5
25
15
15
5
1
45
9 10 11 12
25
15
15
-5
-5
45
9 10 11 12
25
15
15
-5
9 10 11 12
-5
9 10 11 12
9 10 11 12
9 10 11 12
9 10 11 12
(n)
x(n)
35
25
2006
45
(n)
2007
35
35
25
45
(n)
2005
35
-5
x2004(n)
35
25
-5
45
x2003(n)
35
x2002(n)
35
25
-5
325
Figure 7.5 Several realizations of a random signal x20i (n), for i = 01, 02, ..., 07 and the mean
value x (n) for each sample (month) over 15 available realizations.
326
other
(7.3)
3) The sum of probabilities that x (n) takes any value i over the set A
of all possible values of is a certain event. Its probability is 1,
Px(n) ( ) = 1.
N i
N
Ljubia Stankovic
327
Example 7.4. Consider a random signal whose values are equal to the numbers ap-
pearing in a die tossing. The set of possible signal values is i {1, 2, 3, 4, 5, 6}.
Find
Probability { x (n) = 2 or x (n) = 5}
and
Events that x (n) = 2 and x (n) = 5 are obviously mutually exclusive. Thus,
Probability { x (n) = 2 or x (n) = 5} = Px(n) (2) + Px(n) (5) =
1 1
1
+ = .
6 6
3
11
1
= .
66
36
Example 7.5. Assume that a signal x (n) length is N and that the number of samples
NI
N
since there are N samples in total and N I of them are noise-free. Probability
that the first randomly chosen sample is not affected by high noise and that,
at the same time, the second randomly chosen sample is not affected by high
noise is equal to to the product of their probabilities,
P (2) =
N I N1 I
.
N
N1
328
M 1
NIi
.
Ni
i =0
The mean value is calculated as a sum over the set of possible amplitudes, weighted by the corresponding probabilities,
x (n) = E{ x (n)} =
i Px(n) ( i ).
(7.4)
i =1
(7.5)
p x(n) ( )d = 1.
The probability of an event that a value of signal x (n) is within a x (n) < b
is
Probability { a x (n) < b} =
!a
b
p x(n) ( )d.
Ljubia Stankovic
329
p x(n) ( )d.
p x(n) ( )d.
(7.6)
7.1.3 Median
In addition to the mean value, a median is used for description of a set of
random values. The median is a value in the middle of the set, after the
members of the set are sorted. If we denote the sorted values of x (n) as s(n)
s(n) = sort{ x (n)}, n = 1, 2, ..., N
then the median value is
median{ x (n)} = s
"
#
N+1
, for an odd N.
2
If N is an even number then the median is defined as the mean value of two
samples nearest to ( N 1)/2,
median{ x (n)} =
$ %
N
2
+s
2
N
2
+1
, for an even N.
The median will not be influenced by a possible small number of big outliers
(signal values being significantly different from the values of the rest of
data).
330
120
sort(x)
110
100
90
80
70
60
median(x)
50
40
30
20
10
0
10
Figure 7.6
20
30
40
50
60
70
80
90
100
(a) A = {1, 1, 2, 4, 6, 9, 0}, (b) B = {1, 1, 1367, 4, 35, 9, 0}, and (c)
of the signal x (n) from Example 7.1.
In some cases the number of big outliers is small. Thus the median will
neglect many signal values that could produce a good estimate of the mean
value. In that cases, the best choice would be to use not only the mid-value
in the sorted signal, but several samples of the signal around its median and
to calculate their mean, for odd N, as
!
"
L
1
N+1
LSmean{ x (n)} =
s
+
i
.
2L + 1 i=
2
L
With L = ( N 1)/2 all signal values are used and LSmean { x (n)} is the
standard mean of a signal. With L = 0 the value of LSmean{ x (n)} is the
Ljubia Stankovic
331
For a random signal x (n) whose values are available in M realizations the
variance can be estimated as a mean square deviation of the signal values
from their corresponding mean values x (n),
x2 (n) =
#
1 "
| x1 (n) x (n)|2 + ... + | x M (n) x (n)|2 .
M
#
1 "
| x1 (n) x (n)|2 + ... + | x M (n) x (n)|2 .
M
(7.7)
For a small number of samples, this estimate tends to produce lower values
of the standard deviation. Thus, an adjusted version, the sample standard
deviation, is also used. It reads
x (n) =
#
1 "
|( x1 (n) x (n))|2 + ... + | x M (n) x (n)|2 .
M1
This form confirms the fact that in the case when only one sample is
available, M = 1, we should not be able to estimate the standard deviation.
For the case of random signals whose amplitude is continuous the
variance, in terms of the probability density function p x(n) ( ), is
x2(n)
% !
!2
!
!
! x(n) ! p x(n) ( )d.
332
Table 7.3
Random signal z(n)
55
55
47
49
50
50
58
57
51
58
120
57
55
48
55
44
55
55
62
60
51
56
51
58
54
55
55
58
53
54
54
54
56
53
47
50
55
58
58
57
49
59
48
58
56
58
53
54
59
55
44
52
59
59
50
58
60
53
60
52
60
66
63
59
62
63
46
54
50
52
52
54
52
61
51
58
57
48
56
57
57
56
59
58
58
52
59
54
56
50
56
56
59
58
58
52
59
54
56
50
56
z(n)
110
100
90
80
70
60
mean(z)
50
40
30
20
10
0
10
Figure 7.7
20
30
40
50
60
70
80
90
100
Example 7.7. For the signal x (n) from Example 7.1 calculate the mean and vari-
ance. Compare it with the mean and variance of the signal z(n) given in Table
7.3.
Ljubia Stankovic
333
The mean value and variance for signal x (n) are x = 55.76 and
x2 = 314.3863. The standard deviation is x = 17.7309. It is a measure of signal
value deviations from the mean value. For the signal z(n) the mean value
is z = 55.14 (very close to x ), while the variance is z2 = 18.7277 and the
standard deviation is z = 4.3275. Deviations of z(n) from the mean value
are much smaller. If signals x (n) and z(n) were measurements of the same
physical value, then the individual measurements from z(n) would be much
more reliable than the individual measurements from x (n).
Example 7.8. A random signal x (n) can take values from the set {0, 1, 2, 3, 4, 5}. It
is known that for k = 1, 2, 3, 4 the probability of x (n) = k is twice higher than
the probability of x (n) = k + 1. Find the probabilities P{ x (n) = k }. Find the
mean value and variance of signal.
0
32A
1
16A
2
8A
3
4A
4
2A
5
A
Example 7.9. Consider a real-valued random signal x (n) with samples whose
p x(n) ( )d =
( x(n) )2 p x(n) ( )d =
x2(n) =
%1
1
d = 0
2
%1
1 2
1
d = .
2
3
334
0
for 0
for 0
&
0
d
for
0
1
x (n)
1
for 1
1
for 1
for 0 < 1
dF ( )
2
py(n) ( ) =
=
d
0
otherwise.
The mean value and variance of signal y(n) are
y(n) =
(1
1
1
d =
3
2
(1
1
1
4
( )2 d = .
3 2
45
y2(n) =
(a (d
p x(n),y(m) (, )dd.
b c
For mutually independent signals p x(n),y(m) (, ) = p x(n) ( ) py(m) ( ). A special case of the previous relations is obtained when y(m) = x (m).
Example 7.10. Signal x (n) is defined as x (n) = a(n) + b(n) + c(n) where a(n),
b(n), and c(n) are mutually independent random signals with a uniform
probability density function over the range [1, 1). Find the probability
density function of signal x (n), its mean x , and variance x2 .
Ljubia Stankovic
335
! !
p a(n),b(n) (, )dd =
pb(n) ( )
p a(n) ( )dd.
d
pb(n) ( )
d
p a(n) ( )dd
p x (n) ( ) =
( +3)2
16
38
( 3)2
16
for 3 1
for 1 < 1
for 1 < 3
for | | > 3
The mean value and variance can be calculated from p x(n) ( ), or in direct
way, as
x = E{ x (n)} = E{ a(n)} + E{b(n)} + E{c(n)} = 0
x2 = E{( x (n) x )2 } = E{( a(n) + b(n) + c(n))2 }
= E{ a ( n )2 } + E{ b ( n )2 } + E{ c ( n )2 } + 2 ( a b + a c + b c )
1 1 1
= + + = 1.
3 3 3
336
7.2
SECOND-ORDER STATISTICS
1 M
xi (n) xi (m).
M i
=1
(7.8)
(7.9)
1 2
1 2 p x(n),x(m) ( 1 , 2 )d 1 d 2 .
(7.10)
1 M
( xi (n) x (n)) ( xi (m) x (m)) .
M i
=1
(7.11)
(7.12)
Ljubia Stankovic
337
and
(7.13)
(7.14)
A signal is stationary in the strict sense (SSS) if all order statistics are
invariant to a shift in time. The relations introduced for the second-order
statistics may be extended to the higher-order statistics. For example, the
third-order moment of a signal x (n) is defined by
Mxxx (n, m, l ) = E{ x (n) x (m) x (l )}.
(7.15)
x (n) = lim
338
r xx (n)e jn
n=
!
1
2
(7.16)
Sxx (e j )e jn d.
(7.17)
(7.18)
x (n) =
ak e j( n+ ) ,
k
k =1
a k E{ e
k =1
j ( k n + k )
}=
ak
k =1
The autocorrelation is
K
1 j ( k n + k )
e
dk = 0.
2
ak e j( m+ ) } = a2k e j n ,
k
k =1
k =1
a2k ( k ).
k =1
Ljubia Stankovic
339
Remind that the average signal power of a signal x (n) has been defined as
!
"
N
1
PAV = lim
| x (n)|2 = | x (n)|2 .
N 2N + 1 n= N
Pxx (e j ) = lim
(7.19)
Different notation is used since the previous two definitions, (7.16) and
(7.19) of power spectral density, will not produce the same result, in general.
We can write
N
N
1
E{ x (m) x (n)e j (mn) }.
N 2N + 1
m= N n= N
Pxx (e j ) = lim
For a stationary signal
N
N
1
r xx (m n)e j (mn) .
N 2N + 1 m= N n= N
Pxx (e j ) = lim
Pxx (e j ) = lim
= lim
2N
k=2N
(1
2N
|k|
)r xx (k)e jk = lim w B (k )r xx (k )e jk .
2N + 1
N
k=2N
340
Function w B (k) corresponds to a Bartlett window over the calculation interval. If the values of autocorrelation function r xx (k ) are such that the second part of the sum k |k |/(2N + 1)r xx (k )e jk is negligible as compared to
k r xx (k )e jk then
Pxx (e j ) = lim
2N
k=2N
7.3
NOISE
S (e ) = FT{r xx (n)} =
(7.20)
2 .
Spectral density of this kind of noise is constant (like it is the case in the
white light). If this property is not satisfied, then the power spectral density
is not constant. Such a noise is referred to as colored.
Regarding to the distribution of noise (n) amplitudes the most common types of noise in signal processing are: uniform, binary, Gaussian, and
impulsive noise.
7.3.1 Uniform Noise
The uniform noise is a signal with the probability density function
p(n) ( ) =
1
,
for /2 < /2
(7.21)
Ljubia Stankovic
1.5
1.5
0.5
0.5
-0.5
-0.5
-1
-1
-1.5
Figure 7.8
= 0.5.
341
10
20
30
40
50
60
-1.5
p ( )
x
0.5
1.5
A realization of uniform noise (left) with probability density function (right) with
/2
!
/2
2 p(n) ( )d =
2
.
12
=1,1
Px ( ) = (1)(1 p) + 1 p = 2p 1.
The variance is
2 =
=1,1
( )2 Px ( ) = 4p(1 p).
A special case is when the values from the set {1, 1} are equally probable,
that is when p = 1/2. Then we get = 0 and 2 = 1.
342
Example 7.12. Consider a set of N balls. Equal number of balls is marked with
1 (or white) and 0 (or black). A random signal x (n) corresponds to drawing of
four balls in a row. It has four values x (0), x (1), x (2), and x (3). Signal values
x (n) are equal to the marks on the drawn balls. Write all possible realizations
of x (n). If k is the number of appearances of value 1 in the signal, write the
probabilities for each value of k.
Signal realizations, with the number k being the number of appearances of digit 1 in each signal realization, are given in the next table.
x (0)
x (1)
x (2)
x (2)
0
0
0
0
0
0
0
1
0
0
1
0
0
0
1
1
0
1
0
0
0
1
0
1
0
1
1
0
0
1
1
1
1
0
0
0
1
0
0
1
1
0
1
0
1
0
1
1
1
1
0
0
1
1
0
1
1
1
1
0
1
1
1
1
!
"
! "
P(2) = 6 12 12 12 12 ,
P(1) = 4 12 12 12 12 ,
! "
P(3) = 4 12 12 12 12 , and
P(4) = 1 12 12 12 12 .
1111
2222
"
! "
4
0
a4 +
! "
4
1
( a + b )4
! "
! "
! "
a3 b + 42 a2 b2 + 43 ab3 + 44 b4
with a = 1/2 and b = 1/2. For the case when N is a finite number see Problem
7.6.
An interesting form of the random variable that can assume only two
possible values {1, 1} or {No, Yes} or { A, B} is the binomial random
variable. It has been introduced through the previous simple example. In
general, if a signal x (n) assumes value B from the set { A, B} with probability
p, then the probability that there is exactly k values of B in a sequence of N
samples of x (n) is
P(k) =
! "
N
k
p k (1 p ) N k
N!
p k (1 p ) N k .
k!( N k )!
Ljubia Stankovic
343
kP(k)
k =0
N!
k k!( N k)! pk (1 p) Nk .
k =0
Since the first term in summation is 0 we will shift the summation for one
and reindex it to
y = E{ y } =
= Np
N 1
k =0
N 1
N ( N 1) !
k =0
( N 1) !
pk (1 p)( N 1)k .
k!(( N 1) k )!
N 1
k =0
N 1 !
k =0
N 1
k
"
pk (1 p)( N 1)k
( N 1) !
pk (1 p)( N 1)k
k!(( N 1) k )!
k ( k 1) P ( k )
k =0
N
N!
k =0
344
Since the first two terms are 0 we can reindex the summation into
E{y(y 1)} =
N 2
N!
k =0
= N ( N 1) p2
N 2
k =0
( N 2) !
p k (1 p ) N 2 k .
k!( N 2 k )!
The relation
N 2
k =0
( N 2) !
pk (1 p) N 2k = ( p + (1 p)) N 2 = 1
k!( N 2 k )!
is used to get
E{y(y 1)} = N ( N 1) p2 .
y2 = E{y2 } (E{y})2
y =
Increasing the number of the total values N the variance will be lower and
a finite set x (n) will produce a more reliable mean value p.
7.3.3 Gaussian Noise
The Gaussian (normal) noise is used to model a disturbance caused by many
small independent factors. Namely, the central limit theorem states that a
sum of a large number of statistically independent random variables, with
any distribution, obeys to the Gaussian (normal) distribution.
The Gaussian zero-mean noise has the probability density function
p(n) ( ) =
2 / (22 )
(7.22)
Ljubia Stankovic
Figure 7.9
(n)
-1
-1
-2
-2
-3
345
10
20
30
40
50
-3
60
p ( )
x
0.25
0.5
2 / (22 )
where
2
erf() =
d = erf
"
(7.23)
e d
(7.24)
Probability{ < (n) < } = erf(1/ 2) = 0.6827,
346
0.5
0.4
px( )
0.3
0.2
0.1
0
-4
-3
-2
-1
Figure 7.10 Probability density function with intervals corresponding to < (n) < ,
2 < (n) < 2 , and 3 < (n) < 3 . Value of = 1 is used.
P=
1.031 2
!2.5
2.5
2
2
e /(21.031 ) d = erf(2.5/( 2 1.031)) = 0.9847.
2
2
1
e /(2 ) .
2
The probability that any of these samples is smaller than a value of could
be defined by using (7.23)
P () = Probability{ x (n) < , n = n0 }
Ljubia Stankovic
347
of is
PN
1 ( ) = Probability{All N 1 values of x (n ) < , n = n0 }
!
" N 1
.
= 0.5 + 0.5 erf(/( 2 ))
2
2
1
e( A) /(2 ) .
2
The probability that the random variable x (n0 ) takes a value around ,
x (n0 ) < + d, is
Pn+0 () = Probability{ x (n0 ) < + d} =
2
2
e( A) /(2 ) d (7.25)
e( A) /(2 ) d,
(
)
(
)
=
+
0.5
erf
PA () = PN
0.5
n
1
0
2
2
while the total probability that all x (n), 0 n N 1, n = n0 are bellow
x (n0 ) is an integral over all possible values of
PA =
'
PA () =
' #
%& N 1
2
2
e( A) /(2 ) d.
(7.26)
Example 7.15. Random signal x (n) is a Gaussian noise with the mean x = 1
and variance x2 = 1. A random sequence y(n) is obtained by omitting
samples from signal x (n) that are either negative or higher that 1. Find the
probability density function of sequence y(n). Find its y and y .
B 1 e
2
0
( 1)2
2
for 0 < 1
otherwise
py(n) ( )d
1
B = 2/ erf( ).
2
= 1, resulting in
348
Now we have
y(n) =
!1
2
1 ( 1)2
2(1 e1/2 )
2
0.54
d
1
e
erf( 1 ) 2
erf( 1 )
y2(n) =
!1
( y ( n ) )2
( 1)2
2
1
e 2 d 0.08.
erf( 2) 2
Example 7.16. Consider a random signal x (n) that can assume values
{No, Yes} with probabilities 1 p and p. If a random realization of this
signal is available with N = 1000 samples and we obtained that the event
Yes appeared 555 times find the interval where the true p will be with
probability of 0.95. Denote by y the number of observed Yes values divided
by N. We can assume that the mean value estimates for various realizations
are Gaussian distributed.
This is a binomial random variable with the mean p and the variance
p (1 p )
=
N
y = 0.0157.
y2 =
555
555
1000 (1 1000 )
1000
0.2470
1000
555
1000
= [ p 0.0314, p + 0.0314]
Ljubia Stankovic
349
2 2 /2
e
u ( ).
2
1 ( 2 +2 )/2
e
.
2
""
p r i (, )dd
2 + 2 < 2
1
2
""
2 + 2 < 2
e(
2 + 2 )/2
dd.
350
With = cos and = cos (the Jacobian of the polar coordinate transformation is J = ||) we get
P{
2
= 2
"
2r (n) + 2i (n)
2 /2
dd =
1
< } = 2
" "2
dd
0 0
2 /2
"
2 /2
e d = (1 e
2 /2
dF|(n)| ( )
d
2 2 /2
e
u ( ).
2
(7.27)
2 22
e u( )
2
A2
1 | |/
e
.
2
Ljubia Stankovic
351
Gaussian distribution
p ()
0.6
0.4
0.2
0
-5
-4
-3
-2
-1
Laplacian distribution
0.6
p()
0.4
0.2
0
-5
-4
-3
-2
-1
Figure 7.11 The Gaussian and Laplacian noise histograms (with 10000 realizations), with
corresponding probability density function (dots).
352
as
(n) =
1 (n)
.
2 (n)
Both the mean and the variance are signal dependent in the case of multiplicative noise.
7.4
(7.28)
where s(n) is a deterministic useful signal and (n) is an additive noise. The
DFT of this signal is
X (k) =
N 1
(7.29)
n =0
N 1
n =0
s(n)e j2kn/N +
N 1
n =0
Ljubia Stankovic
353
(7.30)
N 1 N 1
n1 =0 n2 =0
N 1 N 1
n1 =0 n2 =0
N 1 N 1
n1 =0 n2 =0
(7.31)
X2 (k ) = 2 N.
(7.32)
(7.33)
with a frequency adjusted to the grid 0 = 2k0 /N, then its DFT is
S(k ) = AN(k k0 ).
Peak signal-to-noise ratio, being relevant parameter for the DFT based
estimation of frequency, is
PSNRout =
(7.34)
354
x(n)
X(k)
x(n)
X(k)
Figure 7.12 Illustration of a signal x (n) = cos(6n/64) and its DFT (top row); the same signal
corrupted with additive zero-mean real-valued Gaussian noise of variance 2 = 1/4, along
with its DFT (bottom row).
Ex
=
SNRin =
E
N 1
| x (n)|
n =0
N 1 !
E |(n)|
n =0
N A2
A2
" = N2 = 2 .
2
(7.35)
If the maximal DFT value is detected then only its value could be used for
the signal reconstruction (equivalent to the notch filter at k = k0 being used).
The DFT of output signal is then
Y ( k ) = X ( k ) ( k k 0 ).
The output signal in the discrete-time domain is
y(n) =
1
N
N 1
n =0
Ljubia Stankovic
355
| x (n)|
Ex
=
SNRout =
E X
n =0
!"
"2 #
" (k )
"
E " N0 e j2k0 n/N "
N 1
n =0
N A2
=N
N2
N2
A2
= N SNRin .
2
Taking 10 log() of both sides we get the signal-to-noise ratio relation in dB,
SNRout [dB] = 10 log N + SNRin [dB].
(7.36)
Example 7.18. If the DFT of a noisy signal s(n) + (n) is calculated using a window
function w(n), find its mean and variance. Noise is white, r = 2 (n), with
zero-mean.
Here,
X (k ) =
N 1
n =0
N 1
n =0
N 1 N 1
n1 =0 n2 =0
= 2
N 1
n =0
|w(n)|2 = 2 Ew ,
(7.37)
Example 7.19. The DFT definition, for a given frequency index k, can be understood as
X (k ) =
N 1
n =0
=N
mean
n=0,1,...,N 1
(7.38)
356
(7.39)
n=0,1,...,N 1
can produce better results than (7.38). Calculate the value X (0) using (7.38)
and estimate it by (7.39) for s(n) = exp( j4n/N ) with N = 8 and noise
(n) = 2001(n) 204(n 3). Which one is closer to the noise-free DFT
value?
If we can expect strong impulsive noise then the mean value will be
highly sensitive to this noise. The median based calculation is less sensitive
to strong impulsive noise. For the given signal
s(n) = exp( jn/2) = [1, j, 1, j, 1, j, 1, j]
and noise (n) the value of X (0) is
X (0) = 0 + 2001 204 = 805.
The median-based estimation is
XR (0) = 8 median {2002, 0, 1, 204, 1, 0, 1, 0}+
n=0,1,..,N 1
(7.40)
(7.41)
Ljubia Stankovic
357
Since the noise is Gaussian, the probability density function of the error is
p(e(n, )) =
1 |e(n,)|2 /(22 )
e
.
22
The joint probability density function for all samples from the data set is
equal to the product of individual probability density functions
pe (e(0, ), e(1, ), ..., e( N 1, )) =
2
N 1
2
1
e n=0 |e(n,)| /(2 ) .
2
N
(2 )
N 1
n =0
|e(n, )|2 =
N 1 !
n =0
!2
!
!
!x (n) be j2kn/N ! .
(7.42)
1
N
#
"
1
x (n)e j2kn/N = mean x (n)e j2kn/N = X (k ).
N
n =0
N 1
N 1
n =0
| x (n) be
j2kn/N 2
| =
N 1
n =0
| x (n)|
N | b |2 .
358
If the additive noise were, for example, Laplacian then the probability
density function would be p(e(n, )) = 21 e|e(n,)|/ , and the solution of
() = nN=01 |e(n, )| minimization would follow from
!
"
X (k ) = Nmedian x (n)e j2kn/N .
N 1
n =0
with X2 (k ) = 2 N and E{(k )} = 0. The real and imaginary parts of the DFT
X (k0 ) at the signal position k = k0 are Gaussian random variables, with total
variance 2 N, or
N ( N A, 2 N/2),
N (0, 2 N/2),
(7.43)
N (0, 2 N/2).
Next, we will find the probability that a DFT value of noise at any
k = k0 is higher than the signal DFT value at k = k0 . This case corresponds
Ljubia Stankovic
359
2 2 /(2 N )
e
, 0.
2 N
The DFT at a noise only position takes a value greater than , with probability
Q() =
2 2 /(2 N )
2
=
(
).
e
d
exp
2 N
2 N
(7.44)
The probability that a DFT of noise only is lower than is [1 Q()]. The
total number of noise only points in the DFT is M = N 1. The probability
that M independent DFT noise only values are lower than is [1 Q()] M .
Probability that at least one of M DFT noise only values is greater than , is
G () = 1 [1 Q()] M .
(7.45)
The probability density function for the absolute DFT values at the
position of the signal (whose real and imaginary parts are described by
(7.43)) is Rice-distributed
p( ) =
2 ( 2 + N 2 A2 )/(2 N )
e
I0 (2N A/(2 N )), 0,
2 N
(7.46)
360
G ( ) p( )d =
!
0
"
2
1 1 exp( 2 )
N
$M %
2
2 2
2
2
2 e( + N A )/( N ) I0 (2N A/(2 N ))d.
N
(7.47)
(7.48)
k=
k=
h(k )E{ x (n k )}
h ( k ) x ( n k ) = h ( n ) n x ( n ).
(7.49)
(7.50)
k =
h(k ) = x H (e j0 ).
(7.51)
Ljubia Stankovic
361
k=
k=
E{ x (k ) x (m)}h(n k )
r xx (k, m)h(n k ).
(7.52)
ryx (l ) =
p=
r xx ( p)h(l p) = r xx (l ) l h(l ).
k=
k=
E{ x (n) x (k )}h (m k )
r xx (n, k )h (m k ).
(7.53)
p=
r xx ( p)h ( p l ).
l =
r xy (l )zl =
l = p=
k= p=
r xx ( p)h ( p l )zl
!
"k
r xx ( p)h (k )z p z1
R xy (z) = R xx (z) H (
1
).
z
(7.54)
Ljubia Stankovic
495
In a vector notation
G(n) = U(n) +
bk ( n ) G ( n k ).
k =1
8.9
496
1.5
a (n)
0
Adaptive Systems
0.5
b (n)
2
b (n)
1
-0.5
-1
a (n)
1
-1.5
20
40
60
80
100
120
time index n
140
160
180
200
20
40
60
80
100
120
time index n
140
160
180
200
2
1
0
-1
-2
-3
Figure 8.31 Identification of an unknown system (from Example 8.26) using the adaptive
recursive system.
time instant the adaptive system coefficients are changed following the rule
H ( n + 1) = H ( n ) + ( n ) e ( n ) X ( n ).
(8.22)
Ljubia Stankovic
497
is the first step towards Kalman filters, and (8.22) now becomes
H ( n + 1) = H ( n ) + G ( n ) e ( n ) X ( n ) = H ( n ) + g ( n ) e ( n ).
(8.23)
Note that it is assumed that the unknown system is deterministic and nonstationary. Since the weight error vector can be related with the system
output error e(n) with:
( n ) + ( n ),
e(n) = X T (n)H + (n) X T (n)H(n) = X T (n)H
(8.26)
a relation between J MSE and J MSD can be found indicating that the minimization of MSD also corresponds to the minimization of MSE. For the
simplicity of derivation we will assume that X(n) is deterministic which is
a common assumption in Kalman filtering literature, although it is usually
treated as a zero-mean process with autocorrelation matrix R in the context of adaptive
systems.
% If we introduce the weight error covariance matrix
$
(n)H
T (n) , in order to perform the minimization of J MSD , startP(n) = E H
ing from (8.23) a recursive relation for the matrix P(n) is established
(n) + (n))
H H(n + 1) = H H(n) G(n)X(n)(X T (n)H
( n + 1) = H
(n) g(n)X T (n)H
(n) g(n)(n)
H
&
'
( n + 1) H
T ( n + 1) = H
(n) g(n)(n)
(n) g(n)X T (n)H
H
'T
&
(n) g(n)(n)
(n) g(n)X T (n)H
H
&
'
P(n + 1)=P(n) P(n)X(n)g T (n) + g(n)X T (n)P(n)
&
'
+ g(n)g T (n) X T (n)P(n)X(n) + (n) .
498
Adaptive Systems
P(n)X(n)
,
X T (n)P(n)X(n) + 2
(8.27)
which is known as the Kalman gain. Besides the calculation of (8.27), the
Kalman filter which estimates the optimal time-invariant and deterministic
coefficients for each time instant also includes the coefficients adjustment
H(n + 1) = H(n) + g(n)(d(n) X T (n)H(n)),
as well as the weight error covariance matrix update
P ( n + 1 ) = P ( n ) g ( n ) X T ( n ) P ( n ).
(8.28)
Note that previous algorithm steps for 2 = 1 can be related with the
RLS algorithm equations.
A generalization of the previous approach assumes time-varying and
stochastic weight vector H (n)
H ( n + 1 ) = F ( n ) H ( n ) + q ( n ),
T
(8.29)
(8.30)
(8.31)
Ljubia Stankovic
499
(8.32)
(n|n) =
Note that the same definition of the weight error vector H
H(n|n) holds, as well as for weight error covariance matrix
!
"
(n|n)H
T (n|n) .
P(n|n) = E H
The weight error covariance matrix is updated in the same manner as for
the time-invariant deterministic case
P ( n | n ) = P ( n | n 1 ) g ( n ) X T ( n ) P ( n | n 1 ),
(8.33)
with the respect to the new index notation. The general Kalman filter also
includes the prediction step of weight error matrix which easily follows
from its definition
!
"
( n + 1| n ) H
T (n + 1|n) = F(n)P(n|n)F T (n) + Q. (8.34)
P ( n + 1| n ) = E H
P ( n | n 1) X ( n )
.
X T (n)P(n|n 1)X(n) + 2
(8.35)
deterministic system with two coefficients h0 = 3 and h1 = 4 using the standard LMS algorithm and Kalman filter (for stationary system identification),
with N = 2. The input signal is colored noise x (n) = 5w(n) + 3.4w(n 1) +
w(n 2), where w(n) is a zero mean white noise with variance w2 = 1. The
step = 0.0005 is used for the LMS algorithm. Show the convergence paths
on the MSE contour plot. After how many iterations the Kalman filter approaches the optimal solution?
The convergence paths on the MSE contour are shown in Fig. 8.32.
Numbers on the Kalman filter path indicate that the optimal solution is
obtained after only two iterations.
500
Adaptive Systems
(h0,h1)
6
LMS convergence path
Kalman filter convergence path
5
1 (h*,h*)
0 1
h1
00
0
3
h0
Figure 8.32 Convergence paths of the LMS algorithm and Kalman filter in the problem of
identification of unknown time-invariant deterministic system. Contour lines are the projections of the MSE surface on the coefficients plane.
8.10
NEURAL NETWORKS
Ljubia Stankovic
inputs
Figure 8.33
501
neural
network
outputs
with the training period. It continues through the whole functioning of neural network.
Neural network can be defined as an artificial cell system capable of
accepting, memorizing and applying empirical knowledge. The knowledge
here means that the neural network can respond to an input from the environment in an appropriate way. Neural network is connected to the environment in two ways: through the inputs where the environment influences
the network and through the outputs where the network responses to environment, as it is illustrated in Figure 8.33.
The basic element in a neural network is neuron. It is the elementary
unit for a distributed signal processing in a neural network. A full functionality of neural networks is achieved using large number of interconnected
neurons. Connections among neurons are one-directional (the outputs from
one neuron can be used as inputs to the other neuron). They are called
synapses, in analogy with the biological systems.
Possible applications of neural networks include almost all aspects of
modern life, text and speech recognition, optimization of a communication
channel, financial forecasts, detection of a fraud credit card usage, are just a
few examples.
Of course, there are many situations when a usage of neural networks
is not justified. In many cases our knowledge about the system, that we
want to control or observe, is sufficient and complete so the problem can be
solved using classical algorithms, with sequential processing on common
computers.
An ideal system for neural networks realization would use independent systems for hardware realization of each neuron. Then the distributed
processing would be most efficient. In the cases of monoprocessor computers, high efficiency is achieved by using very fast sequential data processing.
Typical examples are computer programs for recognition of a scanned text.
502
Adaptive Systems
x (n)
x (n)
x2(n)
x2(n)
x (n)
u(n)
...
output
...
inputs
y(n)
network
function
y(n)
activation
function
x (n)
(a)
(b)
Figure 8.34 Neuron schematic symbol (a) and the model based on network and activation
functions (b).
8.10.1 Neuron
The first step in a neuron design is to define its inputs and outputs. In
biological systems the input and output signals to a neuron are electric
potential that can be modelled by real numbers. The same principle is used
in artificial neurons. Illustration of a neuron is given in Figure 8.34(a) for
the case when it has N inputs ( x1 (n), x2 (n), . . . , x N (n)) and one output y(n).
Index n may be a time index, but it can also be understood as a cardinal
number that identifies the input and output index of a neuron.
Neuron represents and algorithm that transforms N input data into
one output signal. It is common to split this algorithm into two parts: 1)
combinatorial process that transforms N input data to one output value
u(n) and 2) the process that produces output signals y(n) based on the
value of u(n). This two-phase model of a neuron is presented in Figure
8.34(b). The algorithm/rule to produce u(n) is called the network function,
while the second part which determines the output value is the activation
function.
Neuron knowledge is accumulated and contained in the way how the
input data are combined, i.e., in the network function.
8.10.2 Network Function
The basic task of the network function is to combine the input data. The
simplest way of combining N input signals is in their linear weighed combination with coefficients wi , i = 1, 2, ..., N. This is a linear network function.
Because of it simplicity, this type of function is commonly used in neurons.
Ljubia Stankovic
503
Name
Network function
Linear form
u(n) =
wi x i ( n ) +
i =1
N N
u(n) =
Product form
u(n) =
The activation function transform the output value from the network function to an acceptable output value. A common requirement is that the output values have limited range. Thus, most of the activation functions have
a bounded interval of real numbers as its codomain, like for example, [0, 1]
or [1, 1] or a set of binary digits. Forms of commonly used activation functions are presented in table. The most important functions from this set are
the unipolar threshold function and the unipolar sigmoid. Some of the activation functions are presented in Figure 8.35 as well.
504
Adaptive Systems
f(u)
f(u)
u
unipolar sigmoid
u
unipolar threshold function
f(u)
u
Gaussian function
f(u)
bipolar sigmoid
Figure 8.35
f(u)
f(u)
limiter
Function
Linear
Linear with a limiter
Threshold function (unipolar)
Threshold function (bipolar)
Sigmoid (unipolar)
Sigmoid (bipolar)
Inverse tangent function
Gauss function
Formula
f (u) = u
f (u) =
$
f (u) =
$
f (u) =
1 za
u>1
u za 1 u 1
1 za
u < 1
1 za u > 0
0 za u < 0
1 za u > 0
1 za u < 0
1
f (u) =
1 + exp(u)
2
1
f (u) =
1 + exp(2u)
f (u) = 2 arctan
% (u) &
f (u) = exp
( u m )2
2
Ljubia Stankovic
505
x1(n)
y(n)
x1(n)
y(n)
x2(n)
x (n)
2
(a)
Figure 8.36
(b)
506
Adaptive Systems
y1(n)
x (n)
1
y (n)
2
x2(n)
input
layer
y (n)
3
output
layer
Figure 8.37
x1(n)
y1(n)
x2(n)
input
layer
Figure 8.38
hidden
layer I
output
layer
hidden
layer II
hidden
layer III
Ljubia Stankovic
507
508
Adaptive Systems
wk xk (n )
(8.36)
k =1
where it has been assumed that the neuron has N input data. The weighting
coefficients wk represent knowledge that the network should get through
the training procedure. This knowledge will be then used in real situations.
The vector notation is
x1 ( n )
w1
x2 ( n )
w2
W= .
.
X(n) = .
..
..
x N ( n ) N 1
w N N 1
Ljubia Stankovic
509
= X T (n) W.
The neuron output is
!
"
y(n) = f (u(n)) = f W T X(n) .
y1 ( n )
y2 ( n )
.
Y(n) = .
..
y M (n)
M 1
w11 w21 w M1
w12 w22 w M2
= .
.
..
..
..
..
.
.
.
w1N
w2N
w MN
NM
510
Adaptive Systems
where ||X(n)||22 is the squared norm two of vector X(n) (sum of its squared
elements). The value of W T X(n) is increased for ||X(n)||22 , what was the
aim. If d(n) = 0 and y(n) = 1 then Wnew = Wold X(n) holds, meaning
that W T X(n) is reduced for ||X(n)||22 .
The coefficient is the learning coefficient. It is positive. The choice
of parameter value is of great importance for the rate of convergence
and learning process of the network. Larger values may reduce the learning
period, but also may influence the convergence of the training process.
Example 8.28. Consider a one-neuron neural network. Assume that the activation
function of the neuron is unipolar threshold function and that the neuron
Ljubia Stankovic
511
1
X = 1
0
%
D= 1
0
1
1
0
1
0
0
0
1
1
0
0
&
0
where matrix X contains the input data and vector D consists of desired
outputs from the neural network for the considered input data values. Train
the neural network with = 0.5.
Since the neuron is biased one more input will be introduced. Its
input value is always 1. After this modification the matrix of input data is
1 1 1 1 1 1
1 1 0 0 0 1
X=
1 0 1 1 0 0 .
0 1 1 0 1 0
Initial values of weighting coefficients are random, for example,
1
w0
w1 1
W=
w2 = 1 .
0
w3
Now we can start the first epoch of training process. We will use
all input-output data pairs and calculate the output y(n) from the neural
network. The output y(n) will be compared with the desired value d(n) and
the coefficients W will be appropriately modified for each pair of data.
For the first par of data we have
1
)
*
%
& 1
T
) = 1.
y (1) = f W X (1) = f ( 1 1 1 0
1
0
Since d(1) = 1 the error d(n) y(n) is 0 and the coefficients are not modified.
For the second pair of data
*
)
y(2) = f W T X(2) = 0.
The desired value is d(2) = 1. Since the error is not zero, the coefficients
should be modified as
1
1
0.5
1
1 1.5
Wnew = Wold + (d(2) y(2))X(2) =
1 + 0.5 0 = 1 .
0
0.5
1
512
Adaptive Systems
Next pair of input-output data is used. After all data pairs are used, the first
epoch of training is finished. Nonzero error appeared in three out of six data
pairs. The final value of the coefficients, after the first training epoch, is
!
Wepoch 1 = 1.5
0.5
"T
With this initial value, the second epoch of training is completed, using
the same input-output pairs of data. After the second epoch nonzero error
appeared two times. The final values of the coefficients, after the second
epoch, are
!
"T
Wepoch 2 = 1.5 1 1 0 .
The process is continued in the third epoch. In the fifth epoch we came
to the situation that the neural network has made no error. It means that the
training is completed and that more epochs are not needed. The final values
of the coefficients are
!
"T
W = 1.5 1.5 0.5 0.5 .
Ljubia Stankovic
513
d(n) f
wk x k ( n )
k =1
$$2
(8.37)
(n)
wk
or
(n)
W
where (n)/W is the gradient of error function.
Derivatives can be calculated from (8.37) as
Wnew = Wold
(n)
y(n)
= (d(n) y(n))
= (d(n) y(n)) f
wk
wk
wk x k ( n )
k =1
eu
d
1
1
eu
=
=
=
du 1 + eu
1 + eu 1 + eu
(1 + e u )2
%
&
1
= f (u) (1 f (u)) .
= f (u) 1
1 + eu
f (u) =
Therefore
(n)
= (d(n) y(n)) y(n) (1 y(n)) xk (n),
wk
where
f
wk x k ( n )
k =1
= y(n)
x k ( n ).
514
Adaptive Systems
2
1 e2u
=
1
1 + e2u
1 + e2u
we would have
f (u) =
signals and a sigmoid activation function. Input values are random numbers
from the interval [0, 1]. Available are K = 30 input-output pairs of data.
Training of of the neural network should be done in 30 epochs with = 2.
Data for network training are obtained as a set of 30 input values of x1
and x2 . They are assumed as random numbers from the interval from 0 to 1
with a uniform probability density function. For each training pair of random
Ljubia Stankovic
515
numbers x1 and x2 the desired output data is calculated using the formula
d=
1
x1 2x2
+
.
2 3 + x12 + 3x22
Find the total square error after the first, second, fifth and thirtieth
epoch. What are the coefficient values at the end of training process? If the
input values x1 = 0.1 and x2 = 0.8 are applied to the network after the training
process is completed find the output value y and compare it with the desired
result d calculated using the formula.
1
x1 2x2
+
= 0.1957.
2 3 + x12 + 3x22
The error is small. The task for neural network in this example was to find a
complex, nonlinear relation between the input and output data.
516
Adaptive Systems
Ljubia Stankovic
517
(n)
w pk
where
!
""
!
"
!
(n)
= (d(n) y(n)) f V T f W T X(n) vk f WkT X(n) x p (n).
w pk
The pth element of vector X(n) is denoted by x p (n), while the kth element
$
#
of vector V is vk . Taking into account that uk (n) = f WkT X(n) we get
(n)
= (d(n) y(n)) y(n) (1 y(n)) vk [uk (n)(1 uk (n))] x p (n).
w pk
where n2 denotes the learning rule for the considered layer of neurons. In
vector form we can write
Wk,(new) = Wk,(old) + n2 vk [uk (n)(1 uk (n))] X(n).
This is the modification formula for all coefficients of one neuron in the
hidden layer. The modification can be generalized to all neurons in the
hidden layer
W(new) = W(old) + n2 X(n) [V. U(n). (1 U(n))] T ,
where . denotes the element-by-element multiplication, while 1 is the
vector of the same dimension as U(n) whose elements are equal to 1.
The described procedure can be generalized for neural networks with
more than two layers. The basic principle is that based on the error in
one layer, the coefficients are modified in this layer and then in all layers
before the considered layer. It means that the influence of the output error
is transferred in an inverse way (backpropagated) to the correction of the
coefficients of the layers of neurons.
518
Adaptive Systems
Example 8.30. Consider a two-layer neural network with two neurons in the
hidden layer and one neuron in the output layer. The activation function for
all neurons is the unipolar sigmoid. The task for this neural network is to find
unknown relation between the input and output data. Step = 5 is used in
the training process. The data for the training are formed as in Example 8.29,
i.e., as a set of K = 30 input data x1 and x2 that are uniformly distributed
random numbers from the interval 0 x1 1, 0 x2 1. For each training
input value of x1 and x2 the desired signal is calculated as
d=
1
x1 2x2
.
+
2 3 + x12 + 3x22
Find the total square error after 10th, 100th, and 300th epoch. What
are the coefficients of the neurons after the training process? If the values of
x1 = 0.1 and x2 = 0.8 input the trained neural network find the output y and
compare it with the desired result d.
1
x1 2x2
+
= 0.1957.
2 3 + x12 + 3x22
The error is very small. As expected this result is better than in the case of
one-layer neural network (Example 8.29). However, the calculation process
is significantly more demanding.
Ljubia Stankovic
519
After the training process we may expect that each of the neurons
recognizes one category (belonging to one group) of the input signals. If
an uncategorized input signal appears it means that the estimation of the
number of neurons is not good. It should be increased and the training
process should be continued. When two neuron adjust to the same category,
then they produce the same result and one of them can be eliminated. In this
way, we may avoid the assumption that the number of categories (groups)
or neurons N is known in advance.
Example 8.31. Consider a neural network with two input data and 3 neurons. The
task of neural network is to classify the input data in one of three categories.
Each neuron corresponds to one category. The classification decision is made
by choosing the neuron with the highest output. Activation function is a
bipolar sigmoid.
Simulate the neural network in the case when the input data belongs to
one of thee categories with equal probability. Data from the first category are
pairs of Gaussian random variables with probability density function whose
means are x1 = 0 and x2 = 4 and variances are x21 = 4 and x22 = 0.25. For the
data from the second category the mean values and variances of Gaussian
variables are x1 = 4, x2 = 2, x21 = 1 and x22 = 4. In the third category are
the input data with x1 = 4 and x2 = 2, x21 = 1 and x22 = 1. during the
training process the step = 0.5 is used.
8.10.10
Voting Machines
Chapter 9
Time-Frequency Analysis
The Fourier transform provides a unique mapping of a signal from the
time domain to the frequency domain. The frequency domain representation provides the signals spectral content. Although the phase characteristic
of the Fourier transform contains information about the time distribution of
the spectral content, it is very difficult to use this information. Therefore, one
may say that the Fourier transform is practically useless for this purpose,
i.e., that the Fourier transform does not provide a time distribution of the
spectral components.
Depending on problems encountered in practice, various representations have been proposed to analyze non-stationary signals in order to provide time-varying spectral description. The field of the time-frequency signal analysis deals with these representations of non-stationary signals and
their properties. Time-frequency representations may roughly be classified
as linear, quadratic, and higher order representations.
Linear time-frequency representations exhibit linearity, i.e., the representation of a linear combination of signals equals the linear combination
of the individual representations. From this class, the most important one
is the short-time Fourier transform (STFT) and its variations. The energetic
version of the STFT is called spectrogram. It is the most frequently used tool
in time-frequency signal analysis.
The second class of time-frequency representations are the quadratic
ones. The most interesting representations of this class are those which provide a distribution of signal energy in the time-frequency plane. They will be
referred to as distributions. The concept of a distribution is borrowed from
_________________________________________________
Authors: Ljubia Stankovic, Milo Dakovic, Thayaparan Thayananthan
521
522
Time-Frequency Analysis
w()
x(t)
t
t
x(t+)w()
Figure 9.1
the probability theory, although there is a fundamental difference. For example, in time-frequency analysis, distributions may take negative values.
Other possible domains for quadratic signal representations are the ambiguity domain, the time-lag domain and the frequency-Doppler frequency
domain. In order to improve time-frequency representation various higherorder distributions have been defined as well.
9.1
The idea behind the short-time Fourier transform (STFT) is to apply the
Fourier transform to a portion of the original signal, obtained by introducing a sliding window function w(t) to localize the analyzed signal x (t). The
Fourier transform is calculated for the localized part of the signal. It produces the spectral content of the portion of the analyzed signal within the
time interval defined by the width of the window function. The STFT (a
time-frequency representation of the signal) is then obtained by sliding the
window along the signal. Illustration of the STFT calculation is presented in
Fig.9.1.
Analytic formulation of the STFT is
STFT (t, ) =
x (t + ) w( ) e j d.
(9.1)
From (9.1) it is apparent that the STFT actually represents the Fourier
transform of a signal x (t), truncated by the window w( ) centered at
Ljubia Stankovic
523
instant t (see Fig. 9.1). From the definition, it is clear that the STFT satisfies
properties inherited from the Fourier transform (e.g., linearity).
By denoting xt ( ) = x (t + ) we can conclude that the STFT is the
Fourier transform of the signal xt ( )w( ),
STFT (t, ) = FT { xt ( )w( )}.
is
STFTI I (t, ) =
x ( )w ( t)e j d
(9.2)
x (t) = (t t1 ) + (t t2 ) + e j1 t + e j2 t .
(9.3)
+ W ( 1 )e j1 t + W ( 2 )e j2 t ,
(9.4)
where W () is the Fourier transform of the used window. The STFT is depicted in Fig. 9.2 for various window lengths, along with the ideal representation. A wide window w(t) in the time domain is characterized by a narrow
Fourier transform W () and vice versa. Influence of the window to the results will be studied later.
(9.5)
can be approximately calculated for a large a, by using the method of stationary phase. Find its form and the relation for the optimal window w( ) width,
assuming that the window is nonzero for | | < T .
524
Time-Frequency Analysis
STFT
(t,)
(t,)
STFT
wide
narrow
1
(a)
t1
t2
(b)
t1
STFToptimal(t,)
t2
Ideal TFR(t,)
t2
1
(c)
t1
t2
(d)
t1
Figure 9.2 Time-frequency representation of the sum of two delta pulses and two sinusoids
obtained by using (a) wide window, (b) narrow window (c) medium width window and (d)
ideal time-frequency representation.
e ja(t+ ) w( )e j d
=e
2at
2a
since
2a(t + 0 ) = .
$"
j
a
"
2j
2a
(9.6)
Ljubia Stankovic
525
(9.7)
In this case, the width of |STFT (t, )| along frequency does not decrease with
the increase of the window w( ) width. The width of |STFT (, t)| around the
central frequency = 2at is
D = 4aT,
where 2T is the window width in the time domain. Note that this relation
holds for a wide window w( ), such that the stationary phase method may
be applied. If the window is narrow with respect to the phase variations of
the signal, the STFT width is defined by the width of the Fourier transform
of window. It is proportional to 1/T. Thus, the overall STFT width could
be approximated by a sum of the frequency variation caused width and the
windows Fourier transform width, that is,
Do = 4aT +
2c
,
T
(9.8)
where c is a constant defined by the window shape (by using the main lobe
as the window width, it will be shown later that c = 2 for a rectangular
window or c = 4 for a Hann(ing) window). This relation corresponds to
the STFT calculated as a convolution of an appropriately scaled time domain
window whose width is | | < 2aT and the frequency domain form of window
W (). The approximation is checked against the exact STFT calculated by
definition. The agreement is almost complete, Fig.9.3. Therefore, there is a
window width T producing the narrowest possible STFT for this signal. It is
obtained by equating the derivative of the overall width to zero,
4a
2c
= 0,
T2
which results in
c
.
(9.9)
2a
As expected, for a sinusoid, a 0, To . This is just an approximation
of the optimal window, since for narrow windows we may not apply the
stationary phase method (the term 4aT is then much smaller than 2c/T and
may be neglected anyway).
Note that for a = 1/2, when the instantaneous frequency is a symmetry
line for the time and the frequency axis
To =
2c
2c
= 0 or 2T = ,
T
T2
meaning that the optimal window should have the widths equal in the timedomain 2T and in the frequency domain 2c/T (main lobe width).
526
Time-Frequency Analysis
STFT approximation
10
log of the window width log2(T)
10
9
8
7
6
5
4
3
2
1
-1
0
frequency
9
8
7
6
5
4
3
2
1
-1
0
frequency
Figure 9.3 Exact absolute STFT value of a linear FM signal at t = 0 for various window
widths T = 2, 4, 8, 16, .., 1024 (left) and its approximation calculated as an appropriately scaled
convolution of the time and frequency domain window w( ) (right).
1
2
1
2
! !
X ( ) e j(t+ ) w( ) e j d d
"
#
X ( )W ( ) e jt d = X ()e jt W (). (9.10)
j
$
$
$
= $ x ( )w ( t)e
d $ = $ x (t + )w ( )e
d $$ .
$
$
$
$
527
x (t)
x1(t)
Ljubia Stankovic
(a)
(b)
|X ()|
|X1()|
(c)
(d)
Figure 9.4 Two different signals x1 (t) = x2 (t) with the same amplitudes of their Fourier
transforms, i.e., | X1 ()| = | X2 ()|.
Example 9.3. For illustration consider two different signals x1 (t) and x2 (t) pro-
e
128
64
64
#
!
" $
t
t 50 2 ( t50 )2
16
e
1.6 cos 15
2
(9.11)
128
64
x2 (t) = x1 (255 t).
STFT (t0 , )e j d.
528
Time-Frequency Analysis
SPEC1(t,)
250
200
150
100
t
50
(a)
0
0
0.5
1.5
2.5
2.5
SPEC2(t,)
250
200
150
100
t
50
(b)
0
0
0.5
1.5
Figure 9.5
This relation can be theoretically used for the signal within the region
w( ) = 0. In practice it is used within the region of significant window w( )
values.
If the window is shifted for R, for each next STFT calculation, then a
set of values
x (t0 + iR + )w( ) =
Ljubia Stankovic
529
is obtained. If the value of step R is smaller than the window duration then
the same signal value is used within two (several) windows. Using a change
of variables iR + = and summing over all overlapping windows we get
x (t0 + ) w( iR) =
i
Values of i in the summation are such that for a given and R the value of
iR = is within the window w( ).
If the sum of shifted versions of the windows is constant (without loss
of generality assume equal to 1), w( iR) = 1, then
i
x ( t0 + ) =
9.2
WINDOWS
The window function plays a crucial role in the localization of the signal
in the time-frequency plane. The most commonly used windows will be
presented next.
9.2.1 Rectangular Window
The simplest window is the rectangular one, defined by
"
1 for | | < T
w( ) =
0 elsewhere
(9.12)
!T
e j d =
2 sin(T )
.
(9.13)
530
Time-Frequency Analysis
W (e j ) =
n= N/2
e jn =
sin(N/2)
.
sin(/2)
1 |/T |
0
for | | < T
elsewhere.
(9.14)
4 sin2 (T/2)
.
2
(9.15)
Ljubia Stankovic
531
N/21
n= N/2
"
2 |n| jn sin2 (N/4)
=
e
.
N
sin2 (/2)
for | | < T
elsewhere.
(9.16)
Since cos (/T ) = [exp ( j/T ) + exp ( j/T )]/2, the Fourier
transform of this window is related to the Fourier transform of the rectangular window of the same width as
1
1
1
WH () = WR () + WR ( /T ) + WR ( + /T )
2
4
4
2 sin(T )
=
.
( 2 2 T 2 )
(9.17)
N
N
N
( k ) + ( k + 1 ) + ( k 1 ).
2
4
4
532
Time-Frequency Analysis
DFT{ x (n)w(n)} =
Example 9.4. Find the window that will correspond to the frequency smoothing
( X (k + 1) + X (k) + X (k 1))/3, i.e., to
1
DFT{ x (n)} k DFT{w(n)}
N
1
1
1
= X ( k + 1) + X ( k ) + X ( k 1).
3
3
3
DFT{ x (n)w(n)} =
N
N
N
( k ) + ( k + 1) + ( k 1).
3
3
3
Example 9.5. Find the formula to calculate the STFT with a Hann(ing) window, if
the STFT calculated with a rectangular window is known.
X ( )W ( )e jt d
(9.18)
(9.19)
Ljubia Stankovic
533
for | | < T
elsewhere.
(9.20)
"
sin(( /T ) T ) sin(( + /T ) T )
+
/T
+ /T
534
Time-Frequency Analysis
T
T
+
1.5
3.5
"
resulting in
a = 25/46
= 0.54.
(9.21)
This window has several sidelobes, next to the mainlobe, lower than the
previous two windows. However, since it is not continuous at t = T, its
decay in frequency, as , is not fast. Note that we let the mainlobe
to be twice wider than in the rectangular window case, so we cancel out not
the first but the second sidelobe, at its maximum.
The discrete-time domain form is
!
"$
#
2n
[u(n + N/2) u(n N/2)]
w(n) = 0.54 + 0.46 cos
N
with
W (k ) = 0.54N(k ) + 0.23N(k + 1) + 0.23N(k 1).
9.2.5 Blackman and Kaiser Windows
In some applications it is crucial that the sidelobes are suppressed, as much
as possible. This is achieved by using windows of more complicated forms,
like the Blackman window. It is defined by
w( ) =
for | | < T
elsewhere.
(9.22)
Ljubia Stankovic
535
"
2n
w(n) = 0.42 + 0.5 cos
N
"
4n
+ 0.08 cos
N
#$
[u(n +
N
N
) u(n )]
2
2
The STFT at t = 0 is shown in Fig.9.7. The resolution of close components in x1 (t) is better with the Hann(ing) than with the Blackman window,
since the main lobe of the Blackman window is wider. Small signal in x2 (t) is
visible in the STFT with the Blackman window since its side-lobes are much
lower than in the Hamming window.
x (t + )w( )e j d
m=
536
w()
W()
10 log|W()|
w()
W()
10 log|W()|
w()
W()
10 log|W()|
w()
W()
10 log|W()|
w()
W()
10 log|W()|
Time-Frequency Analysis
Figure 9.6 Windows in the time and frequency domains: rectangular window (first row),
triangular (Bartlett) window (second row), Hann(ing) window (third row), Hamming window
(fourth row), and Blackman window (fifth row).
10
-5
-0.5
0
frequency
0.5
-1
|STFT(0,)|
|STFT(0,)|
10
10
-5
-0.5
0
frequency
0.5
-0.5
0
frequency
0.5
10
-5
10
-1
10
-5
10
-1
537
|STFT(0,)|
|STFT(0,)|
Ljubia Stankovic
10
-0.5
0
frequency
0.5
-1
Figure 9.7 The STFT at n = 0 calculated using the Hamming window (left) and the Blackman
window (right) of signals x1 (n) (top) and signal x2 (n) (bottom).
By denoting
x (n) = x (nt)t
and normalizing the frequency by t, = t, we get the time-discrete
form of the STFT as
STFT (n, ) =
m=
w(m) x (n + m)e jm .
(9.23)
We will use the same notation for continuous-time and discrete-time signals,
x (t) and x (n). However, we hope that this will not cause any confusion since
we will use different sets of variables, for example t and for continuous
time and n and m for discrete time. Also, we hope that the context will be
always clear, so that there is no doubt what kind of signal is considered.
538
Time-Frequency Analysis
k=
0
m
N/21
m= N/2
Ljubia Stankovic
539
4
STFT (n, 0) = 1
STFT (n, 1)
W42
or
W42
W41
1
W41
W42
x ( n 2)
W41
x ( n 1)
x (n)
1
x ( n + 1)
W41
1
1
1
1
STFT(n) = W4 x(n)
with STFT(n) = [STFT (n, 2) STFT (n, 1) STFT (n, 0) STFT (n, 1)] T , x(n) =
[ x (n 2) x (n 1) x (n) x (n + 1)] T , and W4 is the DFT matrix of order
four with elements W4mk = exp( j2mk/N ). Here a rectangular window
is assumed. Including the window function, the previous relation can be
written as
STFT (n)= W4 H4 x(n),
with
w(2)
0
H4 =
0
0
0
w(1)
0
0
0
0
w (0)
0
0
0
0
w (1)
being a diagonal matrix whose elements are the window values w(m),
H4 =diag(w(m)), m = 2, 1, 0, 1 and
w(2)W44
w(2)W 2
4
W4 H 4 =
w(2)
w(2)W42
w(1)W42
w(1)W41
w(1)
w(1)W41
w (0)
w (0)
w (0)
w (0)
w(1)W42
w(1)W41
.
w (1)
1
w(1)W4
x (8)
x (12)
x (0) x (4)
x (1) x (5)
x (9)
x (13)
STFT = W4 H4
x (2) x (6) x (10) x (14) = W4 H4 X4,4
x (3) x (7) x (11) x (15)
540
Time-Frequency Analysis
x (0)
x (1)
STFT = H4 W4
x (2)
x (3)
x (1)
x (2)
x (3)
x (4)
x (2)
x (3)
x (4)
x (5)
...
...
...
...
x (10)
x (11)
x (12)
x (13)
x (11)
x (12)
x (13)
x (14)
x (12)
x (13)
x (14)
x (15)
Assuming that the values of the signal with amplitudes bellow 1/e4 could be
neglected, find the sampling rate for the STFT-based analysis of this signal.
Write the approximate spectrogram expression for the Hann(ing) window of
N = 32 samples in the analysis. What signal will be presented in the timefrequency plane, within the basic frequency period, if the signal is sampled
at t = 1/128?
The time interval, with significant signal content, for the first signal
component is 2 t 2, with the frequency content within 56
8, since the instantaneous frequency is (t) = 12t 32. For the
second component these intervals are 0 t 2 and 160 224. The
maximal frequency in the signal is m = 224. Here we have to take into
account possible spreading of the spectrum caused by the lag window. Its
width in the time domain is dt = 2T = Nt = 32t. Width of the mainlobe
in frequency domain dw is defined by 32dw t = 4, or w = /(8t).
Thus, taking the sampling interval t = 1/256, we will satisfy the sampling
theorem condition in the worst instant case, since /(m + dw ) = 1/256.
In the case of the Hann(ing) window with N = 32 and t = 1/256,
the lag interval is Nt = 1/8. We will assume that the amplitude variations
2
2
within the window are small, that is, w( )e(t+ )
= w( )et for 1/16 <
1/16. Then, according to the stationary phase method, we can write the
Ljubia Stankovic
541
STFT approximation,
2
+12t+32
12
"
1 8( t 1)2 2
w
32 e
32t160
32
"
with approximation,
2
+12t+32
12
"
1 8( t 1)2 2
w
32 e
32t96
32
"
,
(9.25)
with t = n/128 and = 128 within < or 128 < 128.
+ STFTR (n 1, k )e j2k/N .
This recursive formula follows easily from the STFT definition (9.24).
For other window forms, the STFT can be obtained from the STFT
obtained by using the rectangular window. For example, according to (9.18)
the STFT with Hann(ing) window STFTH (n, k ) is related to the STFT with
rectangular window STFTR (n, k ) as
1
1
1
STFTH (n, k ) = STFTR (n, k ) + STFTR (n, k 1) + STFTR (n, k + 1).
2
4
4
This recursive calculation is important for hardware implementation of
the STFT and other related time-frequency representations (e.g., the higher
order representations implementations based on the STFT).
542
Time-Frequency Analysis
(-1)
x(n+N/2-1)
e j2k/N
STFT (n,k)
R
-N
z-1
-1
STFT (n,k+1)
STFTR(n,k)
STFTR(n,k-1)
Figure 9.8
STFT (n,k)
H
-1
Recursive implementation of the STFT for the rectangular and other windows.
Ljubia Stankovic
543
STFT(n,0)
w(n)
w(n) e
STFT(n,1)
j2n/N
x(n)
...
w(n) e
Figure 9.9
j2n(N-1)/ N
STFT(n,N-1)
STFT (t, ) =
x (t + ) w( ) e j d
"
#
x (t )w( )e j d = x (t) t w(t)e jt
what is illustrated in Fig.9.9. The next STFT can be calculated with time step
Rt, meaning downsampling in time with factor 1 R N. Two special
cases are: no downsampling, R = 1, and nonoverlapping calculation, R = N.
Influence of R to the signal reconstruction will be discussed later.
544
9.2.8.1
Time-Frequency Analysis
Overlapping windows
Nonoverlapping cases are important and easy for analysis. They also keep
the number of the STFT coefficients equal to the number of the signal samples. However, the STFT is commonly calculated using overlapping windows. There are several reasons for introducing overlapped STFT representations. Rectangular windows have poor localization in the frequency
domain. The localization is improved by other window forms. In the case of
nonrectangular windows some of the signal samples are weighted in such
a way that their contribution to the final representation is small. Then we
want to use additional STFT with a window positioned in such a way that
these samples contribute more to the STFT calculation. Also, in the parameters estimation and detection the task is to achieve the best possible estimation or detection for each time instant instead of using interpolations for
the skipped instants when the STFT with a big step (equal to the window
width) is calculated. Commonly, the overlapped STFTs are calculated using,
for example, rectangular, Hann(ing), Hamming, Bartlett, Kaiser, or Blackman window of a constant window width N with steps N/2, N/4, N/8, ...
in time. Computational cost is increased in the overlapped STFTs since more
STFTs are calculated. A way of composing STFTs calculated with a rectangular window into a STFT with, for example, the Hann(ing), Hamming, or
Blackman window, is presented in Fig.9.8.
If a signal x (n) is of duration M, in some cases in addition to the
overlapping in time, an interpolation in frequency is done, for example up
to the DFT grid with M samples. The overlapped and interpolated STFT of
this signal is calculated, using a window w(m) whose width is N M, as
STFTN (n, k) =
N/21
m= N/2
Ljubia Stankovic
545
S8(4,3)
S8(12,3)
S8(20,3)
S8(28,3)
S8(36,3)
S8(44,3)
S8(52,3)
S8(60,3)
S8(4,2)
S8(12,2)
S8(20,2)
S8(28,2)
S8(36,2)
S8(44,2)
S8(52,2)
S8(60,2)
S8(4,1)
S8(12,1)
S8(20,1)
S8(28,1)
S8(36,1)
S8(44,1)
S8(52,1)
S8(60,1)
S8(4,0)
S8(12,0)
S8(20,0)
S8(28,0)
S8(36,0)
S8(44,0)
S8(52,0)
S8(60,0)
S8(4,-1)
S8(12,-1)
S8(20,-1)
S8(28,-1)
S8(36,-1)
S8(44,-1)
S8(52,-1)
S8(60,-1)
S8(4,-2)
S8(12,-2)
S8(20,-2)
S8(28,-2)
S8(36,-2)
S8(44,-2)
S8(52,-2)
S8(60,-2)
S8(4,-3)
S8(12,-3)
S8(20,-3)
S8(28,-3)
S8(36,-3)
S8(44,-3)
S8(52,-3)
S8(60,-3)
S8(4,-4)
S8(12,-4)
S8(20,-4)
S8(28,-4)
S8(36,-4)
S8(44,-4)
S8(52,-4)
S8(60,-4)
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
Figure 9.10 The STFT of a linear FM signal x (n) calculates using a rectangular window of the
width N = 8.
window case) are presented in Fig.9.13. Three window widths are used here.
The same procedure is repeated with the windows zero padded up to the
widest used window (interpolation in frequency). The results are presented
in Fig.9.14. Note that regarding to the amount of information all these figures do not differ from the basic time-frequency representation presented in
Fig.9.10.
546
Time-Frequency Analysis
STFT (n,k-1)
STFT (n,k)
0.23 STFTR(n,k-1)
STFT (n,k+1)
0.54 STFTR(n,k)
0.23 STFTR(n,k+1)
= STFTH(n,k)
Figure 9.11 The STFT of a linear FM signal calculated using a rectangular window (from the
previous figure), along with its frequency shifted versions STFTR (n, k 1) and STFTR (n, k
1). Their weighted sum produces the STFT of the same signal with a Hamming window
STFTH (n, k).
a matrix form for the signal inversion using a four-sample STFT (N = 16)
calculated with the rectangular and a Hann(ing) window: (a) Without overlapping, R = 16. (b) With a time step in the STFT calculation of R = 2.
(a) For the nonoverlapping case the STFT calculation is done according to:
x (0) x (4)
x (8)
x (12)
x (1) x (5)
x (9)
x (13)
STFT = W4 H4
x (2) x (6) x (10) x (14) .
x (3) x (7) x (11) x (15)
Ljubia Stankovic
547
S8(4,3)
S8(12,3)
S8(20,3)
S8(28,3)
S8(36,3)
S8(44,3)
S8(52,3)
S8(60,3)
S8(4,2)
S8(12,2)
S8(20,2)
S8(28,2)
S8(36,2)
S8(44,2)
S8(52,2)
S8(60,2)
S8(4,1)
S8(12,1)
S8(20,1)
S8(28,1)
S8(36,1)
S8(44,1)
S8(52,1)
S8(60,1)
S8(4,0)
S8(12,0)
S8(20,0)
S8(28,0)
S8(36,0)
S8(44,0)
S8(52,0)
S8(60,0)
S8(4,-1)
S8(12,-1)
S8(20,-1)
S8(28,-1)
S8(36,-1)
S8(44,-1)
S8(52,-1)
S8(60,-1)
S8(4,-2)
S8(12,-2)
S8(20,-2)
S8(28,-2)
S8(36,-2)
S8(44,-2)
S8(52,-2)
S8(60,-2)
S8(4,-3)
S8(12,-3)
S8(20,-3)
S8(28,-3)
S8(36,-3)
S8(44,-3)
S8(52,-3)
S8(60,-3)
S8(4,-4)
S8(12,-4)
S8(20,-4)
S8(28,-4)
S8(36,-4)
S8(44,-4)
S8(52,-4)
S8(60,-4)
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
Figure 9.12 The STFT of a linear FM signal x (n) calculated using the Hamming window with
N = 8. Calculation is illustrated in the previous figure.
x (8)
x (0) x (4)
x (1) x (5)
x (9)
x (12)
x (13)
= H1 W1 STFT
4
4
x (14)
x (15)
1
where the elements of diagonal matrix H
4 are proportional to 1/w ( m ),
1
H4 =diag([1/w(2) 1/w(1) 1/w(0) 1/w(1)]). If a rectangular window
1
is used in the STFT calculation then H
4 = I4 is unity matrix and this kind of
548
Time-Frequency Analysis
Figure 9.13 Time-frequency analysis of a linear frequency modulated signal with overlapping
windows of various widths. Time step in the STFT calculation is R = 1.
Ljubia Stankovic
549
Figure 9.14 Time-frequency analysis of a linear frequency modulated signal with overlapping
windows of various widths. Time step in the STFT calculation is R = 1. For each window
width the frequency axis is interpolated (signal in time is zero padded) up to the total number
of available signal samples M = 64.
550
Time-Frequency Analysis
STFT(2,k)
STFT(6,k)
STFT(10,k)
STFT(14,k)
S (2,1)
S (6,1)
S (10,1)
S (14,1)
S4(2,0)
S4(6,0)
S4(10,0)
S4(14,0)
S4(2,-1)
S4(6,-1)
S4(10,-1)
S4(14,-1)
S (2,-2)
S (6,-2)
S (10,-2)
S (14,-2)
-1
-2
-3
-4
-5
-6
4
-7
-8
0
10
11
12
13
14
15
x(8), x(9),x(10),x(11)
x(12),x(13),x(14),x(15)
x(2+m)w(m)=
x(6+m)w(m)=
x(10+m)w(m)=
x(14+m)w(m)=
IDFT{STFT(2,k)}
IDFT{STFT(6,k)}
IDFT{STFT(10,k)}
IDFT{STFT(14,k)}
m=-2,-1,0,1
m=-2,-1,0,1
m=-2,-1,0,1
m=-2,-1,0,1
Figure 9.15
windows.
Ljubia Stankovic
551
0
x (0) x (2) x (4) x (6) x (8) x (10) x (12) x (14)
0
x (1) x (3) x (5) x (7) x (9) x (11) x (13) x (15)
STFT = W4 H4
x (0) x (2) x (4) x (6) x (8) x (10) x (12) x (14) 0
The inversion is
x (0) w (0)
x (1) w (1)
W41 STFT = H4 X =
x (0)w(2)
x (1)w(1)
x (2) w (0)
x (3) w (1)
x (2)w(2)
x (3)w(1)
x (4) w (0)
x (5) w (1)
x (4)w(2)
x (5)w(1)
x (6) w (0)
x (7) w (1)
...
...
...
...
x (14)w(2)
x (15)w(1)
0
0
where X is the matrix with signal elements. The window matrix is left
on the right side, since in general it may be not invertible. By calculating
W41 STFT we can then recombine the signal values. For example, the
element producing x (0)w(0) in the first column is combined with the element
producing x (0)w(2) in the second column to get x (0)w(0) + x (0)w(2) =
x (0), since for the Hann(ing) window of the width N holds w(n) + w(n
N/2) = 1. The same is done for other signal values in the matrix obtained
after inversion,
x (0)w(0) + x (0)w(2) = x (0)
x (1)w(1) + x (1)w(1) = x (1)
x (2)w(0) + x (1)w(2) = x (2)
...
x (15)w(1) + x (15)w(1) = x (15)
Note that the same relation would hold for a triangular window, while for a
Hamming window a similar relation would hold, with w(n) + w(n N/2) =
1.08. The results should be corrected in that case, by a constant factor of 1.08.
Illustration of the STFT calculation for an arbitrary window width N
at n = n0 is presented in Fig.9.16. Its inversion produces x (n0 + m)w(m) =
IDFT{STFTN (n0 , k )}. Consider the pervious STFT value in the case of
nonoverlapping windows. It would be STFTN (n0 N, k ). Its inverse
IDFT{STFTN (n0 N, k )} = x (n0 N + m)w(m)
552
Time-Frequency Analysis
transforms we would get signal with very low values around n = n0 N/2.
If one more STFT is calculated at n = n0 N/2 and its inverse combined with
previous two it will improve the signal presentation within the overlapping
region n0 N n < n0 . In addition for the most of common windows
w(m N ) + w(m N/2) + w(m) = 1 (or a constant) within 0 m < N
meaning that the sum of overlapped inverse STFTs, as in Fig.9.16, will give
the original signal within n0 N n < n0 .
(9.26)
i = ... 2, 1, 0, 1, 2, ...
1 N/21
STFT (n0 + iR, k)e j2mk/N .
N k=
N/2
(9.27)
Since R < N we we will get the same signal value within different STFT, for
different i. For example, for N = 8, R = 2 and n0 = 0 we will get the value
x (0) for m = 0 and i = 0, but also for m = 2 and i = 1 or m = 2 and i = 1,
and son on. Then in the reconstruction we should use all these values to get
the most reliable reconstruction.
Let us reindex the reconstructed signal values (9.27) by substitution
m = l iR
w(l iR) x (n0 + l ) =
1 N/21
STFT (n0 + iR, k)e j2lk/N e j2iRk/N
N k=
N/2
N/2 l iR N/2 1.
Ljubia Stankovic
553
x(n)
n
n -N
0
n -N/2
w(m)
w(m)
x(n - N+m)w(m)
x(n +m)w(m)
m
0
w(m)
x(n)w(n-n +N/2)
0
x(n0- N/2+m)w(m)
m
n
Figure 9.16 Illustration of the STFT calculation with windows overlapping in order to produce an inverse STFT whose sum will give the original signal within n0 N n < n0 .
554
Time-Frequency Analysis
1 N/21
STFT (n0 , k)e j2lk/N
N k=
N/2
1 N/21
STFT (n0 + 2R, k)e j2lk/N e j22Rk/N
N k=
N/2
w ( l R ) x ( n0 + l ) =
1 N/21
STFT (n0 + R, k)e j2lk/N e j2Rk/N
N k=
N/2
w ( l + R ) x ( n0 + l ) =
1 N/21
STFT (n0 R, k)e j2lk/N e j2Rk/N
N k=
N/2
1 N/21
STFT (n0 2R, k)e j2lk/N e j22Rk/N
N k=
N/2
...
as far as w(l 2iR), for i = 0, 1, 2, ... is within
since
for any n0 and l. Note that i w(l iR) is a periodic extension of w(l ) with
a period R. If W (e j ) is the Fourier transform of w(l ) then the Fourier
transform of its periodic extension is equal to the samples of W (e j ) at
= 2k/R. The condition (9.28) is equivalent to
W (e j2k/R ) = CN(k ) for k = 0, 1, ..., R 1.
Special cases:
Ljubia Stankovic
x(n)
555
x(n-0)
w(3)
STFT(n-3,0)
w(2)
STFT(n-3,1)
w(1)
STFT(n-3,2)
w(0)
STFT(n-3,3)
N/2
w(3) x(n-0)
-4
-1
z
x(n-1)
w(2) x(n-1)
-4
-1
z
x(n-2)
w(1) x(n-2)
-4
-1
z
x(n-3)
-1
z
x(n-4)
w(-1)
STFT
STFT(n-3,4)
(DFT)
-1
z
x(n-5)
w(-2)
STFT(n-3,5)
z-1
x(n-6)
w(-3)
STFT(n-3,6)
w(-4)
STFT(n-3,7)
w(0) x(n-3)
IDFT
-4
w(-1) x(n-4)
w(-2) x(n-5)
w(-3) x(n-6)
x(n-4)
x(n-5)
x(n-6)
-1
z
x(n-7)
w(-4) x(n-7)
x(n-7)
R=N/2=4
Figure 9.17 Signal reconstruction from the STFT for the case N = 8, when the STFT is
calculated with step R = N/2 = 4 and the window satisfies w(m) + w(m N/2) = 1. This
is the case for the rectangular, Hann(ing), Blackman and triangular windows. The same holds
for the Hamming window up to a constant scaling factor of 1.08.
1. For R = N (nonoverlapping), relation (9.28) is satisfied for the rectangular window, only.
2. For a half of the overlapping period, R = N/2, condition (9.28) is met
for the rectangular, Hann(ing), Hamming, and triangular window.
Realization in this case for N = 8 and R = N/2 = 4 is presented in
Fig.9.17. Signal values with a delay of N/2 = 4 samples are obtained at
the exit. The STFT calculation process is repeated after each 4 samples,
producing blocks of 4 signal samples at the output.
3. The same holds for R = N/2, N/4, N/8, if the values of R are integers.
4. For R = 1, (the STFT calculation in each available time instant), any
window satisfies the inversion relation. In this case we may also use a
556
Time-Frequency Analysis
= w (0 ) x ( n ).
Very efficient realizations, for this case, are the recursive ones, instead
of the direct DFT calculation, Fig.9.8.
In analysis of non-stationary signals our primary interest is not in
signal reconstruction with the fewest number of calculation points. Rather,
we are interested in tracking signals non-stationary parameters, like for
example, instantaneous frequency. These parameters may significantly vary
between neighboring time instants n and n + 1. Quasi-stationarity of signal
within R samples (implicitly assumed when down-sampling by factor of R
is done) in this case is not a good starting point for the analysis. Here, we
have to use the time-frequency analysis of signal at each instant n, without
any down-sampling.
9.2.10 Time-Varying Windows
In general, varying window widths could be used for different timefrequency points. When Ni changes with ni we have the case of a timevarying window. Assuming a rectangular window we can write,
STFTNi (ni , k ) =
Ni /21
m= Ni /2
x ( ni + m ) e
j 2
N mk
i
(9.29)
Notation STFTNi (n, k ) means that the STFT is calculated using signal samples within the window [ni Ni /2, ni + Ni /2 1] for Ni /2 k Ni /2 1,
corresponding to an even number of Ni discrete frequencies from to .
For an odd Ni , the summation limits are ( Ni 1)/2. Let us restate that
a wide window includes signal samples over a wide time interval, losing
the possibility to detect fast changes in time, but achieving high frequency
resolution. A narrow window in the STFT will track time changes, but with
a low resolution in frequency. Two extreme cases are Ni = 1 when
STFT1 (n, k ) = x (n)
and Ni = M when
STFTM (n, k ) = X (k ),
Ljubia Stankovic
557
x(n)
x(n-0)
z-1
x(n-1)
w(3)
STFT(n-3,0)
w(2)
STFT(n-3,1)
w(1)
STFT(n-3,2)
w(0)
STFT(n-3,3)
-1
z
x(n-2)
-1
z
x(n-3)
z-1
w(-1)
x(n-4)
STFT
1/(Nw(0))
x(n-3)
STFT(n-3,4)
(DFT)
z-1
x(n-5)
z-1
x(n-6)
w(-2)
STFT(n-3,5)
w(-3)
STFT(n-3,6)
w(-4)
STFT(n-3,7)
-1
z
x(n-7)
Figure 9.18
558
Time-Frequency Analysis
where m is the column index and k is the row index of the matrix. The STFT
value STFTNi (ni , k ) is presented as a block in the time-frequency plane of
the width Ni in the time direction, covering all time instants [ni Ni /2, ni +
Ni /2 1] used in its calculation. The frequency axis can be labeled with the
DFT indices p = M/2, ..., M/2 1 corresponding to the DFT frequencies
2 p/M (dots in Fig.9.19). With respect to this axis labeling, the block
STFTNi (ni , k) will be positioned at the frequency 2k/Ni = 2 (kM/Ni )/M,
i.e., at p = kM/Ni . The block width in frequency is M/Ni DFT samples.
Therefore the block area in time and DFT frequency is always equal to
the number of all available signal samples M as shown in Fig.9.19 where
M = 16.
Example 9.11. Consider a signal x (n) with M = 16 samples. Write the expression
for calculation of the STFT value STFT4 (2, 1) with a rectangular window.
Indicate graphically the region of time instants used in the calculation and
the frequency range in terms of the DFT frequency values included in the
calculation of STFT4 (2, 1)?
x (2 + m ) e j
2
4 m
m=2
2 2 + m < 1
0 n 3.
The frequency term is exp( j2m/4). For the DFT of a signal with M = 16
15
X (k ) = x (n)e j 16 mk
n =0
k = 8, 7, ... 1, 0, 1, ..., 6, 7
k = 1, corresponds to p = 4, 5, 6, 7.
Ljubia Stankovic
-7
S (10,-2)
4
9 10 11 12 13 14 15
S2(15,0)
S (15,-1)
2
S (13,0)
9 10 11 12 13 14 15
S (14,0)
1
S (15,0)
S4(4,-2)
-7
S (3,0)
S8(12,0)
0
-1
S8(12,-1)
S4(8,-2)
S4(6,-1)
-3
-4
-5
S8(12,-2)
S2(11,-1)
S2(1,-1)
-6
S8(12,1)
S (6,0)
-2
S4(8,-1)
S2(13,-1)
S4(4,-1)
-3
S8(12,2)
-1
-2
S (1,0)
S (8,0)
S (3,-1)
S (4,0)
S4(6,1)
S2(1,0)
S8(12,3)
S2(11,0)
S4(8,1)
S (1,-1)
S4(4,1)
S2(13,0)
-5
-7
-8
-4
S (9,0)
-6
S (14,-2)
4
-8
-5
S (6,-2)
4
S4(2,-2)
-4
-5
S (9,-1)
-3
-4
-6
-2
S4(14,-1)
S (7,-1)
S4(10,-1)
S4(6,-1)
S (13,-1)
-1
S4(2,-1)
S (11,0)
-1
-3
-2
S (7,0)
S4(14,0)
S (5,-1)
S4(10,0)
S4(6,0)
S (3,-1)
S4(2,0)
S (5,0)
S (3,0)
S (11,-1)
S4(14,1)
S4(10,1)
S (1,0)
S4(6,1)
S4(2,1)
S (1,-1)
559
S8(12,-3)
-6
S4(6,-2)
-7
-8
S8(12,-4)
-8
0
9 10 11 12 13 14 15
9 10 11 12 13 14 15
Figure 9.19 The nonoverlapping STFTs with: (a) constant window of the width N = 4, (b)
constant window of the width N = 2, (c)-(d) time-varying windows. Time index is presented
on the horizontal axis, while the DFT frequency index is shown on the vertical axis (the STFT
is denoted by S for notation simplicity).
560
Time-Frequency Analysis
values is
STFT =
W N0
0
..
.
0
0
W N1
..
.
0
=
STFT = Wx
..
.
0
0
..
.
W NK
1
WW
M X,
(9.30)
where STFT is a column vector containing all STFT vectors STFT Ni (ni ),
is a block
i = 0, 1,..., K, X = W M x is a DFT of the whole signal x (n), while W
matrix (M M) formed from the smaller DFT matrices W N0 , W N1 , ...,W NK ,
as in (9.29). Since the time-varying nonoverlapping STFT corresponds to a
decimation-in-time DFT scheme, its calculation is more efficient than the
DFT calculation of the whole signal. Illustration of time-varying window
STFTs is shown in Fig.9.19(c), (d). For a signal with M samples, there
is a large number of possible nonoverlapping STFTs with a time-varying
window Ni {1, 2, 3, ..., M }. The exact number will be derived later.
Example 9.12. Consider a signal x (n) with M = 16 samples, whose values are
x = [0.5, 0.5, 0.25, j0.25, 0.25, j0.25, 0.25, 0.25, 0.25, 0.25, 0.5, 0.5,
j0.5, j0.5, 0, 1]. Some of its nonoverlapping STFTs are calculated according
to (9.29) and shown in Fig.9.19. Different representations can be compared
based on the concentration measures, for example,
[STFTN (n, k )] = |STFTN (n, k )| = STFT1 .
n k
The best STFT representation, in this sense, would be the one with the smallest [STFTN (n, k)]. For the considered signal and its four representations
shown in Fig.9.19 the best representation, according to this criterion, is the
one shown in Fig.9.19(b).
Example 9.13. Consider a signal x (n) with M = 8 samples. Its values are x (0) = 0,
x (1) = 1, x (2) = 1/2, x (3) = 1/2, x (4) = 1/4, x (5) = j/4, x (6) = 1/4,
and x (7) = j/4.
(a) Calculate the STFTs of this signal with rectangular window of the
widths N = 1, N = 2, N = 4. Use the following STFT definition
STFTN (n, k ) =
N/21
m= N/2
x (n + m)e j2mk/N .
For an odd N, the summation limits are ( N 1)/2. Calculate STFT1 (n, k)
for n = 0, 1, 2, 3, 4, 5, 6, 7, then STFT2 (n, k ) for n = 1, 3, 5, 7, then STFT4 (n, k)
for n = 2, 6 and STFT8 (n, k) for n = 4. For frequency axis use notation
k = 0, 1, 2, 3, 4, 5, 6, 7.
Ljubia Stankovic
561
(b) Assuming that time-varying approach is used in the nonoverlapping STFT calculation, find the total number of possible representations.
(c) Calculate the concentration measure for each of the cases in (b)
and find the representation (nonoverlapping combination of previous STFTs)
when the signal is represented with the smallest number of coefficients. Does
it correspond to the minimum of [STFT (n, k )]?
STFT2 (3, 0) = 0,
STFT2 (5, 0) = (1 j)/4,
STFT2 (3, 1) = 1,
STFT4 (6, 0) = 0
STFT4 (n, 1) = x (n 2) + jx (n 1) + x (n) jx (n + 1)
STFT4 (2, 1) = (1 + 3j)/2,
STFT4 (6, 1) = 0
STFT4 (n, 2) = x (n 2) x (n 1) + x (n) x (n + 1)
STFT4 (2, 2) = 0,
STFT4 (6, 2) = 0,
STFT4 (n, 3) = x (n 2) jx (n 1) + x (n) + jx (n + 1)
STFT4 (2, 3) = (1 3j)/2,
STFT4 (6, 3) = 1
562
Time-Frequency Analysis
M=4.41
M=4.60
M=4.60
M=4.79
M=3.41
M=4.00
M=4.19
M=4.19
M=4.38
M=3.00, Optimal
M=5.41
M=5.60
M=5.60
M=5.79
M=4.41
M=5.00
M=5.19
M=5.19
M=5.38
M=4.00
M=5.51
M=5.70
M=5.70
M=5.89
M=4.51
Figure 9.20 Time-frequency representation in various lattices (grid-lines are shown), with
concentration measure M = [SPEC (n, k )] value. The optimal representation, with respect
to this measure, is presented with thicker gridlines. Time axis is n = 0, 1, 2, 3, 4, 5, 6, 7 and the
frequency axis is k = 0, 1, 2, 3, 4, 5, 6, 7.
Ljubia Stankovic
563
frequency
3/4
/2
/4
time
Figure 9.21
564
Time-Frequency Analysis
STFT2(1,1)
STFT (4,2)
STFT1(2,0)
/2
STFT2(1,0)
frequency
3/4
/4
STFT3(4,1)
STFT3(4,0)
1
time
Figure 9.22
STFT2 (1, 1) = 0
STFT1 (2, 0) = 1,
STFT3 (4, 0) = 0
STFT3 (4, 1) = 3,
STFT3 (4, 2) = 3
(a) Denoted areas are presented in Fig. 9.22. (b) The STFT values are
obtained using
STFTN (n, k) =
STFTN (n, k) =
( N 1)/21
x (n + m)e j2mk/N or
m=( N 1)/2
N/21
m= N/2
x (n + m)e j2mk/N
Ljubia Stankovic
565
1 + j 3
1 j 3
STFT3 (4, 1) =
x (3) + x (4) +
x (5)
2
2
1 j 3
1 + j 3
STFT3 (4, 2) =
x (3) + x (4) +
x (5).
2
2
The transformation matrix
column vector S) is
1
1
T= 0
0
0
0
0
1
0
1
1 0
0
1 1 0
0
0
0
1
0
S= 0
0 0
1
1+ j 3
0 0
0
2
0
1 j 3
2
0
0
0
1
1+ j 3
2
1 j 3
2
0
0
0
1
1
1
0
0
0
1
1
1
0
0
0
1
1 j 3
2
1+ j 3
2
0
0
0
1
1 j 3
2
1+ j 3
2
2
2
4
3
3
0
=
3+ j3
2
3 j3 3
2
y (5)
'
= 2
y (4)
y (3)
y (2)
2
y (1)
(T
1
.
0
4
4
0
y (0)
(T
Example 9.15. A discrete signal x (n) is considered for 0 n < M. Find the number
566
Time-Frequency Analysis
the first window is one-sample window, then the number of the STFTs is
F ( M 1). When the first window is a two-sample window, then the total
number of the STFTs is F ( M 2), and so on, until the first window is the Msample window, when F ( M M) = 1. Thus, the total number of the STFTs
for all cases is
F ( M ) = F ( M 1) + F ( M 2) + . . . + F (1) + 1
F( M k)
k =1
F ( M 1) =
k =1
F( M 1 k) =
F( M k)
k =2
and
F ( M ) F ( M 1) =
k =1
k =2
F ( M k ) F ( M k ) = F ( M 1)
F ( M) = 2F ( M 1).
resulting in F ( M) = 2 M1 .
(b) In a similar way, following the previous analysis, we can write
F ( M ) = F ( M 20 ) + F ( M 21 ) + F ( M 22 ) + + F ( M 2m )
log2 M
m =0
F ( M 2m )
1
1
M
F( M)
9
98
2
2
3
3
10
174
4
6
11
306
5
10
12
542
6
18
7
31
13
956
8
56
14
1690
15
2983
16
.
5272
Ljubia Stankovic
567
where [] is an integer part of the argument, holds, with relative error smaller
then 0.4% for 1 M 1024. For example, for M = 16 we have 5272 different
ways to split time-frequency plane into non-overlapping time-frequency
regions.
Ni /21
m= Ni /2
w(m) x (n + m)e
j 2
N mk i
i
21
m=2
x (2 + m)e j2m(1)/4 .
1 M 1
P(i ) X (k + i )e j2in/M ,
M i
=0
(9.31)
1 1
STFT M (k ) = W
M P M X ( k ).
568
Time-Frequency Analysis
S (15,0)
-1
-1
S8(12,1)
-7
S4(6,-2)
S4(10,-2)
S (15,-1)
2
S (13,-1)
-6
S4(14,-2)
S (11,-1)
S (9,-1)
-5
S4(2,-2)
S (7,-1)
-4
-5
S (5,-1)
-3
-4
-6
S4(14,1)
-2
S4(14,-1)
S4(10,-1)
S (3,-1)
S4(6,-1)
S4(2,-1)
-3
S (1,-1)
-2
S4(10,1)
S16(8,1)
S16(8,0)
-7
-8
-8
0
9 10 11 12 13 14 15
S (15,0)
-1
-1
-2
-2
-3
S4(6,-1)
S4(10,-1)
S4(14,-1)
-6
Figure 9.23
9 10 11 12 13 14 15
S8(4,2)
S8(12,2)
S (4,1)
8
S8(12,1)
S4(2,0)
S (6,0)
S (10,0)
S (14,0)
S4(2,-1)
S4(6,-1)
S4(10,-1)
S4(14,-1)
-6
S16(8,-7)
S16(8,-8)
-8
16
-5
S8(12,-3)
-7
S16(8,7)
S (8,6)
-4
S (4,-3)
8
-3
-4
-5
S4(2,-1)
S2(13,0)
S (11,0)
2
S2(9,0)
S2(7,0)
S (5,0)
2
S (3,0)
S2(1,0)
S4(6,1)
S8(4,1)
S4(2,1)
S2(13,0)
S (11,0)
2
S2(9,0)
S2(7,0)
S (5,0)
2
S (3,0)
S2(1,0)
-7
S8(4,-4)
-8
9 10 11 12 13 14 15
S8(12,-4)
5
9 10 11 12 13 14 15
STFT =
1
W
N0
0
..
.
0
0
1
W
N1
..
.
0
..
.
0
0
..
.
1
W
NK
X,
(9.32)
Ljubia Stankovic
569
N(i,l ) /21
m= N(i,l ) /2
j N2 mk l
(i,l )
(9.33)
For a graphical representation of the STFT with varying windows, the corresponding STFT value should be assigned to each instant n = 0, 1, ..., M 1
and each DFT frequency p = M/2, M/2 + 1, ..., M/2 1 within a block.
In the case of a hybrid timefrequency-varying window the matrix form is
obtained from the definition for each STFT value. For example, for the STFT
calculated as in Fig.9.24, for each STFT value an expression based on (9.33)
should be written. Then the resulting matrix STFT can be formed.
There are several methods in the literature that adapt windows or
basis functions to the signal form for each time instant or even for every
considered time and frequency point in the time-frequency plane. Selection
of the most appropriate form of the basis functions (windows) for each timefrequency point includes a criterion for selecting the optimal window width
(basis function scale) for each point.
9.3
WAVELET TRANSFORM
The first form of functions having the basic property of wavelets was used
by Haar at the beginning of the twentieth century. At the beginning of
1980s, Morlet introduced a form of basis functions for analysis of seismic
signals, naming them wavelets. Theory of wavelets was linked to the
image processing by Mallat in the following years. In late 1980s Daubechies
presented a whole new class of wavelets that can be implemented in a
simple way, by using digital filtering ideas. The most important applications
of the wavelets are found in image processing and compression, pattern
recognition and signal denoising. Here, we will only link the basics of the
wavelet transform to the time-frequency analysis.
Common STFT is characterized by a constant window and constant
time and frequency resolutions for both low and high frequencies. The basic idea behind the wavelet transform, as it was originally introduced by
Morlet, was to vary the resolution with scale (being related to frequency)
570
Time-Frequency Analysis
STFT8(12,3)
STFT4(2,1)
STFT4(6,1)
STFT8(12,2)
4
3
STFT8(4,1)
STFT8(12,1)
STFT16(8,1)
STFT16(8,0)
-1
STFT8(4,-1)
-2
-4
-5
-6
-7
STFT4(2,-2)
STFT4(6,-2)
STFT8(4,-2)
STFT (15,-1)
STFT4(10,-1)
-3
STFT (13,-1)
frequency
STFT4(10,-2)
-8
0
10 11 12 13 14 15
time
Figure 9.24
in such a way that a high frequency resolution is obtained for signal components at low frequencies, whereas a high time resolution is obtained for
signal at high frequency components. This kind of resolution change could
be relevant for some practical applications, like for example seismic signals.
It is achieved by introducing a frequency variable window width. Window
width is decreased as frequency increases.
The basis functions in the STFT are
STFTI I (t, 0 ) =
"
= x ( ), w ( t ) e
j0
x ( )w( t)e j0 d
= x ( ), h ( t) =
x ( )h ( t)d
Ljubia Stankovic
571
When the above idea about wavelet transform is translated into the
mathematical form and related to the STFT, one gets the definition of a
continuous wavelet transform
1
WT (t, a) = !
| a|
"
x ( )h (
t
)d
a
(9.34)
where h(t) is a band-pass signal, and the parameter a is the scale. This
transform produces a time-scale, rather than the time-frequency signal representation. For the Morlet wavelet the relation between the scale and the
frequency is a = 0 /. In order to establish a strong formal relationship
between the wavelet transform and the STFT, we will choose the basic Morlet wavelet h(t) in the form
h(t) = w(t)e j0 t
(9.35)
"
x ( )w (
t j0 t
a d.
)e
a
(9.36)
From the filter theory point of view the wavelet transform, for a given
scale a, could
! be considered as the output of system with impulse response
| a |,
572
Time-Frequency Analysis
a=2
=0/2
(a)
(b)
t
a=1
=0
(c)
(d)
t
a=1/2
=20
(e)
(f)
Figure 9.25 Expansion functions for the wavelet transform (left) and the short-time Fourier
transform (right). Top row presents high scale (low frequency), middle row is for medium scale
(medium frequency) and bottom row is for low scale (high frequency).
these two band-pass filters from the bandwidth point of view we can see
that, in the case of STFT, the filtering is done by a system whose impulse
response w (t)e jt has a constant bandwidth, being equal to the width of
the Fourier transform of w(t).
Constant Q-Factor Transform: The quality factor Q for a band-pass filter,
as measure of the filter selectivity, is defined as
Q=
Central Frequency
Bandwidth
In the STFT the bandwidth is constant, equal to the window Fourier transform width, Bw . Thus, factor Q is proportional to the considered frequency,
Q=
.
Bw
Ljubia Stankovic
WT(t,)
573
STFT(t,)
1
t1
t2
(a)
t
1
t1
t2
(b)
t
Figure 9.26 Illustration of the wavelet transform (a) of a sum of two delta pulses and two
sinusiods compared with STFT (b)
= 0 = const.
B0 /a
B0
(9.37)
where w(t) is a real-valued function. The transform (9.38) has nonzero values
in the region depicted in Fig. 9.26(a).
(9.39)
574
Time-Frequency Analysis
The scalogram obviously loses the linearity property, and fits into the category of quadratic transforms.
9.3.1 Filter Bank and Discrete Wavelet
This analysis will start by splitting the signals spectral content into its high
frequency and low frequency part. Within the STFT framework, this can be
achieved by a two sample rectangular window
w ( n ) = ( n ) + ( n + 1 ),
with N = 2. A two-sample window STFT is
1 1
STFT (n, 0) = x (n + m)e j0
2 m =0
1
= ( x (n) + x (n + 1)) = x L (n),
2
(9.40)
(9.41)
N 1
m =0
x (n + m)e j2km/N
N/21
j2km/N in order to
is used, instead of STFT (n, k ) = m
= N/2 x (n + m )e
remain within the common
wavelet literature notation. For the same reason
the STFT
is scaled by N (a form when the DFT and IDFT have the same
factor 1/ N).
This kind of signal analysis leads to the Haar (wavelet) transform. In
the Haar wavelet transform the high-frequency part, x H (n) is not processed
any more. It is kept with this (high) two-samples resolution in time. The resolution in time of x H (n, 1) is just slightly (two-times) lower than the original
Ljubia Stankovic
575
2W1 (4, H )
2W1 (6, H )
2W2 (0, H )
2W2 (4, H )
2 2W4 (0, H )
2 2W4 (0, L)
1 1 0 0 0 0 0 0
0 0 1 1 0 0 0 0
0 0 0 0 1 1 0 0
0 0 0 0 0 0 1 1
1 1 1 1 0 0 0 0
0 0 0 0 1 1 1 1
1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1
x (0)
x (1)
x (2)
x (3)
x (4)
x (5)
x (6)
x (7)
(9.42)
This kind of signal transformation was introduced by Haar more than a century
ago . In this notation scale a = 1 values of the wavelet coefficients W1 (2n, H )
are equal to the highpass part of signal calculated using two samples,
W1 (2n, H ) = x H (2n). The scale a = 2 wavelet coefficients are W2 (4n, H ) =
x LH (4n). In scale a = 4 there is only one highpass and one lowpass coefficient at n = 0, W4 (8n, H ) = x LLH (8n) and W4 (8n, L) = x LLL (8n). In this
way any length of signal N = 2m can be decomposed into Haar wavelet
coefficients.
The Haar wavelet transform has a property that its highpass coefficients are equal to zero if the analyzed signal is constant within the analyzed
time interval, for considered scale. If signal has large number of constant
576
Time-Frequency Analysis
value samples within the analyzed time intervals, then many Haar wavelet
transform coefficients are zero valued. They can be omitted in signal storage
or transmission. In recovery their values are assumed as zeros and the original signal is obtained. The same can be done in the case of noisy signals,
when all coefficients bellow an assumed level of noise can be zero-valued
and the signal-to-noise ratio in the reconstructed signal improved.
9.3.1.1
Although the presented Haar wavelet analysis is quite simple we will use
it as an example to introduce the filter bank framework of the wavelet
transform. Obvious results from the Haar wavelet will be used to introduce
other wavelet forms. For the Haar wavelet calculation two signals x L (n) and
x H (n) are formed according to (9.40) and (9.41), based on the input signal
x (n). Transfer functions of the discrete-time systems producing these two
signals are
1
H L ( z ) = (1 + z )
2
1
H H ( z ) = (1 z ) .
2
(9.43)
#
#
#
#
with amplitude characteristics # HL (e j )# = 2 |cos(/2)|, and # HH (e j )# =
(
)
=
(
)
+
(
+
)]
n
x
n
x
n
1
/
2
are obtained, next values of the
signals
x
[
L
and x H (n) = [ x (n) x (n + 1)] / 2 are calculated after one time instant is
skipped. Therefore the output signal is downsampled by factor of two. The
Ljubia Stankovic
577
|HL(ej)|2+|HH(ej)|2=2
1.8
1.6
|H (e )|=|DFT{ (n)}|
L
1
1.4
1.2
1
0.8
|HH(ej)|=|DFT{1(n)}|
0.6
0.4
0.2
0
-3
-2
-1
Figure 9.27 Amplitude of the Fourier transform of basic Haar wavelet and scale function
divided by 2.
(9.44)
1
1
X (z1/2 ) + X (z1/2 ).
2
2
x (n )zn
x (n)[(z1/2 )n + (z1/2 )n ] =
n=
n=
1
1
Z{ x (2n))} = Y (z) = X (z1/2 ) + X (z1/2 ).
2
2
n=
2x (2n)z n
(9.45)
For the signals s L (n) = x L (2n) and s H (n) = x H (2n) the system implementation is presented in Fig.9.28.
578
Time-Frequency Analysis
H (z)
H
2
x(n)
X(z)
H (z)
L
Figure 9.28
by 2.
Signal filtering by a low pass and a high pass filter followed by downsaampling
If the signals s L (n)and s H (n) are passed through the lowpass and
highpass filters HL (z) and HH (z) and then downsampled,
1
1
HL (z1/2 ) X (z1/2 ) + HL (z1/2 ) X (z1/2 )
2
2
1
1
1/2
1/2
S H (z) = HH (z ) X (z ) + HH (z1/2 ) X (z1/2 )
2
2
S L (z) =
hold.
9.3.1.2
Upsampling
Let us assume that we are not going to transform the signals s L (n) and
s H (n) any more. The only goal is to reconstruct the signal x (n) based on its
downsampled lowpass and highpass part signals s L (n) and s H (n). The first
step in the signal reconstruction is to restore the original sampling interval
of the discrete-time signal. It is done by upsampling the signals s L (n) and
s H ( n ).
Upsampling of a signal x (n) is described by
y(n) = [...x (2), 0, x (1), 0, x (0), 0, x (1), 0, x (2), 0, ...] .
Its z-transform domain form is
Y ( z ) = X ( z2 ),
Ljubia Stankovic
579
since
X ( z2 ) =
n=
y(n) =
x (n/2)
0
for
for
even n
odd n
= Z 1 { X (z2 ))}.
If a signal x (n) is downsampled first and then upsampled, the resulting signal transform is
"
#2
1 " 1/2 #2
1
) + X ( z1/2 )
X( z
2
2
1
1
Y (z) = X (z) + X (z).
2
2
Y (z) =
(9.47)
#
"
$ %
In the Fourier domain it means Y (e j ) = ( X e j + X e j( + ) . This form
"
#
indicates that an aliasing component X e j( + ) appeared in this process.
9.3.1.3
Reconstruction Condition
580
Time-Frequency Analysis
H (z)
H
SH(z)
G (z)
H
2
x(n)
y(n)
+
X(z)
Y(z)
2
H (z)
L
SL(z)
G (z)
L
Figure 9.29 One stage of the filter bank with reconstruction, corresponding to the one stage
of the wavelet transform realization.
(9.48)
(9.49)
Ljubia Stankovic
581
(9.50)
(9.51)
HL (z ) GL (z)
[ HH (z) GH (z) + HL (z) GL (z)] = 2.
HH (z) GH (z)
Since the expression within the brackets is equal to 2 (reconstruction condition (9.48) with z being replaced by z) then
HL (z ) GL (z )
=1
HH (z) GH (z)
(9.52)
(9.53)
582
9.3.1.4
Time-Frequency Analysis
Orthogonality Conditions
(9.54)
!2 1 !
!2
1 !!
!
!
!
!HL (e j/2 )! + !HL (e j/2 )! .
2
2
!2 !
!2
!
!
!
!
!
!HL (e j )! + !HL (e j )! = 2.
Ljubia Stankovic
583
If the impulse response h L (n) is orthogonal, as in (9.54), then the last relation
is satisfied for
g L (n) = h L (n).
In the z-domain it holds
G L ( z ) = H L ( z 1 )
and we may write (9.48) in the form
GL (z) GL (z1 ) + GL (z) GL (z1 ) = 2
(9.55)
or P(z) + P(z) = 2 with P(z) = GL (z) GL (z1 ). Relation (9.48) may also
written for HL (z) as well
HL (z) HL (z1 ) + HL (z) HL (z1 ) = 2.
9.3.1.5
K 1
hk (n + k )
k =0
K 1
hk (n k )
k =0
584
Time-Frequency Analysis
GH (e j ) =
g H (n) = (1)n g L (K n)
n =0
n =0
m =0
m =0
= e
or
jK
GL (e
j( )
) = e
jK
GL (e j )
GH (e j ) = e jK GL (e j ) = e jK HL (e j )
HH (e j ) =
m=K
n=K
h H (n)e jn =
n=K
(1)n h L (n K )e jn
= HL (e j ) HL (e j ) = GL (e j ) HL (e j ).
HH (e j ) = e jK GL (e j ).
(9.56)
Ljubia Stankovic
585
h L (m)h H (m 2n) = 0
m
gL (m) gH (m 2n) = 0
m
(9.57)
is also satisfied with these forms of transfer functions for any n. Since
The condition that the reconstruction filter GL (z) has zero value at z = e j =
+ z1 ). This form without additional
1 means that its form is GL (z) = a(1
requirements would produce a = 1/ 2 from the reconstruction relation
GL (z) GL (z1 ) + GL (z) GL (z1 ) = 2. The time domain filter form is
1
g L (n) = [(n) + (n 1)] .
2
It corresponds to the Haar wavelet. All other filter functions can be defined
using g L (n) or GL (e j ).
586
Time-Frequency Analysis
The same result would be obtained starting from the filter transfer
functions for the Haar wavelet already introduced as
1
H L ( z ) = (1 + z )
2
1
H H ( z ) = (1 z ) .
2
The reconstruction filters are obtained from (9.48)-(9.49)
1
(1 + z ) G L ( z ) +
2
1
(1 z ) G L ( z ) +
2
1
(1 z ) G H ( z ) = 2
2
1
(1 + z ) G H ( z ) = 0
2
as
with
"
1 !
G L ( z ) = 1 + z 1
2
"
1 !
G H ( z ) = 1 z 1
2
1
g L (n) = (n) +
2
1
g H (n) = (n)
2
2h L (n)
2h H (n)
n
0
1
1
1
1
1
(9.58)
1
( n 1)
2
1
( n 1 ).
2
(9.59)
2g L (n)
1
1
2g H (n)
1
1
A detailed time domain filter bank implementation of the reconstruction process in the Haar wavelet case is described. The reconstruction is
implemented in two steps:
1) The signals s L (n) and s H (n) from (9.44) are upsampled, according
to (9.46), as
r L (n) = [s L (0) 0 s L (1) 0 s L (2) 0 ...s L ( N 1) 0]
Ljubia Stankovic
587
These signals are then passed trough the reconstruction filters. A sum of the
outputs from these filters is
y(n) = r L (n) g L (n) + r H (n) g H (n)
1
1
1
1
= r L ( n ) + r L ( n 1) + r H ( n ) r H ( n 1)
2
2
2
2
1
= [ x L (0) 0 x L (2) 0 x L (4)....0 x L (2N 2) 0] +
2
1
+ [0 x L (0) 0 x L (2)....0 x L (2N 2)]
2
1
+ [ x H (0) 0 x H (2) 0 x H (4)....0 x H (2N 2) 0]
2
1
[0 x H (0) 0 x H (2)....0 x H (2N 2)] .
2
where s L (n) = x L (2n) and s H (n) = x H (2n). From the previous relation
follows
1
y(0) = [ x L (0) + x H (0)] = x (0)
2
1
y(1) = [ x L (0) x H (0)] = x (1)
2
...
1
y(2n) = [ x L (2n) + x H (2n)] = x (2n)
2
1
y(2n + 1) = [ x L (2n) x H (2n)] = x (2n + 1).
2
A system for implementation of the Haar wavelet transform of a signal
with eight samples is presented in Fig.9.30. It corresponds to the matrix form
realization (9.42).
Example 9.17. For a signal x (n) = [1, 1, 2, 0, 2, 2, 0, 0, 2, 2, 2, 2, 0, 0, 0, 0] calculate
the Haar wavelet transform coefficients, with their appropriate placement in
the time-frequency plane corresponding to a signal with M = 16 samples.
588
Time-Frequency Analysis
discrete-time n
H (z)
H
W1(0,H)
W1(2,H)
W1(4,H)
W1(6,H)
scale a=1
2
x(n)
H (z)
H
W (0,H)
W (4,H)
scale a=2
2
H (z)
L
H (z)
H
W (0,H)
4
scale a=3
2
H (z)
L
2
H (z)
L
W (0,L)
4
first stage
Figure 9.30
second stage
third stage
9.3.1.7
The Haar wavelet has the duration of impulse response equal to two. In
one stage, it corresponds to a two-sample STFT calculated using a rectangular window. Its Fourier transform presented in Fig.9.27 is quite rough
approximation of a lowpass and highpass filter. In order to improve filter
performance, an increase of the number of filter coefficients should be done.
A fourth order FIR system will be considered. The impulse response of anticausal fourth order FIR filter is h L (n) = [ h L (0), h L (1), h L (2), h L (3)] =
[ h0 , h1 , h2 , h3 ].
9 10 11 12 13 14 15
W (14,H)
1
W (12,H)
10
W (10,H)
11
W (8,H)
1
13
12
W (6,H)
14
13
W1(4,H)
14
W (2,H)
15
W (0,H)
15
12
11
W1(14,H)
W (12,H)
1
W (10,H)
W (8,H)
9 10 11 12 13 14 15
(b)
W (0,H)
W (4,H)
W (8,H)
W (12,H)
W (0,H)
W (4,H)
W (8,H)
W (12,H)
W (0,H)
W (8,H)
2
1
W (0,L)
0
W (0,H)
W (8,H)
W (0,H)
4
W (0,L)
3
2
W (8,L)
(c)
10
W1(14,H)
W (12,H)
1
W (12,L)
W (10,H)
W (8,L)
W (4,L)
W (8,H)
W (0,L)
0
2
W2(12,H)
W (4,H)
0
1
W2(8,H)
W (6,H)
1
2
1
W2(4,H)
(a)
W (0,H)
2
W1(2,H)
W (14,L)
1
W (12,L)
W (10,L)
W (8,L)
1
W (6,L)
W1(4,L)
W (2,L)
W (0,L)
W (6,H)
1
10
W (4,H)
11
W1(2,H)
12
W1(0,H)
W (14,H)
1
W (12,H)
10
W (10,H)
W (8,H)
1
11
W (6,H)
13
W1(4,H)
14
13
W (2,H)
15
14
W (0,H)
15
12
589
W1(0,H)
Ljubia Stankovic
9 10 11 12 13 14 15
9 10 11 12 13 14 15
(d)
Figure 9.31 Wavelet transform of a signal with M = 16 samples at the output of stages 1, 2, 3
and 4, respectively. Notation Wa (n, H ) is used for the highpass value of coefficient after stage
(scale of) a at an instant n. Notation Wa (n, L) is used for the lowpass value of coefficient after
stage (scale of) a at an instant n.
that
h L (n)
h0
h1
h2
h3
h H (n)
h3
h2
h1
h0
n
0
1
2
3
g L (n)
h0
h1
h2
h3
g H (n)
h3
h2 .
h1
h0
(9.60)
590
Time-Frequency Analysis
and
HL (z) GL (z) + HH (z) GH (z)
"!
"
= h 0 h 1 z + h 2 z 2 h 3 z 3 h 0 + h 1 z 1 + h 2 z 2 + h 3 z 3
!
"!
"
+ h0 z3 + h1 z2 + h2 z + h3 h0 z3 + h1 z2 h2 z1 + h3 = 0.
!
Ljubia Stankovic
591
+ h21
+ h22
+ h23
2 from HL (e j0 ) =
= 1 reconstruction condition
h0 h1 + h2 h3 = 0 from HL (e j ) = 0
!
dHL (e j ) !!
h1 + 2h2 3h3 = 0 from
!
d
= 0.
Its solution produces the fourth order Daubechies wavelet coefficients (D4)
h L (n)
n
0
1
2
3
1+ 3
4 2
3+ 3
4 2
3 3
4 2
1 3
4 2
h H (n)
1 3
4 2
3 3
4 2
3+ 3
4 2
1+ 3
4 2
n
0
1
2
3
g L (n)
1+ 3
4 2
3+ 3
4 2
3 3
4 2
1 3
4 2
g H (n)
1 3
4 2
3 3
4 2
3+ 3
4 2
1+ 3
4 2
Note that this is just one of possible symmetric solutions of the previous
system of equations, Fig.9.32.
The reconstruction conditions for the fourth order FIR filter
HL (e j ) = h0 + h1 e j + h2 e j2 + h3 e j3
with Daubechies wavelet coefficients (D4) can also be checked in a graphical
way by calculating
!
!2 !
!2
!
!
!
!
!HL (e j )! + !HL (e j( + ) )! = 2
HL (e j( + ) ) HL (e j ) + HL (e j ) HL (e j( + ) ) = 0.
From Fig.9.33, we can see that it is much better approximation of low and
high pass filters than in the Haar wavelet case, Fig.9.27.
Another way to derive Daubechies wavelet coefficients (D4) is in using
relation (9.55)
P(z) + P(z) = 2
with
P ( z ) = G L ( z ) H L ( z ) = G L ( z ) G L ( z 1 )
592
Time-Frequency Analysis
g (n)
0.5
0.5
-0.5
-0.5
-4
-3
h (n)
L
-2
-1
0 1
time n
-0.5
-0.5
Figure 9.32
-2
-1
0 1
time n
-3
-2
-1
0 1
time n
-2
-1
0 1
time n
h (n)
H
1
0.5
-3
-4
0.5
-4
g (n)
-4
-3
j 2
j 2
|HL(e )| +|HH(e )| =2
1.8
1.6
|HL(ej)|=|DFT{1(n)}|
1.4
1.2
1
0.8
|H (e )|=|DFT{ (n)}|
H
0.6
0.4
0.2
0
Figure 9.33
function.
-3
-2
-1
Ljubia Stankovic
593
"
#2
Then GL (z) must contain a factor of the form 1 + z1 . Since the filter
#2
"
order must be even (K must be odd), taking into account that 1 + z1
would produce a FIR system with 3 nonzero coefficients, then we have to
add at least one factor of the form a(1 + z1 z1 ) to GL (z). Thus, the lowest
order FIR filter with an even number of (nonzero) impulse response values
is
$
%2
G L ( z ) = 1 + z 1 a (1 + z 1 z 1 )
with
$
%2 $
%2
P ( z ) = 1 + z 1
1 + z1 R ( z )
where
Using
&
'&
'
R(z) = a(1 + z1 z1 ) a(1 + z1 z1 ) = z0 z1 + b + z0 z.
P(z) + P(z) = 2
only the terms with even exponents of z will remain in P(z) + P(z)
producing
%
1 $
1 3
a = 1 + 3 and z1 =
1+ 3
4 2
and
R(z) =
4 2
)2 $
1+
$
%$
$
%
% %
3 + 1 3 z 1 1 + 3 + 1 3 z 1 .
594
Time-Frequency Analysis
"
1
G L ( z ) = (1 + z 1 )2 1 + 3 + 1 3 z 1
4 2
with
!
"
"
1 !
g L ( n ) = [ 1 + 3 ( n ) + 3 + 3 ( n 1)
4 2
!
!
"
"
+ 3 3 (n 2) + 1 3 (n 3)].
All other impulse responses follow from this one (as in the presented table).
Example 9.18. Consider a signal that is a linear function of time
x (n) = an + b.
Show that the condition
#
dHL (e j ) ##
h L (1) + 2h L (2) 3h L (3) = 0 following from
#
#
d
=0
=
The highpass coefficients after the first stage W1 (2n, H ) are obtained
by downsampling W1 (n, H ) whose form is
W1 (n, H ) = x (n) h H (n)
if
Ljubia Stankovic
where a1 =
595
2a and b1 =
2b + 0.8966a.
2 from HL (e j0 ) =
h0 + h1 h2 + h3 = 0 from HL (e ) = 0
the reconstruction condition
h20 + h21 + h22 + h23 = 1
is equivalent to the orthogonality property of the impulse response and its
shifted version for step 2
h0
0
h1
0
h2
h0
h3
h1
0
h2
0
h3
0
0
0
0
given by
h2 h0 + h3 h1 = 0.
596
Time-Frequency Analysis
The matrix for the D4 wavelet transform calculation in the first stage
is of the form
W1 (0, L)
W1 (0, H )
W1 (2, L)
W1 (2, H )
W1 (4, L)
W1 (4, H )
W1 (6, L)
W1 (6, H )
x (0 )
x (1)
x (2)
x (3)
x (4) .
x (5)
x (6)
x (7 )
(9.61)
In the first row of transformation matrix the coefficients corresponds to
h L (n), while the second row corresponds to h H (n). The first row produces
D4 scaling function, while the second row produces D4 wavelet function.
The coefficients are shifted for 2 in next rows. As it has been described
in the Hann(ing) window reconstruction case, the calculation should be
performed in a circular manner, assuming signal periodicity. That is why
the coefficients are circularly shifted in the last two rows.
h0
h3
0
0
0
0
h2
h1
h1
h2
0
0
0
0
h3
h0
h2
h1
h0
h3
0
0
0
0
h3
h0
h1
h2
0
0
0
0
0
0
h2
h1
h0
h3
0
0
0
0
h3
h0
h1
h2
0
0
0
0
0
0
h2
h1
h0
h3
0
0
0
0
h3
h0
h1
h2
Ljubia Stankovic
597
1 3 3 3 3+ 3 1+ 3
[ h3 , h2 , h1 , h0 ] = [
].
,
,
,
4 2
4 2
4 2
4 2
598
W (14,H)
W1(12,H)
W (10,H)
1
W (8,H)
10
W1(6,H)
11
W (4,H)
1
12
W1(2,H)
W1(14,H)
W (12,H)
1
10
W (10,H)
W (8,H)
11
W (6,H)
1
13
12
W1(4,H)
14
13
W1(2,H)
15
14
W1(0,H)
15
W1(0,H)
Time-Frequency Analysis
W2(0,H)
W2(4,H)
W2(8,H)
W2(12,H)
W2(0,H)
W2(4,H)
W2(8,H)
W2(12,H)
W (0,H)
W (0,L)
0
W (0,L)
W (8,L)
9 10 11 12 13 14 15
W (8,H)
W (0,H)
W (8,L)
W (8,H)
9 10 11 12 13 14 15
Figure 9.34 Daubechies D4 wavelet transform (absolute value) of the signal x (n) = (n 7)
using N = 16 signal samples, 0 n N 1 (left). The Daubechies D4 wavelet transform
(absolute value) of the signal x (n) = 2 cos(28n/N ) + 1, 0 n N 1, with N = 16 (right).
The inverse matrix for the D4 wavelet transform for a signal with
N = 8 samples would be calculated from the lowest level in this case
for a = 2 with coefficients W2 (0, L), W2 (0, H ), W2 (4, L), and W2 (4, H ). The
lowpass part of signal at level a = 1 would be reconstructed using
W1 (0, L)
h0
W1 (2, L) h1
W1 (4, L) = h2
W1 (6, L)
h3
h3
h2
h1
h0
h2
h3
h0
h1
h1
W2 (0, L)
W2 (0, H )
h0
h3 W2 (4, L)
h2
W2 (4, H )
After the lowpass part W1 (0, L), W1 (2, L), W1 (4, L), and W1 (6, L) are reconstructed, they are used with wavelet coefficients from this stage W1 (0, H ),
W1 (2, H ), W1 (4, H ), and W1 (6, H ) to reconstruct the signal as
x (0)
h0
h3
0
0
0
0 h2
h1
W1 (0, L)
x (1) h1 h2
0
0
0
0 h3 h0
W1 (0, H )
x (2) h2
h1 h0
h3
0
0
0
0 W1 (2, L)
x (3) h3 h0 h1 h2
0
0
0
0
=
W1 (2, H ) .
x (4) 0
0 h2
h1 h0
h3
0
0 W1 (4, L)
x (5) 0
0 h3 h0 h1 h2
0
0
W1 (4, H )
x (6) 0
0
0
0 h2
h1 h0
h3
W1 (6, L)
x (7)
0
0
0
0 h3 h0 h1 h2
W1 (6, H )
(9.62)
Ljubia Stankovic
599
This procedure can be continued for signal of length N = 16 with one more
stage. Additional stage would be added for N = 32 and so on.
Example 9.22. For the Wavelet transform from the previous example find its
inverse (reconstruct the signal).
W2 (0, L)
h3 h2
h1
h0
W3 (0, L)
W2 (4, L) h1 h2 h3 h0 W3 (0, H )
W2 (8, L) = h2
h1 h0
h3 W3 (8, L)
W2 (12, L)
h3 h0 h1 h2
W3 (8, H )
0.1373
h3 h2
h1
h0
0.4668
h1 h2 h3 h0 0.1251 0.6373
.
=
=
h2
0
h1 h0
h3 0.1132
0.4226
0
h3 h0 h1 h2
h0
W1 (0, L)
h3
0
0
0
0 h2
h1
W1 (2, L)
0
0
0
0 h3 h0
h1 h2
W1 (4, L) h2
h1 h0
h3
0
0
0
0
h3 h0 h1 h2
W1 (6, L)
0
0
0
0
=
W1 (8, L)
h
h
h
0
0
0
h
2
0
3
1
W1 (10, L)
0
0
0 h3 h0 h1 h2
0
W1 (12, L)
h1 h0
h3
0
0
0
0 h2
W1 (14, L)
0
0
0
0 h3 h0 h1 h2
9.3.1.8
W2 (4n, H ) to
W2 (0, L)
W2 (0, H )
W2 (4, L)
W2 (4, H )
W2 (8, L)
W2 (8, H )
W2 (12, L)
W2 (12, H )
The obtained values W1 (n, L) with the wavelet coefficients W1 (n, H ) are used
to reconstruct the original signal x (n). The transformation matrix in this case
is of 16 16 order and it is formed using the same structure as the previous
transformation matrix.
Although the wavelet realization can be performed using the same basic
function presented in the previous section, here we will consider the equivalent wavelet function h H (n) and equivalent scale function h L (n) in different
scales. To this aim we will analyze the reconstruction part of the system. Assume that in the wavelet analysis of a signal only one coefficient is nonzero.
Also assume that this nonzero coefficient is at the exit of all lowpass filters structure. It means that the signal is equal to the basic scale function in
600
Time-Frequency Analysis
(n)
GL(z)
0(n)=hL(n)
2
GL(z)
1(n)
GH(z)
GL(z)
2(n)
GH(z)
2
Figure 9.35
GH(z)
the wavelet analysis. The scale function can be found in an inverse way, by
reconstructing signal corresponding to this delta pulse like transform. The
system of reconstruction filters is shown in Fig.9.35. Note that this case and
coefficient in the Haar transform would correspond to W4 (0, L) = 1 in (9.42)
or in Fig.9.30. The reconstruction process consists of signal upsampling and
passing it trough the reconstruction stages. For example, the output of the
third reconstruction stage has the z-transform
2 ( z ) = G L ( z ) G L ( z2 ) G L ( z4 ).
In the time domain the reconstruction is performed as
0 (n) = (n) g L (n) = g L (n)
1 (n) = [0 (0) 0 0 (1) 0 0 (2) 0 0 (3)] g L (n)
2 (n) = [1 (0) 0 1 (1) 0 ... 1 (8) 0 1 (9)] g L (n)
....
where g L (n) is the four sample impulse response (Daubechies D4 coefficients). Duration of the scale function 1 (n) is (4 + 3) + 4 1 = 10 samples,
while the duration of 2 (n) is 19 + 4 1 = 22 samples. The scale function for
Ljubia Stankovic
601
G (z)
L
G (z)
L
(n)
1
(n)
G (z)
H
(n)=h (n)
0
G (z)
L
(n)
2
G (z)
H
Figure 9.36
G (z)
H
( z ) = G H ( z ) G L ( z2 ) G L ( z4 ).
In the Haar transform (9.42) and Fig.9.30 this case would correspond to
W4 (0, H ) = 1.
602
Time-Frequency Analysis
0 (n 2m), 1 (n) = 0
since
0 (n 2m), 1 (n) = g H ( p)
p
gH (n 2m) gL (n 2p)
n
"
=0
Ljubia Stankovic
Daubechies wavelet D4
-1
-1
0
10
20
30
40
-1
-1
0
10
20
30
40
-1
-1
0
10
20
30
40
-1
-1
0
10
20
30
40
-1
-1
0
603
10
20
30
40
10
20
30
40
10
20
30
40
10
20
30
40
Figure 9.37 The Daubechies D4 wavelet scale function and wavelet calculated using the filter
bank relation in different scales: a = 0 (first row), a = 1 (second row), a = 2 (third row), a = 3
(fourth row), a = 10 (fourth row-approximation of a continuous domain). The amplitudes are
scaled by 2(a+1)/2 to keep them within the same range. Values a (n)2(a+1)/2 and a (n)2(a+1)/2
are presented.
Chapter 10
Sparse Signal Processing
A discrete-time signal can be transformed into other domains using
different signal transformations. Some signals that cover the whole considered interval in one domain could be sparse in a transformation domain, i.e.,
could be located within a few nonzero coefficients. Compressive sensing is
a field dealing with a model for data acquisition including the problem of
sparse signal recovery from a reduced set of observations. A reduced set
of observations can be a result of a desire to sense a sparse signal with
the lowest possible number of measurements/observations (compressive
sensing). It can also be a result of a physical or measurement unavailability
to take a complete set of observations. Since the signal samples are linear
combinations of the signal transformation coefficients they could be considered as the observations of a sparse signal in the transformation domain. In
applications it could also happen that some arbitrarily positioned samples
of a signal are so heavily corrupted by disturbances that it is better to omit
them and consider as unavailable in the analysis and to try to reconstruct
the signal with a reduced set of samples. Although the reduced set of observations/samples appears in the first case as a result of user strategy to
compress the information, while in the next two cases the reduced set of
samples is not a result of user intention, all of them can be considered within
the unified framework. Under some conditions, a full reconstruction of a
sparse signal can be performed with a reduced set of observations/samples,
as in the case if a complete set of samples/observations were available. A
priori information about the nature of the analyzed signal, i.e., its sparsity
in a known transformation domain, must be used in this analysis. Sparsity
_________________________________________________
Authors: Ljubia Stankovic, Milo Dakovic, Srdjan Stankovic, Irena Orovic
665
666
ILLUSTRATIVE EXAMPLES
Before we start the analysis we will describe few widely known examples
that can be interpreted and solved within the context of sparse signal
processing and compressive sensing.
Consider a large set of real numbers X (0), X (1),...,X ( N 1). Assume
that only one of them is nonzero (or different from a common and known
expected value). We do not know either its position or its value. The aim is to
find the position and the value of this number. This case can easily be related
to many real life examples when we have to find one sample which differs
from other N 1 samples. The nonzero value (or the difference from the
expected value) will be denoted by X (i ). A direct way to find the position
of nonzero (different) sample would be to perform up to N measurements
and compare each of them with zero (the expected) value. However, if N
is very large and there is only one nonzero (different than expected) sample
we can get the result in just a few observations/measurements. A procedure
for the reduced number of observations/measurements is described next.
Take random numbers as weighting coefficients ai , i = 0, 1, 2, ..., N 1,
for each sample. Measure the total value of all N weighted samples, with
weights ai , from the set. Since only one is different from the common and
known expected value m (or from zero) we will get the total measured value
M = a1 m + a2 m + ... + ai (m + X (i )) + ... + a N m.
From this measured value M subtract the expected value MT = ( a1 + a2 +
... + a N )m. The obtained value of this observation/measurement, denoted
by y(0), is
y (0) = M M T =
N 1
k =0
a k X ( k ) = a i X ( i ),
Ljubia Stankovic
667
i=?, X(i)=?
Figure 10.1 There are N bags with coins. One of them, at an unknown position, contains false
coins. False coins differ from the true ones in mass for unknown X (i ) = m. The mass of the
true coins is m. Set of coins for measurement is formed using a1 coins from the first bag, a2 coins
from the second bag, an so on. The total measured value is M = a1 m + ... + ai (m + X (i )) + ... +
a N m. The difference of this value from the case if all coins were true is M MT . Equation for
the case with one and two bags with false coins are presented (left and right).
weight m of true coins. The goal is to find the position and the difference
in weight of false coins. From each of N bags we will take ai , i = 1, 2, ...N,
coins. Number of coins from the ith bag is denoted by ai . The total measured
weight of all coins from N bags is M, Fig.10.1.
After the expected value is subtracted the observation/measurement
y(0) is obtained
y (0) =
N 1
k =0
X (k )k (0),
(10.1)
668
k (0) = 0, k = 0, 1, 2, ..., N 1.
As expected, from one measurement we are not able to solve the problem
and to find the position and the value of nonzero sample.
If we perform one more measurement y(1) with another set of weighting coefficients k (1), k = 0, 1, ..., N 1, and get measured value y(1) =
X (i )i (1) the result will be a hyperplane
y (1) =
N 1
k =0
X (k )k (1).
This measurement will produce a new set of possible solutions for each X (k )
as
X (k) = y(1)/k (0), k = 0, 1, 2, ..., N 1.
If these two hyperplanes (sets of solutions) produce only one common value
X (i ) = y(0)/i (0) = y(1)/i (1).
then it is the solution of our problem.
As an example consider N = 5 sets of coins. The common weight of
true coins is 2. In the first measurement we use i (0) = ai = i coins from
each set. The total weight of coins is M = 31. It is obtained by measuring (1 +
2 + 3 + 4 + 5)2 + iX (i ) = M, where X (i ) is the unknown weight difference
of false coins. It means that iX (i ) = 1, since all true coins would produce
(1 + 2 + 3 + 4 + 5)2 = 30. If the false coins were in the first set the weight
difference would be X (1) = 1/1 = 1, if they were in the second set then
X (2) = 1/2, and so on, X (3) = 1/3, X (4) = 1/4, X (5) = 1/5. False coins can
be in any of five sets. Perform one more measurement with i (1) = ai = i2
coins from each set. Total measured weight is now M = 113. It is obtained as
M = 2(12 + 22 + 32 + 42 + 52 ) + i2 X (i ) = 113. Obviously i2 X (i ) = 3. Again
if the false coins were in the first set then X (1) = 3/1, the second would
produce X (2) = 3/22 = 3/4, and so on, X (3) = 3/32 = 1/3, X (4) = 3/16,
X (5) = 3/25. The solution satisfying both equations is X (3) = 1/3. Thus,
false coins are in the third set. Their weight is 2 + 1/3 = 7/3. Note that we
would not be able to solve the problem with two measurements if we got
two values X (i ) and X (k ) for i = k satisfying both equations.
Ljubia Stankovic
669
y (0)
y (1)
0 (0)
0 (1)
1 (0)
1 (1)
...
...
N 1 (0 )
N 1 (1 )
y = AX
X (0 )
X (1 )
...
X ( N 1)
k (0) X (k ) = y(0),
k (1) X (k ) = y(1).
Then
i (0) X (i ) = k (0) X (k )
and
i (1) X (i ) = k (1) X (k ).
If we divide these two equations we get
i (0)/i (1) = k (0)/k (1)
or i (0)k (1) i (1)k (0) = 0. This is contrary to the assumption that
i (0)k (1) i (1)k (0) = 0.
The same conclusion can be made considering matrix form relations
for X (i ) and X (k ). If both of them may satisfy the same two measurements
670
then
y (0 )
y (1 )
y (0)
y (1)
=
=
i (0)
i (1)
k (0)
k (1)
i (0)
i (1)
k (0)
k (1)
.-
.-
X (i )
0
0
X (k )
i (0)
i (1)
k (0)
k (1)
.-
X (i )
X (k )
(10.2)
= 0.
i (0)
i (1)
k (0)
k (1)
= 0
for any i = k. It also means that rank (A2 ) = 2 for any A2 being a 2 2 submatrix of the matrix of coefficients (measurement matrix) A. For additional
illustration of this simple problem see Section 10.5.2.
In numerical and practical applications we would not be satisfied,
if for example i (0)k (1) i (1)k (0) = 0 but i (0)k (1) i (1)k (0) =
close to zero. In this case the theoretical condition for a unique solution
would be satisfied, however the analysis and possible inversion would be
highly sensitive to any kind of noise, including quantization noise. Thus,
a practical requirement is that the determinant is not just different from
zero, but that it sufficiently differs from zero so that an inversion stability
and robustness to a noise is achieved. Inversion stability for a matrix B is
commonly described by the condition number of matrix
cond {B} =
max
min
where max and min are the largest and the smallest eigenvalue of matrix
B (when B H B = BB H )1 . The inversion stability worsens as min approaches
to zero (when min is small as compared to max ). For stable and robust
1
The value of determinant of matrix B is equal to the product of its eigenvalues, det {B} =
1 2 ... N , where N is the order of square matrix B. Note that the condition number can
be interpreted as a ratio of the norms-two (square roots of energies) of noise and signal x
after and before inversion y + y = B1 (x+). This number is always greater or equal to 1.
The best value for this ratio is achieved when min is close to max .
Ljubia Stankovic
671
calculations a requirement
max
1+
min
is imposed, with a nonnegative constant being sufficiently small. In our
example this condition should hold for any submatrix A2 = B.
The previous experiment can be repeated assuming two nonzero values X (i ) and X (k), Fig.10.1(right). In the case of two nonzero coefficients,
two measurements
y (0) =
N 1
l =0
y (1) =
N 1
l =0
(10.3)
will result in X (i ) and X (k ) for any i and k. They are the solution of a system
with two equations and two unknowns. Therefore, with two measurements
we cannot get a result of the problem and find the positions and the values
of nonzero coefficients. If two more measurements are performed then an
additional system with two equations
y(2) = X (i )i (2) + X (k )k (2)
y(3) = X (i )i (3) + X (k )k (3)
(10.4)
is formed. Two systems of two equations (10.3) and (10.4) could be solved
for X (i ) and X (k) for each combination of i and k. If these two systems
produce only one common solution pair X (i ) and X (k ) then this pair is the
solution of our problem. As in the case of one nonzero coefficient, we may
show that the sufficient condition for a unique solution is
k1 (0)
k (1)
1
det
k (2)
1
k1 (3)
k2 (0)
k2 (1)
k2 (2)
k2 (3)
k3 (0)
k3 (1)
k3 (2)
k3 (3)
k4 (0)
k4 (1)
= 0
k4 (2)
k4 (3)
(10.5)
672
and
k1 (0)
y (0 )
y(1) k (1)
1
y(2) = k (2)
1
y (3 )
k1 (3)
k2 (0)
k2 (1)
k2 (2)
k2 (3)
k3 (0)
k3 (1)
k3 (2)
k3 (3)
k4 (0)
X (k1 )
X (k2 )
k4 (1)
k4 (2)
0
0
k4 (3)
k1 (0)
y (0)
y(1) k (1)
1
y(2) = k (2)
1
y (3)
k1 (3)
k2 (0)
k2 (1)
k2 (2)
k2 (3)
k3 (0)
k3 (1)
k3 (2)
k3 (3)
k4 (0)
0
k4 (1)
0
k4 (2) X (k3 )
X (k4 )
k4 (3)
k1 (0)
k (1)
1
0=
k (2)
1
k1 (3)
k2 (0)
k2 (1)
k2 (2)
k2 (3)
k3 (0)
k3 (1)
k3 (2)
k3 (3)
k4 (0)
X (k1 )
X (k2 )
k4 (1)
k4 (2) X (k3 )
X (k4 )
k4 (3)
Since (10.5) holds then X (k1 ) = X (k2 ) = X (k3 ) = X (k4 ) = 0, meaning that
the assumption about two independent pairs of solutions with two nonzero
coefficients is not possible.
This approach to solve a problem (and to check the solution uniqueness) is illustrative, however not computationally feasible. For example, for
a simple case with N = 1024 and just two nonzero coefficients, in order to
find a solution we have to solve two times systems of equations (10.3) and
(10.4) for each possible combination of i and k and to compare their solutions. Total number of combinations of two indices out of the total number
of N indices is
* +
N
106 .
2
Ljubia Stankovic
759
Figure 10.19 Minimization using the 1 -norm and the solution illustration for the case when
the measurements line corresponds to noisy data.
760
XK 22
d2min
X0 22
10.6
Ljubia Stankovic
761
Then
X (1) = X (0) b1
X (2) = X (0) b2
...
X ( N 1 ) = X (0 ) b N 1
(10.71)
(10.72)
If the signal sparsity is K < N/2 then there will exist more than
N/2 values bi = b such that | X (0) bi | = 0. The solution of minimization
problem then will not depend on other bk = bi = b and will be unique
762
2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
0
-1
-0.8
-0.6
-0.4
-0.2
0.2
0.4
0.6
0.8
dz/dx=-5
dz/dx=N=7
dz/dx=-1
4
3
dz/dx=5
dz/dx=-3
dz/dx=1
dz/dx=3
arg{min{z}}= median{x1,x2,x3,x4,x5,x6,x7}
-1
Figure 10.20
-0.8
-0.6
-0.4
-0.2
0.2
0.4
0.6
0.8
Ljubia Stankovic
763
N 1
i =0
i = P +1
| ai | >
| ai | .
For P = 1 we get
| a0 | + | a1 | + | a2 | max{| a0 | , | a1 | , | a2 |}
> 1.
max{| a0 | , | a1 | , | a2 |}
This relation' corresponds
to the thick line for the 1 -norm in Fig.10.16(top)
'
'
'
with | a0 | = 'p X (0) ' = max{| a0 | , | a1 | , | a2 |} and
'
' '
'
'
' '
'
'p X (1) ' + 'p X (2) '
'
'
> 1.
'
'
'p X (0) '
Consider next the case when two degrees of freedom exist (with M =
N 2 measurements). All coefficients X (k ) can be expressed as a function
of, for example, X (0) and X (1) as
X (2) = a2,0 X (0) + a2,1 X (1) b2 ,
...
X ( N 1) = a N 1,0 X (0) + a N 1,1 X (1) b N 1 .
Then
z = X (k )1 = | X (0)| + | X (1)| + | a20 X (0) + a21 X (1) b2 |
+ ... + | a N 1,0 X (0) + a N 1,1 X (1) b N 1 |
The solution of the minimization problem is a two-dimensional median. It
is a point in X (0), X (1) plane such that a sum of absolute distances from the
764
lines
X (0 ) = 0
X (1 ) = 0
a2,0 X (0) + a2,1 X (1) b2 = 0
...
a N 1,0 X (0) + a N 1,1 X (1) b N 1 = 0
is minimal3 . Median here is not so simple as in the one-dimensional case.
Various algorithms have been proposed for multidimensional (multivariate
or spatial) median form. Note that by crossing a line ai,0 X (0) + ai,1 X (1)
xi = 0 we will always either increase or reduce the rate of the function z, as
in one dimensional case.
An illustration of signal with N = 6 is presented in Fig.10.21. Value of
z is presented, along with measurements lines, for the case of two degrees
of freedom (two dimensional variable space). From this figure we can see
that the number of measurements is M = 4 and the sparsity of signal is
K = 2 since the distance of the function z minimum point from four planes
is 0. There are two nonzero distances (to the thick black lines) meaning that
there are two nonzero coefficients X (k ). It is interesting that in this case the
marginal median (minimization along axes X (0) and X (1) independently
would produce the same result, since one of the zero values is on the axis).
For any value of variables at least two X (k ) will be equal to zero,
since at least two of the elements in z are zero. It means that, in general,
the solution will be of sparsity K = N 2 at least.
In the case of M measurements the system
AX = y
contains M equations with N unknowns. It means that there are N M
free variables, while M can be calculated based on the free variables. Let us
a2 + b2 = 1.
| ax0 + by0 + c|
= | ax0 + by0 + c|
a2 + b2
Ljubia Stankovic
765
-1
-0.5
median (z(x,y))=(1,0)
2
0.5
1.5
-1.5
Figure 10.21
-1
-0.5
0.5
1.5
x ( n1 )
0 (n1 ) 1 (n1 )
x (n2 ) 0 (n2 ) 1 (n2 )
...
...
...
x (n M )
0 (n M ) 1 (n M )
N M ( n 1 ) N M +1 ( n 1 )
N M ( n 2 ) N M +1 ( n 2 )
+
...
...
N M ( n M ) N M +1 ( n M )
N M 1 ( n 1 )
X (0 )
N M 1 ( n 2 )
X (1 )
...
...
N M 1 ( n M )
X ( N M 1)
N 1 ( n 1 )
X ( N M)
N 1 ( n 2 )
X ( N M + 1)
...
...
M 1 ( n M )
X ( N 1)
766
y= B N M X0,N M1 +C M X N M,N 1
1
1
X N M,N 1 = C
M y C M B N M X0,N M1
+ LC
y
C
B
X
N
M
0,N M1 L .
M
M
1
10.7
subject to y = AX
subject to X1 <
Ljubia Stankovic
767
= y22 X T A T y y T AX + X T A T AX
subject to the minimal energy values of X, i.e. subject to X22 . The minimization of the ridge constraint problem can be reformulated in Lagrangian
form using a parameter as
M
N
X = arg min y AX22 + X22 .
X
Minimization of
= y22 X T A T y y T AX + X T A T AX+X T X
can be obtained in a closed form using the symbolic derivative operator
F (X)
= 2A T y + 2A T AX + 2X = 0
X T
as
C 1
B
Xridge = A T A + I
A T y.
Parameter balances the error and constraint. Its inclusion makes that the
inversion is nonsingular even if A T A is singular. Real valued matrix A is
assumed, otherwise Hermitian conjugate and transpose A H would be used.
The standard ridge regression minimizes the energy of solution X (k )
and not its sparsity, Fig.10.22. That is the reason while the 1 -norm constraint is introduced in the cost function
F (X) = y AX22 + X1
768
|X(0)|2+|X(1)|2
|X(0)|1/4+|X(1)|1/4
|X(0)|+|X(1)|
0.5
0.5
0.5
-0.5
-0.5
-0.5
-1
-1
-1
-1
-1
-1
Figure 10.22 Minimization with constraint: in ridge regression (left), LASSO regression (middle), and the 1/4 -norm being a function closer to the 0 -norm .
Function X1 promotes sparsity. It produces the same results (under certain conditions) as if X p , with p close to 0, is used, Fig.10.22.
10.7.1.1 Iterative Calculation
The minimization problem with the 1 -norm constraint does not have a
close form solution. It is solved in iterative ways. In order to define an
iterative procedure we will add a nonnegative term, having zero value at
the solution Xs of the problem,
G (X) = (X Xs )T (I A T A)(X Xs ),
to the function F (X). This term will not change the minimization solution.
New function is
H (X) = F (X) + (X Xs )T (I A T A)(X Xs ).
where is such that the added term is always nonnegative. It means >
max , where max is the largest eigenvector of A T A. Gradient of H (X) is
H (X)=
H (X)
= 2A T y + 2A T AX+sign{X} + 2(I A T A)(X Xs ).
X T
Solution of H (X) = 0 is
A T y+ sign{X}(I A T A)Xs + X = 0
2
1
X+ sign{X} = A T (y AXs ) + Xs .
2
Ljubia Stankovic
769
1
sign{Xs+1 } = A T (y AXs ) + Xs .
2
2
or
y+
0
x = soft(y, ) =
for
for
for
y <
|y| .
y>
2
or
(10.73)
of the size 40 60. Since there are 40 measurements the random variable
N (0, 2 ) with 2 = 1/40 is used. The original sparse signal of the total
length N = 60 is X (k) = (k 5) + 0.5(k 12) + 0.9(k 31) 0.75(k
45) in the transformation domain. It is measured with a matrix A with
40 measurements stored in vector y. All 60 signal values are reconstructed
using these 40 measurements y and the matrix A, in 1000 iterations. In
770
1
0.5
0.25
0
-0.25
-0.5
-0.75
-1
=0.0001
0.75
sparse signal X(k)
=0.01
0.75
0.5
0.25
0
-0.25
-0.5
-0.75
10
20
30 40
index k
50
60
-1
10
20
30 40
index k
50
60
Figure 10.23 A sparse signal with N = 60 and K = 4 reconstructed using a reduced set of
M = 40 observations and LASSO iterative algorithm. The results for = 0.01 and = 0.0001
are presented.
the initial iteration X0 = 0 is used. Then for each next s the new values
of X are
O calculated
P using (10.73), given data y and matrix A. Value of =
2 max eig{A T A} is used. The results for = 0.01 and = 0.0001 are
presented in Fig.10.23. For very small = 0.0001 the result is not sparse, since
the constraint is too weak.
Ljubia Stankovic
771
yc
(m)
where yc
(m)
= yc
'
M ''
yc 'yc =yc(m)
0
x (n)
where NQ is the set of missing sample positions. The available samples are
considered as constants, while the missing samples are changed through
(m)
where Xa (k ) = DFT[ Xa (n)]. Since the task is to find the position of function z = X a 1 minimum, trough an iterative procedure, the relation for
772
|| X ||
a 1
4.5
3.5
2.5
2
1
0
-1
x(n )
N
-2
-3
-1
-2
x(n
N-1
) -3
Figure 10.24 Sparsity measure function in the case of two unavailable signal samples yc = ( x (n N 1 ), x (n N )) with corresponding gradient. Available samples are y =
( x (n1 ), x (n2 ), ..., x (n N 2 )).
X+
a 1 X a 1
2N
Ljubia Stankovic
773
where
Xa+ (k ) = T { x +
a ( n )}
Xa (k ) = T { x
a ( n )}
and
(m)
x+
a ( n ) = x a (n ) + (n ni )
(m)
x
a ( n ) = x a (n ) (n ni ).
For ni M there are no changes of the signal values, g(ni ) = 0. A parameter
for finite difference calculation is denoted by . All g(n) values form vector
denoted by Gm with elements Gm (n). The minimum of sparsity measure
is obtained when all unavailable samples are equal to the values of the
original signal values, i.e., when the signal is reconstructed (assuming that
the recovery conditions are satisfied).
10.7.2.1 Finite Difference Step
Before presenting the algorithm, the basic idea and parameters in (10.74)
will be discussed. Assume first a simple case when a single signal sample at
n0 NQ is not available, with card {M} = N 1. This sample is considered
as variable. It may assume an arbitrary signal value x a (n0 ) = x (n0 ) + z(n0 ),
where z(n0 ) is a variable representing shift from the true signal value at n0 .
In order to estimate the finite difference of the sparsity measure
X a 1 =
N 1
k =0
| Xa (k)| ,
x
a ( n ) = x ( n ) + ( z ( n ) ) ( n n0 ),
where is a parameter. The finite difference of the sparsity measure is
g ( n0 ) =
X+
a 1 X a 1
.
2N
774
The pulses (n n0 ) are uniformly spread over all frequencies in the DFT
domain. Then
Xa+ (k ) = X (k ) + (z(n0 ) + )e j2n0 k/N
Xa (k ) = X (k ) + (z(n0 ) )e j2n0 k/N
holds. Since the signal is sparse (K N) in a rough analysis we may neglect
changes in a few nonzero values of X (k ). We may approximately write
N 1 '
'
L +L
LX a L = ' Xa+ (k )'
= + | z ( n0 ) + | N
1
k =0
N 1 '
L L
'
LX a L = ' Xa (k )'
= + |z(n0 ) | N,
1
k =0
where = X1 is the sparsity measure of the original signal x (n). Therefore the gradient approximation of the sparsity measure X a 1 along the
direction of variable z(n0 ) is
g ( n0 ) =
X+
a 1 X a 1 | z ( n0 ) + | | z ( n0 ) |
.
=
2N
2
For deviations from the true signal value smaller than the step |z(n0 )| <
we get
z ( n0 )
z ( n0 ).
g ( n0 )
=
(10.75)
It means that the gradient value can be used as an indicator of the signal
value deviation from the correct value (this property will be later used
for detection of impulsive noise in signal samples as well). For a large
|z(n0 )| >
1
(10.76)
g ( n1 )
= sign(z(n0 )).
2
In that case the gradient assumes correct direction toward minimum positions, with a deviation independent intensity.
In order to analyze the influence of to the solution precision, when
z(n0 ) is very small, assume that we have obtained the exact solution and
that the change of sparsity measure is tested on the change of sample x (n0 )
for . Then for a signal x (n) = iK=1 Ai e j2n0 ki /N of sparsity K the DFTs of
Ljubia Stankovic
775
x+
a ( n ) = x ( n ) + ( n n0 ) and x a (n ) = x (n ) (n n0 ) are
L +L
LX a L =
1
L L
LX a L =
1
'
'
'
j2n0 k i /N '
+
e
'A
' + ( N K )
i
K
i =1
K '
'
'
j2n0 k i /N '
e
'A
' + ( N K ).
i
i =1
For the worst case analysis, assume that Ai are in phase with e j2n0 ki /N
and | Ai | when
L +L
LX a L =
1
L L
LX a L =
1
| Ai | + K + ( N K ) = + N
i =1
K
| Ai | K + ( N K ) = + ( N 2K ).
i =1
N ( b) = ( N 2K )( + b)
K
K
b=
= for K N.
NK
N
(10.77)
(10.78)
The bias upper limit can be reduced by using very small . However,
calculation with a small would be time consuming (with many iterations).
Efficient implementation can be done by using of an order of signal
amplitude in the initial iteration. When the algorithm reaches a stationary
point, with a given , the value of mean squared error will assume its almost
constant value. The error will be changing the gradient direction around
correct point only, for almost . This fact may be used as an indicator
to reduce the step , in order to approach the signal true value with a
given precision. For example, if the signal amplitudes are of order of 1 and
K/N = 0.1 taking = 1 in the first iteration will produce the solution with
a precision better than 20 [dB]. Then, the step should be reduced, for
example to = 0.1. A precision better than 40 [dB] would be obtained, and
so on.
Through simulation study it has been concluded that appropriate step
parameter value in (10.74) is related to the finite difference step as = 2.
776
10.7.2.2 Algorithm
The presented analysis is used as a basic idea for the algorithm summarized
as follows:
(0)
Step 0:) Set m = 0 and form the initial signal estimate x a (n) defined for n
N as
!
0
for missing samples, n NQ
(0)
x a (n) =
,
(10.79)
x (n) for available samples, n M
where N = {0, 1, . . . , N 1} and NQ = N\M is the complement of M with
respect to N. The initial value for an algorithm parameter is estimated as
= max | x (n)|.
nM
(10.80)
(m)
Step 1: Set x p (n) = x a (n). This signal is used in Step 3 in order to estimate
reconstruction precision.
Step 2.1: Set m = m + 1. For each missing sample at ni NQ form signals
x+
a ( n ) and x a (n ):
(m)
x+
a ( n ) = x a (n ) + (n ni )
(m)
x
a ( n ) = x a (n ) (n ni ).
(10.81)
(10.82)
+
where Xa+ (k ) = T { x +
a ( n )} and X a ( k ) = T { x a (n )} are transforms of x a (n )
and x a (n).
Step 2.3: Form a gradient vector Gm with the same length as the signal. At
the positions of available samples n M, this vector has value Gm (n) = 0.
At the positions of missing samples n NQ its values are Gm (n) = g(n),
calculated by (10.82).
Step 2.4: Correct the values of estimated signal y a (n) iteratively by
(m)
( m 1)
x a (n) = x a
(n) Gm (n),
kN=01 Gm1 (k ) Gm (k )
Q
N 1 2
2 (k )
(
)
G
k
k =0 m 1
kN=01 Gm
(10.83)
Ljubia Stankovic
777
If angle m is lower than 170 and the maximal allowed number of iterations
is not reached (m < mmax ) go to Step 2.1.
Step 3: If the maximal allowed number of iterations is reached stop the
algorithm. Otherwise calculate
(m)
Tr = 10 log10
nNQ | x a (n)|2
Value of Tr is an estimate of the reconstruction error to signal ratio, calculated for missing samples only. If Tr is above the required precision threshold (for example, if Tr > 100dB), the calculation procedure
should be repeated with smaller . For example, set new value as / 10 or /10 and
go to Step 1.
Step 4: Reconstruction with the required precision is obtained in m iterations or the maximal allowed number of iterations is reached. The recon(m)
structed signal is x R (n) = x a (n).
By performing presented iterative procedure, the missing values will
converge to the true signal values, producing the minimal concentration
measure in the transformation domain.
- The inputs to the algorithm are the signal length N, the set of
available samples M, the available signal values x (ni ), ni M, the required
precision Tmax , and maximal number of iterations.
- Instead of calculating signals (10.81) and their transforms for each
ni NQ we can calculate
(m)
'
' + ' '' (m)
' Xa (k )' = 'Xa (k ) + Dn (k )''
i
'
' ' '' (m)
' Xa (k )' = 'Xa (k ) Dn (k )''
i
(m)
n
)
N
778
5.5
3.5
4.5
3.5
3
2.5
3.5
2
1.5
4.5
5.5
0.5
0
4.5
-0.5
-1
-4
Figure 10.25
-3
-2
-1
the initial value of missing samples x (1) = 0 and x (6) = 0 are used. The values
of missing samples in the first 20 iterations are presented by dots (connected
by a line) in Fig.10.25. After about 6 iterations the algorithm with = 1
does not significantly change the missing sample values (zoomed changes
are shown in lower subplot within the figure). Close to the stationary point
obtained for = 1 the gradient coordinates are almost zero-valued (with
direction changes for almost ), since the measures are on the contour with
almost the same measure (circles). After the step is reduced to = 0.1 in
the 20th iteration, the algorithm resumes its fast approach toward the exact
value, until a new stationary state. With a new change of to = 0.01 the
approach is again continued.
K
The stationary state bias for = 1 is lower than N
= 1/4 (it corresponds to the bias caused MSE lower than 15.5 [dB]). By each reduction of
to /10 the bias caused MSE will be lower for 20 [dB]. The reconstruction result and the MSE for the estimated missing values x (1) and x (6) is presented
in Fig.10.26.
The calculation is repeated with the signal
x (n) = 3 sin(20
n
n
n
) + 2 cos(60 ) + 0.5 sin(46 )
N
N
N
and N = 32. Missing samples are n N Q = {2, 4, 5, 7, 9, 13, 17, 19, 24, 26, 28, 31}.
The result for this case is shown in Fig.10.27.
Ljubia Stankovic
Available samples
-3
-3
0
2
4
6
8
Reconstructed signal in 60 iteratons
3
0
-3
0
Figure 10.26
779
4
time n
0
2
4
6
8
Reconstruted MSE in [dB]
0
-20
-40
-60
-80
20
40
iteration
60
780
Available samples
4
2
0
-2
-4
4
2
0
-2
-4
0
10
20
30
10
20
30
10
20
30
10
20
30
4
2
0
-2
-4
-20
-40
-60
0
Figure 10.27
10
20
time n
30
-80
20
40
60
iteration
Ljubia Stankovic
781
K/2
Ai cos(2tki /T + i ),
(10.84)
i =1
nN=01 | x (n)|
(10.85)
Bright colors indicate the region where the algorithm had fully recovered
missing samples in all realizations, while dark colors indicate the region
where the algorithm could not recover missing samples in any realization.
In the transition region for M slightly greater than 2K we have cases when
the signal recovery is not achieved and the cases of full signal recovery. The
simulations are done for N = 128 and for N = 64, Fig.10.28(a),(b). A stopping
criterion for the accuracy of 120 [dB] is used. It corresponds to a precision
in the recovered signal of an input samples precision if they are acquired by
a 20-bit A/D converter. The case with N = 64 is repeated with an additive
input Gaussian noise such that the input signal-to-noise ratio is 20 [dB] in
each realization Fig.10.28(c). The reconstruction error in this case is limited
by the input signal-to-noise value. The number of iterations to achieve the
required precision is presented in Fig.10.28(d). We can see that the number
of iterations is well bellow 100 for the most important region where the
reconstruction was achieved in all realizations (high values of M and small
value of K, M K).The number of iterations is quite small in the region
where the reconstruction can be achieved.
An illustration of the algorithm performance regarding to the SRR and
the gradient angle m in one realization, with K = 6, is presented in Fig.10.29.
The algorithm reached 120 [dB] accuracy in 47 iterations. From
the gradient
angle graph we see that the algorithm step is reduced to / 10 in about
each 4 iterations. According
to (10.77) the expected MSE improvement by
each reduction of is 20 log( 10) = 10 [dB].
782
Signal-to-reconstruction error
60
120
50
100
40
80
sparsity K
sparsity K
Signal-to-reconstruction error
[dB]
30
120
25
100
20
80
15
60
30
60
20
40
10
40
10
20
20
(a)
20
40 60 80 100 120
available samples M
Signal-to-noise ratio
(b)
20
40
available samples M
[n]
30
15
20
10
15
5
10
5
400
25
sparsity K
25
60
Number of iterations
[dB]
30
sparsity K
[dB]
300
20
15
200
10
100
5
0
(c)
20
40
available samples M
60
(d)
20
40
available samples M
60
Figure 10.28 Signal-to-reconstruction-error (SRR) averaged over 100 realizations for various
sparsity K and number of available samples M: (a) The total number of samples is N = 128. (b)
The total number of samples is N = 64. (c) With a Gaussian noise in the input signal, SNR = 20
[dB] and N = 64. (d) Number of iterations to reach the solution with the defined precision.
10.8
Ljubia Stankovic
783
200
150
100
50
0
10
20
30
40
50
30
40
50
iteration
SSR [dB]
100
50
0
0
10
20
iteration
Figure 10.29 Angle between successive gradient estimations m and the signal-toreconstruction-error ratio (SRR) as a function of the number of iterations in the algorithm for
one signal realization with 6 nonzero DFT coefficients and M = 64.
In the adaptive gradient-based method the missing samples (measurements) are considered as the minimization variables. The available samples
values are known and fixed. The number of variables in the minimization
process is equal to the number of missing samples/measurements in the observation domain. This approach is possible when the common signal transforms are the domains of signal sparsity. Then the missing and available
samples/measurements form a complete set of samples/measurements.
The DFT will be considered here as the signal sparsity domain. The
solution uniqueness is defined in the sense that the variation of the missing
sample values cannot produce another signal of the same sparsity. In the
case when the signal is already reconstructed then the uniqueness is checked
in the sense that there is no other signal of the same or lower sparsity with
the same set of available samples.
Consider a signal x (n) with n N={0, 1, 2, ...., N 1}. Assume that
Q of its samples at the positions qm NQ = {q1 , q2 , ...., q Q } are missing/omitted. The signal is sparse in the DFT domain, with sparsity K. The
reconstruction goal is to get x (n), for all n N using available samples at
784
= N Ai (k k0i ) +
i =1
m =1
Positions of nonzero values in X (k ) are k0i K = {k01 , k02 , ...., k0K } with
amplitudes X (k0i ) = N Ai . The values of missing samples of x a (n) = x (n) +
z(n) for n N Q are considered as variables. The goal of reconstruction
process is to get x a (n) = x (n), or z(n) = 0 for all n N. This goal should be
achieved by minimizing a sparsity measure of the signal transform Xa (k ).
Existence of the unique solution of this problem depends on the number of
missing samples, their positions, and the signal form.
If a signal with the transform X (k ) of sparsity K is obtained using a reconstruction method, with a set of missing samples, then the reconstruction
X (k ) is unique if there is no other signal of the same or lower sparsity that
satisfies the same set of available samples (using the same set of missing
samples as variables).
Example 10.25. Consider the simplest case of one missing sample at position
Xa (k ) = N Ai (k k0i ) + ze j2kq/N .
i =1
card{X a } = X a 0 =
'
'0
'
j2k0i q/N '
+
A
ze
'N
' +
i
K
i =1
i = K +1
| z |0
Ljubia Stankovic
785
N
for
|z| = 0 and z = N Ai e j2k0i q/N for any i
=
N
K
for
z
0
and
z
N
A
|
|
ie
K
for
|z| = 0.
(10.86)
With just one missing value and arbitrary signal, the minimum of X a 0 is
achieved at |z| = 0 only if the signal sparsity is lower than the lowest possible
sparsity with |z| = 0,
K < N K.
It means K < N/2. For K = N/2 the last two rows of (10.86) will produce the
same result N K = N/2 and K = N/2. In that case the minimum of X a 0
is not unique. Note that this is true only if the considered signal x (n) has a
very specific form
A1 e j2k01 q/N = A2 e j2k02 q/N = A3 e j2k03 q/N = ... = AK e j2k0K q/N = C. (10.87)
In reality the case that all components have equal amplitudes | A1 | = | A2 | =
| A3 | = ... = | AK | and that the missing sample position q is such that
arg { A1 } + 2k01 q/N = arg { A2 } + 2k02 q/N = ... = arg { AK } + 2k0K q/N
(10.88)
is a zero probability event.
It is interesting to note that if the last two conditions are satisfied by a
signal x (n) then the DFT coefficients from (10.87) are the frequency domain
samples of a harmonic signal B exp(e j2kq/N ), at k {k01 , k02 , ...., k0s }. Its IDFT
is a delta pulse with the group delay at the position of missing sample
IDFT{ B exp(e j2kq/N )} = B(n q).
Example 10.26. Consider a signal x (n) with N = 32 and two missing samples at
qm N Q = {3, 19}.
Signal sparsity is K. In order to simplify the notation assume that one DFT
value of the reconstructed signal is X (5) = 2.
(a) Show that the limit for sparsity K (when we can claim that the
reconstructed sparse signal is unique, assuming that all signal amplitudes
may assume arbitrary values) is K < 8.
(b) What are the properties that a signal must satisfy in the limit case
K = 8 so that the solution is not unique?
(c) What is the sparsity limit if the missing samples are at qm N Q =
{5, 9}?
808
10
-1
10
-2
10
-3
10
-4
10
10
15
20
sparsity K
25
30
35
Figure 10.38 Sparsity limit probability distribution for the worst possible case of signal with
Q = 72 out of N = 128 samples in 100,000 random realizations.
10.11
IMAGE RECONSTRUCTION
The gradient based algorithm is applied on the image x (n, m). As the
transformation domain the two-dimensional DCT (in symmetric form) will
be used
!
"
!
"
N 1 N 1
2 (2m + 1)k
2 (2n + 1)l
cos
,
C (k, l ) = vk vl x (m, n) cos
4N
4N
m =0 n =0
Ljubia Stankovic
809
Note that for the missing pixels any value within the possible image
values range can be assumed in the initial step. The algorithm will reconstruct the true image values at these positions. For graphical representation
of missing pixels the value 255 corresponding to a white pixel or 0 will be
used. Then the corrupted pixels are black or white pixels, Fig. 10.39.
+
For each missing sample signals x +
a ( m, n ) and x a (m, n ) are formed:
( p)
(m, n) + (m mi , n ni )
x+
a ( m, n ) = x
( p)
(m, n) (m mi , n ni ).
x
a ( m, n ) = x
(10.106)
(10.107)
( p 1)
x a (m, n) = x a
(m, n) 2Gm,n .
(10.108)
The change of step and the stopping criterion are the same as in onedimensional case. The results in 50 iterations are shown in Fig. 10.39. Reconstructed image after 1, 3, and 50 iterations are presented.
810
Figure 10.39
Noisy image
Index
Adaptive reconstruction, 765
Adaptive systems, 417
Allpass system, 239
Ambiguity function, 623
Analog signals, 10
Analytic part, 34
Antenna array, 467
Anticausal systems, 180
Attenuation, 50
Auto-regressive (AR), 179
Autocorrelation function, 330
Autocovariance function, 330
Backward difference, 220
Bandpass filter, 238
Bilinear transform, 224
Binomial random variable, 336
Blackman window, 528
Block LMS algorithm, 477
Born-Jordan distribution, 651
Butterworth filter, 47, 234
discrete-time, 231
Capons method, 608
local polynomial Fourier transform (LPFT),
614
short-time Fourier transform (STFT),
612
Cascade realization, 270
Causal system, 28, 61
Causal systems, 175
Central limit theorem, 338
Characteristic polynomial, 180
Choi-Willimas distribution, 657
811
Derivative
complex function, 20
Difference equation, 177, 180
Differential equation, 45
Differentiator, 67
Digital signals, 10
Direct realization I, 262
Direct realization II, 262
Dirichlet conditions, 18
Discrete Cosine transform (DCT), 134
Discrete Fourier transform (DFT), 101, 118
Discrete Hartley transform (DCT), 159
Discrete pseudo Wigner distribution, 633
Discrete Sine transform (DST), 137
Discrete system, 58
Discrete-time signals (discrete signals), 53
Displacement, 131
Downsampling, 570
Duality property, 30
Eigenvalues, 432
Eigenvectors, 432
Energy, 18, 57
Equiangular Tight Frame, 680
Ergodicity, 331
812
Index
Interpolation, 114
Inverse system, 241
Isometry, 677
ISTA algorithm, 763
Kalman filter, 489
Kronecker delta function, 54
L-statistics, 325
Lagrangian, 760
Laplace transform, 42
LASSO minimization, 761
Leakage effect, 129
Linear adaptive adder, 421
Linear phase systems, 279
Linear system, 27
LMS algorithm, 451
antenna systems, 467
block, 477
complex, 483
convergence, 453
echo cacellation, 473
identification, 454
noise cancelation, 458
prediction, 464
sign, 475
sinusoidal disturbance, 462
variable step, 479
Local polynomial Fourier transform, 604
moments, 605
relation to fractional Fourier transform,
607
Lowpass filter, 230
Magnitude, 18
Marginal properties, 647
Matched filter, 363
Matched z-transform method, 217
Measurement Matrix, 673
Bernoulli Random, 676
Gaussian Random, 675
Indirect, 674
Partial DFT, 675
Structured Random, 676, 787
Median, 323, 350
Minimum phase system, 241
Moments
LPFT, 606
Morlet wavelet, 565
Moving average (MA), 179
Ljubia Stankovic
813
Quantization, 370
Random signals, 313
Rank of matrix, 698
Rayleigh distribution
unform, 343
Reconstruction uniqueness, 777
Reconstruction uniquness, 781
Rectangular window, 64
Recursive systems
adaptive, 488
Reduced interference distributions
discrete form, 651
Region of convergence, 164
Resolution, 527
Restricted isometry, 677, 681
constant, 677, 682
eigenvalues, 685
uniqueness, 695
Ridge regression, 761
RLS algorithm
variable step, 483
S-method, 641
S-transform (the Stockwell transform), 601
Sampling
nonunform, 786
Sampling theorem, 95
for periodic signals, 124
in the frequency domain, 35
in the time domain, 71
Schwartzs inequality, 680
Second-order statistics, 330
Sensitivity of system, 265
Short-time Fourier transform (STFT), 516
discrete, 532
discrete-time, 529
filter bank, 536
frequency-varying, 561
hybrid, 563
inversion, 523, 540
optimal window, 519
optimisation, 554
overlapping, 538
recursive, 535
time-varying, 550
Sign LMS algorithm, 475
Sinc distribution
discrete form, 651
Soft-thresholding, 763
814
Index
Bibliography
[1] S. T. Alexander, Adaptive Signal Processing: Theory and Applications,
Springer-Verlag, 1986.
[2] R. Allred, Digital Filters for Everyone, CreateSpace Independent Publishing Platform; 2 edition, 2013.
[3] M. Amin, Compressive Sensing for Urban Radar, CRC Press, 2014.
[4] A. Antoniou, Digital Signal Processing: Signals, Systems, and Filters,
McGraw-Hill Education, 2005.
[5] F. Auger and F. Hlawatsch, Time-Frequency Analysis: Concepts and Methods, Wiley-ISTE, 2008.
[6] D. Blandford and J. Parr, Introduction to Digital Signal Processing, Prentice Hall, 2012.
[7] B. Boashash, Time-Frequency Signal Analysis and Processing: A Comprehensive Reference, Second Edition, Academic Press, 2015.
[8] P. Bremaud, Mathematical Principles of Signal Processing: Fourier and
Wavelet Analysis, Springer, 2002.
[9] S. A. Broughton, Discrete Fourier Analysis and Wavelets: Applications
to Signal and Image Processing, Wiley-Interscience, 2008.
[10] L. F. Chaparro, Signals and Systems using MATLAB, Academic Press,
2011.
[11] C. T. Chen, Signals and Systems: A Fresh Look, CreateSpace Independent
Publishing Platform, 2011.
815
816
Bibliography
[12] V. Chen, D. Tahmoush, and W. J. Miceli, Radar Micro-Doppler SignatureProcessing and Applications, The Institution of Engineering and Technology (IET), 2014.
[13] L. Cohen, Time-frequency Analysis, Prentice-Hall, 1995.
[14] A. G. Constantinides, System Function of Discrete-Time Systems, Academic Press, 2001.
[15] M. Davenport, Digital Signal Processing, Kindle edition, 2011.
[16] P. S. R. Diniz, E. A. B. da Silva, and S. L. Netto, Digital Signal Processing:
System Analysis and Design , Cambridge University Press; 2 edition,
2010.
[17] I. Daubechies, Ten Lectures on Wavelets, SIAM, 1992.
[18] M. Elad, Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing, Springer, 2010.
[19] Y. C. Eldar and G. Kutyniok Compressed Sensing: Theory and Applications,
Cambridge University Press, 2012.
[20] M. E. Frerking, Digital Signal Processing in Communication Systems,
Springer, 1994.
[21] P. Flandrin, Time-Frequency/Time-Scale Analysis, Academic Press, 1999.
[22] R. C. Gonzalez and R. E. Woods, Digital Image Processing, Prentice Hall,
2007.
[23] N. Hamdy, Applied Signal Processing: Concepts, Circuits, and Systems,
CRC Press, 2008.
[24] M. H. Hayes, Schaums Outline of Digital Signal Processing, 2nd Edition,
Schaums Outline Series, 2011.
[25] S. Haykin, Least-Mean-Square Adaptive Filters, Wiley-Interscience, 2003.
[26] S. Haykin and B. Van Veen, Signals and Systems, Wiley, 2002.
[27] E. Ifeachor and B. Jervis, Digital Signal Processing: A Practical Approach,
Prentice Hall; 2 edition, 2001.
[28] V. K. Ingle and J. G. Proakis, Digital Signal Processing Using MATLAB,
CL Engineering; 3 edition, 2011.
Ljubia Stankovic
817
818
Bibliography
[44] B. Mulgrew, P. Grant, and J. Thompson, Digital Signal Processing: Concepts and Applications, Palgrave Macmillan; 2nd Edition edition, 2002.
[45] R. Newbold, Practical Applications in Digital Signal Processing, Prentice
Hall, 2012.
[46] A. V. Oppenheim and A. S. Willsky, Signals and Systems, Prentice Hall,
2008.
[47] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing,
Prentice Hall, 2009.
[48] A. V. Oppenheim and G. C. Verghese, Signals, Systems and Inference,
Prentice Hall, 2015.
[49] A. Papulis, Signal Analysis, McGraw-Hill, 1997
[50] C. L. Phillips, J. Parr, and E. Riskin, Signals, Systems, and Transforms,
Prentice Hall, 2013.
[51] B. Porat, A Course in Digital Signal Processing, Wiley, 1996.
[52] J. G. Proakis and D. G. Manolakis, Digital Signal Processing: Principles,
Algorithms, and Applications, Pearson; 4 edition, 2013.
[53] A. Quinquis, Digital Signal Processing Using Matlab, Wiley-ISTE, 2008.
[54] L. R. Rabiner and B. Gold, Theory and Application of Digital Signal
Processing, Prentice Hall, 1975.
[55] M. A. Richards, Fundamentals of Radar Signal Processing, McGraw-Hill
Education, second edition, 2014.
[56] M. J. Roberts, Signals and Systems: Analysis Using Transform Methods &
MATLAB, McGraw-Hill Education, 2011.
[57] A. H. Sayed, Adaptive Filters, Wiley-IEEE, 2008.
[58] L. L. Scharf, Statistical Signal Processing: Detection, Estimation, and Time
Series Analysis, Prentice Hall, 1991.
[59] M. Soumekh, Synthetic Aperture Radar Signal Processing with MATLAB
Algorithms, Wiley-Interscience, 1999.
[60] L. Stankovic, Digital Signal Processing, Naucna knjiga, Beograd, Second
edition 1989 (in Montenegrin/Serbian/Croat/Bosnian).
Ljubia Stankovic
819
820