TRANSFORMS THEORY
AND APPLICATIONS
Edited by Juuso Olkkonen
Published by InTech
Janeza Trdine 9, 51000 Rijeka, Croatia
Copyright 2011 InTech
All chapters are Open Access articles distributed under the Creative Commons
Non Commercial Share Alike Attribution 3.0 license, which permits to copy,
distribute, transmit, and adapt the work in any medium, so long as the original
work is properly cited. After this work has been published by InTech, authors
have the right to republish it, in whole or part, in any publication of which they
are the author, and to make other personal use of the work. Any republication,
referencing or personal use of the work must explicitly identify the original source.
Statements and opinions expressed in the chapters are these of the individual contributors
and not necessarily those of the editors or publisher. No responsibility is accepted
for the accuracy of information contained in the published articles. The publisher
assumes no responsibility for any damage or injury to persons or property arising out
of the use of any materials, instructions, methods or ideas contained in the book.
Publishing Process Manager Ivana Lorkovic
Technical Editor Teodora Smiljanic
Cover Designer Martina Sirotic
Image Copyright Arvind Balaraman, 2010. Used under license from Shutterstock.com
First published March, 2011
Printed in India
A free online edition of this book is available at www.intechopen.com
Additional hard copies can be obtained from orders@intechweb.org
Contents
Preface
Part 1
IX
Non-stationary Signals
Chapter 1
Chapter 2
Chapter 3
Part 2
61
Chapter 4
A MAP-MRF Approach
for Wavelet-Based Image Denoising 63
Alexandre L. M. Levada, Nelson D. A. Mascarenhas
and Alberto Tanns
Chapter 5
Chapter 6
Chapter 7
VI
Contents
Part 3
Chapter 8
Chapter 9
Chapter 10
Part 4
Chapter 11
Chapter 12
Biomedical Applications
141
143
213
215
Preface
Discrete wavelet transform (DWT) algorithms have become standards tools for processing of signals and images in several areas in research and industry. The first DWT
structures were based on the compactly supported conjugate quadrature filters (CQFs).
However, a drawback in CQFs is related to the nonlinear phase eects such as image
blurring and spatial dislocations in multi-scale analyses. On the contrary, in biorthogonal discrete wavelet transform (BDWT) the scaling and wavelet filters are symmetric
and linear phase. The BDWT algorithms are commonly constructed by a ladder-type
network called lifting scheme. The procedure consists of sequential down and uplifting steps and the reconstruction of the signal is made by running the lifting network
in reverse order. Ecient lifting BDWT structures have been developed for VLSI and
microprocessor applications. The analysis and synthesis filters can be implemented
by integer arithmetics using only register shifts and summations. Many BDWT-based
data and image processing tools have outperformed the conventional discrete cosine
transform (DCT) -based approaches. For example, in JPEG2000 Standard the DCT has
been replaced by the lifting BDWT.
As DWT provides both octave-scale frequency and spatial timing of the analyzed signal, it is constantly used to solve and treat more and more advanced problems. One of
the main diculties in multi-scale analysis is the dependency of the total energy of the
wavelet coecients in dierent scales on the fractional shifts of the analysed signal. If
we have a discrete signal x[n] and the corresponding time shifted signal x[n-], where
[0,1], there may exist a significant dierence in the energy of the wavelet coecients
as a function of the time shift. In shift invariant methods the real and imaginary parts
of the complex wavelet coecients are approximately a Hilbert transform pair. The
energy of the wavelet coecients equals the envelope, which provides smoothness and
approximate shift-invariance. Using two parallel DWT banks, which are constructed
so that the impulse responses of the scaling filters have half-sample delayed versions
of each other, the corresponding wavelets are a Hilbert transform pair. The dual-tree
CQF wavelet filters do not have coecient symmetry and the nonlinearity interferes
with the spatial timing in dierent scales and prevents accurate statistical correlations.
Therefore the current developments in theory and applications of wavelets are concentrated on the dual-tree BDWT structures.
This book reviews the recent progress in theory and applications of wavelet transform
algorithms. The book is intended to cover a wide range of methods (e.g. lifting DWT,
shift invariance, 2D image enhancement) for constructing DWTs and to illustrate the
utilization of DWTs in several non-stationary problems and in biomedical as well as
industrial applications. It is organized into four major parts. Part I focuses on non-
Preface
Part 1
Non-stationary Signals
0
1
1
0
Discrete Wavelet Analyses for Time Series
Discrete
Wavelet
Analyses
TimeSeries
Series
Discrete
Wavelet
Analyses
forfor
Time
Jos S. Murgua and Haret C. Rosu
Jos S. Murgua and UASLP,
Haret C.IPICYT
Rosu
UASLP, IPICYT
Mxico
Mxico
1. Introduction
1. Introduction
One frequent way of collecting experimental data by scientists and engineers is as sequences
One
frequent
way of collecting
experimental
dataThese
by scientists
andare
engineers
is as sequences
of values
at regularly
spaced intervals
in time.
sequences
called time-series.
The
of
values at regularly
spaced
in form
time.ofThese
sequences
aretocalled
time-series.
The
fundamental
problem with
theintervals
data in the
time-series
is how
process
them in order
fundamental
problem with
the datainformation,
in the form of
time-series
is how
to process
themin
in them.
order
to extract meaningful
and correct
i.e.,
the possible
signals
embedded
to
meaningful
and correct
information,
thehave
possible
signalscomponents
embedded in
them.
If aextract
time-series
is stationary
one can
think thati.e.,
it can
harmonic
that
can
If
time-series
is stationary
one can
think i.e.,
thatFourier
it can have
harmonic
that
can
beadetected
by means
of Fourier
analysis,
transforms
(FT).components
However, in
recent
be
detected
by means
Fourier
i.e.,are
Fourier
transforms
(FT).sense
However,
in recent
times,
it became
evidentofthat
manyanalysis,
time-series
not stationary
in the
that their
mean
times,
it became
evident
that
time-series
not stationary
theharmonic
sense thatcomponents
their mean
properties
change
in time.
Themany
waves
of infiniteare
support
that forminthe
properties
changeininthe
time.
Thecase
waves
of infinite
support
that form
the harmonic
components
are not adequate
latter
in which
one needs
waves
localized
not only in
frequency
are
not
adequate
in
the
latter
case
in
which
one
needs
waves
localized
not
only
in
frequency
but in time as well. They have been called wavelets and allow a time-scale decomposition
of a
but
in time
as well.progress
They have
been called wavelets
and allow
a time-scale
decomposition
of a
signal.
Significant
in understanding
the wavelet
processing
of non-stationary
signals
signal.
Significant
understanding
waveletto
processing
of non-stationary
signalsa
has been
achievedprogress
over theinlast
two decades.the
However,
get the dynamics
that produces
has
been achieved
over
last two
decades.
However, to get
the dynamics
that produces
non-stationary
signal
it the
is crucial
that
in the corresponding
time-series
a correct
separationa
non-stationary
signal
it is
that
in the corresponding
time-series
a correctpeople
separation
of the fluctuations
from
thecrucial
average
behavior,
or trend, is performed.
Therefore,
had
of
the fluctuations
from the
averageofbehavior,
or trend,
is that
performed.
people
to invent
novel statistical
methods
detrending
the data
should Therefore,
be combined
withhad
the
to
inventanalysis.
novel statistical
detrending
thebeen
datadeveloped
that shouldlately
be combined
with the
wavelet
A bunchmethods
of such of
techniques
have
for the important
wavelet
A bunch
of series
such techniques
have
been developed
lately
important
class of analysis.
non-stationary
time
that display
multi-scaling
behavior
offor
thethe
multi-fractal
class
of
non-stationary
time
series
that
display
multi-scaling
behavior
of
the
multi-fractal
type. Our goal in this chapter is to present our experience with the wavelet processing,
type.
goal
this
chapter
is to transform
present our
experience
with the wavelet
based Our
mainly
oninthe
discrete
wavelet
(DWT),
of non-stationary
fractal processing,
time-series
based
mainly on
the discrete
wavelet
(DWT), ofchaotic
non-stationary
fractal
time-series
of elementary
cellular
automata
and transform
the non-stationary
time-series
produced
by a
of
elementary
cellularelectronic
automatacircuit.
and the non-stationary chaotic time-series produced by a
three-state
non-linear
three-state non-linear electronic circuit.
((
The function
(t) space
is saidoftofunctions
be a wavelet
if and
if its Let
FT
The function (t) is said to be a wavelet if and only if its FT ( ) satisfies
| ( )|2
C = | ( )|2 d < .
(1)
| | d < .
C = 0
(1)
|
|
0
The relation (1) is called the admissibility condition (Daubechies, 1992; Mallat, 1999; Strang,
The
(1) iswhich
calledimplies
the admissibility
condition
(Daubechies,
Mallat, 1999; Strang,
1996;relation
Qian, 2002),
that the wavelet
must
have a zero1992;
average
1996; Qian, 2002), which implies that
the wavelet must have a zero average
(t )dt = (0) = 0,
(2)
(t )dt = (0) = 0,
(2)
and therefore it must be oscillatory. In other words, must be a sort of wave (Daubechies,
1992; Mallat, 1999).
Let us now define the dilatedtranslated wavelets a,b as the following functions
1
tb
,
(3)
a,b (t) =
a
a
(t)
a, b
1/2,
3/2
(t)
1.5
(t)
0.5
1, 0
5/2, 3/2
(t)
0.5
0.5
t
1.5
2
1.5
0.5
1.5
2.5
Fig. 1. The Haar wavelet function for several values of the scale parameter a and translation
parameter b. If a < 1, the wavelet function is contracted, and if a > 1, the wavelet is
expanded.
The continuous wavelet transform (CWT) of x (t) L2 (R ) is defined as
dt,
(4)
=
x (t)
a
a
where , is the scalar product in L2 (R ) defined as f , g : = f (t) g (t)dt, and the symbol
denotes complex conjugation. The CWT (4) measures the variation of x in a neighborhood of
the point b, whose size is proportional to a.
Wx ( a, b ) = x, a,b =
If we are interested to reconstruct x from its wavelet transform (4), we make use of the the
reconstruction formula, also called resolution of the identity (Daubechies, 1992; Mallat, 1999)
x (t) =
1
C
Wx ( a, b )a,b (t)
dadb
,
a2
(5)
(6)
x (t)nm (t)dt .
(7)
The orthonormal basis functions are all dilations and translations of a function referred as the
analyzing wavelet (t), and they can be expressed in the form
nm (t) = 2m/2 (2m t n ),
(8)
with m and n denoting the dilation and translation indices, respectively. The contribution of
the signal at a particular wavelet level m is given by
dm (t ) =
dmn nm (t),
n
(9)
which provides information on the time behavior of the signal within different scale bands.
Additionally, it provides knowledge of their contribution to the total signal energy.
In this context, Mallat (1999) developed a computationally efficient method to calculate (6) and
(7). This method is known as multiresolution analysis (MRA). The MRA approach provides
a general method for constructing orthogonal wavelet basis and leads to the implementation
of the fast wavelet transform (FWT). This algorithm connects, in an elegant way, wavelets
and filter banks. A multiresolution signal decomposition of a signal X is based on successive
decomposition into a series of approximations and details, which become increasingly coarse.
Associated with the wavelet function (t) is a corresponding scaling function, (t), and
scaling coefficients, am
n (Mallat, 1999). The scaling and wavelet coefficients at scale m can be
computed from the scaling coefficients at the next finer scale m + 1 using
am
n =
(10)
dm
n =
(11)
l
l
where h[ n ] and g[ n ] are typically called lowpass and highpass filters in the associated filter
bank. Equations (10) and (11) represent the fast wavelet transform (FWT) for computing (7). In
m
m +1 with the filters h[ n ] and g[ n ] followed
fact, the signals am
n and dn are the convolutions of a n
by a downsampling of factor 2 (Mallat, 1999).
Conversely, a reconstruction of the original scaling coefficients anm+1 can be made from
anm+1 =
(12)
a combination of the scaling and wavelet coefficients at a coarse scale. Equation (12) represents
the inverse of FWT for computing (6), and it corresponds to the synthesis filter bank. This part
can be viewed as the discrete convolutions between the upsampled signal am
l and the filters
h[ n ] and g[ n ], that is, following an upsampling of factor 2 one calculates the convolutions
between the upsampled signal and the filters h[ n ] and g[ n ]. The number of levels in the
multiresolution algorithm depends on the length of the signal. A signal with 2k values can
be decomposed into k + 1 levels. To initialize the FWT, one considers a discrete time signal
X = { x [1], x [2], . . . , x [ N ]} of length N = 2 M . The first application of (10) and (11), beginning
with anm+1 = x [ n ], defines the first level of the FWT of X. The process goes on, always adopting
the m + 1 scaling coefficients to calculate the m scaling and wavelet coefficients. Iterating
(10) and (11) M times, the transformed signal consists of M sets of wavelet coefficients at
scales m = 1, . . . , M, and a signal set of scaling coefficients at scale M. There are exactly 2( km)
( k M ) scaling coefficients a M . The maximum
wavelet coefficients dm
n at each scale m, and 2
n
number of iterations Mmax is k. This property of the MRA is generally the key factor to identify
crucial information in the respective frequency bands. A three-level decomposition process of
the FWT is shown in Fig. 2.
n =1
n =1
(13)
m =1 n =1
This can be identified as Parsevals relation in terms of wavelets, where the signal
energy can be calculated in terms of the different resolution levels of the corresponding
wavelet-transformed signal. A more detailed treatment of this subject can be found in (Mallat,
1999).
(14)
which corresponds to the rule 90. Table 1 is the lookup table of this ECA rule, where it
is specified the evolution from the neighborhood configuration (first row) to the next state
(second row), that is, the next state of i th cell depends on the present states of its left and
right neighbors.
Neighborhood 111 110 101 100 011 010 001 000
Rule result
0 1 0 1 1 0 1 0
Table 1. Elementary rule 90. The second row shows the future state of the cell if it and its
neighbors are in the arrangement shown above in the first row.
In fact, a rule is numbered by the unsigned decimal equivalent of the binary expression in
the second row. When the same rule is applied to update cells of ECA, such ECA are called
uniform ECA; otherwise the ECA are called non-uniform or hybrids. It is important to observe
that the evolution rules of ECA are determined by two main factors, the rule and the initial
conditions.
3.2 WMF-DFA algorithm
To reveal the MF properties (Halsey et al., 1986) of ECA, we follow a variant of the MF-DFA
with the discrete wavelet method proposed in (Manimaran et al., 2005). This algorithm will
separate the trends from fluctuations, in the ECA time series, using the fact that the low-pass
version resembles the original data in an averaged manner in different resolutions. Instead
of a polynomial fit, we consider the different versions of the low-pass coefficients to calculate
the local trend. This method involves the following steps.
Let x (tk ) be a time series type of data, where tk = kt and k = 1, 2, . . . , N.
1. Determine the profile Y (k) = ki=1 ( x (ti ) x ) of the time series, which is the cumulative
sum of the series from which the series mean value is subtracted.
2. Compute the fast wavelet transform (FWT), i.e., the multilevel wavelet decomposition of
the profile. For each level m, we get the fluctuations of the Y (k) by subtracting the local
trend of the Y data, i.e., Y (k; m) = Y (k) Y (k; m), where Y (k; m) is the reconstructed
profile after removal of successive details coefficients at each level m. These fluctuations at
level m are subdivided into windows, i.e., into Ms = int( N/s) non-overlapping segments
of length s. This division is performed starting from both the beginning and the end of
the fluctuations series (i.e., one has 2Ms segments). Next, one calculates the local variances
associated to each window
F 2 (, s; m) = var [ Y (( 1)s + j; m)] , j = 1, ..., s , = 1, ..., 2Ms , Ms = int( N/s) . (15)
3. Calculate a q th order fluctuation function defined as
Fq (s; m) =
1 2Ms 2
| F (, s; m)| q/2
2Ms
=1
1/q
(16)
when q 0 we employed
where q Z with q = 0. Because of the diverging exponent
1 2Ms
2
in this limit a logarithmic averaging F0 (s; m) = exp
ln | F (, s; m)| as in
2Ms
=1
(Kantelhardt et al., 2002; Telesca et al., 2004).
To determine if the analyzed time series have a fractal scaling behavior, the fluctuation
function Fq (s; m) should reveal a power law scaling
Fq (s; m) sh( q) ,
(17)
where h(q ) is called the generalized Hurst exponent (Telesca et al., 2004) since it can depend
on q, while the original Hurst exponent is h(2). If h is constant for all q then the time
series is monofractal, otherwise it has a MF behavior. In the latter case, one can calculate
various other MF scaling exponents, such as (q ) = qh(q ) 1 and f () (Halsey et al.,
1986). A linear behavior of (q ) indicates monofractality whereas the non-linear behavior
indicates a multifractal signal. A fundamental result in the multifractal formalism states that
the singularity spectrum f () is the Legendre transform of (q ), i.e.,
= ( q ),
and
f () = q (q ).
To illustrate the efficiency of the wavelet multifractal procedure, we first carry out the analysis
of the binomial multifractal model (Feder, 1998; Kantelhardt et al., 2002).
For the multifractal time series generated through the binomial multifractal model , a series
of N = 2nmax numbers xk , with k = 1, . . . , N, is defined by
xk = an( k1) (1 a)nmax n( k1) .
(18)
where 0.5 < a < 1 is a parameter and n (k) is the number of digits equal to 1 in the binary
representation of the index k. The scaling exponent h(q ) and (q ) can be calculated exactly in
this model. These exponents have the closed form
h(q ) =
1 ln[ aq + (1 a)q ]
,
q
q ln 2
(q ) =
ln[ aq + (1 a)q ]
.
ln 2
(19)
In Table 2 and Fig. 3, we present the comparison of the multifractal quantity h for a = 2/3
between the values for the theoretical case (h T (q )), with the numerical results obtained
through wavelet analysis (hW (q )). Notice that the numerical values have a slight downward
translation. Adding a vertical offset ( = h T (1) hW (1)) to hW (q ), we can notice that both
values theoretically and numerically are very close.
q h T ( q ) hW ( q ) hW ( q ) +
-10
-9
-8
-7
-6
-5
-4
-3
-2
-1
0
1
2
3
4
5
6
7
8
9
10
1.4851
1.4742
1.4607
1.4437
1.4220
1.3938
1.3568
1.3083
1.2459
1.1699
0.0000
1.0000
0.9240
0.8617
0.8131
0.7761
0.7479
0.7262
0.7093
0.6958
0.6848
1.4601
1.4498
1.4373
1.4217
1.4018
1.3761
1.3422
1.2971
1.2376
1.1626
1.0742
0.9809
0.8961
0.8286
0.7780
0.7401
0.7112
0.6887
0.6711
0.6570
0.6457
1.4851
1.4749
1.4623
1.4467
1.4269
1.4012
1.3673
1.3221
1.2627
1.1876
1.0992
1.0059
0.9212
0.8537
0.8031
0.7652
0.7362
0.7137
0.6961
0.6821
0.6707
Table 2. The values of the generalized Hurst exponent h for the binomial multifractal model
with a = 2/3, which were computed analytically and with the wavelet approach.
In a similar way, we analyze the time series of the so-called row sum ECA signals, i.e., the sum
of ones in sequences of rows, employing the db-4 wavelet function, another wavelet function
that belongs to the Daubechies family (Daubechies, 1992; Mallat, 1999). We have found that
10
h(q)
1.6
Theoretical
WMFDFA
1.4
1.2
0.8
0.6
0.4
0.2
10
12
14
16
18
20
22
Fig. 3. The generalized Hurst exponent h for the binomial multifractal model with a = 2/3.
The theoretical values of h(q ) with the WMF-DFA calculations are shown for comparison.
a better matching of the results given by the WMF-DFA method with those of other methods
is provided with this wavelet function. Figure 4 illustrates the results for the rule 90, when
the first row is all 0s with a 1 in the center, i.e., the impulsive initial condition. The fact that
the generalized Hurst exponent is not a constant horizontal line is indicative of a multifractal
behavior in this ECA time series. In addition, if the index is not of a single slope, it can be
considered as another clear feature of multifractality.
For the impulsive initial condition in ECA rule 90 the most frequent singularity for the
analyzed time series occurs at = 0.568, and = 1.0132(0.9998) when the WMF-DFA
(MF-DFA) are employed. Reference (Murgua et al., 2009) presents the results for different
initial center pulses for rules 90, 105, and 150, where the width of rule 90 is shifted to the
right with respect to those of 105 and 150. In addition, the strongest singularity, min , of all
these time series corresponds to the rule 90 and the weakest singularity, max , to the rule 150.
With the aim of computing the pseudo-random sequences of N bits, in Reference (Meja
& Uras, 2001) an algorithm based on the backward evolution of the CA rule 90 has
been proposed. A modification of the generator producing pseudo-random sequences has
been recently considered in (Murgua et al., 2010). The latter proposal is implemented and
studied in terms of the sequence matrix H N , which was used to generate recursively the
pseudo-random sequences.
This matrix has dimensions (2N + 1) (2N + 1). Since the evolution of the sequence matrix
H N is based on the evolution of the ECA rule 90, the structure of the patterns of bits of the
latter must be directly reflected in the structure of the entries of H N .
11
300
(a)
x 10
(b)
h(q)
(c)
WMFDFA
250
200
150
1.6
MFDFA
1.4
1.2
1
100
0.8
10
50
12
10
0.6
(q)
5000
10000
0.8
0.6
10
0.4
WMFDFA
0.4
10
10
(e)
WMFDFA
0.2
MFDFA
20
10
f()
(d)
15
15000
MFDFA
10
0.2
0.4
0.6
0.8
1.2
Fig. 4. (a) Time series of the row signal of the cellular automata rule 90. Only the first 28
points are shown of the whole set of 214 data points. Profile Y of the row signal. (d)
Generalized Hurst exponent h(q ). (e) The exponent, (q ) = qh(q ) 1. (f) The singularity
d ( q )
12
Here, in the same spirit as in Ref. (Murgua et al., 2009), we also analyze the sum of ones in
the sequences of the rows of the matrix H N with the db-4 wavelet function. The results for the
row sums of H2047 are illustrated in Fig. 5, through which we confirm the multifractality of
this time series. The width H2047 = 1.12 0.145 = 0.975, and the most frequent singularity
= 0.638. Although the profile is different, the results are similar with those
occurs at mfH
2047
obtained for the rule 90 with a slight shifting, see Fig. 4. A more complete analysis of this
matrix is carried out in (Murgua et al., 2010).
The electronic circuit of Fig. 6 (a) has been employed to study chaos synchronization (Rulkov,
1996; Rulkov & Sushchik, 1997). This circuit, despite its simplicity, exhibits complex chaotic
dynamics and it has received wide coverage in different areas of mathematics, physics,
engineering and others (Campos-Cantn et al., 2008; Rulkov, 1996; Rulkov & Sushchik, 1997).
It consists of a linear feedback and a nonlinear converter, which is the block labeled N. The
linear feedback is composed of a low-pass filter RC and a resonator circuit rLC.
The dynamics of this chaotic circuit is very well modeled by the following set of differential
equations:
x = y,
y = z x y,
(20)
z = [ k f ( x ) z] y,
where x (t) and z(t) are the voltages across the capacitors, C and C , respectively, and y
(t) =
J (t)( L/C )1/2 is the current through the inductor L. The unit of time is given by = 1/ LC.
The parameters , ,
and have the following dependence on the physical values of the
circuit elements: = LC/RC , = r C/L and = C/C . The main characteristic of the
nonlinear converter N in Fig. 6 is to transform the input voltage x (t) into an output voltage
with nonlinear dependence F ( x ) = k f ( x ) on the input. The parameter k corresponds to the
gain of the converter at x = 0. The detailed circuit structure of N is shown in Fig. 6 (b).
It is worth mentioning that depending on the component values of the linear feedback and the
parameter k, the behavior of the chaotic circuit can be in regimes of either periodic or chaotic
oscillations. Due to the characteristics of the inductor in the linear feedback, it turns out to
be hard to scale to arbitrary frequencies and analyze it because of its frequency-dependent
resistive losses. Therefore, the parameter k has been considered to analyze this chaotic circuit,
since it appeared to be a very useful bifurcation parameter in both the numerical and
experimental cases (Campos-Cantn et al., 2008). Two different attractors, projected on the
plane ( x, y), generated by this electronic circuit, are shown in Fig. 7. These attractors have
13
140
(a)
x 10
(c)
h(q)
(b)
1.6
120
MFDFA
0.5
1.4
100
80
1.2
1.5
60
WMFDFA
40
0.8
2
20
0
0.6
0
10
100
2.5
200
(q)
2000
0.8
0.6
10
0.4
WMFDFA
20
10
10
(e)
WMFDFA
0.2
MFDFA
f()
(d)
15
10
4000
MFDFA
10
0.5
Fig. 5. (a) Time series of the row signal of H2047 . Only the first 256 points are shown of the
whole set of 211 1 data points. (b) Profile of the row signal of H2047 . (c) Generalized Hurst
exponent h(q ), (d) the (q ) exponent, and (e) the singularity spectrum f ().
14
Fig. 6. (a) The circuit diagram of a nonlinear chaotic oscillator. The component values
employed are C = 100.2 nF, C = 200.1 nF, L = 63.8 mH, r = 138.9 , and R = 1018 . (b)
Schematic diagram of the nonlinear converter N. The electronic component values are
R1 = 2.7 k, R2 = R4 = 7.5 k, R3 = 50 , R5 = 177 k, R6 = 20 k. The diodes D1 and
D2 are 1N4148, the operational amplifiers A1 and A2 are both TL082, and the operational
amplifier A3 is LF356N.
a shape similar to a Rssler oscillator (Fig. 7(a)), and to a double scroll oscillator (Fig. 7(b)).
They can be easily obtained by just fixing the bifurcation parameter k to be equal to 0.4010,
and 0.3964, respectively.
4.2 Wavelet variance
In the wavelet approach the fractal character of a certain signal can be inferred from the
behavior of its power spectrum P ( ), which is the Fourier transform of the autocorrelation
function and in differential form P ( )d represents the contribution to the variance of the
part of the signal contained between frequencies and + d. Indeed, it is known that for
self-similar random processes the spectral behavior of the power spectrum is given by
P ( ) | | ,
(21)
where is the spectral parameter of the signal. In addition, the variance of the wavelet
coefficients var {dm
n } is related to the level m through a power law of the type (Wornell &
Oppenheim, 1992)
15
(a)
0.5
0.5
1.5
0.5
0.5
1.5
y
(b)
1
0.8
0.6
0.4
0.2
0
0.2
0.4
0.6
0.8
1
2
1.5
0.5
0.5
1.5
Fig. 7. Attractors of the electronic chaotic circuit projected on the plane x y obtained
experimentally for two different values of the bifurcation parameter k: (a) 0.4010, and (b)
0.3964.
m
.
var {dm
n } (2 )
(22)
This wavelet variance has been used to find dominant levels associated with the signal, for
example, in the study of numerical and experimental chaotic time series (Campos-Cantn et
al., 2008; Murgua & Campos-Cantn, 2006; Staszewski & Worden, 1999). In order to estimate
we used a least squares fit of the linear model
log2 (var{dm
n }) = m + (K + vm ),
(23)
where K and vm are constants related to the linear fitting procedure. Equation (22) is
certainly suitable for studying discrete chaotic time series, because their variance plot has a
well-defined form as pointed out in (Murgua & Campos-Cantn, 2006; Staszewski & Worden,
1999). If the variance plot shows a maximum at a particular scale, or a bump over a group
of scales, which means a high energy concentration, it will often correspond to a coherent
16
structure. In general, the gradient of a noisy time series turns out to be zero in the variance
plot, therefore it does not show any energy concentration at specific wavelet level. In certain
cases the gradient of some chaotic time series has a similar appearance with Gaussian noise
at lower scales, which implies that these chaotic time series do not present a fundamental
carrier frequency at any scale.
For our illustrative analysis and comparison with the experiments, we study the time series
of the x states of the attractors displayed in Fig. 7(a)-(b), because they are of very different
type and we want to emphasize the versatility of the wavelet approach. The acquisition of
the experimental data was carried out with a DAQ with a sampling frequency of 180 kHz, i.e.
we collected the experimental data for a total time of 182 ms for both signals. In the analysis
of these time series we employed the db-8 wavelet, a wavelet function that belongs to the
Daubechies family (Daubechies, 1992; Mallat, 1999).
Case k = 0.4010.
The first time series to consider corresponds to the x state of the experimental attractor
of Fig. 7 (a). The first 12 ms of this time series are shown in Fig. 8 (a), whereas Fig. 8 (b)
shows a semi-logarithmic plot of the wavelet coefficient variance as a function of level m,
which is denominated as variance plot of the wavelet coefficients. One can notice that the
whole series is dominated by the 12th wavelet level, i.e., this wavelet level has the major
energy concentration, and it is plotted in isolation in Fig. 8 (c). The energy rate between the
reconstructed signal with respect to the original signal was ( Ex12 /Ex ) = 0.9565, which
means an energy close to 96% of the total one in this case. Since it does not properly
show the structure of the chaotic time series, we considered and added together the three
neighbor wavelet levels, m = 11 13, achieving an energy concentration of 99% of the
total one. In this case, the reconstruction of the signal at these wavelet levels is shown in
Fig. 8(d), where the structure of the original signal can be noticed. Both reconstructed time
series present a slight downward translation, because of the DC component of this chaotic
time series.
Case k = 0.3964.
For this value of k, the behaviour of the chaotic electronic circuit is similar to that of a
double scroll oscillator with the shape of the attractor displayed in Fig. 7. The experimental
time series corresponding to the x state of this attractor is shown in Fig. 9 (a), while the
variance plot is given in Fig. 9 (b) where the gradient is close to zero, which means that
no significant energy concentration can be seen. We have found that when summing over
the wavelet levels m = 6 12 the energy concentration is close to 99% of the total one but
without any pronounced peak. Thus, this case does not present a fundamental carrier
frequency and therefore this attractor has a Gaussian noisy behavior. The reconstructed
time series with the mentioned wavelet levels is displayed in Fig. 9 (c).
5. Conclusion
The DWT is currently a standard tool to study time-series produced by all sorts of
non-stationary dynamical systems. In this chapter, we first reviewed the main properties of
DWT and the basic concepts related to the corresponding mathematical formalism. Next, we
presented the way the DWT characterizes the type of dynamics embedded in the time-series.
In general, the DWT reveals with high accuracy the dynamical features obeying power-like
scaling properties of the processed signals and has been already successfully incorporated in
the multifractal formalism. The interesting case of the time-series of the elementary cellular
17
(a) (a)
x(V) x(V)
1.5
1.5
0.5
0.5
0
0
t(ms)t(ms)
0
m
log |log
var|d
(var
( m) |
) |d
n
n
10
10
12
12
14
14
2
10
10
12
(b) (b)
m
2
10
10 12
12 14
14 16
(V) (V)
xm=12x
m=12
12
m
16
(c) (c)
t(ms)t(ms)
0
10
10
(V) (V)
xm=1113
xm=1113
12
12
(d) (d)
t(ms)t(ms)
0
10
10
12
12
Fig. 8.Fig.
The8.
case
Thek case
= 0.4010:
k = 0.4010:
(a) experimental
(a) experimental
time series
time series
of the of
x state,
the x (b)
state,
wavelet
(b) wavelet
coefficient
coefficient
variance,
variance,
(c) time
(c)series
time series
of the of
12th
thewavelet
12th wavelet
level, and
level,(d)
and
the(d)
time
theseries
time series
of the of
sum
thefrom
sum11th
from 11th
to the to
13th
thewavelet
13th wavelet
levels.levels.
18
(a) (a)
x(V)x(V)
2
2
0
t(ms)
t(ms)
0
10
10 15
15 20
20 25
25 30
30 35
m
log |log
var
|d
(var
( m) |
)d
|
n
n
0
40
(b) (b)
35 40
10 10
12 12
14 14
2
10
10
12
12
14
(V) (V)
xm =x612
m = 612
14
16
16
(c) (c)
t(ms)
t(ms)
0
10
10 15
15 20
20 25
25 30
30 35
35 40
40
Fig. 9.Fig.
The9.case
Thekcase
= 0.3964:
k = 0.3964:
(a) experimental
(a) experimental
time series
time series
of theofx the
state,
x state,
(b) wavelet
(b) wavelet
coefficient
coefficient
variance,
variance,
(c) time
(c) series
time series
of theofsum
the from
sum 6th
fromto6th
theto12th
thewavelet
12th wavelet
levels.levels.
19
automata has been presented in the case of rule 90 and the concentration of energy by means of
the concept of wavelet variance for the chaotic time-series of a three-state non-linear electronic
circuit was also briefly discussed.
6.References
Campos-Cantn, E.; Murgua, J. S. & Rosu, H. C. (2008). Chaotic dynamics of a nonlinear
electronic converter, International Journal of Bifurcation and Chaos, 18(10), October 2008
(2981-3000), ISSN 0218-1274.
Daubechies, I. (1992). Ten lectures on Wavelets, SIAM, ISBN 10: 0-89871-274-2, Philadelphia, PA.
Feder, J. (1998). Fractals, Plenum Press, ISBN 3-0642-851-2, New York, 1998 (Appendix B).
Halsey T.C.; Jensen M. H.; Kadanoff L. P.; Procaccia I. & Shraiman B. I. (1986). Fractal measures
and their singularities: The characterization of strange sets, Physical Review A, 33(2),
February 1986 (1141-1151), ISSN 1050-2947.
Kantelhardt, J.,W.; Zschinegner, S.,A.; Koscielny-Bunde, E.; Havlin, S.; Bunde, A. & Stanley.
H. E. (2002). Multifractal detrended fluctuation analysis of nonstationary time series,
Physica A, 316(1-4), December 2002 (87-114), ISSN 0378-4371.
Mallat, S. (1999).A Wavelet Tour of Signal Processing, 2nd. Edition, Academic Press,
ISBN-13:978-0-12-466606-1, San Diego, California, USA.
Manimaran P.; Panigrahi P. K. & Parikh J. C. (2005). Wavelet analysis and scaling properties of
time series, Physical Review E, 72(4) October 2005 (046120, 5 pages), ISSN 1539-3755.
Meja M. & Uras J. (2001). An asymptotically perfect pseudorandom generator, Discrete and
Continuos Dynamical Systems, 7(1), January 2001 (115-126), ISSN 1078-0947.
Murgua, J. S. & Campos-Cantn, E. (2006). Wavelet analysis of chaotic time series, Revista
Mexicana de Fsica, 52(2), April 2006 (155-162), ISSN 0035-001X.
Murgua, J. S.; Prez-Terrazas, J. E. & Rosu, H. C. (2009). Multifractal properties of elementary
cellular automata in a discrete wavelet approach of MF-DFA, Europhysics Letters,
87(2), July 2009 (28003, 5 pages), ISSN 0295-5075.
Murgua, J. S.; Meja-Carlos, M; Rosu, H. C. & Flores-Eraa, G. (2010). Improvement and
analysis of a pseudo random bit generator by means of CA, International Journal of
Modern Physics C, 21(6), June 2010 (741-756), ISSN 0129-1831.
Nagler J. & Claussen J. C. (2005). 1/ f spectra in elementary cellular automata and fractal
signals, Physical Review E, 71(6) June 2005 (067103, 4 pages), ISSN 1539-3755.
Percival, D. B. & Walden, A. T. (2000) Wavelet Methods for Time Series Analysis, Cambridge
University Press, ISBN 0-52164-068-7, Cambridge.
Rulkov, N. F. (1996). Images of synchronized chaos: Experiments with circuits, CHAOS, 6(3),
September 1996 (262-279), ISSN 1054-1500.
Rulkov, N. F. & Sushchik, M. M.(1997). Robustness of Synchronized Chaotic Oscillations,
International Journal of Bifurcation and Chaos 7(3), March 1997(625-643), ISSN
0218-1274.
Rulkov, N. F., Afraimovich, V. S., Lewis, C. T., Chazottes, J. R., & Cordonet, J. R. (2001).
Multivalued mappings in generalized chaos synchronization. Physical Review E 64(1),
July 2001(016217 1-11), ISSN 1539-3755.
Sanchez J. R. (2003). Multifractal characteristics of linear one-dimensional cellular automata,
International Journal of Modern Physics C, 14(4), May 2003 (491-499), ISSN 0129-1831.
Staszewski, W. J. & Worden, K. (1999). Wavelet analysis of time series: Coherent structures,
chaos and noise, International Journal of Bifurcation and Chaos, 9(3), September 1999
(455-471), ISSN 0218-1274.
20
Strang, G. & Nyugen, T. (1996). Wavelets and Filter Banks, Wellesley Cambridge Press, ISBN
0-96140-887-1, Wellesley, MA, USA.
Telesca L., Colangelo G., Lapenna V. & Macchiato M. (2004). Fluctuation dynamics in
geoelectrical data: an investigation by using multifractal detrended fluctuation
analysis, Physics Letters A, 332(5-6), November 2004 (398-404), ISSN 0375-9601.
Qian, S. (2002). Introduction to Time-Frequency and Wavelet Transforms, Prentice Hall PTR, ISBN
0-13030-360-7.
Wornell, G. W. & Oppenheim, A. V. (1992). Wavelet-based representations for a class of
self-similar signals with application to fractal modulation, IEEE Transactions on
Information and Theory, 38(2), 1992(785-800), ISSN 0018-9448.
2
Discrete Wavelet Transfom for
Nonstationary Signal Processing
Yansong Wang, Weiwei Wu, Qiang Zhu and Gongqi Shen
1. Introduction
In engineering, digital signal processing techniques need to be carefully selected according
to the characteristics of the signals of interest. The frequency-based and time-frequency
techniques have been frequently mentioned in some literature (Cohen, 1995). The frequencybased techniques (FBTs) have been widely used for stationary signal analysis. For
nonstationary signals, the time-frequency techniques (TFTs) in common use, such as shorttime Fourier transform (STFT), wavelet transform (WT), ambiguity function (AF) and
wigner-ville distribution (WVD), etc., are usually performed for extracting transient features
of the signals. These techniques use different algorithms to produce a time-frequency
representation for a signal.
The STFT uses a standard Fourier transform over several types of windows. Waveletbased techniques apply a mother wavelet with either discrete or continuous scales to a
waveform to resolve the fixed time-frequency resolution issues inherent in STFT. In
applications, the fast version of wavelet transform, that is attributed to a pair of mirror
filters with variable sampling rates, is usually used for reducing the number of
calculations to be done, thereby saving computer running time. AF and WVD are
quadratic time-frequency representations, that use advanced techniques to combat these
resolution difficulties. They have better resolution than STFT but suffer from cross-term
interference and produce results with coarser granularity than wavelet techniques do. Of
the wavelet-based techniques, discrete wavelet transform (DWT), especially its fast
version, is usually used for encoding and decoding signals, while wavelet packet analysis
(WPA) are successful in signal recognition and characteristic extraction. AF and WVD
with excessive transformation durations are obviously unacceptable in the development
of real-time monitoring systems.
In applications, the FBTs were typically used in noise and vibration engineering (Brigham,
1988). They provide the time-averaged energy information from a signal segment in
frequency domain, but remain nothing in time domain. For nonstationary signals such as
vehicle noises, some implementation examples are the STFT (Hodges & Power, 1985), WVD,
smoothed pseudo-WVD (Baydar & Ball, 2001) and WT (Chen, 1998). In particular, the WT as
mathematical microscope in engineering allows the changing spectral composition of a
nonstationary signal to be measured and presented in the form of a time-frequency map and
thus, was suggested as an effective tool for nonstationary signal analysis.
22
This chapter includes three sections. We firstly briefly introduce the theory background of the
Wavelet-based techniques, such as the CWT, DWT, WPA, as well as the Mallat filtering
scheme and algorithm for the DWT-based calculation. Secondly, we discuss the advantages
and drawbacks of the DWT-based methods in nonstationary signal processing by comparing
the DWT with other TFTs. Some successful examples of the DWT used for nonstationary
vibration and sound signals in the vehicle engineering will be given in the third section.
2. Theory background
2.1 Continuous wavelet transform
For a function or signal x(t)L2(R), if a prototype or mother wavelet is given as (t), then the
wavelet transform can be expressed as:
1
tb
x(t)(
)dt = x(t), ab (t)
(1)
a
a
Here a and b change continuously, so comes the name continuous wavelet transform (CWT).
A family of wavelets ab(t), each of which can be seen as a filter, is defined in (1) by dilating
and translating of (t). Obviously, b changes along the time axle, its role is simple and clear.
Varible a acts as a scale function, its change alters not only the spectrum of the wavelet
function, but also the size of its time-frequency window. The local information in time and
frequency domain, which reflects different characteristics of the signal, is extracted by CWT.
CWTx (a, b) =
20.5(/2)
20.5 (t/2)
(t)
(2t)/20.5
()
(2)/20.5
0.5
1
0
-0.5
-8
-4
0
-10
-5
10
()
admissible wavelet. In this condition, original signal x(t) can be recovered from its CWT by:
x(t) =
1
c
dadb
a2
(2)
In the case where is also L1(R), the admissibility condition implies that (0)=0; has mean
value 0, is oscillating, and decays to zero at infinity; these properties explain the
qualification as wavelet of this function (t). From the view of signal processing, (t) acts
as a band pass filter.
23
Simply let a = a0j , where a0 > 0 and j Z , we can discretize a. Generally we have a0 = 2 , thus
1
2j
x(t)(
tb
2j
tb
2j
) is a dyadic
is
(
) . The relationship between (t) and (t) is: (
j
j
2
2
2
(2 j )
j =
2 3 j/2 WTx ( j, b) (
tb
2j
j =
)db
(3)
j =
this is the stability condition. Obviously, dual wavelet of a stable function is also stable.
To step further, we sample time domain by taking b=kb0, where b0 should be chosen to ensure
the recovery of x(t). When a is changed from a0j 1 to a0j , the central frequency and the band
width of the wavelet are all decreased by a0 times, so the sample interval can increase to a0
times. In this case, the discretized wavelet function is jk (t) =
wavelet transform is: WTx ( j,k) =
1
a0j
x(t)(
t ka0j b0
a0j
1
a0j
t ka0j b0
a0j
) , and its
is called discrete wavelet transform (DWT). From this formula, while time t is still continuous,
we only compute the wavelet transform on a grid in the time-frequency plane, as depicted in
Fig. 2.
Given dj(k)=WTx(j,k), we hope to recover x(t) from formula like
x(t) =
d j (k) jk (t)
j = 0 k =
(4)
24
jk (t) is dual
This formula is called wavelet series, in which dj(k) is wavelet coefficients and
wavelet. To recover x(t) using (4), many questions should be answered, such as: are jk(t)
complete to describe arbitrary signal x(t)L2(R); is there information redundancy in the
decomposition; how to determine the sample interval of a and b. Daubechies studied them
thoroughly, and her wavelet frame theory answered these questions [1].
x, n
be tight. A frame defines a complete and stable signal representation, which may also be
redundant. When the frame vectors are normalized n
by the frame bounds A and B. The frame is an orthogonal basis if and only if A=B=1. If A>1
then the frame is redundant and A can be interpreted as a minimum redundancy factor.
If a frame operator S is defined as Sx = x, n n , then x = x,S 1 n n = x, n S 1 n ,
n
A-1
and
B-1.
If A=B, we have
1
n = n . So the recovery process in (4) is well founded. In many cases where precise
A
2
reconstruction
is
not
a
pursuit,
we
can
take
jk (t)
jk (t) ,
A+B
2
BA
x(t) =
x(t), jk (t) jk (t) + e(t) , here e(t) is the error and e(t) B + A f .
A + B j,k
The only remain problem is how to construct a wavelet frame. Obviously, the smaller b0 and
a0 are, the greater the information redundancy is, and the reconstruction is easier. On the
contrary, n will be incomplete when b0 and a0 are big enough, which make precise recovery
of x(t) impossible. For this problem, there are two theorems: (1) If jk (t) = a 0 2 (a0 j t kb0 )
sup
0 a0 j =
[(
k =
k 0
()
d
0
B ; (2) Define
b0 ln a0
2 k 2 k 2
)(
)] , if b0 and a0 are such that
b0
b0
25
A0 =
2
2
1
1
( inf (a0j ) ) > 0 and B0 = ( inf (a0j ) + ) < , then {jk(t)} is
b0 0 a0 j =
b0 0 a0 j =
a frame of L2(R). These two theorems are the sufficient and necessary conditions to construct
wavelet frame.
In some cases, wavelet frame {jk(t)} is orthogonal or independent, the more correlated the
functions are , the smaller the subspace spanned by the frame is. This is useful in noise
reduction. When b0 and a0 is close to 0 and 1, the functions of the frame are strongly related
and behave like continuous wavelet. In other cases, redundancy or dependency is avoided
as possible, so , b0 and a0 are chosen to compose an orthogonal basis.
2.3 Multiresolution analysis and mallat algorithm
Multiresolution analyze (MRA) provides an elegant way to construct wavelet with different
properties. A sequence {Vj}jZ of closed subspaces of L2(R) is a MRA if the following 6
properties are satisfied:
1. (j,k) Z, f(t) Vj f(t 2 j k) Vj ,
2.
j Z, Vj + 1 Vj ,
3.
t
j Z, f(t) Vj f( ) Vj + 1 ,
2
4.
Vj = Vj = {0} ,
lim
j
j =
5.
Vj = Closure( Vj ) = L (R) ,
lim
j
6.
j =
V0
V1
V2
V3
W2
W0
W1
26
basis of Vj and Vj+1 differs only on the scale by 2. We only discuss how to construct an
orthogonal wavelet basis here, so a space series Wj which satisfy Vj Wj Vj 1 are
introduced.
By
this
idea,
the
function
Wj Vj and so L2 (R) =
V0 = W1 W2
space
can
be
decomposed
like
m =
( + 2k)
kZ
transform (), is not an orthogonal basis of V0, from the above theorem, we can compute
()
() =
, and {(t-n)}nz must be orthogonal. We call (t) the scale function,
2
( + 2k)
kZ
t
( ) = 2 h(k)(t k) .
2
k =
H() =
In
the
2(2) = H()() ,
frequency,
where
t
h(k)e ik . If we take {( n)} nZ as an orthogonal basis of W1, since we have
2
k =
V0 = V1 W1
t
( ) = 2 g(k)(t k)
2
k =
and
(t)dt = 0
and
2 and
g(k) = 0 ; 2)
H(0) = 2
and G(0)=0. From this, H is a low pass filter and G band pass filter.
From formula
( + 2k)
kZ
H() + H( + ) = 2
(5)
G() + G( + ) = 2
(6)
(7)
27
d j (k) =
x j 1 (k) =
n =
n =
n =
(8)
(9)
x j (n)h(k 2n) +
n =
d j (n)g(k 2n)
(10)
In them, (8) and (9) are for decomposition and (10) is for reconstruction. By decomposing it
recursively, as in Fig. 4(a), the approximate signal xj(k) and detail signal dj(k) are computed
out successively.
xj-2(k)
h(k)
g(k)
xj-1(k)
h(k)
g(k)
dj-1(k)
xj(k)
dj(k)
(a) Decomposition
xj(k)
dj(k)
h(k)
g(k)
xj-1(k)
dj-1(k)
h(k)
g(k)
xj-2(k)
(b) Reconstruction
Fig. 4. The Mallat algorithm
28
important in signal analysis. The Fourier transform and its inversion connect the frequency
domain features with the time domain features. Their definitions are as below:
X(f) = x(t)e j2 ftdt
(11)
(12)
In the stationary signal analysis, one may use the Fourier transform and its inversion to
establish the mapping relation between the time and frequency domains. However, in the
practical applications, the Fourier transform is not the best tool for signal analysis due to the
nonstationary and time varying feature in the most engineering signals, such as engine
vibration and noise signals. For these signals, although their frequency elements can be
observed from their frequency spectrum, the time of frequency occurrence and frequency
change relationship over time can not be acquired. For further research on these signals, the
time-frequency descriptions are introduced. Fig. 5 shows three time-frequency descriptions
of the linear frequency modulation signal generated from the Matlab Toolbox: (a) is the
frequency domain description which loses the time information; (c) is the time domain
description which loses the frequency information; (b) is the time-frequency description
which shows the change rule of frequency over time clearly.
60
60
50
50
40
f/Hz
f/Hz
40
30
30
20
20
10
10
0
30
25
20
15
10
A
(a) frequency domain
20
40
60
80
100
120
t/s
(b) time-frequency representation
0.5
0
-0.5
-1
50
100
t/s
(c) time domain
150
29
The basic idea of time-frequency analysis is to develop a joint function to combine the
time and frequency factors. The time-frequency analysis, which can describe the signal
traits on a time-frequency plane, has become an important research field. Many timefrequency methods have been presented, which can be divided into three types: linear,
quadratic and nonlinear. The STFT and WT belong to the linear type, and the WignerVille distribution (WVD) and pseudo Wigner-Ville distribution (PWVD) belong to the
quadratic type. This section compares the STFT, WVD, PWVD and WT for showing the
advantage of WT.
The basic idea of STFT, which is presented by Gabor in 1946, is to cut out the signal by a
window function, in which the signal can be regard as stationary, and analyze the signal to
make sure the frequency elements in the window by the Fourier Transform, then move the
window function along the time axle to obtain the change relationship of frequency over
time. This is time-frequency analysis process of STFT and the STFT of the signal x(t) can be
described as:
STFTx (t, f) = x(t')g * (t' t)e
j2 ft '
dt'
(13)
The WVD, which was presented by Wigner in the research of quantum mechanics in 1932
and applied to signal, processing by Ville later, satisfies many mathematical properties
expected by time-frequency analysis. The WVD of the signal x(t) can be described as:
WD x (t, f) = x(t + / 2)x * (t / 2)e j2 fd
(14)
To suppress the disturbing of cross term in the WVD, the PWVD, which can be equivalent to
smooth the WVD, is introduced. The PWVD of the signal x(t) can be described as:
PWVD x (t, f) = h( )x(t + / 2)x * (t + / 2)e j2 fd
(15)
30
1.5
0.5
-0.5
-1
-1.5
0
20
40
60
80
100
120
140
0.5
0.5
0.4
0.4
0.3
0.3
f/Hz
f/Hz
0.2
0.1
0
0.2
0.1
20
40
60
80
100
t/s
120
20
40
80
100
120
(b) WVD
0.5
0.5
0.4
0.4
f/Hz
0.3
f/Hz
60
t/s
(a) STFT
0.2
0.3
0.2
0.1
0.1
0
20
40
60
80
100
t/s
120
(c) PWVD
20
40
60
80
100
t/s
120
(d) WT
31
(16)
Where [M] , [C] , [K] are matrixes of mass, damping and stiffness respectively, {I(t)} is the
road roughness vector; [P] is the transfer matrix from the road roughness vector to the force
excitation. {z} is the system response vector.
V(t) =
v
+
a
(t
v
/a
)
v
/
a
2
m
1
m
1 < t < (v m / a 1 v m / a 2 )
m
(17)
The above process was called AAB process. Using the Runge-Kutta Method, the time
series of road roughness and vehicle response were calculated by Eqs. (16) and (17).
32
Fig. 9. 2D and 3D scalograms result from CWT during the AAB process: (a) the vertical
vibration of the driver seat; (b), (c) and (d) the vertical, pitch and roll vibrations of the
vehicle body; (e) the vertical vibration of the front axle; (f) the road roughness of the rightrear wheel.
The CWT and DWT are performed by using the Mallat algorithm in the Matlab toolbox. The
selected parameters for calculation are: the Daubechies wavelet with a filter length of seven,
the scaling factor a=1-350, i.e., the frequency range: 0.404-138.5 Hz. Fig. 9 (a)-(f) shows the
acceleration scalograms, which were obtained from the CWT, of the seat, vehicle body, axle
and road roughness during the AAB process, respectively. As seen from Fig. 9, the worst
ride performance of the vehicle happened at 8s during the AAB, and there was a little
time delay in the vibrations transfer from road to the vehicle system. In the accelerating
process, the vibration energies of the vehicle are getting bigger, moving, as well, to the
33
higher-frequency area; their frequency bands are getting broader, and vice versa in the
braking process. As a rule, these phenomena of energy flow are transmitted to the other
levels through the suspension system.
In view of the vehicle design, the ride comfort of the passenger seat is the most important.
Comparing Fig. 9 (a)-(f), the energy of road excitation has been greatly restrained by the
suspension system of the vehicle. However, the similar time frequency traits can be seen in
(a), (b) and (c), and the ride comfort of the seat deteriorates suddenly at a certain running
speed. That means that the vertical and the pitching movement of the vehicle body have
more effect on the vibration of seats than the rolling movement, and that the vibration
energy of the vehicle body flowed into the resonance frequency region of the seat vibration
system during the AAB process.
From the above findings, the WT can provide the time-frequency map of transient energy
flow of the examined points of interest in the vehicle vibration system. Thus, the WT may
be used in vehicle vibration system design, especially for the transient working cases.
4.2 DWT-based denoising for nonstationary sound signals
In sound quality evaluation (SQE) engineering, distortion of the measured sounds by certain
additive noises occurred inevitably, which came from both ambient background noise and
the hardware of the measurement system; therefore, the signal needed to be denoised. In the
former researches, we found that the unwanted noises are mainly write random noises
which distributed in a wide frequency band but with small amplitudes. Some techniques for
white noise suppression in common use, such as the least square, spectral subtraction,
matching pursuit methods, and the wavelet threshold method have been used successfully
in various applications. The wavelet threshold method in particular has proved very
powerful in the denoising of a nonstationary signal. Here a DWT-based shrinkage denoising
technique was applied for SQE of vehicle interior noise.
Sample vehicle interior noises were prepared using the binaural recording technique. The
following data acquisition parameters were used: signal length, 10 s, sampling rate, 22 050
Hz. The measured sounds have been distorted by the random write random noises, and
then wavelet threshold method is applied. This technique may be performed in three steps:
(a) decomposition of the signal, (b) determination of threshold and nonlinear shrinking of
the coefficients, and (c) reconstruction of the signal. Mathematically, the soft threshold
signal is sign(x) (|x|-t) if |x|>t, and otherwise is 0, where t denotes the threshold. The
selected parameters were: Daubechies wavelet db3, 7 levels, soft universal threshold equal
to the root square of 2 log (length(f)). As an example, a denoised interior signal and
corresponding specrum are shown in Fig. 10. It can be seen that the harmony and white
noise components of the sample interior noise are well-controlled. The wavelet shrinkage
denoising technique is effective and sufficient for denoising vehicle noises.
Based on the denoised signals, the SQE for vehicle interior noise was performed by the
wavelet-based neural-network (WT-NN) model which will be mentioned in detail in the
next section, the overall schematic presentation of the WT-NN model is shown in Fig. 11.
After the model was well trained, the signals were fed to the trained WT-NN model and the
Zwicker loudness model which is as reference. It can be seen that the predicted specific
loudness and sharpness in Fig.12 are consistent with those from the Zwicker models. The
wavelet threshold method can effectively suppress the write noises in the nonstationary
sound signal.
34
Fig. 10. Comparison of the interior noises (left panel) and their spectra (right panel) before
and after the wavelet denoising model.
4.3 DWT for nonstationary sound feature extraction
In the above section, we mentioned a new model called WT-NN used for SQE for vehicle
interior noise shown in Fig. 11. A wavelet-based, 21-point model was used as the preprocessor of the new WT-NN SQE model for extracting the feature of the nonstationary
vehicle interior noise. For interpreting this new proposed model in detail, here we extend
this model to another kind of noise-passing vehicle noise.
Fig. 11. Schematic presentation of the data inputs and outputs to the neural network
Sample passing vehicle noises were prepared identically as the above vehicle interior
noises. The measured signals were denoised by using the wavelet threshold method
mentioned before. Based on the pass-by vehicle noises, the 21-point feature extraction
model for pass-by noises was designed by combining the a five-level DWT and a fourlevel WPA shown in Fig.13. It was used to extract features of the pass-by noises. The
results are shown in Fig. 14.
35
Fig. 12. Comparisons of specific loudness (left panel) and sharpness (right panel) between
(a) the Zwicker model (upper), and (b) the WT-NN model (down)
Fig. 13. Twenty-one-point wavelet-based feature extraction model for pass-by noise analysis
36
Fig. 14. Feature of the pass-by noise in time-frequency map extracted by the 21-point model
As the inputs of the WT-NN models, the above wavelet analysis results provide the timefrequency features of the signals. The SQM (sound equality matrices) of the pass-by noise as
the outputs is taken from the psychoacoustical model. The loudness was adopted, which is
related to the SQE of the vehicle pass-by noises. The output SQM is expressed as,
SQM = [TL SL]T
where the vectors TL and SL denote the total and specific values of loudness, respectively.
After training the WT-NN model, the signals were fed into the Zwicker loudness model and
the trained WT-NN model. It can be seen that the predicted specific loudness in Fig. 15
coincide well with those from the Zwicker models, thus as the pre-processor of the WT-NN
model, the newly proposed wavelet-based, 21-point model can extract the feature of
nonstationary signal precisely.
Fig. 15. Specific loudness comparison between (left panel) the Zwicker model, and (right
panel) the WT-NN model.
37
L i = 10 log(
1
mi
mi
p2ij / pref )
(18)
j=1
where L i is the ith band SPL, m i is the total points of the ith band signal, Pij is the ith band
sound pressure on the jth point, and Pref is the reference sound pressure, Pij =20e-6.
Comparing with the measured results, the errors of the band SPLs in Fig. 21 are within [-0.3,
+0.2] dB, which are much less than the band error scope of 1 dB defined in the IEC 651
standard. The total SPL are also computed by Eq. (19),.
L T = 10 log( 10Li )
(19)
It is exact same as the measured value 83.7 dB. In view of the A-weighted total SPL, the
measured value is 66.1 dB (A), and the calculated value is 66.2 dB(A). To prove transient
characteristic of the DWT-OBA algorithm, furthermore, the time-varying A-weighted total
SPLs of the interior vehicle noise are carried out by using the DWT-OBA and MF-OBA
algorithms, respectively. MF-OBA is a self-designed multi-filter octave band analysis
method also used for SQE and here is adopted as reference. The selected calculation
parameters are: time frame length, 200 ms, frame amount, 25, and A-weightings, [-56.7 -39.4
-26.2 -16.1 -8.6 -3.2 0 1.2 1.0 -1.1] dB, for octave band number from one to 10. The results
shown in Fig. 17 imply a very good transient characteristic of the DWT-OBA.
In order to examine the effectiveness of the presented DWT-OBA algorithm for more
practical uses, we applied it and the self-designed filtering algorithm to the measured
exterior vehicle noise, respectively. The exterior noise signal has been pre-processed
following the DWT denoising procedure. The A-weighted band SPLs of the exterior vehicle
noise calculated from the filtering and DWT algorithms, as well as the measurement results,
are shown in Figs. 18 and 19. And the calculated results are summarized in Table 1.
Sound
signal
Resampling by
CoolEdit
DWT
decomposition
DWT
reconstruction
Octave
spectra
Fig. 16. The calculation flowchart for DWT octave-band analysis of a sound signal
Total
SPL
38
Fig. 17. Calculated time-varying A-weighted total SPLs of the interior vehicle noise by using
the newly proposed DWT and filtering algorithms.
Fig. 18. Linear SPL comparison of the octave-band analysis of the interior vehicle noise: (a)
the measured result, (b) SPL values calculated by the db35 filter bank, and (c) the band SPL
errors
39
Fig. 19. A-weighted octave-band SPLs of the exterior vehicle noise from (a) the
measurement, (b) self-designed filtering algorithm, and (c) the DWT algorithm.
Octave band
1
2
3
4
number
Measured band
-13.2
21.0
56.8 55.9
SPLs(dB)
Filtering band SPL
-0.09
0.48
0.05 0.37
error(dB)
DWT band SPL
-0.008 0.25
-0.08 0.24
errors(dB)
A-weighted total
64.0 (measured )
SPLs(dB)
Error percentage
0.2306% (filtering)
of total SPLs
10
56.7
55.9
58.8
35.4
25.5
20.2
0.19
0.12
-0.07
0.27
0.17
0.09
0.18
0.04
0.02
0.03
-0.01
64.1476 (filtering)
64.0953 (DWT)
0.1489% (DWT)
Table 1. Summary of the calculated A-weighted SPLs of the exterior vehicle noise from
different methods
It can be seen that, for the exterior vehicle noise, the A-weighted SPLs from different
methods have almost same octave patterns in frequency domain. From Table 1, the
maximum errors of the filtering and DWT band SPLs are 0.48 and 0.25 dB, respectively,
40
which are all occurred in the octave band with a center frequency of 32 Hz. These errors can
make very small contributions to the total SPL values, due to the special frequency
characteristics of the vehicle noises. The octave band SPLs from the presented methods are
satisfied the error limitation of 1 dB published in the IEC 651 standard. The error
percentage of the A-weighted total SPLs are 0.2306% and 0.1489% for the filtering and DWT
algorithms, respectively. The above comparisons indicate that the presented DWT-OBA
algorithm is effective and feasible for sound quality estimation of vehicle noises.
4.5 DWT pattern identification for engine fault diagnosis
In Section 4.2, 4.3 we proposed a new model called WT-NN in which the wavelet-based, 21point feature extraction model was designed as the pre-processor. Here we performed this
model for engine fault diagnosis (EFD), so called EFD WT-NN model.
To establish the EFD WT-NN model, firstly, a database including the engine fault
phenomena and their corresponding sound intensity signals needs to be built. Based on the
2VQS type of EFI engine mounted on the GW-II engine test bed, the sound intensities in
different failure conditions were measured using the two-microphone recording technique
recommended by the standard ISO 9614. The experimental equipments are arranged as that
in Fig. 20. The measured signals were denoised by using the wavelet threshold method.
41
Fig. 21. The 21-point time-frequency feature of the engine fault state that the ECU does not
receive the knock signals (meshing point no.2)
Engine working state
Normal idling state of the engine (S0)
The nozzle in the first cylinder doesnt work (S1)
The second and third cylinders do not work (S2)
The electric motor doesnt work (S3)
ECU does not receive the hall senor signals (S4)
The throttle orientation potentionmeter is broken (S5)
ECU does not receive the knock signals (S6)
The 5-voltage power of the hall sensor is broken (S7)
ECU does not receive the oxygen sensor signal (S8)
Target output
[0 0 0 0 0 0 0 0 0]
[0 1 0 0 0 0 0 0 0]
[0 0 1 0 0 0 0 0 0]
[0 0 0 1 0 0 0 0 0]
[0 0 0 0 1 0 0 0 0]
[0 0 0 0 0 1 0 0 0]
[0 0 0 0 0 0 1 0 0]
[0 0 0 0 0 0 0 1 0]
[0 0 0 0 0 0 0 0 1]
S fs = uncertain
0.45 S v 0.55
1
0.55 S v 1.0
where, Sfs denotes the fault state of the engine, Sv denotes the calculated output values of the
WT-NN model. It can be seen that the diagnosis results in Table 4 is exactly same as that
expected.
We obtained similar comparison results from the simulations using engine noise signals at
other measuring points. We found that, for the sample signals used in the NN learning, the
outputs of the BP network are in general conformity with the desired results; when the
input data deviate from the samples within a certain range, the NN output has a tendency to
approach the sample failure characteristics. For a real failure diagnosis, one may select in
measurement points under the guidance of the NN designer of the diagnosis system.
According to the above findings, the wavelet-based model may be used to diagnose engine
failures in vehicle EFD engineering.
42
State
Model
output
Result
S0
0
0
0
0.001
0
0.009
0
0
0
S0
S1
0.164
0.610
0.022
0
0.016
0
0
0
0
S1
S2
0
0
0.989
0
0
0
0
0
0
S2
S3
0
0
0
0.987
0.034
0.013
0
0.008
0
S3
S4
0.027
0.023
0
0.001
0.970
0
0
0
0.023
S4
S5
0
0
0
0
0.002
0.995
0.001
0
0.004
S5
S6
0
0
0.009
0.002
0
0.002
0.979
0
0.030
S6
S7
0
0
0
0.002
0
0.011
0
0.885
0
S7
S8
0
0
0
0
0.085
0
0.088
0
0.976
S8
Table 4. The outputs of the WT-NN model and diagnosis results at point P1
5. Acknowledgments
This work was supported by the NSFC (grant no. 51045007), and partly supported by the
Program for Professor of Special Appointment (Eastern Scholar) at Shanghai Institutions of
Higher Learning, China.
6. References
Cohen, L. Time-Frequency Analysis. Prentice-Hall, New Jersey, USA, 1995.
Brigham E. O., The fast fourier transform and its applications. Prentice-Hall, Englewood Cliffs,
New Jersey, 1988
Hodges C. H., Power J., Woodhouse J., The use of the sonogram in structure acoustics and
an application to the vibrations of cylindrical shells, Journal of Sound and Vibration.
101, 203-218, 1985
Baydar N., Ball A., A comparative study of acoustic and vibration signals in detection of
gear failures using wigner-ville distribution. Mechanical Systems and Signal
Proceeding, 15(6), 1091-1107, 2001
Chen F. S., Wavelet transform in signal processing theory and applications. National
Defense Publication of China, 1998
Daubachies I., Ten Lectures on Wavelets, Philadelphia, PA: SIAM, 1992
Wang Y. S., Lee C.-M., Zhang L. J., Wavelet Analysis of Vehicle Nonstationary Vibration
Under Correlated Four-wheel Random Excitation, International Journal of Automotive
Technology, Vol.5 No.4, 2004.
Wang Y. S., Lee C.-M., Kim D.-G, Xu Y., Sound quality prediction for nonstationary vehicle
interior noise based on wavelet pre-processing neural network model, Journal of
Sound and Vibration, Vol 299, 933-947, 2007.
Wang Y. S., Lee C.-M., Evaluation of nonstationary vehicle passing loudness based on an
antinoise wavelet pre-processing neural network model, Int. J. Wavelets,
Multiresolution and Information Processing, Vol 7, No.4, 459-480, 2009.
Wang Y. S., Sound Quality Estimation for Nonstationary Vehicle Noises Based on Discrete
Wavelet Transform, Journal of Sound and Vibration, Vol 324, 1124-1140, 2009.
Wang Yansong, Xing Yanfeng, He Hui, An Intelligent Approach for Engine Fault Diagnosis
Based on Wavelet Pre-processing Neural Network Model, Proceedings of the 2010
IEEE International Conference on Information and Automation, June 20-23, Harbin,
China.
3
Transient Analysis and Motor Fault Detection
using the Wavelet Transform
Jordi Cusid i Roura and Jose Luis Romeral Martnez
1. Introduction
Induction motors are the most common means of converting electrical power to mechanical
power in the industry. Induction machines were typically considered robust machines;
however, this perception began to change toward the end of the last decade as low-cost motors
became available on the market. Nowadays the most widely used induction motor in the
industry is a machine which works at the limits of its mechanical and physical properties. A
good diagnosis system is mandatory in order to ensure proper behavior in operation.
The history of fault diagnosis and protection is as outdated as the machines themselves.
Initially, manufacturers and users of electrical machines used to rely on simple protection
against, for instance, overcurrent, overvoltage and earth faults to ensure safe and reliable
operation of the motor. However as the tasks performed by these machines became more
complex, improvements were also sought in the field of fault diagnosis. It has now become
essential to diagnose faults at their very inception, as unscheduled machine downtime can
upset deadlines and cause significant financial losses.
The major faults of electrical machines can be broadly classified as follows:
Electrical faults (Singh et al., 2003):
1. Stator faults resulting in the opening or shorting of one or more stator windings;
2. Abnormal connection of the stator windings;
Mechanical faults:
3. Broken rotor bars or rotor end-rings;
4. Static and/or dynamic air-gap irregularities;
5. Bent shaft (similar to dynamic eccentricity) which can result in frictions between the
rotor and the stator, causing serious damage to the stator core and the windings;
6. Bearing and gearbox failures.
However, as is introduced in the basic bibliography by Devaney (Devaney et al., 2004), the
effect of bearing faults is, in most cases, similar to eccentricities and has the same effects on
the motor.
The operation during faults generates at least one of the following symptoms:
1. Unbalanced air-gap voltages and line currents
2. Increased torque pulsations
3. Decreased average torque
4. Increase in losses and decrease in efficiency
5. Excessive heating
6. Appearance of vibrations
44
Many diagnostic methods have been developed so far for the purpose of detecting such
fault-related signals. These methods come from different types and areas of science and
technology, and can be summarized as follows (Jardine et al., 2006) (Meador, 2003):
1. Electromagnetic field monitoring by means of search coils, and coils placed around
motor shafts (axial flux-related detection). This is associated with the capacity for
capturing the presence of magnetic fields around an IM. Field evaluation must provide
information about motor-operation states.
2. Temperature measurements: Temperature is a typical second-order effect of operation
conditions. Induction motors typically have an operational temperature range, defined
in the motor nameplate, which is associated with the tests performed. Any faultoperation condition shows a temperature increment. By performing a temperature
analysis the first approach to fault conditions can be made.
3. Infrared recognition: This is used to evaluate the state of the material, especially for
bearings. This cannot be performed in an online system.
4. Radio frequency (RF) emissions monitoring: Radio frequency is a second-order effect of
fault conditions which is currently used for gearbox diagnosis.
5. Vibration monitoring: This is the typical method for fault diagnosis in industrial
applications; it achieves good results for bearing analysis although it presents some
deficiencies with electrical faults and rotor faults.
6. Chemical analysis: This is used to analyze bearing grease; it is only used with big
motors and not with the typical small ones.
7. Acoustic noise measurement: This is a new trend in the field of gearbox failure (Tahori
et al., 2007).
8. Motor current signature analysis (MCSA), which is explained further below.
9. Model-based artificial intelligence and neural-network-based techniques: These are new
approaches which combine multi-modal data acquisition with advanced signalprocessing techniques.
Motor current signature analysis (MCSA) is one of the most widely used techniques for fault
detection analysis in induction machines. It is based on the Fast Fourier transform (FFT),
which is currently considered the standard.
Finally, other pieces of work introduce all the motor faults (Benbouzid et al., 2000) (Thomson
et al., 2003) at the same time, typifying the different harmonic effects of every fault.
The classic MCSA method works well under constant load torque and with high-power
motors, but difficulties emerge when it is applied to pulsating load torques, in applications
such as mills, freight elevators and reciprocating compressors. On the other hand, the results
of the common signal processing method (typically FFT, Fast Fourier transform) should
vary according to the application, especially during transient states. In the cases described
above, the FFT algorithm is likely to cause errors due to the averaging of spectral
amplitudes during sampling time.
The need to find other signal processing techniques for non-stationary signals becomes,
therefore essential. Time-frequency transforms such as the short time Fourier transform or
the wavelet analysis (Ukil et al., 2006) (Valsan et al., 2008) have been successfully used with
electrical systems in order to evaluate faults during transient states. The detection of
induction motor faults using the wavelet transform has also been introduced (Kar et al.,
2006), especially in the case of noise or vibration signals. Interesting approaches have been
presented recently (Calis et al., 2007) (Bacha et al., 2008) which introduce the analysis and
Transient Analysis and Motor Fault Detection using the Wavelet Transform
45
monitoring of fluctuations of motor current zero-crossing instants and the use of artificial
intelligence solutions such as neural networks. A recent publication (Niu et al., 2008)
presents an interesting approach of DWT applied to the evaluation of different statistic
feature extraction techniques. In this paper different statistic methodologies are applied over
wavelet decomposition details showing interesting results for specific details. However the
feature extraction has been done without taking the motor fault behavior into consideration.
This chapter proposes a different approach that begins with a detailed analysis of motor
current decomposition for the further application of DWT at specific faulty bands. An
energy estimation of the analyzed bands is proposed to define fault factors.
PSD (power spectral density) (Ayhan et al., 2003) describes the distribution of power along
frequencies. A similar concept applied to the wavelet transform could be useful for
diagnosing a motor under variable load torque. The energy estimation of specific details
improves the diagnosis, as it introduces a specific fault factor.
This chapter starts with a description of the theoretical approach of MCSA bases and signal
processing techniques proposed, followed by a presentation of experimental results. The use
of the wavelet transform improves fault detection, and the energy estimation provides the
fault factor needed to implement an online monitoring system. Conclusions are presented in
the last section.
2. Basic theory
2.1 Motor current signature analysis (MCSA)
This method focuses its efforts on the spectral analysis of the stator current and has been
successfully introduced for the detection of broken rotor bars (Deleroi, 1984), bearing
damage and dynamic eccentricities (Devaney et al., 2004) caused by a variable air gap due to
a bent shaft or a thermal bow. The procedure consists in evaluating the relative amplitudes
of the different current harmonics which appear as a result of the fault.
The frequencies related to the different faults in the induction machine, such as air-gap
eccentricity, broken rotor bars (Figure 1), and the effect of bearing damage, are expressed by
equations (1), (2) and (3), respectively (Tahori et al., 2007)
1 s
f ecc = f1 1 m
(1)
1 s
= f1 m
s
p
2
(2)
n
bd
f r 1
cos
2
pd
(3)
f brb
f i ,o =
where fi is the rotational speed frequency of the rotor, f1 is the frequency supply, m is the
harmonic order, s is the slip and p is the number of poles. In the bearing fault equation, bd,
pd and cos correspond to the constructive bearing parameters (Figure 2).
46
Fig. 1. Stator current spectrum for an induction motor with broken bars. Base frequency of
50 Hz
47
Transient Analysis and Motor Fault Detection using the Wavelet Transform
iR ( t ) = 2 I R cos 2 f s t + 2 I Rn cos ( 2 f nt Rn )
(4)
n =0
(5)
(6)
iS ( t ) = 2 IS cos 2 f s t 2 3 + 2 I Sn cos 2 f nt Sn 2 3
n=0
N
where IR = IS = IT = I are the RMS values of the fundamental component of the line current,
IRn, ISn, ITnare the RMS values of the fault components and Rn, Sn, Tn are the angular
displacements of the fault components.
The space vector is referred to the stator reference frame is obtained by applying the
transformation of the symmetrical components:
iR =
2
j 2 3
j 2
j 2 f t
j 2 f t
j 2 f t
= 3 I e j2 f s t + I1 e [ 1 1 ] + I 2 e [ 2 2 ] ... + I n e [ n n ]
iR + iS e 3 + iT e
(7)
48
apply another signal processing technique, such as the Wavelet transform, that can reveal
aspects that a simple Fourier analysis misses.
2.3 Continuous wavelet transform (CWT)
The Fourier analysis consists in breaking up a signal into sine waves with different
frequencies. Similarly, a wavelet analysis is the breaking-up of a signal into shifted and
scaled versions of the function called the mother wavelet.
The continuous wavelet transform is the sum over time of the signal multiplied by scaled
and shifted versions of the wavelet. This process produces wavelet coefficients that are a
function of scale and position.
The integral wavelet transform of a function f ( t ) L2 with respect to an analyzing wavelet
is defined as
W f ( b ,a ) =
f ( t )b ,a ( t ) dt
(8)
where
b ,a ( t ) =
t b
1
a
a
a>0
(9)
Parameters b and a are called translation and dilation parameters respectively. The
normalization factor
a is included so that b , a =
f (t ) =
1
C
db a2 W f ( b ,a ) b , a ( t ) da
(10)
C =
( )
d <
(11)
The coefficients constitute the results of a regression of the original signal performed on the
wavelets. A plot can be generated with the x-axis representing position along the signal
(time), the y-axis representing scale, and the color at the x-y point representing the
magnitude of wavelet coefficient C. These coefficient plots are generated with graphical
tools.
2.4 Discrete wavelet transform (DWT)
The discrete version of the wavelet transform, DWT, consists in sampling the scaling and
shifted parameters, but neither the signal nor the transform. This leads to high-frequency
resolution at low frequencies and high-time resolution for higher frequencies, with the same
time and frequency resolution for all frequencies.
A discrete signal x[n] can be decomposed as (Mallat, 1998):
49
Transient Analysis and Motor Fault Detection using the Wavelet Transform
x [ n] = a j0 ,k j0 ,k [ n] +
k
J 1
d j ,k j ,k [n]
(12)
j = jo k
where
[ n]
j0 ,k [ n] = 2
j0
2 ( 2 j0 n
k) ,
[ n] ,
j ,k [ n] = 2 2 2 j n k ,
a j0 ,k ,
d j ,k ,
g[n]
g[n]
h[n]
2
Level 3 detail coefficients
g[n]
h[n]
2
Level 2 detail coefficients
x[n
h[n]
2
Level 1 detail coefficients
50
Approx.
Level 3
Detail
Level 3
fs/16
Detail
Level 2
fs/8
Detail
Level 1
fs/4
fs/2
a1
2a1
2
a1
a0
a0
b1 + a 1 t
b0 + a 0 t
Transient Analysis and Motor Fault Detection using the Wavelet Transform
51
52
Harmonic content due to the 100 Hz superimposed frequency appears on details 2 and 3;
when it should only appear on detail 3, corresponding to the analysis band between 125 and
62.5 Hz. A high-order Daubechies mother wavelet is needed to prevent this drawback,
which is due to the db3 associated filter not being ideal enough to filter the 100 Hz harmonic
content on detail 2.
53
Transient Analysis and Motor Fault Detection using the Wavelet Transform
1
P = lim
2
x ( t )
1
dt = lim
x ( t )
dt
(13)
P=
1
1
lim
2 2
1
2
x ( ) d =
2
x ( ) 2
lim
2
1
d =
2
S ( ) d
(14)
Where:
x ( ) 2
S ( ) = lim
2
(15)
S ( ) is the spectral density of the signal x ( t ) , and represents the distribution or the
density of power as a function of .
The energy of a discrete signal can be calculated by averaging the square of all the signal
components inside the unity window, following equation 12:
Power =
1
2
( iR ( t ) ( t ) ) dt
T 0
(16)
3. Experimental results
3.1 Experimental setup
A three-phase, 1.1 kW, 380 V and 2.6 A, 50 Hz, 1410 rpm, four-pole induction motor was
used in this study. Firstly, its healthy performance was analyzed and, afterwards, a sixth of
the rotor bars was damaged as is shown in Figure 8.
The motor nameplate is shown as follows:
Induction motor
Value
Rated power
Number of poles
Nominal speed
1410 rev/min
Cos
0.81
54
Load Torque
Control
Wavelet Details
Amplitude (A)
Load control has been implemented by using a PMSM and an inverter where variable load
torque was introduced. The variable load torque follows an implemented increasing ramp
as a torque control reference. Figure 10 depicts the evolution of the acquired currents.
5
4
3
2
1
0
-1
-2
-3
-4
-5
12
Ia
12.5
13
13.5
14
14.5
15
Time (s)
Fig. 10. Current supply to the motor
3.2 Signal acquisition requirements
When carrying out experimental analyses one of the key elements to obtain good results is
to choose appropriate acquisition parameters: sampling frequency and number of samples.
There are three different constraints: analysis signal bandwidth, frequency resolution for the
FFT analysis and wavelet decomposition spectral bands.
For an IM, the most significant information about the stator current signal is focused around
the 0-400 Hz band (Devaney et al., 2004), (Benbouzid et al., 2000) & (Thomson et al., 2003). The
application of Nyquists theorem results in a minimum sampling frequency (fs) of 800 Hz.
Transient Analysis and Motor Fault Detection using the Wavelet Transform
55
Furthermore, in case of an FFT analysis, it is necessary to get the right resolution. As for the
inverter supply, several harmonics could be mixed up in case low resolution of the band
side was chosen. The minimum resolution needed in order to obtain good results is 0.5 Hz.
Equation (17) defines the number of samples to achieve the correct resolution required.
Ns =
fs
R
(17)
Detail at level 1
Detail at level 2
Detail at level 3
Detail at level 4
Detail at level 5
Detail at level 6
Detail at level 7
3000-1500
1500-750
750-375
375-187.5
187.5-92.75
92.75-46.37
46.37-23.18
56
(18)
(19)
Transient Analysis and Motor Fault Detection using the Wavelet Transform
57
58
Detail levels of high frequency bands provide virtually no information about the original
signal. Detail 6 corresponds to the frequency band of the main harmonic and detail 7
corresponds to the frequency band where the fault harmonic is located in the test.
Comparing Figure 13 to Figure 14, we can clearly see the increase of the coefficient values as
a result of the fault condition on the depicted scalograms (cfs). Also, the increase of the
signal content is clearly appreciated on details 4, 5 and 7.
Promising results are also obtained using wavelet transforms and evaluating the proper
signal evolution during acquisition time. Figure 14 shows the advantage of the use of
wavelets under variable load torque. Comparing the FFT decomposition and the DWT
proves how using the Fourier decomposition (Figure 11) will reveal low amplitude for the
spectrum in the 40 Hz band, lower than 3 mA. However, analyzing the wavelet timeamplitude decomposition (Figure 14) will show that the amplitude value follows the change
of the amplitude in the fault harmonic over time, eventually achieving a value higher than
0.15 A when maximum torque is applied. The maximum torque value is the same that was
applied to the constant torque test. The result of the analysis using the wavelet
decomposition under a variable load torque matches the results obtained using an FFT
analysis in the constant load torque test (Figure 1.)
To perform the diagnosis, we also need to determine the fault factor, which is defined as the
estimation of the energy content of any decomposed detail. Energy is estimated applying
equation (16).
Table III illustrates the energy increment for a fault condition of the approximation and
detail decompositions at level 7. This energy has been calculated according to equation (12).
Power [W]
D1
Phase A
Phase A
D2
D3
D5
Healthy motor
0.00
0.00
0.11
9.9
Motor with broken rotor bars
0.00
0.00
1.1
13
D6
D7
929.2
35.75
887.7
88.11
4. Conclusions
This chapter has introduced the problems of fault detection under a variable load torque.
The classical computation of MCSA using the FFT introduces average errors in the
Transient Analysis and Motor Fault Detection using the Wavelet Transform
59
amplitude harmonic evaluation, hampering fault detection. To ensure proper results a timefrequency analysis is required.
As with time-frequency analysis, the proposed alternative is the discrete wavelet transform
(DWT). DWT has different resolutions on time and frequency depending on the different
frequency bands defined. The use of DWT ensures good time-frequency analysis. DWT has
been used to analyze motors with eccentricity and broken rotor bars under fault conditions,
achieving good results.
Moving toward an autonomous diagnosis sensor, a fault condition parameter has been
studied and the power spectral density has been used as a power detail density with
wavelets, ensuring proper results.
To sum up, we can say that:
Wavelet decomposition is the proper technique for isolating time components of nonstationary signals, with low computational costs.
Analyzing the energy of some wavelet decompositions is the right way to detect rotor
faults in industrial motor applications with non-constant load torque.
The evolution of wavelet coefficients gives good results in terms of fault detection.
5. References
B. Ayhan, M.Y.Chow, H.J. Trussell, M.H. Song, E.S. Kang, H.J.Woe: Statistical Analysis on
a Case Study of Load Effect on PSD Technique for Induction Motor Broken Rotor
Bar Fault Detection, Symposium on Diagnostics for Electric Machines, Power
Electronics and Drives, SDEMPED 2003, Atlanta GA, USA 24-26 August 2003.
Khmais Bacha, Humberto Henao, Moncef Gossa, Grard-Andr Capolino; Induction
machine fault detection using stray flux EMF measurement and neural networkbased decision; Electric Power Systems Research, Volume 78, Issue 7, July 2008,
Pages 1247-1255.
Mohamed El Hachemi Benbouzid: A Review of Induction Motor Signature Analysis as a
Medium for Faults Detection, IEEE Transactions on Industrial Electronics, Vol. 47, n
5, Oct 2000, pp. 984-993.
Hakan al and Abdlkadir akr, Rotor bar fault diagnosis in three phase induction
motors by monitoring fluctuations of motor current zero crossing instants; Electric
Power Systems Research, Volume 77, Issues 5-6, April 2007, Pages 385-392.
J. R. Cameron, W. T. Thomson, and A.. B. Dow: Vibration and current monitoring for
detecting airgap eccentricity in large induction motors, IEE Proceedings, pp. 155163, Vol.133, Pt. B, No.3, May 1986.
W. Deleroi, Broken bars in squirrel cage rotor of an induction motor- Part 1: Description by
superimposed fault currents (in German) Arch. Elektrotech, vol. 67, pp. 91-99, 1984.
Michael J. Devaney, Levent Eren; Detecting Motor Bearing Faults IEEE Transactions on
Instrumentation and Measurement Magazine, pp 30-50, December 2004.
Andrew K.S. Jardine, Daming Lin, Dragan Banjevic, A review on machinery diagnostics and
prognostics implementing condition-based maintenance, Mechanical Systems and
Signal Processing 20 (2006), 1483-1510
60
Chinmaya Kar, A.R. Mohanty, Monitoring gear vibrations through motor current signature
analysis and wavelet transform, Mechanical Systems and Signal Processing 20 (2006)
158-187.
S. G. Mallat A Theory for multiresolution Signal Decomposition: The Wavelet
Representation IEEE Transactions on Pattern Analysis and Machine intelligence Vol II
No 7, July 1989.
S. G. Mallat, A Wavelet tour of signal Processing Academic Press 1998 Second Edition
Dick Meador; Tools for O&M, from Building Controls to Thermal Imaging O&M Workshop
for Government Facility Managers, June 19, 2003, US Department of Energy.
Gang Niu, Achmad Widodo, Jong-Duk Son, Bo-Suk Yang, Don-Ha Hwang, Dong-Sik Kang;
Decision-level fusion based on wavelet decomposition for induction motor fault
diagnosis using transient current signal; Expert Systems with Applications, Volume
35, Issue 3, October 2008, Pages 918-928.
G. K. Singh, Saad Ahmed Saleh Al Kazzaz; Induction machine drive condition monitoring
and diagnostic researcha survey, Electric Power Systems Research, Volume 64,
Issue 2, February 2003, Pages 145-158.
Easa Tahori Oskouel, Alan James Roddis: A condition Monitoring Device using Acoustic
Emission Sensors and data Storage Devices, UK Patent Application GB 2340034 A,
data of publication 03/14/2007.
W. T. Thomson, and M. Fenger: Case histories of current signature analysis to detect faults
in induction motor drives, IEEE International Conference on Electric Machines and
Drives, IEMDC'03, Vol. 3, pp. 1459-1465, June 2003.
Abhisek Ukil and Rastko ivanovi, Abrupt change detection in power system fault
analysis using adaptive whitening filter and wavelet transform; Electric Power
Systems Research, Volume 76, Issues 9-10, June 2006, Pages 815-823
Simi P. Valsan, K.S. Swarup; Wavelet based transformer protection using high frequency
power directional signals; Electric Power Systems Research, Volume 78, Issue 4,
April 2008, Pages 547-558.
Part 2
Image Processing and Analysis
0
1
4
0
A MAP-MRF
MAP-MRF
Approach
for
Wavelet-Based
A MAP-MRF
Approach
for for
Wavelet-Based
A
Approach
Wavelet-Based
Image
Denoising
Image
Denoising
Image
Denoising
1. Introduction
1. Introduction
University
of So Carlos (UFSCar)
3 University
of So Paulo (USP)
3 University
of So Paulo (USP)
Brazil
Brazil
64
To test and evaluate our method, we built a series of experiments using both real Nuclear
Magnetic Resonance (NMR) images and simulated data, considering several wavelet basis.
The obtained results show the effectiveness of GSAShrink, indicating a clear improvement
on the wavelet denoising performance in comparison to the traditional approaches. As in
this chapter we are using a sub-optimal combinatorial optimization algorithm to approximate
the optimal MAP solution, GSAShrink converges to a local maximum, making our method
sensitive to different initializations. What at first could look like a disadvantage, actually
revealed to be an interesting and promissing feature, mostly because we can incorporate
other non-linear filtering techniques in a really straighforward way, by simply using them
to generate better initial conditions for the algorithm. Results obtained by combining Bilateral
Filtering and GSAShrink show that the MAP-MRF method under investigation is capable of
suppressing the noise while preserving most relevant image details, avoiding the appearance
of visible artifacts.
The remaining of the chapter is organized as follows. Section 2 describes the Discrete Wavelet
Transform (DWT) in the context of digital signal processing, showing that, in practice, this
transform can be implemented by a Perfect Reconstruction Filter Bank (PRFB), being completely
characterized by a pair of Quadrature Mirror Filters (QMF), h0 [], a low-pass filter and g1 [], a
high-pass filter. Section 3 briefly introduces the wavelet-based denoising problem, describing
the proposed MAP-MRF solution, as well as the statistical modeling and threshold estimation,
a crucial step in this kind of application. In Section 4 we briefly discuss the MRF Maximum
Pseudo-Likelihood parameter estimation. The experimental setup and the obtained results are
described in Section 5. Finally, Section 6 brings the our conclusions and final remarks.
This section describes the Discrete Wavelet Transform from a digital signal processing
perspective, by characterizing its underlying mathematical model by means of the
Z-Transform. For an excellent review on wavelet theory and mathematical aspects of filter
banks the reader is refered to Jensen & Cour-Harbo (2001); Strang & Nguyen (1997),
from where most results described in this section were taken. A two-channel perfect
reconstruction lter bank (PRFB) consists of two parts: an analysis filter bank, responsible for
the decomposition of the signal in wavelet sub-bands (DWT) and a synthesis filter bank, that
reconstructs the signal by synthesizing these wavelet sub-bands Ji & Fermller (2009). Figure 1
shows the block diagram of a two-channel PRFB, where H0 (z) and G1 (z) are the Z-transforms
of the pair of analysis filters, r0 [ n ] and r1 [ n ] are the resulting signals after low-pass and
high-pass filtering, respectively, y0 [ n ] and y1 [ n ] are the downsampled signals, t0 [ n ] and t1 [ n ]
65
(1)
X (z) = z X (z)
(2)
which means that the entire system can be replaced by a single transfer function. Equivalently,
in the Z-domain we have:
As the filter bank defines a linear time invariant (LTI) system and using the convolution
theorem, we have:
R0 (z) = H0 (z) X (z)
(3)
(4)
Y0 (z) =
(5)
(6)
66
1
H0 (z1/2 ) X (z1/2 ) + H0 ( z1/2 ) X ( z1/2 )
2
1
Y1 (z) =
G1 (z1/2 ) X (z1/2 ) + G1 ( z1/2 ) X ( z1/2 )
2
Y0 (z) =
(7)
(8)
Since H0 (z) and G1 (z) are not ideal half-band filters, downsampling can introduce aliasing
since we cannot reduce the interval between samples by half because we would be sampling
below the Nyquist rate. To overcome this problem, conditions for alias cancellation must be
enforced. According to the perfect reconstruction condition:
V0 (z) + V1 (z) = z X (z)
(9)
Using the upsampling property of the Z-transform, we have the following expressions for
V0 (z) and V1 (z):
V0 (z) = F0 (z) T0 (z) = F0 (z)Y0 (z2 )
(10)
(11)
(12)
(13)
Thus, grouping similar terms and enforcing the perfect reconstruction condition, we have the
following equation that relates the input, analysis filters, synthesis filters and the output of
the LTI system:
1
F0 (z) H0 (z) + K1 (z) G1 (z) X (z) +
2
1
F0 (z) H0 ( z) + K1 (z) G1 ( z) X ( z) = z X (z)
2
(14)
Therefore, a perfect reconstruction lter bank must satisfy the following conditions:
1. Alias cancellation
F0 (z) H0 ( z) + K1 (z) G1 ( z) = 0
(15)
(16)
67
The first condition is trivially satisfied by defining the synthesis filters as:
F0 (z) = G1 ( z)
(17)
K1 (z) = H0 ( z)
(18)
=
=
(19)
g1 [n](z)n
(1)n g1 [n]zn
and
K1 (z) = H0 ( z)
=
=
(20)
h0 [n](z)n
(1)n+1 h0 [n]zn
so that the synthesis filters coefficients are obtained directly from the analysis filters by a
simple alternating signs rule:
f 0 [ n ] = (1)n g1 [ n ]
(21)
k1 [ n ] = (1)n+1 h0 [ n ]
Defining P0 (z) = F0 (z) H0 (z) and using equation (19) on (16) leads to:
P0 (z) P0 ( z) = 2z
(22)
where must be odd since the left hand side of (22) is an odd function, since all even terms
cancel each other. Let P (z) = z P0 (z). Then, P ( z) = z P0 ( z), since is odd. Rewriting
equation (22) we finally have:
P (z) + P ( z) = 2
(23)
showing that for perfect reconstruction the low-pass filter P (z) requires all even powers to be
zero, except the constant term. The design process starts with the specification of P (z) and
68
then the factorization of P0 (z) into F0 (z) H0 (z). Finally, the alias cancellation condition is used
to define G1 (z) and K1 (z). It has been shown that flattest P (z) leads to the widely recognized
Daubechies wavelet filter Daubechies (1988).
In this chapter, we consider the traditional 2-D separable DWT, also known as Square Wavelet
Transform, that is based on consecutive one dimensional operations on columns and rows of
the pixel matrix. The method first performs one step of the 1-D DWT on all rows, yielding
a matrix where the left side contains down-sampled low-pass (h filter) coefficients of each
row, and the right contains the high-pass (g filter) coefficients. Next, we apply one step to all
columns, resulting in four wavelet sub-bands: LL (which is known as approximation signal),
LH, HL and HH. A multilevel decomposition scheme can be generated in a straghtforward
way, always expanding the approximation signal.
The analysis of a signal or image wavelet coefficients suggests that small coefficients
are dominated by noise, while coefficients with a large absolute value carry more signal
information. Thus, supressing or smoothing the smallest, noisy coefficients and applying
the Inverse Wavelet Transform (IDWT) lead to a reconstruction with the essential signal or
image characteristics, removing the noise. More precisely, this idea is motivated by three
assumptions Jansen (2001):
The decorrelating property of a DWT creates a sparse signal, where most coefcients are
zero or close to zero.
Noise is spread out equally over all coefcients and the important signal singularities are
still distinguishable from the noise coefficients.
The noise level is not too high, so that we can recognize the signal wavelet coefcients.
2.2 Wavelet-based denoising
Basically,
the problemof wavelet denoising by thresholding can be stated as follows. Let g =
gi,j ; i, j = 1, 2, . . . , M denotes the M M observed image corrupted by additive Gaussian
noise:
gi,j = f i,j + n i,j
(24)
where f i,j is the noise-free pixel, n i,j has a N (0, 2 ) distribution and 2 is the noise variance.
Then, considering the linearity of the DWT:
y j,k = x j,k + z j,k
(25)
with y j,k , x j,k and z j,k denoting the k-th wavelet coefficient from the j-th decomposition level
of the observed image, original image and noise image, respectively. The goal is to recover
the unknown wavelet coefficients x j,k from the observed noisy coefficients y j,k . One way to
estimate x j,k is through Bayesian inference, by adopting a MAP approach. In this chapter, we
introduce a MAP-MRF iterative method based on the combinatorial optimization algorithm
Game Strategy Approach (GSA) Yu & Berthod (1995a), an alternative to the deterministic and
widely known Besags Iterated Conditional Modes (ICM) Besag (1986a). By iterative method we
mean that an initial solution x(0) is given and the algorithm successively improves it, by using
the output from one iteration as the input to the next. Thus, the algorithm updates the current
wavelet coefficients given a previous estimative according to the following MAP criterion:
( p +1)
( p)
x j,k
= arg maxx j,k p x j,k | x j,k , y j,k ,
(26)
69
( p)
where p x j,k | x j,k , y j,k , represents the a posteriori probability obtained by adopting a
Generalized Gaussian distribution as likelihood (model for observations) and a Generalized
Isotropic Multi-Level Logistic (GIMLL) MRF model as a priori knowledge (for contextual
( p)
modeling), x denotes the wavelet coefficient at p-th iteration and is the model parameter
j,k
vector. This vector contains the parameters that control the behavior of the probability laws.
More details on the statistical modeling and how these parameters are estimated are shown
in Sections 3 and 4. In the following, we will derive an algorithm for approximating the MAP
estimator by iteratively updating the wavelet coefficients.
It has been shown that the distribution of the wavelet coefficients within a sub-band can
be modeled by a Generalized Gaussian (GG) with zero mean Mallat (1989), Westerink et al.
(1991). The zero mean GG distribution has the probability density function:
|w |
(27)
exp
p (w| , ) =
2 (1/)
where > 0 controls the shape of the distribution and the spread. Two special cases of
the
GG distribution are the Gaussian and the Laplace distributions. When = 2 and = 2,
it becomes a standard Gaussian distribution. The Laplace distribution is obtained by setting
= 1 and = 1/. According to Sharifi & Leon-Garcia (1995), the parameters and
can be empirically
determined by directing computing the sample moments = E [| w|] and
= E w2 (method of moments), because of this useful relationship:
1 3
=
(28)
2
2 2
and we can use a look-up table with different values of and determine is value from the ratio
/2 . After, it is possible to obtain by:
70
1
=
3
(29)
Basically, MRF models represent how individual elements are influenced by the behavior
of other individuals in their vicinity (neighborhood system). MRF models have proved
to be powerful mathematical tools for contextual modeling in several image processing
applications. In this chapter, we adopt a model originally proposed in Li (2009) that
generalizes both Potts and standard isotropic Multi-Level Logistic (MLL) MRF models for
continuous random variables. According to the Hammersley-Clifford theorem any MRF
can be equivalently defined by a joint Gibbs distribution (global model) or by a set of
local conditional density functions (LCDFs). From now on, we will refer to this model
as Generalized Isotropic MLL MRF model (GIMLL). Due to our purposes and also for
mathematical tractability, we define the following LCDF to characterize this model, assuming
levels:
the wavelet coefficients are quantized into M
exp { Ds ( xs )}
(30)
y G exp { Ds (y)}
where Ds (y) = ks 1 2exp (y xk )2 , xs is the s-th element of the field, s is the
neighborhood of xs , xk is an element belonging to the neighborhood of xs , is a parameter
that controls the spatial dependency between neighboring elements, and G is the set of all
possible values of xs , given by G = { g /m g M }, where m and M are respectively,
(cardinality of the set). This
the minimum and maximum sub-band coefficients, with | G | = M
model provides a probability for a given coefficient depending on the similarity between its
value and the neighboring coefficient values. Acording to Li (2009), the motivation for this
model is that it is more meaningful in texture representation and easier to process than the
isotropic MLL model, since it incorporates similarity in a softer way.
For GIMLL MRF model parameter estimation we adopt a Maximum Pseudo-Likelihood
(MPL) framework that uses the observed Fisher information to approximate the asymptotic
variance of this estimator, which provides a mathematically meaningful way to set this
regularization parameter based on the observations. Besides, the MPL framework is useful
in assessing the accuracy of MRF model parameter estimation.
p ( x s | s , ) =
In a n-person game, I = {1, 2, . . . , n } denotes the set of all players. Each player i has a set
of pure strategies Si . The game process consists in, at a given instant, each player choosing a
strategy si Si . Hence, a situation (or play) s = (s1 , s2 , . . . , sn ) is yielded, and a payoff Hi (s) is
assigned to each player. In the approach proposed by Yu & Berthod (1995a), the payoff Hi (s)
of a player is defined in such a way that it depends only on its own strategy and on the set of
strategies of neighboring players.
In non-cooperative game theory each player tries to maximize his payoff by choosing his own
strategy independently. In other words, it is the problem of maximizing the global payoff
through local and independent decisions, similar to what happens in MAP-MRF applications
with the conditional independence assumption.
71
A mixed strategy for a player is a probability distribution defined over the set of pure
strategies. In GSA, it is supposed that each player knows all possible strategies, as well as the
payoff given by each one of them. Additionally, the solutions for a non-cooperative n-person
game are given by the set of points satisfying the Nash Equilibrium condition (or Nash points).
It has been shown that Nash Equilibrium
points always exist in non-cooperative n-person
games Nash (1950). A play t = t1 , t2 , . . . , tn satisfies the Nash Equilibrium condition
if none of the players can improve you payoff by changing his strategy unilaterally, or in
mathematical terms:
t ||t is
i : Hi (t ) = maxs i Si Hi (t ||t)
(31)
Game Theory
sub-band lattice
n-person game structure
sub-band elements
players
wavelet coefficients
pure strategies
an entire sub-band at p-th iteration
a play or situation
posterior distribution
payoff
local conditional densities
mixed strategies
local maximum points (MAP) Nash equilibrium points
Table 1. Correspondence between concepts of game theory and the MAP-MRF wavelet
denoising approach.
3.3 GSAShrink for wavelet denoising
Given the observed data y (noisy image wavelet coefficients), and the estimated parameters
for all the sub-bands r = r , r , r , r = 1, . . . , S, where S is the total number of sub-bands
in the decomposition, our purpose is to recover the optimal wavelet coefficient field x using a
Bayesian approach. As the number of possible candidates for x is huge, to make the problem
computationally feasible, we adopt an iterative approach, where the wavelet coefficient field
72
at a previous iteration, lets say x( p) , is assumed to be known. Hence, the new wavelet
( p +1)
coefficient x j,k
x j,k
(32)
Basically, GSAShrink consists in, given an initial solution, improve it iteratively by scanning
all wavelet coefficients sequentially until the convergence of the algorithm or until a
maximum number of iterations is reached. In this manuscript, we are setting the initial
conditions as the own noisy image wavelet sub-band, that is, x(0) = y, although some
kind of previous preprocessing may provide better initializations. Considering the statistical
modeling previously described, we can define the following approximation:
j
(33)
log p x j,k |x( p) , y j,k , j log
2 j 1j
j
y j,k
( p)
( p) 2
j
Therefore, we can define the following rule for updating the wavelet coefficient x j,k , based on
minimizing the negative of each player payoff, denoted by Hj,k x, y, j , considering x(0) =
y:
( p +1)
x j,k
= argminx j,k Hj,k x, y, j
(34)
where
Hj,k x, y, j =
x j,k
( p)
( p) 2
+ j
(35)
The analysis of the above functional (the payoff of each player), reveals that while the first
term favors low valued strategies (coefficients near zero), since the mean value of wavelet
coefficients in a sub-band is zero, the MRF term favors strategies that are similar to those
belonging to the neighborhood (coefficients close to the neighboring ones), defining a tradeoff
between supression and smoothing, or hard and soft thresholding. In this scenario, the
MRF model parameter plays the role of a regularization parameter, since it controls the
compromisse between these two extreme behavior. Thus, our method can be considered a
hybrid adaptive approach since identical wavelet coeficients belonging to different regions
of a given sub-band are modified by completely different rules. In other words, coefficients
belonging to smooth regions tend to be more attenuated than those belonging to coarser
regions. In the following, we present the GSAShrink algorithm for wavelet-based image
denoising.
73
x j,k = argmin x j,k Hj,k x, y, j
( p)
if H x j,k H x j,k
then
( p)
if x j,k T or max j,k T then
( p +1)
( p)
x j,k
= x j,k (1 + )
else
( p +1)
= x j,k w. p. ;
x j,k
Otherwise,
( p +1)
( p)
x j,k
= x j,k (1 ) w. p. (1 ) ;
end if
end if
end for
end for
end while
It is interesting to note that an observation can be set forward to explain why there are a large
number of "small" coefficients but relatively few "large" coefficients as the GGD suggests: the
small ones correspond to smooth regions in a image and the large ones to edges, details or
textures Chang et al. (2000). Therefore, the application of the derived MAP-MRF rule in all
sub-bands of the wavelet decomposition removes noise in an adaptive manner by smoothing
the wavelet coeficients in a selective way.
Basically, the GSAShrink algorithm works as follows: for each wavelet coefficient, the value
that maximizes the payoff is chosen and the new payoff is calculated. If this new payoff is
less than the original one, then nothing is done (since in the Nash equilibrium none of the
playes can improve its payoff by uniterally changing its strategy). Otherwise, if the absolute
value of the current wavelet coefficient x j,k or any of its neighbors is above the threshold T,
which means that we are probably dealing with relevant image information such as edges
or fine details, then x j,k is amplified by a factor of (1 + ). The goal of this procedure is to
perform some image enhancement during noise removal. However, if its magnitude is less a
threshold, then the new coefficient x j,k is accepted with probability , which is a way to smooth
the wavelet coefficients since we are employing the MAP-MRF functional given by equation
(35). The level of suppression/shrinkage depends basically on two main issues: the contextual
information and the MRF model parameter, that controls the tradeoff between suppression
and smoothing. On the other hand, with probability (1 ) the coefficient is attenuated by
a constant factor of (1 ), since we are probably facing a noise coefficient. It is worthwhile
to note that the only parameter originally existing in the traditional GSA algorithm for image
labeling problems is , that controls the probability of acceptance of new strategies. Both
74
and parameters have been included to better represent the nature of our problem. Also,
in all experiments thoughout this chapter, we have adopted the following parameter values:
= 0.9, = 0.1, = 0.05 and MAX = 5.
3.4 Wavelet thresholds
As we have seen, a critical issue in the method is the choice of the thresholding value. Several
works in the wavelet literature discuss threshold estimation Chang et al. (2000); Jansen &
Bultheel (1999). In the experiments throughout this chapter we adopted four different wavelet
thresholdings: universal Donoho (1995); Donoho et al. (1995), SURE Jansen (2001), Bayes and
Oracle thresholds Chang et al. (2000).
3.4.1 Universal threshold
Despite its simplicity, it has been shown that the Universal Threshold has some optimal
asymptotic properties Donoho (1995); Donoho & Johnstone (1994). The Universal Threshold
is obtained by the following expression:
(36)
U N IV = 2logN2
where N is the number of data points and 2 denotes the noise variance. Thus, the Universal
Threshold does not depend directly on the observed input signal, but only on simple statistics
derived from it.
3.4.2 SURE threshold
The SURE (Steinss Unbiased Risk Estimator) threshold is obtained by minimizing a risk function
R(.), assuming the coefficients are normally distributed Hudson (1978); Stein (1981). In this
chapter, we use the approximation for R(.) derived in Jansen (2001) and given by:
( N N0 )
1
(37)
R() =
2 2 + 22
N
N
where N is the number of wavelet coefficients, 2 is the noise variance, and denote
the wavelet coefficients before and after thresholding, respectively, and N0 is number of null
wavelet coefficients after thresholding. The SURE threshold SURE , is defined as the one that
minimizes R(), that is:
SURE = argmin { R()}
(38)
Analyzing the expression we can see that this method for threshold estimation seeks a tradeoff
between data fidelity and noise removal.
3.4.3 Bayes threshold
2
x
(39)
x =
max y2 2 , 0
y2 =
1
N2
y2i
75
(40)
(41)
Median (| yi |)
(42)
0.6745
It is worth mentioning that in case of 2 > y2 , x is taken to be zero, implying that BAYES =
, which means, in practice, that all coefficients within the sub-band are suppressed.
=
The Oracle Thresholds are the theoretic optimal sub-band adaptive thresholds in a MSE
sense, assuming the original image is known, a condition that obviously is possible only in
simulations. The OracleShrink threshold is defined as:
S = argmin
( ( y k ) x k )2
(43)
k =1
where N is the number of wavelet coefficients in the sub-band, denotes the soft
thresholding operator and xk is the k-th coefficient of the original image. Similarly, the
OracleThresh threshold is given by:
H = argmin
( ( y k ) x k ) 2
(44)
k =1
76
This section briefly describes the MLP estimation of the Generalized isotropic MLL parameter
model , given by equation (30). Basically, our motivations for this approach are:
MPL estimation is a computationally feasible method.
From a statistical perspective, MPL estimators have a series of desirable properties, such
as consistency and asymptotic normality Jensen & Knsh (1994), Winkler (2006).
In recent works found in MRF literature, analytical pseudo-likelihood equations for Potts
MRF model on higher-order neighborhood systems have been derived Levada et al. (2008c),
showing the importance of MRF parameter estimation assessment. In the experiments along
this chapter, the proposed methodology is based on the approximation for the asymptotic
variance of the Potts MRF model reported in Levada et al. (2008b) and Levada et al. (2008a).
4.1.1 Pseudo-likelihood equation
PL ( X; ) =
s =1
exp { Ds ( xs )}
y G exp { Ds (y)}
(45)
y G exp { Ds (y)}
s =1
s =1
In the experiments, the solution is obtained by nding the zero of the resultant equation.
We chose the Brents method Brent (1973), a numerical algorithm that does not require the
computation (or even the existence) of derivatives. The advantages of this method are: it uses
a combination of bisection, secant, and inverse quadratic interpolation methods, leading to a
very robust approach. Besides, it has superlinear convergence rate.
4.2 Bilateral filtering
Bilateral Filtering (BF) is a noniterative and local non-linear spatial domain ltering technique
that originally was proposed as an intuitive tool Tomasi & Manduchi (1998) but later has
showed to be closely related to classical partial differential equation based methods, more
precisely, anisotropic diffusion Barash (2002); Dong & Acton (2007); Elad (2002). The basic
idea of bilateral ltering is to use a weighted average of degraded pixels to recover the original
pixel by combining a low-pass function (h D ) and a edge stoping function (h P ) according to the
following relationship:
f[i, j] =
( k,n)i,j h D [ k, n ] h P [ k, n ] g(k, n )
( k,n)i,j h D [ k, n ] h P [ k, n ]
(47)
( k i )2 + ( n j )2
h D [ k, n ] = exp
2
2D
( g[ k, n ] g[i, j])2
h P [ k, n ] = exp
2P2
77
(48)
(49)
where the parameters D and P control the effect of the spatial and radiometric weight
factors. The first weight, h D , measures the geometric distance between the central pixel and
each one of its neighbors, in a way that the nearest samples have more influence on the final
result than the distant ones. The second weight, h P , penalizes the neighboring pixels that vary
greatly in intensity from the central pixel, in a way that the larger the difference, the smaller
will be the pixels contribution during the smoothing. In all experiments along this chapter,
2 = 1 and 2 = 0.1.
we set N = 2 (5 5 window), D
P
Basis
Metrics
ISNR
HAAR PSNR
SSIM
ISNR
DB4
PSNR
SSIM
ISNR
SYM4 PSNR
SSIM
ISNR
BIOR6.8 PSNR
SSIM
Soft
-0.8484
25.613
0.8012
0.4864
27.067
0.8598
0.6580
27.257
0.8639
0.8336
27.549
0.8655
Hard GSAShrink
0.4388 0.5823
27.032 27.777
0.8625 0.8903
1.6952 2.2662
28.705 29.365
0.9017 0.9108
1.8093 2.3455
28.662 29.266
0.9001 0.9113
1.9868
2.587
28.856 29.829
0.8981 0.9176
78
Fig. 2. HL2 and HH1 wavelet sub-bands for the Lena image: (a) a more homogeneous
situation ( = 1.1754) and (b) a more heterogeneous case ( = 0.9397), defined by statistically
different MRF parameter values.
wavelet transform that has filters with symmetrical impulse response, that is, linear phase
filters. The motivation for including Biorthogonal wavelets is that it has been reported that in
image processing applications filters with non-linear phase aften introduce visually annoying
artifacts in the denoised images.
To perform quantitative analysis of the obtained results, we compared several metrics
for image quality assessment. In this manuscript, we selected three different metrics that
are: Improvement in Signal-To-Noise-Ratio (ISNR), Peak Signal-To-Noise Ratio (PSNR) and
Structural Similarity Index (SSIM), since MSE based metrics have proved to be inconsistent
with the human eye perception Wang & Bovik (2009).
Sub-band MPL
LH2
HL2
HH2
LH1
HL1
HH1
1.1441
1.1754
1.0533
0.9822
0.9991
0.9397
n2 ( MPL )
3.1884 106
9.1622 106
1.8808 105
6.2161 106
7.3409 106
4.5530 106
Table 3. MPL estimators for and asymptotic variances for the Lena image wavelet
sub-bands.
Table 2 shows the results for GSAShrink denoising on the Lena image, corrupted by additive
Gaussian noise (PSNR = 26.949 dB). Table 3 shows the estimated regularization MRF
parameters and their respective asymptotic variances for each one of the details sub-bands.
Figure 2 shows the HL2 and HH1 sub-bands of wavelet decomposition. Note that the coarser
a sub-band, the smaller is the regularization parameter, indicating that suppression is favored
over smoothing, forcing a more intense noise removal.
79
Analyzing the results, we see that GSAShrink had superior performance in all cases.
Furthermore, the best result was obtained by using GSAShrink with Biorthogonal 6.8 wavelets.
To illustrate these numerical results, Figure 3 shows some visual results for the best
performances.
(b) Soft-Threshold.
(d) GSAShrink.
Fig. 3. Visual results for wavelet denoising using Biorthogonal6.8 wavelets with sub-band
adaptive Universal threshold (Table 2): (a) Noisy Lena; (b) Soft-Threshold; (c)
Hard-Threshold; (d) GSAShrink.
The same experiment was repeated by considering other threshold estimation methods. The
use of SURE and Bayes thresholds improved the denoising performance, as indicate Table
4. As the use of Biothogonal6.8 wavelets resulted in uniformly superior performances, from
now on we are omitting the other wavelet filters. Figure 4 shows the visual results for the best
results (SURE).
As GSAShrink iterativelly converges to local maxima solutions, we performed an experiment
to illustrate the effect of using different initializations on the final result by combining spatial
domain (Bilateral Filtering) and wavelet-domain (GSAShrink) non-linear filtering. If instead of
considering the observed noisy image directly as input to our algorithm, we use the result
of Bilateral Filtering, the performance can be further improved. Table 5 shows a comparison
between simple Bilateral Filtering and the combined approach. Figure 5 shows that the use of
80
ISNR
PSNR
SSIM
ISNR
PSNR
SSIM
ISNR
PSNR
SSIM
Soft
2.0235
28.702
0.8918
2.8511
29.433
0.8942
3.3713
30.045
0.9103
Hard GSAShrink
2.8836 3.3458
29.641 30.441
0.8991 0.9270
1.2721 3.2280
28.270 29.880
0.8306 0.9157
2.7318 3.6411
29.586 30.609
0.8964 0.9277
(b) Soft-Threshold.
(c) Hard-Threshold.
(d) GSAShrink.
Fig. 4. Visual results for wavelet denoising using Biorthogonal6.8 wavelets with sub-band
adaptive SURE threshold (Table 4): (a) Noisy Lena; (b) Soft-Threshold; (c) Hard-Threshold;
(d) GSAShrink.
Bilateral Filtering in the generation of initial conditions to the GSAShrink algorithm prevents
the appearance of visible artifacts that are usually found in wavelet-based methods.
81
Table 5. Results of using Bilateral Filtering to generate better initial conditions to our
MAP-MRF approach.
Sub-band MPL
LH2
HL2
HH2
LH1
HL1
HH1
0.8066
0.8898
0.7338
0.7245
0.7674
0.6195
n2 ( MPL )
2.4572 105
3.7826 105
1.0153 105
3.9822 105
5.8675 105
4.4578 105
Table 6. MPL estimators for and asymptotic variances for the NMR image wavelet
sub-bands.
5.1 Results on real image data
6. Conclusion
In this chapter, we investigated a novel MAP-MRF iterative algorithm for wavelet-based
image denoising (GSAShrink). Basically, it uses the Bayesian approach and game-theoretic
concepts to build a flexible and general framework for wavelet shrinkage. Despite its
simplicity, GSAShrink has demonstrated to be efficient in edge-preserving image filtering.
The Generalized Gaussian distribution and a GIMLL MRF model were combined to derive a
payoff function which provides a rule for iterativelly update the current value of a wavelet
coefficient. This was, to the best of our knowledge, the first time these two models were
combined for this purpose. Also, we have shown that in this scenario, the MRF model
parameter plays the same role of a regularization parameter, since it controls the tradeoff
between supression and atenuation, defining a hybrid approach.
Experiments with both simulated and real NMR image data provided good results that were
validated by several quantitative image quality assessment metrics. The obtained results
82
(c) GSAShrink.
(d) BL + GSAShrink.
Fig. 5. Results for wavelet denoising using combination of Bilateral Filtering and our
MAP-MRF approach (Table 5): (a) Original Lena; (b) Bilateral Filtering (BF); (c) GSAShrink;
(d) Bilateral Filtering + GSAShrink.
indicated a significant improvement in the denoising performance, showing the efectiveness
of the proposed method.
Future works may include the use and investigation of more wavelet decomposition levels,
other kinds of wavelet transforms, such as wavelet packets and undecimated or stationary
transforms, as well as the filtering of other kinds of noise such as multiplicative speckle and
signal-dependent Poisson noise (by using the Anscombe Transform). Finally, we intend to
proposed and study the viability of other combinatorial optimization shrinkage methods
as ICMShrink and MPMShrink, based on modified versions of ICM and MPM algorithms
respectively. Regarding the influence of the initial conditions on the final result, we believe
that the use of multiple initializations instead of a single one, together with information
fusion techniques, can further improve the denoising performance, particularly in multiframe
image filtering/restoration or video denoising, where several frames from the same scene are
available and only the noise changes from one frame to another.
83
(b) Soft-Threshold.
(c) Hard-Threshold.
(d) BL + GSAShrink.
Fig. 6. Results for wavelet denoising on real NMR marmoset brain image data: (a) Noisy
NMR image; (b) Soft-Threshold; (c) Hard-Threshold; (d) Bilateral Filtering + GSAShrink.
(b) HH subband.
Fig. 7. HL2 and HH1 wavelet sub-bands for the NMR image: (a) a more homogeneous
situation ( = 0.8898) and (b) a more heterogeneous case ( = 0.6195), defined by statistically
different MRF parameter values.
84
7.References
Barash, D. (2002). A fundamental relationship between bilateral filtering, adaptive smoothing
and the nonlinear diffusion equation, IEEE Transactions on Pattern Analysis and
Machine Intelligence 24(6): 844847.
Berthod, M., Kato, Z. & Zerubia, J. (1995). Dpa: Deterministic approach to the map problem,
IEEE Transactions on Image Processing 4(9): 13121314.
Besag, J. (1974). Spatial interaction and the statistical analysis of lattice systems, Journal of the
Royal Statistical Society - Series B 36: 192236.
Besag, J. (1986a). On the statistical analysis of dirty pictures, Journal of the Royal Statistical
Society B 48(3): 192236.
Besag, J. (1986b). On the statistical analysis of dirty pictures, Journal of Royal Statistical Society
Series B 48(3): 259302.
Blake, A. & Zisserman, A. (1987). Visual Reconstruction, MIT Press.
Brent, R. (1973). Algorithms for minimization without derivatives, Prentice Hall.
Chang, S. G., Yu, B. & Vetterli, M. (2000). Adaptive wavelet thresholding for image denoising
and compression, IEEE Trans. on Image Processing 9(9): 15321546.
Chou, P. B. & M., B. C. (1990). The theory and practice of bayesian image labeling, International
Journal of Computer Vision 4: 185210.
Daubechies, I. (1988). Orthonormal bases of compactly supported wavelets, Communications
on Pure and Applied Mathematics 41(7): 909996.
Dong, G. & Acton, S. T. (2007). On the convergence of bilateral filter for edge-preserving image
smoothing, IEEE Signal Processing Letters 14(9): 617620.
Donoho, D. L. (1995). De-noising by soft-thresholding, IEEE Trans. on Information Theory
41(3): 613627.
Donoho, D. L. & Johnstone, I. M. (1994). Ideal spatial adaptation via wavelet shrinkage,
Biometrika 81: 425455.
Donoho, J., Johnstone, I. M., Kerkyacharian, G. & Picard, D. (1995). Wavelet shrinkage:
Asymptopia ?, Journal of the Royal Statistical Society B 52(2): 301369.
Elad, M. (2002). On the origin of the bilateral filtering and ways to improve it, IEEE Transactions
on Image Processing 11(10): 11411151.
Geman, S. & Geman, D. (1984).
Stochastic relaxation, Gibbs distributions, and the
Bayesian restoration of images, IEEE Trans. on Pattern Analysis Machine Intelligence
6(6): 721741.
H., Y., Zhao, L. & Wang, H. (2009). Image denoising using trivariate shrinkage filter in the
wavelet domain and joint bilateral filter in the spatial domain, IEEE Transactions on
Image Processing 18(10): 23642369.
Hammersley, J. & Clifford, P. (1971). Markov field on finite graphs and lattices. unpublished.
Hudson, H. M. (1978). A natural identity for exponential families with applications in
multiparameter estimation, Annals of Statistics 6(3): 473484.
Jansen, A. & Bultheel, A. (1999). Multiple wavelet threshold estimation by generalized
crossvalidation for images with correlated noise, IEEE Transactions on Image Processing
8(7): 947953.
Jansen, M. (2001). Noise reduction by wavelet thresholding, Springer-Verlag.
Jensen, A. & Cour-Harbo, A. (2001). Ripples in Mathematics, Springer-Verlag Berlin.
Jensen, J. & Knsh, H. (1994). On asymptotic normality of pseudo likelihood estimates
for pairwiseinteraction processes, Annals of the Institute of Statistical Mathematics
46(3): 475486.
85
86
Winkler, G. (2006). Image Analysis, Random Fields and Markov Chain Monte Carlo Methods: A
Mathematical Introduction, Springer.
Wolff, U. (1989). Collective monte carlo updating for spin systems, Physical Review Letters
62: 361364.
Wu, J. & Chung, A. C. S. (2007).
A segmentation model using compound markov
random fields based ona boundary model, IEEE Transactions on Image Processing
16(1): 241252.
Yoon, B. J. & Vaidyanathan, P. P. (2004). Wavelet-based denoising by customized thresholding,
Proceedings of International Conference on Acoustics, Speech, and Signal Processing, Vol. 2,
pp. 925928.
Yu, S. & Berthod, M. (1995a). A game strategy approach for image labeling, Computer Vision
and Image Understanding 61(1): 3235.
Yu, S. & Berthod, M. (1995b). A game strategy approach for image labeling, Computer Vision
and Image Understanding 61(1): 3237.
Zhang, B. & Allebach, J. P. (2008). Adaptive bilateral filter for sharpness enhancement and
noise removal, IEEE Transactions on Image Processing 17(5): 664678.
Zhang, M. & Gunturk, B. K. (2008). Multiresolution bilateral filtering for image denoising,
IEEE Transactions on Image Processing 17(12): 23242333.
5
Image Equalization Using
Singular Value Decomposition
and Discrete Wavelet Transform
Cagri Ozcinar1, Hasan Demirel2 and Gholamreza Anbarjafari3
1Department
1. Introduction
Contrast enhancement is frequently referred as one of the most important issues in image
processing. Contrast is created by the difference in luminance reflected from two adjacent
surfaces. In other words, contrast is the difference in visual properties that makes an object
distinguishable from other objects and the background. In visual perception, contrast is
determined by the difference in the color and brightness of the object with other objects. Our
visual system is more sensitive to contrast than absolute luminance; therefore, we can perceive
the world similarly regardless of the considerable changes in illumination conditions.
If the contrast of an image is highly concentrated on a specific range, e.g. an image is very
dark; the information may be lost in those areas which are excessively and uniformly
concentrated. The problem is to optimize the contrast of an image in order to represent all
the information in the input image. There have been several techniques to overcome this
issue (Shadeed et al., 2003; Gonzales and Woods, 2007; Kim et al., 1998; Chitwong et al.,
2002). One of the most frequently used techniques is general histogram equalization (GHE).
After the introduction of GHE, researchers came out with better techniques such as local
histogram equalization (LHE). However, the contrast issue is yet to be improved and even
these days many researchers are proposing new techniques for image equalization. In this
work, we are comparing our results with two state-of-art techniques, namely, dynamic
histogram equalization (DHE) (Abdullah Al Wadud et al., 2007) and our previously
introduced singular value equalization (SVE) (Demirel et al. ISCIS 2008).
As motioned before, in many image-processing applications, GHE technique is one of the
simplest and most effective primitives for contrast enhancement (Kim and Yang, 2006),
88
which attempts to produce an output histogram that is uniform (Weeks et al., 1999). One of
the disadvantages of the GHE is that the information laid on the histogram or probability
distribution function (PDF) of the image will be lost. Demirel and Anbarjafari showed that
the PDF of face images can be used for face recognition (Demirel and Anbarjafari, IEEE
Signal Processing Letter, 2008), hence preserving the shape of PDF of the image is of vital
importance. Techniques such as DHE or SVE are preserving the general pattern of the PDF
of an image. DHE is obtained from dynamic histogram specification (Sun et al., 2005) which
generates the specified histogram dynamically from the input image. DHE algorithm works
in the following way (Abdullah Al Wadud et al., 2007): Firstly, the locations of local
minimums of the histogram are found and then the histogram is divided into several subhistograms based on those local minimums. Then the mean, , and standard deviation, ,
for each sub-histogram is calculated. If gray levels(GLs) of having frequencies within (-) to
(+) is more than an specific value, e.g. 68.3% of the total number of GLs of a subhistogram, then that sub-histogram can be considered as a normal distribution of
frequencies and there is no dominating portion. But if it is less then that threshold value, the
sub histogram splits again. Then weights for GL range of ith sub-histogram are calculated by
the following equation (Abdullah Al Wadud et al., 2007):
(1)
rangei =
weight i
( L 1)
(2)
weight i
i =1
where L is total number of available GLs. Finally, GHE is applied for each sub-histogram.
SVE (Demirel et al. ISCIS 2008; Demirel and Anbarjafari, IEEE Signal Processing Letter,
2008) technique is based on equalizing the singular value matrix obtained by singular value
decomposition (SVD). SVD of an image, which can be interpreted as a matrix is written as
follows:
A = U A A VAT
(3)
where UA and VA are orthogonal square matrices known as hanger and aligner respectively,
and A matrix contains the sorted singular values on its main diagonal. The idea of using
SVD for image equalization comes from this fact that A contains the intensity information
of the given image (Tian et al., 2003). The objective of SVE proposed by Demirel et al. (ISCIS
2008) is to equalize a low contrast image in such a way that the mean moves towards the
neighborhood of 8-bit mean gray value 128 in the way that the general pattern of the PDF of
the image is preserved.
In our earlier work (Demirel and Anbarjafari, IEEE Signal Processing Letter, 2008), where
we introduced PDF based face recognition, singular value decomposition was used to deal
with the illumination problem. The method uses the ratio of the largest singular value of the
Image Equalization Using Singular Value Decomposition and Discrete Wavelet Transform
89
generated normalized matrix over a normalized image which can be calculated according to
equation (4).
max N ( = 0,var = 1)
max ( A )
(4)
where N ( = 0,var = 1) is the singular value matrix of the synthetic intensity matrix. This
coefficient can be used to regenerate an equalized image using equation (5).
equalized A = U A ( A )VAT
(5)
where equalizedA is representing the equalized image A. This task is eliminating the
illumination problem.
Nowadays, wavelets have been used quite frequently in image processing. It has been used
for feature extraction (Wang and Chen, 2006), denoising (Starck et al., 2002), compression
(Lamard et al., 2005), image equalization enhancement (Demirel et al., IEEE Geoscience and
Remote Sensing Letter, 2010), and face recognition (Liu et al., 2007). The decomposition of
images into different frequency ranges permits the isolation of the frequency components
introduced by intrinsic deformations or extrinsic factors into certain subbands (Dai and
Yan, 2007). This process results in isolating small changes in an image mainly in high
frequency subband images. Hence discrete wavelet transform (DWT) is a suitable tool to be
used for designing pose invariant face recognition system. The two-dimensional wavelet
decomposition of an image is performed by applying the one-dimensional DWT along the
rows of the image first, and then the results are decomposed along the columns. This
operation results in four decomposed subband images refer to Low-Low (LL), Low-High
(LH), High-Low (HL), and High-High (HH). The frequency components of those subband
images cover the frequency components of the original image.
In this work, we have proposed a new method for image equalization which is an extension
of SVE and it is based on SVD of LL subband image obtained by DWT. DWT is used to
separate the input image into different frequency subbands, where LL subband concentrates
the illumination information. That is why, only LL subband goes through SVE process,
which preserves high frequency components (i.e. edges). Hence, after IDWT, the resultant
image will be sharper. In this chapter, the proposed method has been compared with
conventional GHE technique as well as LHE and some state-of-art technique such as DHE
an SVE. The results indicate the superiority of the proposed method over the
aforementioned methods.
90
applying the illumination enhancement in LL subband only, will protect the edge
information from possible degradation (Demirel et al., IEEE Geoscience and Remote Sensing
Letter, 2010). After reconstructing the final image by using IDWT, the resultant image will
not only be enhanced with respect to illumination, but also it will be sharper.
The general procedure of the proposed technique is as follows. The input image, A, is first
. Then both of these images are transformed by
processed by using GHE to generate A
DWT into four subband images. The correction coefficient for singular value matrix is
calculated by using the following equation:
max LL
max LLA
(6)
where LL is the LL singular value matrix of the input image and LL is the LL singular
A
A
value matrix of the output of the GHE. The new LL image is composed by:
LLA = LLA
(7)
Now the LL A and LHA, HLA, and HHA subband images of the original image are recombined
by applying IDWT, to generate the resultant equalized image A .
A = IDWT LL A , LH A , HLA , HH A
(8)
In this chapter we have used db.9/7 wavelet function as the mother function of the DWT. In
the following section the experimental results and the comparison of the aforementioned
conventional and state-of-art techniques are discussed. Fig. 1 illustrates all steps of the
proposed image equalization technique.
Image Equalization Using Singular Value Decomposition and Discrete Wavelet Transform
Low contrast
input image
Equalized image
using GHE
DWT
HH
HL
LH
DWT
LL
LL
LH
HL
HH
Calculate
using eq (6)
IDWT
Equalized
image
Fig. 1. The detailed steps of the proposed equalization technique.
91
92
(a) Original
(b) GHE
(c) SVE
(d) DHE
(e) LHE
Fig. 2. (a) Low contrast image, Equalized image using: (b) GHE, (c) SVE, (d) DHE, (e) LHE,
and (f ) the proposed technique.
(a) Original
(c) SVE
(b) GHE
(d) LHE
Fig. 3. (a) Low contrast image, Equalized image using: (b) GHE, (c) SVE, (d) LHE, and (e )
the proposed technique.
Image Equalization Using Singular Value Decomposition and Discrete Wavelet Transform
(a) Original
(c) SVE
93
(b) GHE
(d) LHE
Fig. 4. (a) Low contrast image, Equalized image using: (b) GHE, (c) SVE, (d) LHE, and (e )
the proposed technique.
4. Conclusions
In this work, a new image equalization technique based on SVD and DWT was proposed.
The proposed technique converted the image from spatial domain into the DWT domain
and after equalizing the singular value matrix of the LL subband image, it reconstructed the
image in the spatial domain by using IDWT. The technique was compared with the GHE,
LHE, DHE and SVE techniques. The experimental results were showing the superiority of
the proposed method over the conventional and the state-of-art techniques.
5. Acknowledgements
Authors would like to thank Haidi Ibrahim and Nicholas Sia Pik Kong from School of
Electrical and Electronic Engineering, Universiti Sains Malaysia for providing the equalized
output images of DHE technique.
6. References
A. R. Weeks, L. J. Sartor, and H. R. Myler, Histogram specification of 24-bit color images in
the color difference (C-Y) color space, Proc. SPIE, 1999, 3646, pp. 319329.
C. C. Liu, D. Q. Dai, and H. Yan, Local discriminant wavelet packet coordinates for face
recognition, Journal of Machine Learning Research, 2007, Vol. 8, pp: 1165-1195.
C. C. Sun, S. J. Ruan, M. C. Shie, and T, W. Pai, Dynamic contrast enhancement based on
histogram specification, IEEE Transactions on Consumer Electronics, Vol. 51, No.
4, 2005, pp.13001305.
94
D. Q. Dai and H. Yan, Wavelet and face recognition, Face recognition, Chapter 4, K. Delac
and M. Grgic (Eds), ISBN: 978-3-902613-03-5, Austria, pp: 59-74, 2007.
H. Demirel, G. Anbarjafari, and M. N. S. Jahromi, "Image equalization based on singular
value decomposition", 23rd IEEE International Symposium on Computer and
Information Sciences, Turkey, Oct 2008, pp. 1-5.
H. Demirel and G. Anbarjafari, Pose invariant face recognition using probability
distribution function in different color channels", IEEE Signal Processing Letter,
Vol. 15, 2008, pp. 537 - 540.
H. Demirel, C. Ozcinar, and G. Anbarjafari, "Satellite Image Contrast Enhancement Using
Discrete Wavelet Transform and Singular Value Decomposition", IEEE Geoscience
and Remote Sensing Letter, April 2010, Vol. 7, No. 2, pp. 334-338.
J. L. Starck, E. J. Candes, and D. L. Donoho, The curvelet transform for image denoising,
IEEE Transactions on Image Processing, 2002, Vol. 11, pp: 670-684.
J. W. Wang and W. Y. Chen, Eye detection based on head contour geometry and wavelet
subband projection, Optical Engineering, 2006, Vol. 45, No. 5.
M. Abdullah-Al-Wadud, H. Kabir, M. A. A.Dewan, C. Oksam, A dynamic histogram
equalization for image contrast enhancement, IEEE Transaction on Consumer
Electronics, Vol. 53, No 2, May 2007, pp. 593-600.
M. Lamard, W. Daccache, G. Cazuguel, C. Roux, and B. Cochener, "Use of a JPEG-2000
wavelet compression scheme for content-based ophtalmologic retinal images
retrieval", 27th Annual International Conference of the Engineering in Medicine and
Biology Society, IEEE-EMBS 2005, pp: 4010 4013.
R. C. Gonzalez, and R. E. Woods, Digital Image Processing, Prentice Hall, ISBN 013168728X,
2007.
S. Chitwong, T. Boonmee, and F. Cheevasuvit, Enhancement of colour image obtained from
PCA-FCM technique using local area histogram equalization, Proc. SPIE, 2002,
4787, pp. 98106.
T. K. Kim, J. K. Paik, and B. S. Kang, Contrast enhancement system using spatially adaptive
histogram equalization with temporal filtering, IEEE Transactions on Consumer
Electronics, Vol. 44, No. 1, 1998, pp. 8286.
T. Kim, H.S. Yang, A Multidimensional Histogram Equalization by Fitting an Isotropic
Gaussian Mixture to a Uniform Distribution, IEEE International Conference on
Image Processing, 8-11 Oct. 2006, pp. 2865 2868.
W. G. Shadeed, D. I. Abu-Al-Nadi, and M. J. Mismar, Road traffic sign detection in color
images, 10th International Conference on Environmental and Computer Science,
Vol. 2, Dec 2003, pp. 890-893.
Y. Tian, T. Tan, Y. Wang, and Y. Fang, Do singular values contain adequate information for
face recognition?, Pattern Recognition, Vol. 36, 2003, pp. 649 655.
6
Probability Distribution Functions
Based Face Recognition System
Using Discrete Wavelet Subbands
Hasan Demirel1 and Gholamreza Anbarjafari2
1Department
1. Introduction
Face recognition has recently been the centre of attention of many researchers (Jain et al. 2004).
The earliest work in digital face recognition was reported by Bledsoe in 1964. Statistical face
recognition systems such as principal component analysis (PCA) based eigenfaces introduced
by Turk and Pentland in 1991, attracted lots of attention. Fisherfaces method based on linear
discriminant analysis was introduced later on by Belhumeur et al. (1997).
Many of these methods are based on grey scale images; however colour images are
increasingly being used since they add additional biometric information for face recognition
(Marcel and Bengio, 381). PDFs obtained from different colour channels of a face image can
be considered as the signature of the face, which can be used to represent the face image in a
low dimensional space (Demirel and Anbarjafari, VISSAP 2008). Images with small changes
in translation, rotation and illumination still possess high correlation in their corresponding
PDFs. PDF of an image is a normalized version of an image histogram which have been
used in many image processing applications such as object detection (Laptev, 2006) and face
recognition (Yoo and Oh, 1999; Rodriguez and Marcel, 2006; Demirel and Anbarjafari, IEEE
Signal Processing Letter, 2008).
Nowadays, wavelets have been used quite frequently in image processing. It has been used
for feature extraction (Wang and chen, 2006), denoising (Starck et al., 2002), compression
(Lamard et al. 2005), and face recognition (Liu et al., 2007; Demirel et al., 2008). The
decomposition of images into different frequency ranges permits the isolation of the
frequency components introduced by intrinsic deformations or extrinsic factors into
certain subbands (Dai and Yan, 2007). This process results in isolating small changes in an
image mainly in high frequency subband images. Hence discrete wavelet transform (DWT)
is a suitable tool to be used for designing pose invariant face recognition system.
Another important issue in face recognition system is face localization. There are several
methods for this task such as skin tone based face localization for face segmentation. Skin is
96
a widely used feature in human image processing with a range of applications (Yang et al.,
2002; Demirel et al., EECS 2008). Many methods have been proposed to use skin colour
pixels for face localization. Chai and Ngan (1999) modelled the skin colour in YCbCr colour
space. One of the recent methods for face localization is proposed by Nilsson (2007) which is
using local Successive Mean Quantization Transform (SMQT) technique. Local SMQT has
been claimed to be robust for illumination changes and the Receiver Operation
Characteristics of the method is reported to be very successful for the segmentation of faces
(Nilsson et al., ICASSP2007). In order to enhance the robustness of the system under
changing illumination conditions, a reliable image equalization technique such as dynamic
histogram equalization (Abdullah et al., 2007) or singular value decomposition based image
equalization (Demirel and Anbarjafari, IEEE Signal Processing Letter, 2008; Sabet et al.,
ISCIS 2008) can be applied in the pre-processing stage.
In this chapter, after the face localization, 2-norm based image equalization technique has
been employed to enhance the robustness of the system under changing illumination. Then
the PDFs of the equalized and segmented faces in different subbands obtained from discrete
wavelet transform (DWT) are calculated. These PDFs are used as statistical feature vectors
for the recognition of faces by minimizing the Kullback-Leibler Divergence (KLD) between
the PDF of a given face and the PDFs of faces in the database. The effect of well-known
decision fusion techniques such as sum rule, median rule, max rule, product rule, majority
voting (MV), and feature vector fusion (FVF), for combining feature vectors in HSI and
YCbCr colour spaces of Low-Low, Low-High, High-Low, and High-High subbands, have
been studied in order to achieve higher recognition performance.
The Head Pose (HP) face database (Gourier et al., 2004) and a subset from the FERET
(Philips et al., 2000) database with faces containing varying poses changing from -90o to +90o
of rotation around the vertical axis passing through the neck were used to test the proposed
system. Both databases include face images with varying poses and face images have little
illumination variation. The results are compared with principle component analyses (PCA),
and three state-of-art face recognition systems: adaptive local binary pattern [LBP] PDFs
based face recognition (Rodriguez and Marcel, 2006), nonnegative matrix factorization
(NMF) introduced by Lee et al. (1999, 2001) and supervised incremental NMF (INMF)
introduced and described by Wen-Sheng et al. (Wen Sheng et al., 2008).
A = {R , G , B}
(1)
where UA and VA are orthogonal square matrices (hanger and aligner matrices) and A
matrix contains the sorted singular values on its main diagonal (stretcher matrix). As
reported in (Tian et al., 2003), A represents the intensity information of a given image. If an
image is a low contrast image this problem can be corrected to replace the A of the image
97
with another singular matrix obtained from an image with no contrast problem. A
normalized intensity image matrix with no illumination problem can be considered to be the
one with a PDF having a Gaussian distribution with mean of 0.5 and variance of 1. Such a
synthetic intensity matrix with the same size of the original image can easily be obtained by
generating random pixel values with Gaussian distribution with mean of 0.5 and variance of
1. Then the ratio of the largest singular value of the generated normalized matrix over a
normalized image can be calculated according to equation (2):
A =
max ( A )
A = {R , G , B}
(2)
where g(=0.5,=1) is the singular value matrix of the synthetic intensity matrix. This
coefficient can be used to regenerate a new singular value matrix which is actually an
equalized intensity matrix of the image generated by equation (3):
equalizedA = U A ( A A )VAT
A = {R , G , B}
(3)
(4)
A = UV T AT A = V 2V T
(5)
This follows that the eigenvalues of ATA are the square of elements of the main diagonal of
, and that the eigenvector of ATA is V. Because is in the form of:
"
k
m n
k = min ( m , n )
(6)
Thus,
A = 1
(7)
The 2-norm of a matrix is equal to the largest singular value of the matrix. Therefore A can
be easily obtained from:
98
A =
g ( = 0.5, = 1)
A
A = {R , G , B}
(8)
where g(=0.5,=1) is a random matrix with mean of 0.5 and variance of 1 and A is the
intensity image in R, G, or B. Hence the equalized image can be obtained by:
equalized A = A A =
g ( = 0.5, = 1)
A
A = {R , G , B}
(9)
which shows there is no need to use singular value decomposition of intensity matrices.
This procedure reduces the complexity of the equalization procedure. This task, which is
actually equalizing the images, will eliminate the illumination problem. The SVE technique
has been tested on the Oulu face database (Marszalec et al., 2000) as well as the FERET and
HP face databases. Fig. 1 shows the general required steps of the pre-processing phases of
the proposed system.
After applying SVE, the equalized images can be used as an input for the face detector
prepared by Mike Nilsson (MathWirks, 2008) in order to localize and then crop the face
region and eliminate the undesired background. The segmented face images are used as
inputs of DWT for the generation of PDFs of different subband images in H, S, I, Y, Cb, and
Cr colour channels. If there is no face in the image, then there will be no output from the face
detector software, so it means the probability of having a random noise which has the same
colour distribution of a face but with different shape is zero, which makes the proposed
method reliable.
Equalized
images
Input
Images
Cropping by
using local
SMQT method
Output
Images
Fig. 1. The algorithm, with a sample image with different illumination from Oulu face
database, of pre-processing of the face images to obtain a segmented face from the input face
image.
99
The two-dimensional wavelet decomposition of an image is performed by applying the onedimensional DWT along the rows of the image first, and then the results are decomposed
along the columns (MATLAB 2009). This operation results in four decomposed subband
images refer to Low-Low (LL), Low-High (LH), High-Low (HL), and High-High (HH). The
frequency components of those subband images cover the frequency components of the
original image.
(10)
i = i
(11)
1 M
nTn = T
M n=1
=[ 1 2 M ]
(12)
Since there are M images in the database, the covariance matrix C has only M-1 meaningful
eigenvectors. Those eigenvectors ul, can be obtained by multiplying eigenvectors vl, of
matrix L=T (of size MM) with difference vectors in matrix .
M
ul = v lk k
k =1
(13)
The eigenvectors, ul, are called the eigenfaces. Eigenfaces with higher eigenvalues contribute
more in representation of a face image. The face subspace projection vector for every image
is defined by:
T = [ 1 2 ... M ]
k = uTk (
k = 1,2,..., M
(14)
100
The projection vectors are indispensable in face recognition tasks, due to their uniqueness.
The projection vector, which represents a given face image in the eigenspace can be used for
the recognition of faces. Euclidian distance, , between projection vectors of two different
images (1 and 2) is used to determine whether a face is recognized correctly or not.
= 1 2
(1i 2 i )
(15)
i =1
PCA face recognition system has been applied to the different colour channels (H, S, I, Y, Cb
and Cr) and as it will be shown in section 7, the recognition rate of PCA based face
recognition system is being increased by fusion of the decisions of different colour channels
using MV.
3.2 Language, style spelling
The local binary pattern (LBP) is a non-parametric operator which describes the local spatial
structure of an image. Ojala et al. introduced this operator and showed its high
discriminative power for texture classification (Ojala et al., 1996). At a given pixel position
(x,y), LBP is defined as an ordered set of binary comparisons of pixel intensities between the
centre pixel and its eight neighbour pixels, as shown in Fig 2.
83
90
225
98
97
200
45
69
199
Binary
Intensity
1
Comparison with
The centre
1
Binary : 00111001
Decimal: 57
1
0
LBP ( x , y ) = 2 n s in i( x , y )
n=0
(16)
where i(x,y) corresponds to the grey value of the centre pixel (x,y), in to the grey values of the
8 neighbour pixels, and function s(x) is defined as:
1 if x 0
s(x) =
0 if x < 0
(17)
101
complexity and its texture discriminative property. LBP has been used in many image
processing applications such as motion detection, visual inspection, image retrieval, face
detection, and face recognition.
In most aforementioned applications a face image was usually divided in small regions. For
each region, a cumulative histogram of LBP code computed at each pixel location within the
region was used as a feature vector.
Ahonen et al. used LBP operator for face recognition (Ahnon et al., 2004). Their face
recognition system can be explained as follows: A histogram of the labelled image f1(x,y) can
be defined as:
H i = I { f 1 ( x , y ) = i}
x ,y
i = 0," , n 1
(18)
where n is the number of different labels produced by the LBP operator and
1 A is true
I { A} =
0 A is false
(19)
This histogram contains information about the distribution of the local micropatterns, such
as edges, spots, and flat areas, over the whole image. For efficient face representation,
retaining the spatial information is required; hence the image is divided into regions R0, R1,
, Rm-1, as shown in Fig 3.
H i , j = I { f 1 ( x , y ) = i} I ( x , y ) R j
x,y
, i = 0," , n 1 , j = 0," , m 1
(20)
where m is the number of blocks and n is the LBP bins. In this histogram, a description of the
face on three different levels of locality exists: the labels for the histogram contain
information about the patterns on a pixel level, the labels are summed over a small region to
102
produce information on a regional level, and the regional histograms are concatenated to
build a global description of the face.
Although Ahonen et al. have mentioned several dissimilarity measures such as histogram
intersections and log-likelihood statistics, they used nearest neighbour classifier with Chi
square dissimilarity measure in their work (Ahnon et al., 2004).
When the image has been divided into several regions, it can be expected that some of the
regions contain more useful information than others in terms of distinguishing between
people, such as eyes. In order to contribute such information, a weight can be set for each
region based on level of information it contains.
(21)
where N is the total number of pixels in an image and i is the number of pixels having i
intensity.
Given two PDFs the divergence between them can be calculated by using Kullback-Leibler
Divergence (KLD). The KLD value, , between two given PDFs, pC and qC, can be calculated
as follows:
( q C ,pC ) = q i log
i
iC
pi
C
i=0,1,2,..., -1
(22)
where is the number of bins and C is (H, S, I, Y, Cb, or Cr)LL,LH,HL,HH. However, KLD is not a
distance measure but it represents the similarity of the two PDFs. In other words, the
smaller the KLD value the more similar the PDFs.
((
j = min q C ,p jC
C
))
( H , S , I , Y , Cb , Cr )LL
( H , S , I , Y , Cb , Cr )LH
, C=
, j = 1, 2," , M
( H , S , I , Y , Cb , Cr ) HL
( H , S , I , Y , Cb , Cr )
HH
(23)
Here, iC , is the minimum KLD reflecting the similarity of the ith image in the training set in
C subband colour channel and the query face and M is the number of image samples. The
colour PDFs used in the proposed system is generated only from the segmented face, and
hence the effect of background regions is eliminated. Fig. 4 shows two subjects with two
different poses and their segmented faces from the FERET face database.
0.2
0.04
103
0.03
0.04
0.2
0.2
0.02
0.1
0.1
0.02
0.1
0.02
0.01
200
400
200
400
200
400
200
400
200
400
0.2
0.04
0.04
0.04
0.2
0.2
0.1
0.02
0.02
0.02
0.1
0.1
200
400
0.4
200
400
200
400
0.03
0.03
0.03
0.02
0.02
0.02
0.01
0.01
0.01
200
400
0.2
200
400
200
400
0.4
0.04
0.02
0.2
0.02
0.01
200
400
200
400
0.03
200
400
0.2
0.2
0.1
0.1
200
400
0.1
0.2
0.05
0.1
200
400
200
400
200
400
200
400
0.02
0.01
0
(a)
(b)
200
(c)
400
200
(d)
400
200
400
(e)
200
(f)
400
200
(g)
400
(h)
Fig. 4. Two subjects from FERET database with 2 different poses (a), their segmented faces
(b) and their PDFs in H(c), S(d), I(e), Y(f), Cb(g), and Cr(h) colour channels in LL subband
respectively.
104
Using Local
SMQT to detect
the face and
then Crop it.
Input
face image
H channel
DWT
LL, LH,
HL, HH
S channel
DWT
LL, LH,
HL, HH
I channel
DWT
LL, LH,
HL, HH
Y channel
DWT
LL, LH,
HL, HH
Cb channel
DWT
LL, LH,
HL, HH
Cr channel
DWT
LL, LH,
HL, HH
24 PDFs
Decision making
using KLD, and
decision fusion
techniques
C =
[ 1
2 " nM ]C
nM
i =1
C = max ( 1 C )
( H , S , I , Y , Cb , Cr )LL
( H , S , I , Y , Cb , Cr )LH
C=
( H , S , I , Y , Cb , Cr ) HL
( H , S , I , Y , Cb , Cr )
HH
(24)
where C is the normalized KLD value, i is indicating the KLD value of the query image
from the ith image in the training set, n shows the number of face samples in each class and
M is the number of classes. The highest similarity between two projection vectors is when
the minimum KLD value is zero. This represents a perfect match, i.e. the probability of
selection is 1. So zero KLD value represents probability of 1 that is why C has been
subtracted from 1, the maximum probability corresponds to the probability of the selected
class. The sum rule is applied, by adding all the probabilities of a class in different colour
105
channels of different subbands, followed by declaring the class with the highest accumulated
probability to be the selected class. The maximum rule, as its name implies, simply takes the
maximum among the probabilities of a class in different colour channels of different subbands,
followed by declaring the class with the highest probability to be the selected class. The
median rule similarly takes the median among the sorted probabilities of a class in different
channels. The product rule is achieved from the product of all probabilities of a class in
different colour channels of different subbands. Product rule is very sensitive as a low
probability (close to 0) will remove any chance of that class being selected.
MV is one of the most frequently used decision fusion technique. The main idea behind MV
is to achieve increased recognition rate by combining decisions of the PDF based face
recognition procedures of different colour spaces and subbands. By considering the H, S, I,
Y, Cb and Cr PDFs in different wavelet subbands separately and combining their results by
using MV, the performance of the classification process will be increased. The MV
procedure can be explained as follows. Consider {p1,p2,.,pM}C to be a set of PDFs of training
face images in wavelet subband colour channels (C=(H, S, I, Y, Cb, or Cr)LL,LH,HL,HH), then a
given a PDF of a query face image, q, colour PDFs of the query image qC can be used to
calculate the KLD between qC and PDFs of the images in the training samples by equation
(24). The image with the minimum distance in a channel, C, is declared to be the vector
representing the recognized subject. Given the decisions of each classifier in each colour
space, the voted class E, can be chosen as follows.
(25)
(26)
where only the H colour channel components are shown in equation (26). This new PDF can
be used to calculate the KLD between fvfq and fvfpi of the images in the training samples. fvfq
is a vector of 16144, where 6144 is multiplication of the bin size (which is 256) by number of
colour channels (which is 6) by number of subbands (which is 4).
This new PDF can be used to calculate the KLD between fvfq and fvfpj of the images in the
training samples as follows:
( (
))
j = 1," , M
(27)
where M is the number of images in the training set and fvfpj is the combined PDFs of the jth
image in the training set. Thus, the similarity of the ith image in the training set and the
query face can be reflected by i, which is the minimum KLD value. The image with the
lowest KLD distance, i , is declared to be the vector representing the recognized subject.
106
Cb
Cr
36.89
48.67
49.11
47.33
49.78
49.11
41.50
54.75
53.50
52.50
58.25
57.75
52.86
62.86
56.29
56.57
67.71
64.00
58.00
69.00
64.67
66.00
73.67
70.33
62.40
74.80
69.60
72.80
77.60
74.80
Table 1. Performance of the PCA based system in H, S, I, Y, Cb and Cr colour channels of the
FERET face database.
Table 2 shows the performance of LBP based face recognition system of the FERET face
databases in HSI and YCbCr colour spaces.
Recognition rates of the proposed PDF based system
# of face images in the
training set
Cb
Cr
59.33
50.67
48.22
48.44
61.11
62.67
64.50
56.75
54.50
54.75
66.50
66.50
74.57
66.00
64.29
64.86
76.86
76.57
85.67
79.00
77.00
77.67
87.67
86.67
87.60
80.40
78.00
78.40
89.00
89.60
Table 2. Performance of the LBP based system in H, S, I, Y, Cb and Cr colour channels of the
FERET face database.
The correct recognition rates in percent of the PDF based face recognition of LL, LH, HL,
and HH subband images in different colour channels of HSI and YCbCr for the FERET face
database are included in Table 3. Each result is an average of 100 runs, where we have
randomly shuffled the faces in each class.
LL
67.16
80.43
86.43
89.83
92.96
LH
62.89
74.28
76.86
82.43
84.44
HL
68.84
79.73
84.00
85.13
88.88
HH
68.40
78.48
83.54
86.07
89.08
LL
48.27
60.23
64.20
71.00
74.72
LH
37.27
48.10
49.14
55.20
57.56
HL
42.76
51.63
55.46
59.63
64.76
HH
42.49
52.60
55.83
60.17
65.84
LL
44.02
54.93
59.74
66.30
70.16
LH
37.31
47.88
50.43
55.83
60.44
HL
41.96
47.90
52.34
57.27
62.36
HH
40.20
47.78
52.37
57.87
60.68
LL
53.93
64.55
70.14
76.33
81.24
LH
36.47
48.13
50.54
57.90
60.80
HL
40.18
47.78
52.31
59.27
63.00
HH
41.27
49.53
54.34
60.60
63.48
LL
56.09
67.90
74.43
79.23
83.36
LH
29.49
33.23
35.89
40.60
42.84
HL
42.93
50.78
55.37
56.03
58.24
HH
29.84
33.33
35.14
37.23
38.96
LL
14.65
15.89
17.31
18.33
20.24
LH
26.24
32.95
36.94
38.60
40.24
HL
30.84
39.35
39.94
40.63
43.68
HH
22.04
37.35
52.60
59.47
62.20
Cr
Cb
Number of training
images
107
Table 3. Performance of the proposed PDF based face recognition system of the DWT
subbands of colour images in H, S, I, Y, Cb and Cr colour channels separately for the FERET
face database.
108
The performances of the proposed system using data fusion techniques such as sum rule,
median rule, max rule, product rule, MV, and FVF, between all 24 decisions (an image with
its 4 subband images in 6 colour channels) for the FERET face database are shown in Table 4.
The performance of the conventional PCA, PCA-MV, LDA, and the state-of-art face
recognition systems: LBP, LBP-MV, PDF based face recognition by using FVF, NMF, and
INMF based face recognition systems for the FERET face database are also included in the
Table 4.
Number of training images
Recognition rate
82.22
90.45
93.69
95.60
96.88
H (DWT subbands)
80.44
85.50
95.43
96.33
96.40
S (DWT subbands)
60.89
68.25
81.71
88.00
90.00
I (DWT subbands)
63.11
68.50
84.57
91.67
93.60
Y (DWT subbands)
80.67
85.75
95.71
96.67
96.80
Cb (DWT subbands)
66.89
70.75
84.00
86.33
89.60
Cr (DWT subbands)
63.33
61.50
74.29
79.67
83.20
All subbands
82.89
87.00
96.57
98.80
99.33
SUM RULE
94.53
97.03
98.08
98.49
98.84
MEDIAN RULE
93.82
96.23
97.80
97.98
98.39
MAX RULE
81.71
87.78
90.37
91.83
92.87
PRODUCT RULE
16.58
0.67
0.67
0.67
0.67
PCA
44.00
52.00
58.29
66.17
68.80
LDA
61.98
70.33
77.78
781.43
85.00
PCA-MV
57.11
62.50
65.71
74.00
77.60
LBP
50.89
56.25
74.57
77.67
79.60
LBP-MV
54.44
58.75
69.14
81.00
83.20
80.44
83.75
94.00
97.67
98.00
NMF
61.33
64.67
69.89
77.35
80.37
INMF
63.65
67.87
75.83
80.07
83.20
FVF
MV
Table 4. Performance of the proposed face recognition system using MV, FVF, PCA, LDA,
LBP, PDF based face recognition, NMF, and INMF based face recognition system for the
FERET database.
109
FVF
The performances of the proposed system using aforementioned data fusion techniques
between all decisions for the HP face database are shown in Table 5.
1
77.93
68.89
64.44
54.07
69.63
59.17
28.15
88.15
83.85
84.74
74.74
84.22
Recognition Rate
2
3
4
91.00
92.67
95.11
79.17
85.71
88.89
77.50
87.62
92.22
64.17
77.14
87.78
80.00
86.67
90.00
60.00
67.62
75.56
34.17
41.90
40.00
93.33
97.14
98.67
96.42
96.76
96.67
97.00
96.19
97.00
88.17
90.95
91.47
97.33
96.86
97.11
5
97.53
88.00
93.33
84.00
89.33
77.33
48.00
98.89
97.33
98.53
91.67
96.27
Table 5. Performance of the proposed face recognition system using MV, and FVF based face
recognition system for the HP face databases
6.2 Discussions
The combination of feature vectors, with 5 samples per subject in the training set, achieve
99.33% and 96.88% recognition rates by using FVF and MV methods for the FERET face
database respectively. The MV and FVF results are 98.89% and 97.53% for the HP face
database, when 5 samples per subject is available in the training set, respectively. The results
obtained by the proposed system using FVF for the FERET database shows 30.53%, 21.73%,
14.33%, 19.73%, 16.13%, 1.33%, 18.96%, and 16.13% improvement over PCA, PCA-MV, LDA,
LBP, LBP-MV, PDF based face recognition system by using FVF, NMF, and INMF
respectively. In all cases both FVF and MV approaches outperform the conventional
methods in the literature. As it could be predicted sum rule, median rule, and max rule are
improving the recognition rate but as table 4 and 5 are showing, FVF is over performing the
other fusion techniques.
7. Conclusion
In this chapter, a new high performance face recognition system using the PDFs obtained
from DWT subbands in different colour channels followed by data fusion has been
proposed. The PDFs of the equalized and segmented face images in different subbands of
different colour channels were used as feature vectors for the recognition of faces by
minimizing the KLD between the PDF of a given face and the PDFs of faces in the database.
Several fusion techniques including sum rule, median rule, max rule, product rule, MV, and
FVF have been employed in order to improve the recognition performance. The system was
tested on the FERET and the HP face databases. The results have been compared with the
conventional PCA, improved PCA by applying MV, LDA and state-of-art face recognition
techniques including LBP, improved LBP by using MV, previously introduced PDF based
110
face recognition by using FVF, NMF, and INMF. The performance of the proposed face
recognition system has clearly shown the superiority of the system over the conventional
and state-of-art techniques.
8. References
A. K. Jain, R. Ross, and S. Prabhakar, An introduction to biometric recognition, IEEE
Transaction on Circuits and Systems for Video Technology, 2004, Vol. 14, No. 1, pp:
84-92.
C. C. Liu, D. Q. Dai, and H. Yan, Local discriminant wavelet packet coordinates for face
recognition, Journal of Machine Learning Research, 2007, Vol. 8, pp: 1165-1195.
C. Wen-Sheng, P. Binbin, F. Bin, L. Ming, and T. Jianliang, Incremental nonnegative matrix
factorization for face recognition, Mathematical problems in Engineering, June
2008, Vol. 2008.
D. Chai and K.N. Ngan Face segmentation using skin color map in videophone
application, IEEE Transactions on Circuits and Systems for Video Technology,
1999, Vol. 9, No. 4, pp: 551-564.
D. D. Lee, and H. S. Seung, Learning the parts of objects by nonnegative matrix
factorization, Nature, 1999, Vol. 401, No. 6755, pp: 788-791.
D. D. Lee, and H. S. Seung, Algorithm for nonnegative matrix factorization, In proc. Of
the advances in natural information processing system (NIPS 01), 2001, Vol.13, pp:
556-562.
D. Q. Dai and H. Yan, Wavelet and face recognition, Face recognition, Chapter 4, K. Delac
and M. Grgic (Eds), ISBN: 978-3-902613-03-5, Austria, pp: 59-74, 2007.
E. Marszalec, B. Martinkauppi, M. Soriano, and M. Pietikinen, A physics-based face
database for colour research, Journal of Electronic Imaging, 2000, Vol. 9, No. 1 pp:
32-38.
Face detector software written by M. Nilsson, provided in MathWorks. Retrieved in January
2008, http://www.mathworks.com/matlabcentral/fileexchange/13701
H. Demirel, and G. Anbarjafari, "Pose invariant face recognition using image histograms",
The 3rd International Conference On Computer Vision Theory and Applications
(VISAPP 2008), Portugal, Vol. 2, pp: 282-285.
H. Demirel, and G. Anbarjafari, "Pose invariant face recognition using probability
distribution functions in different color channels", IEEE Signal Processing Letter,
2008, Vol. 15, pp: 537-540.
H. Demirel, A. Eleyan, and H. zkaramanli, Complex wavelet transform based face
recognition, Eurasip Journal on Advance in Signal Processing, 2008, Vol. 2008, doi:
10.1155/2008/185281
H. Demirel, G. Anbarjafari, and M. N. Sabet Jahromi, "Skin detection in HSI colour space",
5th International Conference on Electrical and Computer Systems (EECS08),
November 27-28, 2008, Lefke, North Cyprus.
I. Laptev, Improvements of object detection using boosted histograms, British Machine
Vision Conference (BMVC), 2006, pp: III-949-958.
J. L. Starck, E. J. Candes, and D. L. Donoho, The curvelet transform for image denoising,
IEEE Transactions on Image Processing, 2002, Vol. 11, pp: 670-684.
J. W. Wang and W. Y. Chen, Eye detection based on head contour geometry and wavelet
subband projection, Optical Engineering, 2006, Vol. 45, No. 5.
111
112
Y. Rodriguez and S. Marcel, Face authentication using adapted local binary pattern PDFs.
Proceedings of the 9th European Conference on Computer Vision (ECCV), Graz,
Austria, May 7-13 2006, pp: 321-332.
Y. Tian, T. Tan, Y. Wang, and Y. Fang, Do singular values contain adequate information for
face recognition?, Pattern Recognition, 2003, Vol. 36, pp: 649 655.
7
An Improved Low Complexity Algorithm
for 2-D Integer Lifting-Based Discrete
Wavelet Transform Using
Symmetric Mask-Based Scheme
Chih-Hsien Hsia1, Jing-Ming Guo1 and Jen-Shiun Chiang2
1Department
of Electrical Engineering,
National Taiwan University of Science and Technology
2Department of Electrical Engineering, Tamkang University
Taipei,
Taiwan
1. Introduction
Communication and multimedia have been developed rapidly in recent years. Digital media
and services found in daily life include, such as digital cameras, VCD (Video Compact Disc),
DVD (Digital Video Disc), HDTV (High-Definition TeleVision) and video conferences.
Several well-known compression schemes, such as Differential Pulse Code Modulation
(DPCM)-based method (Habibi & Hershel, 1974), DCT-based methods (Feig et al.,
1995)(Kondo & Oishi, 2000), and Wavelet-based methods (Mallat, 1989) have been welldeveloped in recent years. The lifting-based scheme has recently provided a less-complexity
solution for image/video applications, e.g., JPEG2000, Motion-JPEG2000, MPEG-4 still
image coding, and MC-EZBC (Motion Compensation- Embedded Zero Block Coding).
However, the real-time 2-D DWT (software-based) is still difficult to be achieved. Hence, an
efficient transformation scheme for large of multimedia files is highly demanded.
Filter banks for the applications of subband image/video coding were introduced in the
1990s. Wavelet coding has been studied extensively since then. Wavelet coding has been
successfully applied to many applications. The most significant applications include
subband coding for audio, image, video, signal analysis and representation using wavelets.
In the past few years, DWT (Mallat, 1989) has been adopted in a wide range of applications
including image coding and video compression, including speech analysis, numerical
analysis, signal analysis, image coding, pattern recognition, computer vision and biometrics.
The DWT can be viewed as a multi-resolution decomposition of a signal, meaning which
decomposes a signal into several components in different wavelet frequency bands.
Moreover, 2-D DWT is a modern tool for signal processing applications, such as JPEG2000
still image compression, denoising, region of interest (ROI), and watermarking. By factoring
the classical wavelet filter into lifting steps, the computational complexity of the
corresponding DWT can be reduced by up to 50% (Daubechies & Sweldens, 1998). The
lifting steps can be easily implemented, which is different from the direct finite impulse
114
115
H(z)=h0+h1z-1+h2z-2+h3z-3,
(1)
G(z)=g0+g1z-1+g2z-2+g3z-3.
(2)
The downsampling operation is then applied to the filtered results. A pair of filters are
applied to the signal to decompose the image into the low-low (LL), low-high (LH), highlow (HL), and high-high (HH) wavelet frequency bands. Consider an image of size NN,
Each band is subsampled by a factor of two, so that each wavelet frequency band contains
N/2N/2 samples. The four bands can be integrated to generate an output image with the
same number of samples as the original.
In most image compression applications, the above 2-D wavelet decomposition can be
applied again to the LL sub-image, forming four new subband images, and so on to achieve
a compact energy in the lower frequency bands.
(3)
h ( z)
P( z) = e
ho ( z )
(4)
g e ( z ) .
g o ( z )
The Euclidean algorithm recursively finds the greatest common divisors of the even and
odd parts of the original filters. Since h(z) and g(z) form a complementary filter pair, P(z) can
be factorized into Eq. 5.
116
.
0
1
ti
(
z
)
1
0
1
/
k
i =1
(5)
where si(z) and ti(z) are Laurent polynomials corresponding to the prediction and update
steps, respectively, and k is a nonzero constant. Therefore, the filter bank can be factorized
into three lifting steps. As illustrated in Fig. 2, a lifting-based scheme has the following four
stages:
1) Split phase: The original signal is divided into two disjoint subsets. Significantly, the
variable Xe denotes the set of even samples and Xo denotes the set of odd samples. This
phase is called lazy wavelet transform because it does not decorrelate the data, but only
subsamples the signal into even and odd samples.
2) Predict phase: The predicting operator P is applied to the subset Xo to obtain the wavelet
coefficients d[n] as in Eq. 6.
d[n]=Xo[n]+P(Xe[n]).
(6)
3) Update phase: Xe[n] and d[n] are combined to obtain the scaling coefficients s[n] after an
update operator U as in Eq. 7.
s[n]=Xe[n]+U(d[n]).
(7)
4) Scaling: In the final step, the normalization factor is applied on s[n] and d[n] to obtain the
wavelet coefficients. Equations 8 and 9 describe the implementation of the 5/3 integer lifting
analysis DWT and are used to calculate the odd coefficients (high-pass coefficients) and
even coefficients (low-pass coefficients), respectively.
d*[n]=X(2n+1) - X(2 n) + X(2 n + 2) / 2
(8)
(9)
Although the lifting-based scheme has less complexity, its long and irregular data paths
constitute a major limitation for efficient hardware implementation. Additionally, the
increasing number of pipelined registers increases the transpose memory size of the 2-D
DWT architecture.
117
requirement dominates the hardware cost and complexity of the architectures for 2-D DWT.
The 2-D transform operation is shown in Fig. 3.
(a)
(b)
Fig. 3. 2-D LDWT operation. (a) The flow of a traditional 2-D DWT. (b) Detailed processing
flow.
118
(10)
Li= [(Hi+Hi-1)1/4+si]K1,
(11)
K0 = K1 = 1.
(12)
119
has the advantages of fast computational speed, less complexity, reduced latency, and
regular data flow.
For speed and simplicity, four-masks, 33, 53, 35, and 55, are generally used to perform
spatial filtering tasks. Moreover, the four-subband processing can be further optimized to
speed up and reduce the transpose memory of DWT coefficients. The four-matrix processors
consist of four mask filters, and each filter is derived from one 2-D DWT of 5/3 integer liftingbased coefficients. In LDWT implementation, a 1-D DWT needs massive computations, so the
computation unit dominants the hardware cost (Chiang & Hsia, 2005)(Andra et al., 2002). A 2D DWT is compose of two 1-D DWTs and a block of transpose memory, which is of the same
size of the processed image. The transpose memory is the main overhead of the computation
unit in the 2-D DWT. Figure 3 shows the block diagram of a traditional 2-D DWT. Without loss
of generality, the 5/3 lifting-based 2-D DWT is adopted for comparison. Assuming that the
image is of size NN, during the transformation, a large amount of transpose memory (order
of N2) is needed to store the temporary data after the first stage 1-D DWT decomposition. The
second stage 1-D DWT is then applied to the stored data to obtain the four-subband (HH, HL,
LH, and LL) results of the 2-D DWT. Because the memory requirement of size N2 is huge and
the processing is too long, this work proposes a new approach, called 2-D SMDWT, to reduce
the transpose computing latency and critical path. Figure 5(a) shows the concept of the
proposed SMDWT architecture, which consists of input arrangement, processing element,
memory unit, and control unit, as shown in Fig. 5(b). The outputs are fed to the 2-D DWT foursubband coefficients, HH, HL, LH, and LL. Significant transpose memory can be saved using
the proposed approach. This architecture is described in detail in the following subsections,
and is illustrated in Figs. 5, 7(c), 8(c), 11(c), and 14(c). This study focuses on the 5/3 liftingbased 2-D DWT complexity reduction.
(a)
(b)
Fig. 5. The system block diagram of the proposed 2-D DWT. (a) 2-D SMDWT. (b) Block
diagram of the proposed system architecture.
120
Without loss of generality, let us take a 66-pixel image is employed to demonstrate the 5/3
LDWT operations as shown in Fig. 6. In Fig. 6, the variable x(i,j) denotes the original image.
The upper part of Fig. 6 shows the first stage 1-D LDWT operations, and the lower part of
Fig. 6 shows the second stage 1-D LDWT operations for evaluating the four-subband
coefficients, HH, HL, LH, and LL. In the first stage of the 1-D LDWT, three pixels are used to
evaluate a 1-D high-frequency coefficient. For example, x(0,0), x(0,1), and x(0,2) are used to
calculate
the
high-frequency
wavelet
coefficient
b(0,0),
where
H part
L part
121
b(0,0)=[x(0,0)+x(0,2)]/2+x(0,1). The pixels, x(0,2), x(0,3), and x(0,4) are used to calculate the
next high-frequency wavelet coefficient b(0,1). Herein x(0,2) is used to calculate both of b(0,0)
and b(0,1), and is called the overlapped pixel. The low-frequency wavelet coefficient is
calculated using two consecutive high-frequency wavelet coefficients and the overlapped
pixel. For example, b(0,0) and b(0,1) cope with x(0,2) to find the low-frequency wavelet
coefficient c(0,1), where c(0,1)=[b(0,0)+b(0,1)]/4+x(0,2). The calculated high-frequency
wavelet coefficients, b(i,j), and the low frequency wavelet coefficients, c(i,j), are then used in
the second stage 1-D LDWT to calculate the four-subbands coefficients, HH, HL, LH and LL.
The general form of the mask coefficients is derived first, and the complexity is further
reduced by employing the symmetric feature of the mask.
3.2 Simplified 2-D SMDWT using symmetric features
1. High-High (HH) band mask coefficients reduction for 2-D SMDWT
According to the 2-D 5/3 LDWT, the HH band coefficients of the SMDWT can be derived as
follows:
HH(i,j)=x(2i+1,2j+1)+(1/4)1u=01v=0x(2i+2u,2j+2v)+(-1/2)2u=-1x(2i+|u|,2j+|1-u|).
(13)
The mask as shown in Fig. 7(a) can be obtained by Eq. 13, where the variables =-1/2,
=1/4, and =1. Figure 7(b) shows the DSP architecture and Fig. 7(c) shows the hardware
architecture.
The transpose memory requirement is a very important issue in multimedia IC design.
Therefore, to make the SMDWT architecture suitable for VLSI implementation, the design
processing element must be as simple and modular as possible. However, the product of
cost and computation time is always the most important consideration from a
standardization provides economies of scale for VLSI solution point of view. Therefore,
speed is sometimes sacrificed to obtain less cost hardware, while still satisfying the
performance requirement. In other words, the SMDWT architecture can be decomposed so
as to adjust the cost and computation time product. Its hardware cost and computation time
tradeoffs must be carefully considered to find the optimal design for VLSI implementation.
A simple SMDWT method for cost and computation time savings is introduced below.
Figure 7(c) shows the concept of the proposed HH-band architecture for SMDWT. The
proposed HH-band architecture consists of a shifter (, , and ) and one adder tree with
propagation registers, as shown in Fig. 7(c). The architecture design can be divided as
follows:
Input arrangement unit: Three pixels in a column are inputted into a processing
element for address generator circuits in each cycle. Simultaneously, the input
arrangement to assign input original signals used in multiplexer (MUX) fetch 3 pixels in
each cycle to switch for group 1, group 2 and group 3 to operations, respectively.
Coefficient shifter unit: The coefficient shifter values are =-1/2, =1/4, and =1.
Shifters replace multipliers to achieve a high-efficiency architecture by (reducing
computational time, critical path, area cost and power consumption (Tan & Arslan,
2003)).
Adder tree unit: An adder tree architecture is adopted to avoid the long signal path
length, signal skewing, and hazards caused by signal dependency. Each adder tree level
can be viewed as a parallel pipeline stage. This architecture is suitable for the realization
in hardware design.
122
(a)
(b)
(c)
Fig. 7. HH band mask coefficients and the corresponding DSP architecture. (a) Coefficients.
(b) DSP architecture. (c) Hardware architecture design.
Propagation register unit: Current pixels are stored to assign subband coefficients
computation needs in each group, and next horizontal or vertical scan oriented
computation are stored in propagation registers for data reuse. This approach can
reduce the next access time and computations. The pipeline design is the best method to
improve the system throughput.
Based on this structure, the coefficient overlap part can be reused as show in Fig. 7(c).
The complexity of the mask-based method is further reduced by employing the symmetric
feature of the mask. First, the initial horizontal scan is expressed by:
HH(0,0)=x(0,0)+x(0,1)+x(0,2)+x(1,0)+x(1,1)
+x(1,2)+x(2,0)+x(2,1)+x(2,2)
(14)
(15)
123
where the variable XMH denotes the repeated part after the horizontal third coefficient,
where X denotes group of pixels x, M denotes the mask, and H denotes horizontal
orientation. The general form can be derived as:
XMH=x(i,2j+2)+x(i+1,2j+2)+x(i+2,2j+2).
(16)
(18)
where the variable XMV denotes the repeated part after the vertical third coefficient, where
V denotes vertical orientation. The general form can be derived as:
XMV=x(2i+2,j)+x(2i+2,j+1)+x(2i+2,j+2).
(19)
(21)
where the variable XMD denotes the repeated part after the vertical fifth coefficient, where D
denotes diagonal orientation. The general form can be expressed as:
XMD=x(2i+2,2j+2)+x(2i+2,2j+3)+x(2i+2,2j+4)+x(2i+3,2j+2)+x(2i+4,2j+2). (22)
Since =1, the general form can be expressed as:
HH(i+1,j+1)=x(2i+4,2j+4)+(x(2i+3,2j+4)+x(2i+4,2j+3))+x(2i+3,2j+3)+XMD. (23)
where i=0~N-1, j=0~N-2.
The repeat part is only needed to be calculated once throughout the whole image. Hence it
greatly reduces the complexity of the SMDWT.
2. High-Low (HL) and Low-High (LH) band mask coefficients reduction for 2-D SMDWT
According to the 2-D 5/3 lifting-based DWT, the HL-band coefficients of the mask-based
DWT can be expressed as follows:
HL(i,j)=(3/4)x(2i+1,2j)+(1/16)1u=01v=0x(2i+4u,2j-2+2v)+(-1/8)1u=0x(2i+4u,2j)
124
+(-1/8)1u=01v=0x(2i+2u,2j-1+2v)+
(1/4)1u=0x(2i+1,2j-1+2u)+(-3/8)1u=0x(2i+2u,2j).
(24)
The mask as shown in Fig. 8(a) can be obtained via Eq. 24, where =-1/8, =1/16, =1/4,
=-3/8, and =3/4. The DSP and hardware architecture are also depicted in Figs. 8(b) and
(c). The complexity of the SMDWT is further reduced by employing the symmetric feature
of the mask.
The initial horizontal scan is expressed by:
(a)
(b)
(c)
Fig. 8. HL band mask coefficients and the corresponding DSP architecture. (a) Coefficients.
(b) DSP architecture. (c) Hardware architecture design.
125
HL(0,0)=x(0,0)+x(0,1)+x(0,2)+x(0,3)+x(0,4)+x(1,0)+x(1,1)+x(1,2)+
+x(1,3)+x(1,4)+x(2,0)+x(2,1)+x(2,2)+x(2,3)+x(2,4)
=x(0,0)+x(0,1)+x(0,2)+x(0,4)+x(1,0)+x(1,1)+x(1,2)+
+x(1,4)+x(2,0)+x(2,1)+x(2,2)+(2,4)+XMH+1,
(25)
where the variable XMH+1 denotes the repeated part after the first horizontal coefficient. The
next coefficient can be calculated as:
HL(0,1)=x(0,2)+x(0,3)+x(0,4)+x(0,5)+x(0,6)+x(1,2)+x(1,3)+x(1,4)+
+x(1,5)+x(1,6)+x(2,2)+x(2,3)+x(2,4)+x(2,5)+x(2,6)
=x(0,2)+x(0,4)+x(0,5)+x(0,6)+x(1,2)+x(1,4)+x(1,5)+
+x(1,6)+x(2,2)+x(2,4)+x(2,5)+(2,6)+XMH+1,
(26)
The general form of the first horizontal step can be derived as:
HL(i,1)=x(i,j+2)+x(i,j+4)+x(i,j+5)+x(i,j+6)+x(i+1,j+2)+x(i+1,j+4)+
+x(i+1,j+5)+x(i+1,j+6)+x(i+2,j+2)+x(i+2,j+4)+x(i+2,j+5)+x(i+2,j+XMH+1, (27)
where i=0~N-1, and
XMH+1=x(i,3)+x(i+1,3)+x(i+2,3).
(28)
(29)
where the variable XMH+n denotes the repeated part after the second horizontal coefficient.
From Eq. 29, the general form can be expressed as:
HL(i,j+2)=x(i,2j+6)+x(i,2j+7)+x(i,2j+8)+x(i+1,2j+6)+x(i+1,2j+7)+x(i+1,2j+8)+
+x(i+2,2j+6)+x(i+2,2j+7)+x(i+2,2j+8)+XMH+n,
(30)
126
+x(3,2)+x(3,3)+x(3,4)+x(4,0)+(4,1)+x(4,2)+x(4,3)+x(4,4)
=x(3,0)+x(3,1)+x(3,2)+x(3,3)+x(3,4)+x(4,0)+x(4,1)+
+x(4,2)+x(4,3)+x(4,4)+XMV,
(32)
where the variable XMV denotes the repeated part after the vertical fifth coefficient. The
general form can be expressed as:
HL(i+1,j)=x(2i+3,j)+x(2i+3,j+1)+x(2j+3,j+2)+x(2j+3,j+3)+x(2j+3,j+4)+
+x(2j+4,j)+x(2j+4,j+1)+x(2j+4,j+2)
+x(2j+4,j+3)+x(2j+4,j+4)+XMV,
where i=0~N-1, j=0~N-1, and
(33)
XMV=x(2i+2,j)+x(2i+2,j+1)+x(2i+2,j+2)+x(2i+2,j+3)+x(2i+2,j+4).
(34)
(35)
where the variable XMD+1 denotes the repeated part as shown in the gray part of Fig. 9 after
the first diagonal scan. Next, the HL(2,2) is calculated as:
HL(2,2)=x(4,4)+x(4,5)+x(4,6)+x(4,7)+x(4,8)+x(5,4)+x(5,5)+
+x(5,6)+x(5,7)+x(5,8)+x(6,4)+(6,5)+x(6,6)+x(6,7)+x(6,8)
=x(5,6)+x(5,7)+x(5,8)+x(6,6)+x(6,7)+x(6,8)+XMD+n,
x(2,2)
x(2,3)
x(2,4)
x(2,5)
x(2,6)
x(3,2)
x(3,3)
x(3,4)
x(3,5)
x(3,6)
x(4,2)
x(4,3)
x(4,4)
x(4,5)
x(4,6)
(36)
Fig. 9. Repeat part (in gray) of the diagonal scanned position HL(1,1).
where the variable XMD+n denotes the repeated part as shown in the gray part of Fig. 10 after
the first diagonal scan. The general form of XMD+n can be expressed as:
XMD+n=x(2i+4,2i+4)+x(2i+4,2i+5)+x(2i+4,2i+6)+x(2i+4,2i+7)+
x(2i+4,2i+8)+x(2i+5,2i+4)+x(2i+5,2i+5)+x(2i+5,2i+6)+
+x(2i+5,2i+7)+x(2i+5,2i+8)+x(2i+6,2i+4)+x(2i+6,2i+5)+
+x(2i+6,2i+6)+ x(2i+6,2i+7)+x(2i+6,2i+8),
(37)
127
x(4,4)
x(4,5)
x(4,6)
x(4,7)
x(4,8)
x(5,4)
x(5,5)
x(5,6)
x(5,7)
x(5,8)
x(6,4)
x(6,5)
x(6,6)
x(6,7)
x(6,8)
Fig. 10. Repeat part (in gray) of the diagonal scanned position HL(2,2).
The general form of the rest part can be expressed as:
HL(i+1,j+1)=x(2i+6,2j+8)+(x(2i+5,2j+8)+x(2i+6,2j+7))+x(2i+5,2j+7)+
+x(2i+6,2j+6)+x(2i+5,2j+6)+XMD+n,
(38)
(39)
The mask as shown in Fig. 11(a) can be obtained via Eq. 39, where =-1/8, =1/16, =1/4,
=-3/8, and =3/4. The DSP and hardware architecture are depicted in Figs. 11(b) and (c).
The complexity of the SMDWT is further reduced by employing the symmetric feature of
the mask. First, the initial horizontal scan is calculated by the method that is similar to that
of HL SMDWT, where the variable XMH denotes the repeated part after the horizontal fifth
coefficient. The general form can be expressed as:
LH(i,j+1)=x(i,2j+3)+x(i,2j+4)+x(i+1,2j+3)+x(i+1,2j+4)+x(i+2,2j+3)+
+x(i+2,2j+4)+x(i+3,2j+3)+
x(i+3,2j+4)+x(i+4,2j+3)+x(i+4,2j+4)+XMH,
(40)
(41)
Next, the initial vertical scan is calculated by the method similar to that of HL mask-based
DWT, where the variable XMV+1 denotes the repeated part after the vertical first coefficient.
The general form of the first vertical step can be expressed as:
LH(1,j)=x(i+2,j)+x(i+2,j+1)+x(i+2,j+2)+x(i+4,j)+x(i+4,j+1)+x(i+4,j+2)+
+x(i+5,j)+x(i+5,j+1)+x(i+5,j+2)+
+x(i+6,j)+x(i+6,j+1)+x(i+6,j+2)+XMV+1,
(42)
(43)
128
Next, the second vertical scan is calculated with the method similar to that of HL SMDWT.
LH(i+2,j)=x(2i+6,j)+x(2i+6,j+1)+x(2i+6,j+2)+x(2i+7,j)+x(2i+7,j+1)+
+x(2i+7,j+2)+x(2i+8,j)+x(2i+8,j+1)+x(2i+8,j+2)+XMV+n,
(44)
(a)
(b)
(c)
Fig. 11. LH band mask coefficients and the corresponding DSP architecture. (a) Coefficients.
(b) DSP architecture. (c) Hardware architecture design.
129
where the variable XMD+1 denotes the repeated part as shown in the gray part of Fig. 12 after
the first diagonal scan.
Next the LH(2,2) is calculated as:
LH(2,2)=x(5,4)+x(6,5)+x(6,6)+x(7,5)+x(7,6)+x(8,4)+x(8,5)+XMD+n, (47)
where the variable XMD+n denotes the repeated part as shown in the gray part of Fig. 13 after
the first diagonal scan. The general form of XMD+n can be expressed as:
XMD+n=x(2i+4,2i+4)+x(2i+4,2i+5)+x(2i+4,2i+6)+x(2i+5,2i+4)+x(2i+5,2i+5)+
+x(2i+5,2i+6)+x(2i+6,2i+4)+x(2i+7,2i+4)+x(2i+8,2i+4).
x(2,2)
x(2,3)
x(2,4)
x(3,2)
x(3,3)
x(3,4)
x(4,2)
x(4,3)
x(4,4)
x(5,2)
x(5,3)
x(5,4)
x(6,2)
x(6,3)
x(6,4)
(48)
Fig. 12. Repeat part (in gray) of the diagonal scanned position LH(1,1).
x(4,4)
x(4,5)
x(4,6)
x(5,4)
x(5,5)
x(5,6)
x(6,4)
x(6,5)
x(6,6)
x(7,4)
x(7,5)
x(7,6)
x(8,4)
x(8,5)
x(8,6)
Fig. 13. Repeat part (in gray) of the diagonal scanned position LH(2,2).
The general form of the rest part can be expressed as:
LH(i+1,j+1)=x(2i+8,2j+6)+(x(2i+7,2j+6)+x(2i+8,2j+5))+x(2i+7,2j+5)+
+x(2i+6,2j+6)+x(2i+5,2j+6)+XMD+n.
(49)
130
+(-1/32)1u=01v=0x(2i-2+4u,2j-1+2v)+(3/16)1u=0[x(2i-1+2u,2j)+P(2i,2j-1+2u)]+
+(-3/32)1u=0[x(2i-2+4u,2j)+x(2i,2j-2+4u)].
(50)
The mask as shown in Fig. 14(a) can be obtained via Eq. 50, where =-1/32, =1/64, =1/16,
=-3/32, =3/16 and =9/16. The DSP and hardware architecture are depicted in Figs. 14(b)
and (c). The complexity of the SMDWT is further reduced by employing the symmetric
feature of the mask. First, the initial horizontal scan LL(0,0). The next coefficient can be
calculated as LL(0,1). where the variable XMH+1 denotes the repeated part after the first
horizontal coefficient. The general form of the first horizontal step can be expressed as:
LL(i,1)=x(i,j+2)+x(i,j+4)+x(i,j+5)+x(i,j+6)+x(i+1,j+2)+x(i+1,j+4)+
+x(i+1,j+5)+x(i+1,j+6)+x(i+2,j+2)+x(i+2,j+4)+x(i+2,j+5)+x(i+2,j+6)+
+x(i+3,j+2)+x(i+3,j+4)+x(i+3,j+5)+x(i+3,j+6)+x(i+4,j+2)+
+x(i+4,j+4)+x(i+4,j+5)+x(i+4,j+6)+XMH+1,
(51)
XMH+1=x(i,3)+x(i+1,3)+x(i+2,3)+x(i+3,3)+x(i+4,3).
(52)
The next coefficient can be calculated as LL(0,2). where the variable XMH+n denotes the
repeated part after the second horizontal coefficient. From LL(0,2), the general form can be
expressed as:
LL(i,j+2)=x(i,2j+6)+x(i,2j+7)+x(i,2j+8)+x(i+1,2j+6)+x(i+1,2j+7)+
+x(i+1,2j+8)+x(i+2,2j+6)+x(i+2,2j+7)+x(i+2,2j+8)+
+x(i+3,2j+6)+x(i+3,2j+7)+x(i+3,2j+8)+x(i+4,2j+6)+
+x(i+4,2j+7)+x(i+4,2j+8)+XMH+n,
(53)
(a)
(54)
131
(b)
(c)
Fig. 14. LL band mask coefficients and the corresponding DSP architecture. (a) Coefficients.
(b) DSP architecture. (c) Hardware architecture design.
132
The vertical scan can be done in the same way, where LL(0,0) is the same as that horizontal
in LL(0,0). The next coefficient can be calculated as LL(1,0). Next, the initial vertical scan is
calculated by the method similar to that of LH SMDWT, where the variable XMV+1 denotes
the repeated part after the vertical first coefficient. The general form of the first vertical step
can be expressed as:
LL(1,j)=x(2i,j)+x(2i,j+1)+x(2i,j+2)+x(2i,j+3)+x(2i,j+4)+x(2i+4,j)+
+x(2i+4,j+1)+x(2i+4,j+2)+x(2i+4,j+3)+x(2i+4,j+4)+x(2i+5,j)+
+x(2i+5,j+1)+x(2i+5,j+2)+x(2i+5,j+3)+x(2i+5,j+4)+
+x(2i+6,j)+(2i+6,j+1)+x(2i+6,j+2)+x(2i+6,j+3)+x(2i+6,j+4)+XMV+1,
(55)
(56)
Next, the second vertical scan is calculated by the method similar to that of LH SMDWT.
LL(i+2,j)=x(2i+6,j)+x(2i+6,j+1)+x(2i+6,j+2)+x(2i+6,j+3)+x(2i+6,j+4)+
+x(2i+7,j+2)+x(2i+7,j+1)+x(i,2j+7)+x(2i+7,j+3)+x(2i+7,j+4)+x(i,2j+8)+
+x(2i+8,j+1)+x(2i+8,j+2)+x(2i+8,j+3)+x(2i+8,j+4)+XMV+n,
(57)
(58)
(59)
where the variable XMD+1 denotes the repeated part as shown in the gray part of Fig. 15 after
the first diagonal scan.
Next the HL(2,2) is calculated as:
LL(2,2)=x(6,5)+x(6,6)+x(6,7)+x(7,5)+x(7,6)+x(7,7)+x(7,8)+
+x(8,5)+x(8,6)+x(8,7)+x(8,8)+XMD+n,
(60)
where the variable XMD+2 denotes the repeated part as shown in the gray part of Fig. 16 after
the first diagonal scan. The variable XMD+1 denotes the repeated part as shown in the gray
part of Fig. 17 after the first diagonal scan. The general form of XMD+n can be expressed as:
133
x(2,2)
x(2,3)
x(2,4)
x(2,5)
x(2,6)
x(3,2)
x(3,3)
x(3,4)
x(3,5)
x(3,6)
x(4,2)
x(4,3)
x(4,4)
x(4,5)
x(4,6)
x(5,2)
x(5,3)
x(5,4)
x(5,5)
x(5,6)
x(6,2)
x(6,3)
x(6,4)
x(6,5)
x(6,6)
Fig. 15. Repeat part (in gray) of the diagonal scanned position LL(1,1).
x(4,4)
x(4,5)
x(4,6)
x(4,7)
x(4,8)
x(5,4)
x(5,5)
x(5,6)
x(5,7)
x(5,8)
x(6,4)
x(6,5)
x(6,6)
x(6,7)
x(6,8)
x(7,4)
x(7,5)
x(7,6)
x(7,7)
x(7,8)
x(8,4)
x(8,5)
x(8,6)
x(8,7)
x(8,8)
Fig. 16. Repeat part (in gray) of the diagonal scanned position LL(2,2).
x(6,6)
x(6,7)
x(6,8)
x(6,9)
x(6,10)
x(7,6)
x(7,7)
x(7,8)
x(7,9)
x(7,10)
x(8,6)
x(8,7)
x(8,8)
x(8,9)
x(8,10)
x(9,6)
x(9,7)
x(9,8)
x(9,9)
x(9,10)
x(10,6)
x(10,7)
x(10,8)
x(10,9)
x(10,10)
Fig. 17. Repeat part (in gray) of the diagonal scanned position LL(3,3).
XMD+n=x(2i+6,2i+6)+x(2i+6,2i+7)+x(2i+6,2i+8)+x(2i+6,2i+9)+
+x(2i+6,2i+10)+x(2i+7,2i+6)+x(2i+7,2i+7)+x(2i+7,2i+8)+
+x(2i+7,2i+9)+x(2i+7,2i+10)+x(2i+8,2i+6)+x(2i+8,2i+7)+
+x(2i+8,2i+10)+x(2i+9,2i+6)+x(2i+9,2i+7)+x(2i+10,2i+6)+x(2i+10,2i+7). (66)
The general form of the rest part can be expressed as:
LL(i+1,j+1)=x(2i+8,2i+8)+x(2i+8,2i+9)+x(2i+9,2i+8)+x(2i+9,2i+9)+
+x(2i+9,2i+10)+x(2i+10,2i+8)+x(2i+10,2i+9)+x(2i+10,2i+10)+XMD+n,
(67)
134
9
15
15
25
2
2
2
2
x(i,3)+x(i+1,3)+x(i+2,3).
Original SMDWT: adder is 14, and multiplier is 15.
Simplified SMDWT: adder is 12, and multiplier is 0.
x(i,2j+4)+x(i,2j+5)+x(i+1,2j+4)+x(i+1,2j+5)+x(i+2,2j+4)+x(
i+2,2j+5).
Original SMDWT: adder is 14, and multiplier is 15.
Simplified SMDWT: adder is 9, and multiplier is 0.
x(2i+2,j)+x(2i+2,j+1)+x(2i+2,j+2)+x(2i+2,j+3)+x(2i+2,j+4).
Original SMDWT: adder is 14, and multiplier is 15.
Simplified SMDWT: adder is 10, and multiplier is 0.
x(i,2j+2)+x(i+1,2j+2)+x(2i+2,j+2)+x(i+3,2j+2)+x(i+4,2j+2).
Original SMDWT: adder is 14, and multiplier is 15.
Simplified SMDWT: adder is 10, and multiplier is 0.
x(2i+3,0)+x(2i+3,1)+x(2i+3,2).
Original SMDWT: adder is 14, and multiplier is 15.
Simplified SMDWT: adder is 12, and multiplier is 0.
x(2i+4,j)+x(2i+4,j+1)+x(2i+4,j+2)+x(2i+5,j)+x(2i+5,j+1)+x(
2i+5,j+2).
Original SMDWT: adder is 14, and multiplier is 15.
Simplified SMDWT: adder is 9, and multiplier is 0.
135
XMH+1 of LL(i,1)
Complexity
reduction
x(i,3)+x(i+1,3)+x(i+2,3)+x(i+3,3)+x(i+4,3).
Original SMDWT: adder is 24, and multiplier is 25.
Simplified SMDWT: adder is 20, and multiplier is 0.
x(i,2j+4)+x(i,2j+5)+x(i+1,2j+4)+x(i+1,2j+5)+x(i+2,2j+4)+
XMH+n of LL(i,j+2)
x(i+2,2j+5)+x(i+3,2j+4)+x(i+3,2j+5)+x(i+4,2j+4)+x(i+4,2j+5).
Complexity
Original SMDWT: adder is 24, and multiplier is 25.
reduction
Simplified SMDWT: adder is 15, and multiplier is 0.
XMV+1 of LL(1,j) x(3,j)+x(3,j+1)+x(3,j+2)+x(3,j+3)+x(3,j+4).
Complexity
Original SMDWT: adder is 24, and multiplier is 25.
reduction
Simplified SMDWT: adder is 20, and multiplier is 0.
x(2i+4,j)+x(2i+4,j+1)+x(2i+4,j+2)+x(2i+4,j+3)+x(2i+4,j+4)+
XMV+n of LL(i+2,j)
x(2i+5,j)+x(2i+5,j+1)+x(2i+5,j+2)+x(2i+5,j+3)+x(2i+5,j+4).
Complexity
Original SMDWT: adder is 24, and multiplier is 25.
reduction
Simplified SMDWT: adder is 15, and multiplier is 0.
136
60
40
2-D Mask scheme DWT
2-D Lifting-based DWT
20
0
0
0.5
1.5
2.5
3.5
Rate(bpp)
Fig. 19. PSNR (dB) versus Rate (bpp) comparison between 2-D LDWT and the proposed 2-D
SMDWT.
The architecture of the 2-D SMDWT has many advantages compared to the 2-D LDWT. For
example, the critical path of the 2-D LDWT is potentially longer than that of SMDWT.
Moreover, the 2-D LDWT is frame-based with the implementation bottleneck being the huge
amount of the transpose memory size. This work uses the symmetric feature of the masks in
SMDWT to improve the design. Experimental results, as shown in Table 7 show that the
proposed algorithm is superior to most of the previous works. The proposed algorithm has
efficient solutions for reducing the critical path (which is defined as the longest, time-weighted
sequence of events from the start of the program to its termination with examples shown in
Figs. 7(c), 8(c), 11(c), 14(c)), latency (the time between the arrival of a new signal and its first
signal output becoming available in the system), and hardware cost, as shown in Figs. 7, 8, 11,
14, and 20, and Table 6. The SMDWT approach requires a transpose memory of size (N/2)+26
((N/2) is on-chip memory of size and 26 is number of register). The proposed 2-D DWT adopts
parallel and pipeline schemes are employed to reduce the transpose memory and increase the
operating speed. The shifters and adders replace multipliers in the computation to increase the
hardware utilization and reduce the hardware cost. A NN 2-D lifting-based DWT is RTL
(Register Transistor Level) designed and simulated with VerilogHDL in this paper.
(a)
(b)
(c)
(d)
Fig. 20. 2-D LDWT critical path. (a) HH band. (b) HL band. (c) LH band. (d) LL band.
Subbands
HH
HL
LH
LL
137
Methods
2-D
DWT
Wave
stage
1Transpose
LDWT
Integer
4N
(3/4)N2+7
Simple
LDWT
Integer
N2/4+5N
N2
Medium
LDWT
Integer
3.5N
N/A
N/A
Simple
LDWT
Integer
2.5N
N/A
N2
Complexity
LDWT
Integer
3.5N
2N+5
(N2/2)+N+5
Simple
LDWT
Integer
3N
N/A
(N2/2)+N+5
Medium
LDWT
Integer
(N2/2)+5
Medium
N/A
N/A
Simple
memory
2Latency
3Computing
time
Complexity
ISO/IEC, 2000
LDWT
Integer
N2
Varshney et al.,
2007
LDWT
Integer
3N
13
N/A
Medium
Chen, 2002
LDWT
Integer
3N
N/A
(N2/2)+N+5
Medium
SMDWT
Integer
(N/2)+26
N2/4+3
Simple
Proposed
Transpose memory is used to store frequency coefficients in the 1-L 2-D DWT.
In a system, latency is often used to mean any delay or waiting time that increases real or perceived
response time beyond the response time desired. For example, specific contributors to 2-D DWT latency
include from original image input to first subband output in signal.
3 In a system, computing time represents the time used to compute an image of size NN.
4 Suppose image is of size NN.
1
2
138
computations. The SMDWT also has the advantages of regular signal coding, short critical
path, reduced latency time, and independent subband coding processing. Moreover,
SMDWT can easily reduce the transpose memory access time and overlap original signal
access so that power consumption of 2-D LDWT can also be easily improved by SMDWT.
5. Conclusions
This work proposes a novel 2-D SMDWT fast algorithm, which is superior to the 5/3 LDWT.
The algorithm solves the latency problem in the previous schemes caused by multiple-layer
transpose decomposition operation. Moreover, it provides real-time requirement and can be
further applied to the 3-D wavelet video coding [30].
The proposed 2-D SMDWT algorithm has the advantages of a fast computational speed, less
complexity, reduced latency. Low-transpose memory and regular data flow, and is suitable
for VLSI implementation. Possible future works are described below:
1. The Dual-Mode 2-D SMDWT on JPEG2000: The dual-mode 2-D SMDWT can be
developed to support 5/3 (lossless) lifting or 9/7 (lossy) lifting using similar hardware
architecture, since the 5/3 and 9/7 are very similar and both have less complexity.
2. High Performance JPEG2000 Codec: Since part of the JPEG2000 encoder is symmetric to
the decoder the complexity of both the encoder and the decoder can be reduced.
3. An independent four-subband mask can be used in other visual coding fields (eg. visual
processing, visual compression and visual recognition).
6. References
Andra, K.; Chakrabarti, C. & Acharya, T. (2000), A VLSI architecture for lifting-based
wavelet transform, IEEE Workshop on Signal Processing Systems, (October 2000), pp.
70-79.
Andra, K.; Chakrabarti, C. & Acharya, T. (2002), A VLSI architecture for lifting-based
forward and inverse wavelet transform, IEEE Transactions on Signal Processing, vol.
50, no.4, (April 2002), pp. 966-977.
Chen, S.-C. & Wu, C.-C. (2002). An architecture of 2-D 3-level lifting-based discrete wavelet
transform, Proceeding of the VLSI Design/ CAD Symposium, (August 2002), pp. 351354.
Chen, P.-Y. (2002). VLSI implementation of discrete wavelet transform using the 5/3 filter,
IEICE Transactions on Information and Systems, vol. E85-D, no.12, (December 2002),
pp. 1893-1897.
Chen, P. & J. W. Woods. (2004). Bidirectional MC-EZBC with lifting implementation, IEEE
Transactions on Circuits and Systems for Video Technology, vol. 14, no. 10, (October
2004), pp. 1183-1194.
Chiang, J.-S.; Hsia, C.-H. & Chen, H.-J. (2005). 2-D discrete wavelet transform with efficient
parallel scheme, International Conference on Imaging Science, Systems, and Technology:
Computer Graphics, (June 2005), pp. 193-197.
Chiang, J.-S.; Hsia, C.-H.; Chen, H.-J. & Lo, T.-J. (2005). VLSI architecture of low memory
and high speed 2-D lifting-based discrete wavelet transform for JPEG2000
applications, IEEE International Symposium on Circuits and Systems, (May 2005), pp.
4554-4557.
139
Chiang, J.-S. & Hsia, C.-H. (2005). An efficient VLSI architecture for 2-D DWT using lifting
scheme, IEEE International Conference on Systems and Signals, (April 2005), pp. 528531.
Daubechies, I. & Sweldens, W. (1998). Factoring wavelet transforms into lifting steps, The
Journal of Fourier Analysis and Applications, vol. 4, no.3, (1998), pp. 247-269.
Diou, C.; Torres, L. & Robert, M. (2001). An embedded core for the 2-D wavelet transform,
IEEE on Emerging Technologies and Factory Automation Proceedings, vol. 2, (2001), pp.
179-186.
Feig, E.; Peterson, H. & Ratnakar, V. (1995). Image compression using spatial prediction,
IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, (May
1995), pp. 2339-2342.
Habibi, A. & Hershel, R. S. (1974). A unified representation of differential pulse code
modulation (DPCM) and transform coding systems, IEEE Transaction on
Communications, vol. 22, no. 5, (May 1974), pp. 692-696.
Huang, C.-T.; Tseng, P.-C. & Chen, L.-G. (2005). Analysis and VLSI architecture for 1-D and
2-D discrete wavelet transform, IEEE Transactions on Signal Processing, vol. 53, no.
4., (April 2005), pp. 1575-1586.
ISO/IEC JTC1/SC29 WG1. (2000). JPEG 2000 Part 1 Final Committee Draft Version 1.0,
Information Technology.
ISO/IEC JTC1/SC29/WG1 Wgln 1684. (2000). JPEG 2000 Verification Model 9.0.
ISO/IEC JTC1/SC29 WG11. (2001). Coding of Moving Pictures and Audio, Information
Technology.
ISO/IEC ISO/IEC 15444-3. (2002). Motion JPEG2000, Information Technology.
Kondo, H. & Oishi, Y. (2000). Digital image compression using directional sub-block DCT,
International Conference on Communications Technology, vol. 1, (August 2000), pp.2125.
Lan, X.; Zheng, N. & Liu, Y. (2005). Low-power and high-speed VLSI architecture for liftingbased forward and inverse wavelet transform, IEEE Transactions on Consumer
Electronics, vol. 51, no.2, (May 2005), pp. 379-385.
Lee, W.-T.; Chen, H.-Y.; Hsiao, P.-Y. & Tsai, C.-C. (2003). An efficient lifting based
architecture for 2-D DWT used in JPEG2000, Proceeding of the VLSI Design/ CAD
Symposium, (August 2003), pp. 577-580.
Lian, C.-J.; Chen, K.-F.; Chen, H.-H. & Chen, L.-G. (2001). Lifting based discrete wavelet
transform architecture for JPEG2000, IEEE International Symposium on Circuits and
Systems, vol. 2, (May 2001), pp. 445-448.
Liao, H.; Mandal, M. Kr. & Cockburn, B. F. (2004). Efficient architecture for 1-D and 2-D
lifting-based wavelet transforms, IEEE Transactions on Signal Processing, vol. 52, no.
5, (May 2004), pp. 1315-1326.
Mallat, S. G. (1989). A theory for multi-resolution signal decomposition: The wavelet
representation, IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 11,
no. 7, (July 1989), pp. 674-693.
Ohm, J.-R. (2005). Advances in scalable video coding, Proceedings of the IEEE, Invited Paper,
vol. 93, no.1, (January 2005), pp. 42-56.
Srinivasan, K. S. S. (2002). VLSI implementation of 2-D DWT/IDWT cores using 9/7-tap
filter banks based on the non-expansive symmetric extension scheme, IEEE
International Conference on VLSI Design, (January 2002), pp. 435-440.
140
Part 3
Biomedical Applications
8
ECG Signal Compression Using
Discrete Wavelet Transform
Prof. Mohammed Abo-Zahhad
144
2. ElectroCardioGraphy
ECG signal is a recording of the electrical activity of the heart over time produced by an
electrocardiograph and is a well-established diagnostic tool for cardiac diseases. ECG signal
is monitored by placing sensors at defined positions on chest and limb extremities of the
subject. Each heart beat is caused by a section of the heart generating an electrical signal
which then conducts through specialized pathway to all parts of the heart. These electrical
signals also get transmitted through the chest to the skin where they can be recorded. The
following four steps in the generation of ECG signal can be monitored:
1. The S-A node (natural pacemaker) creates an electrical signal.
2. The electrical signal follows natural electrical pathways through both atria. The
movement of electricity causes the atria to contract, which helps push blood into the
ventricles.
3. The electrical signal reaches the A-V node (electrical bridge). There, the signal pauses to
give the ventricles time to fill with blood.
4. The electrical signal spreads through the His-Purkinje system. The movement of
electricity causes the ventricles to contract and push blood out to lungs and body.
ECG signal is obtained from a machine known as an Electrocardiograph, which captures the
signal through an array of electrode sensors placed at standard locations on the skin of the
human body. Modern electrocardiographs record ECG signals by digitizing and then
storing the signal in magnetic or optical discs. An automated diagnostic system is required
to speed up the diagnostic process and assist the cardiologists in examining patients using
non-invasive techniques. Electrical impulses in the heart originate in the sinoatrial node and
travel through the heart muscle where they impart electrical initiation of systole or
contraction of the heart. The electrical waves can be measured at selectively placed
electrodes (electrical contacts) on the skin. Electrodes on different sides of the heart measure
the activity of different parts of the heart muscle. An ECG displays the voltage between
pairs of these electrodes, and the muscle activity that they measure, from different
directions, also understood as vectors. The ECG signal is composed from five waves labeled
using five capital letters from the alphabet: P, Q, R, S, and T. The width of a wave on the
horizontal axis represents a measure of time. The height and depth of a wave represent a
measure of voltage. An upward deflection of a wave is called positive deflection and a
downward deflection is called negative deflection. A typical representation of the ECG
waves is presented in Figure (1) (Moody, (1992).
The electrocardiogram essentially reads the electrical impulses that stimulate the heart to
contract. It is probably the most useful tool to determine whether the heart has been injured
or how it is functioning. The ECG signal is made up of a number of segments or waves of
different durations, amplitudes, and forms: slow, low-frequency P and T waves and short
and high-frequency Q, R, and S waves, forming the QRS complex. P wave, QRS wave, and T
wave, they are diagnostic critical waves. The P wave represents the atrial depolarization
where the blood is squeezed from the atria to the ventricles. The QRS segment is when the
ventricles depolarize and squeeze the blood from the right ventricle to the aorta. The T wave
represents the period of time when the ventricles repolarize (get ready for the next heart
beat). Most of the ECG signal energy is concentrated in the QRS complex, but there are
diagnostically important changes in the low amplitude PQ and ST intervals, the P and T
waves.
145
146
1.
2.
3.
147
W (s , ) =
f ( x ) s , ( x ) dx
(1)
1
x -
(
)
s
s
(2)
where,
s , ( x ) =
148
s and are called scale and translation parameters, respectively. W (s , ) denotes the
wavelet transform coefficients and is the fundamental mother wavelet. If W (s , ) is
given, f ( x ) can be obtained using the inverse continuous wavelet transform (ICWT) that is
described by:
f (x) =
1
C
W (s, )
s , ( x )
s2
d ds
(3)
C =
|(u)|2
|u| du
(4)
The discrete wavelet transform can be written on the same form as Equation (1), which
emphasizes the close relationship between CWT and DWT. The most obvious difference is
that the DWT uses scale and position values based on powers of two. The values of s and
are: s = 2 j , = k * 2 j and ( j , k ) Z 2 as shown in Equation (5).
j ,k (x) =
1
soj
x - k osoj
soj
(5)
The key issues in DWT and inverse DWT are signal decomposition and reconstruction,
respectively. The basic idea behind decomposition and reconstruction is low-pass and highpass filtering with the use of down sampling and up sampling respectively. The result of
wavelet decomposition is hierarchically organized decompositions. One can choose the level
of decomposition j based on a desired cutoff frequency. Figure (3-a) shows an
implementation of a three-level forward DWT based on a two-channel recursive filter
bank, where h0 (n) and h1 (n) are low-pass and high-pass analysis filters, respectively,
and the block 2 represents the down sampling operator by a factor of 2. The input signal
x(n) is recursively decomposed into a total of four subband signals: a coarse signal C 3 (n) ,
and three detail signals, D3 (n), D2 ( n) , and D1 (n) , of three resolutions. Figure (3-b) shows an
implementation of a three-level inverse DWT based on a two-channel recursive filter
bank, where h0 (n) and h 1 ( n ) are low-pass and high-pass synthesis filters, respectively, and
the block 2 represents the up sampling operator by a factor of 2. The four subband signals
C 3 (n), D3 (n), D2 ( n) and D1 (n) , are recursively combined to reconstruct the output
signal x (n) . The four finite impulse response filters satisfy the following relationships:
h1 (n) = (-1)n h0 (n)
(6)
h0 (n) = h0 (1 - n)
(7)
(8)
so that the output of the inverse DWT is identical to the input of the forward DWT.
149
Fig. 3. A three-level two-channel iterative filter bank (a) forward DWT (b) inverse DWT
The compression ratio (CR) is defined as the ratio of the number of bits representing the
original signal to the number required for representing the compressed signal. So, it can be
calculated from:
150
CR =
N bc
( N S + M )(bS + 1)
(9)
Where, bc is the number of bits representing each original ECG sample. One of the most
difficult problems in ECG compression applications and reconstruction is defining the error
criterion. Several techniques exist for evaluating the quality of compression algorithms. In
some literature, the root mean square error (RMS) is used as an error estimate. The RMS is
defined as
N
RMS =
( x(n) x(n))2
n=1
(10)
where x(n) is the original signal, x(n) is the reconstructed signal and N is the length of the
window over which the RMS is calculated(Zou & Tewfik, 1993). This is a purely
mathematical error estimate without any diagnostic considerations.
The distortion resulting from the ECG processing is frequently measured by the percent
root-mean-square difference (PRD) (Ahmed et al., 2000). However, in previous trials focus
has been on how much compression a specific algorithm can achieve without loosing too
much diagnostic information. In most ECG compression algorithms, the PRD measure is
employed. Other error measures such as the PRD with various normalized root mean
square error and signal to noise ratio (SNR) are used as well (Javaid et al., 2008). However,
the clinical acceptability of the reconstructed signal is desired to be as low as possible. To
enable comparison between signals with different amplitudes, a modification of the RMS
error estimate has been devised. The PRD is defined as:
N
PRD =
( x(n) x(n))2
n=1
(11)
( n)
n=1
This error estimate is the one most commonly used in all scientific literature concerned with
ECG compression techniques. The main drawbacks are the inability to cope with baseline
fluctuations and the inability to discriminate between the diagnostic portions of an ECG
curve. However, its simplicity and relative accuracy make it a popular error estimate among
researchers (Benzid et al., 2003; Blanco-Velasco et al., 2004).
As the PRD is heavily dependent on the mean value, it is more appropriate to use the
modified criteria:
N
PRD1 =
( x(n) x(n))2
n=1
N
( x(n) x )2
(12)
n=1
where x is the mean value of the signal. Furthermore, it is established in (Zigel et al., 2000),
that if the PRD1 value is between 0 and 9%, the quality of the reconstructed signal is either
151
very good or good, whereas if the value is greater than 9% its quality group cannot be
determined. As we are strictly interested in very good and good reconstructions, it is taken
that the PRD value, as measured with (11), must be less than 9%.
In (Zigel et al., 2000), a new error measure for ECG compression techniques, called the
weighted diagnostic distortion measure (WDD), was presented. It can be described as a
combination of mathematical and diagnostic subjective measures. The estimate is based on
comparing the PQRST complex features of the original and reconstructed ECG signals. The
WDD measures the relative preservation of the diagnostic information in the reconstructed
signal. The features investigated include the location, duration, amplitudes and shapes of
the waves and complexes that exist in every heartbeat. Although, the WDD is believed to be
a diagnostically accurate error estimate, it has been designed for surface ECG recordings.
More recently (Al-Fahoum, 2006), quality assessment of ECG compression techniques using
a wavelet-based diagnostic measure has been developed. This approach is based on
decomposing the segment of interest into frequency bands where a weighted score is given
to the band depending on its dynamic range and its diagnostic significance.
152
g [ n] =
( 1)n h [L
n] , n = 0 , 1, . . . , L 1
(13)
where L is the filter length. To adapt the mother wavelet to the signals for the purpose of
compression, it is necessary to define a family of wavelets that depend on a set of
parameters and a quality criterion for wavelet selection (i.e. wavelet parameter
optimization). These concepts have been adopted to derive a new approach for ECG signal
compression based on dyadic discrete orthogonal wavelet bases, with selection of the
mother wavelet leading to minimum reconstruction error. An orthogonal wavelet transform
decomposes a signal into dilated and translated versions of the wavelet function (t ) . The
wavelet function (t ) is based on a scaling function (t ) and both can be represented by
dilated and translated versions of this scaling function.
L 1
(t ) = h(n) (2t n)
n=0
and
L 1
(t ) = g(n) (2t n)
n=0
(14)
153
With these coefficients h(n) and g(n), the transfer functions of the filter bank that are used to
implement the discrete orthogonal wavelet transform, can be formulated.
H ( z) =
L 1
h( n ) z n
b =0
and G( z ) =
L 1
g( n ) z n
(15)
b =0
For a finite impulse response (FIR) filter of length L, there are L / 2 + 1 sufficient conditions
to ensure the existence and orthogonality of the scaling function and wavelets (Donoho &
Johnstone, 1998). Thus L / 2 1 degrees of freedom (free parameters) remain to design the
filter h.
154
cos i
R( i ) =
sin i
sin i
cos i
16)
Ge ( z2 ) Go ( z2 )
i = o sin i
sin i 1
cos i 0
z1
(17)
H ( z) = H e ( z 2 ) + z1 H o ( z 2 )
(18a)
G( z) = Ge ( z 2 ) + z1 Go ( z 2 )
(18b)
and
To obtain the expressions for the coefficients of H(z) in terms of the rotational angles, it is
necessary to multiply out the above matrix product. In order to parameterize all orthogonal
wavelet transforms leading to a simple implementation, the following facts should be
considered.
1. Orthogonality is structurally imposed by using lattice filters consisting of orthogonal
rotations.
2. The sufficient condition for constructing a wavelet transform, namely one vanishing
moment of the wavelet, is guaranteed, by assuring the sum of all rotation angles of the
filters to be exactly -45o .
A suitable architecture for the implementation of the orthogonal wavelet transforms are
lattice filters. However, the wavelet function should be of zero mean, which is equivalent to
the wavelet having at least one vanishing moment and the transfer functions H(z) and G(z)
have at least one zero at z =-1 and z=1 respectively. These conditions are fulfilled if the sum
of all rotation angles is 45o (Xie & Morris, 1994), i.e.,
L /2
i = 45o
(19)
i =1
Therefore, a lattice filter whose sum of all rotation angles is 45o performs an orthogonal WT
independent of the angles of each rotation. For a lattice filter of length L, L/2 orthogonal
155
i = ( 1) ( i 1 + i ) for i = 2, 3, . . ., L / 2 1 ,
L /2 = ( 1)L /2 L /2 1
1 = 45o 1 ,
i
(20)
At the end of the decomposition process, a set of vectors representing the wavelet
coefficients is obtained
C = d1 , d2 , d3 , . . . , d j , . . . , dm , am
(21)
where, m is the number of decomposition levels of the DWT. This set of approximation and
detail vectors represents the DWT coefficients of the original signal. Vectors d j contain the
detail coefficients of the signal in each scale j. As j varies from 1 to m, a finer or coarser detail
coefficients vector is obtained. On the other hand, the vector am contains the approximation
wavelet coefficients of the signal at scale m. It should be noted that this recursive procedure
can be iterated ( m log 2 N ) times at most. Depending on the choice of m, a different set of
coefficients can be obtained. The inverse transform can be performed using a similar
recursive approach. Thus, the process of decomposing the signal x can be reversed, that is
given the approximation and detail information it is possible to reconstruct x. This process
can be realized as up-sampling (by a factor of 2) followed by filtering the resulting signals
and adding the result of the filters. The impulse responses h and g can be derived from h
and g. However, toL generate an orthogonal wavelet, h must satisfy some constraints. The
basic condition is h(n) = 2 , to ensure the existence of . Moreover, for orthogonality, h
n = 1 and must satisfy the quadratic condition
must be of norm one
L
n=1
n=1
h(n) h(n 2 k ) = 0,
for k = 1,. . ., L / 2 1
(22)
)(
i=0 ,1
h(i ) = 1 + ( 1)i cos1 + sin 1 1 ( 1)i cos2 sin 2 + ( 1)i 2 sin 2 cos1 / 4 2
i=2 ,3
i=4,5
h( i ) = 1 / 2 h(i 4) h(i 2)
(23)
156
For other values of L, expressions of h are given in (Maitrot et al., 2005). With this wavelet
parameterization there are infinite available wavelets which depend on the design
parameter vector to represent the ECG signal at hand. Different values of may lead to
different quality in the reconstructed signal. In order to choose the optimal values, and
thus the optimal wavelet, a blind criterion of performance is needed. Figure (6) illustrates
the block diagram of the proposed compression algorithm. In order to establish an efficient
solution scheme, the following precise problem formulation is developed. For this purpose,
consider the one-dimensional vector x(i), i=1, 2, 3, ., N represents the frame of the ECG
signal to be compressed; where N is the number of its samples. The initial threshold values
are computed separately for each subband by finding the mean () and standard deviation
() of the magnitude of the non-zero wavelet coefficients in the corresponding subband. If
the is greater than then the threshold value in that subband is set to (2*), otherwise, it is
set to (-). Also, define the targeted performance measures PRDtarget and CRtarget and start
with an initial wavelet design parameter vector = [10 , 20 ,. .., L 10 ] to construct the
wavelet filters H(z) and G(z). Figure (7) illustrates the compression algorithm for satisfying
predefined PRD (PRD1) with minimum bit rate representation of the signal. The same
algorithm with little modifications is used for satisfying predefined bit rate with minimum
signal distortion measured by PRD ( PRD 1 ); case 2. In this case, the shaded two blocks are
replaced by: CR calculation and predefined CR is reached?, respectively.
157
process and when the target is met, the encoding simply stops. Similarly, given a bit stream,
the decoder can cease decoding at any point and can produce reconstruction corresponding to
all lower-rate encodings. EZW, introduced in (Shapiro, 1993) is a very effective and
computationally simple embedded coding algorithm based on discrete wavelet transform, for
image compression. SPIHT algorithm introduced for image compression in (Said & Pearlman,
1996) is a refinement to EZW and uses its principles of operation.
Fig. 7. Compression Algorithm for Satisfying Predefined PRD with Minimum Bit Rate.
158
These principles are partial ordering of transform coefficients by magnitude with a set
partitioning sorting algorithm, ordered bit plane transmission and exploitation of selfsimilarity across different scales of an image wavelet transform. The partial ordering is done
by comparing the transform coefficients magnitudes with a set of octavely decreasing
thresholds. In this algorithm, a transmission priority is assigned to each coefficient to be
transmitted. Using these rules, the encoder always transmits the most significant bit to the
decoder. In (Lu et al., 2000), SPIHT algorithm is modified for 1-D signals and used for ECG
compression. For faster computations SPIHT algorithm can be described as follows:
1. ECG signal is divided to contiguous non-overlapping frames each of N samples and
each frame is encoded separately.
2. DWT is applied to the ECG frames up to L decomposition levels.
3. Each wavelet coefficient is represented by a fixed-point binary format, so it can be
treated as an integer.
4. SPIHT algorithm is applied to these integers (produced from wavelet coefficients) for
encoding them.
5. The termination of encoding algorithm is specified by a threshold value determined in
advance; changing this threshold, gives different compression ratios.
6. The output of the algorithm is a bit stream (0 and 1). This bit stream is used for
reconstructing signal after compression. From it and by going through inverse of SPIHT
algorithm, we compute a vector of N wavelet coefficients and using inverse wavelet
transform, we make the reconstructed N sample frame of ECG signal.
In (Pooyan et al., 2005), the above algorithm is tested with N=1024 samples, L=6 levels and
the DWT used is biorthogonal 9/7 (with symmetric filters h(n) with length 9 and g(n) with
length 7). The filters' coefficients are given in Table (1).
n
h(n)
g(n)
0
0,852699
0.788485
1
0,377403
0.418092
2
-0.11062
-0.04069
3
-0.023849
-0.064539
4
0.037829
159
redundancy between adjacent heartbeats has been presented. The QRS complex in each
heartbeat is detected for slicing and aligning a 1-D ECG signal to a 2-D data array, and then
2-D wavelet transform is applied to the constructed 2-D data array. Consequently, a
modified SPIHT algorithm is applied to the resulting wavelet coefficients for further
compression. The way that the 2-D ECG algorithm presented in (Tai et al., 2005) differs from
other 2-D algorithms, (Reza et al., 2001; Ali et al., 2003), is that it not only utilizes the
interbeat correlation but also employs the correlation among coefficients in relative
subbands. More recently (Wang & Meng, 2008), a new 2-D wavelet-based ECG data
compression algorithm has been presented. In this algorithm a 1-D ECG data is first
segmented and aligned to a 2-D data array, thus the two kinds of correlation of heartbeat
signals can be fully utilized. And then 2-D wavelet transform is applied to the constructed 2D ECG data array. The resulting wavelet coefficients are quantized using a modified vector
quantization (VQ). This modified VQ algorithm constructs a new tree vector which well
utilizing the characteristics of the wavelet coefficients. Experimental results show that this
method is suitable for various morphologies of ECG data, and that it achieves higher
compression ratio with the characteristic features well preserved.
6.4 Hybrid ECG signal compression methods
Hybrids ECG signal compression methods are constructed from more than time and/or
frequency domain techniques (Ahmed et al., 2007). These include Modified Discrete Cosine
Transform (MDCT) and DWT; linear prediction coding and DWT. By studying the ECG
waveforms, it can be concluded that the ECG signals generally show two types of
correlation, namely correlation between adjacent samples within each ECG cycles (intrabeat
correlation) and correlation between adjacent heartbeats (interbeat correlation) (Xingyuan &
Juan, 2009). However, most existing ECG compression techniques did not utilize such
correlation between adjacent heartbeats. Hybrid compression methods of ECG signals are
discussed in this section, which fully utilizes the interbeat correlation and thus can further
improve the compression efficiency.
6.4.1 ECG signal compression based on combined MDCT and DWT
In (Ahmed et al., 2008), a hybrid two-stage electrocardiogram (ECG) signals compression
method based on the MDCT and DWT has been proposed. The ECG signal is partitioned
into blocks and the MDCT is applied to each block to decorrelate the spectral information.
Then, the DWT is applied to the resulting MDCT coefficients. The resulting wavelet
coefficients are then threshold and compressed using energy packing and binary-significant
map coding technique for storage space saving. MDCT is a linear orthogonal lapped
transform, based on the idea of time domain aliasing cancellation (TDAC). It is designed to
be performed on consecutive blocks of a larger dataset, where subsequent blocks are
overlapped so that the last half of one block coincides with the first half of the next block.
This overlapping, in addition to the energy-compaction qualities of the DCT, makes the
MDCT especially attractive for signal compression applications. Thus, it helps to avoid
artifacts stemming from the block boundaries (Britanak & Rao, 2002; Nikolajevic & Fettweis,
2003). MDCT is critically sampled, which means that though it is 50% overlapped, a
sequence data after MDCT has the same number of coefficients as samples before the
transform (after overlap-and-add). This means that, a single block of IMDCT data does not
correspond to the original block on which the MDCT was performed. When subsequent
160
blocks of inverse transformed data are added, the errors introduced by the transform cancel
out TDAC. The MDCT is defined as (Nikolajevic & Fettweis, 2003):
XC ( k ) =
N 1
n =0
k = 0, 1, . . . M 1
(24)
where, x(n), n=0, 1, 2, , N-1 is the sequence to be transformed, N=2M is the window length
and M is the number of transform coefficients. The computation burden can be reduced if
the transform coefficients given by equation (24) are rewritten in the following recursive
form
(25)
(26)
Where,
and
k = k + 1
(27)
2M
The MDCT computation algorithm of a data sequence x(n) can be summarized in the
following:
1. Partition the data sequence in Nb consecutive blocks, each one with N=64 samples.
2. Recursively generate the Vm from the input sequence x(n) according to (26) and (27).
3. Calculate the MDCT coefficients for each block by evaluating the k-th MDCT coefficient
using (25) at the N-th step.
In the decompression stage, the inverse MDCT, that is termed IMDCT, is adopted. Because
there are different numbers of inputs and outputs, at first glance it might seem that the
MDCT should not be invertible. However, perfect invertability is achieved by adding the
overlapped IMDCTs of subsequent overlapping blocks, causing the errors to cancel and the
original data to be retrieved. The IMDCT transforms the M real coefficients, XC (0), XC (1),
, XC (M-1), into N=2M real numbers, x(0), x(1), .. , x(N-1), according to the formula:
x(n) =
M 1
XC (k)cos n + M2+1 k + 21 M ,
k =0
n = 0, 1, . . . N 1
(28)
Again, the computation burden of x(n) can be reduced considerably if equation (28) is
rewritten in the following recursive form
3
x(n) = XC (0) cos n + V1 cos n V2 cos n
2
2
2
Where,
and n = n +
M +1
2 M
(29)
(30)
161
6.4.2 ECG signal compression based on the linear prediction of DWT coefficients
In (Abo-Zahhad et al., 2000; Ahmed & Abo-Zahhad, 2001), a new hybrid algorithm for ECG
compression based on the compression of the linearly predicted residuals of the wavelet
coefficients is presented. The main goal of the algorithm is to reduce the bit rate while
keeping the reconstructed signal distortion at a clinically acceptable level. In this algorithm,
the input signal is divided into blocks and each block goes through a discrete wavelet
transform; then the resulting wavelet coefficients are linearly predicted. In this way, a set of
uncorrelated transform domain signals is obtained. These signals are compressed using
various coding methods, including modified run-length and Huffman coding techniques.
The error corresponding to the difference between the wavelet coefficients and the predicted
coefficients is minimized in order to get the best predictor.
162
length algorithm is the need of two words for the representation of each group of repeated
samples: one for the repeated value and the other for the number of repetitions. In this
section a more efficient coding algorithm, a modified run-length algorithm, is presented for
dealing with this situation. The algorithm is based on representing each significant
coefficient by bS+1 bits. The insignificant coefficients (of value zero) are manipulated in a
different manner. First, the repeated groups of zeros are counted and the resulting count is
represented by bS+1 bits. Then the train of coefficients representing the ECG signal is
transformed to another train of numbers. Some of these numbers represent the significant
coefficients and the rest are the numbers representing the repeated group of zeros (K1, K2,
., KM). Here, M denotes the number of these groups. The problem here is how to
differentiate between the coefficients and the numbers representing the group of zeros. For
example, the number 18 may be found twice in the new train of numbers, where the first 18
may be a significant coefficient and the second one may indicate 18 repeated zeros. To
overcome this problem, the first bit in the representation of each number is used as a control
bit. In case of the significant coefficient this bit is set to one and in case of repeated zeros it is
reset to zero.
bS bits
bS bits
163
distortion of the reconstructed signal. This has been performed by thresholding the wavelet
coefficients of the approximation and details subbands with different threshold levels.
As it can be deduced from the above discussion, the approximation band is the smallest
band in size and it includes high amplitude approximation coefficients. The wavelet
coefficients other than these included in the approximation band, detail coefficients, have
small magnitudes. Most of the energy is captured by these coefficients of the lowest
resolution band. This can be seen from the decomposition of 4096-sample ECG signal up to
the fifth level. The total energy of the signal is 94393.5. About 99.73% of this energy is
concentrated in the 136 approximation coefficients and only 0.27% of the energy is
concentrated in the remaining 3960 detail coefficients. Here, threshold levels are defined
according to the energy packing efficiencies of the signal for all subbands. EPE for a set of
coefficients in the ith subband is defined as the ratio of the energy captured by the subband
coefficients and the energy captured by the whole number of coefficients.
L
( c( n ) )
( c( n ) )
EPE = nL= 1
i
x 100
(31)
n=1
Where Li and L are the number of coefficients in the ith subband and the whole number of
coefficients respectively. A large threshold could attain high data reduction but poor signal
fidelity and a small threshold would produce low data reduction but high signal fidelity. To
explore the effect of threshold level () selection and the coefficients representation on the
compression ratio and PRD, the following thresholding rule is set:
Keep all the wavelet coefficients in the approximation subband without thresholding and calculate the
threshold value for each details subband separately by preserving the higher amplitude wavelet
coefficients in the ith details subband that contribute to i % of the energy in that subband.
One important feature of this rule is that the integer part of the wavelet coefficients in each
subband is represented by different number of bits.
8.2 Binary significant map coding algorithm
The coding algorithm adopted here is based on grouping the significant coefficients in one
vector and the locations of the insignificant coefficients in another vector. The significant
coefficients are arranged from high scale coefficients to low scale coefficients. Each
significant coefficient is decomposed into integer part and fractional part, where M-bits are
assigned to represent the integer part (signed representation) and N-bits represent the
fractional part; i.e. each coefficient is represented by N+M bits. A binary significant map is
used as flags to indicate if the coefficient is significant or not. This binary stream is
compressed further as will be shown in the following:
1. Threshold the wavelet coefficients, c(n) , to produce the threshold coefficients c (n) .
The threshold level () is determined by using the above-mentioned rule such that the
distortion in the reconstructed signal xi is acceptable. The distortion is measured using
PRD and/or visual inspection. The optimal non-orthogonal wavelet transform
developed in (Ahmed et al., 2000) may be used to minimize the PRD in least mean
square sense. Here, the threshold is determined such that the PRD is less or equal to a
prescribed acceptable value defined by a cardiologist.
164
2.
3.
4.
5.
6.
Search the vector c (n) to isolate the significant coefficient in another vector C S ( m) .
Use finite word length representation to represent the integer and fractional parts of the
coefficients, C S ( m) . The number of bits used to represent these coefficients is
determined as follows:
3.1 Search the vector C S ( m) to find the maximum coefficient (in absolute value) and
determine the number of bits that represents this coefficient. This can be done by
finding k = Int max|C (m)| where Int . denotes the integer part. Then convert
S
k to a binary number and count the number of bits, M.
3.2 Similarly, find the number of bits, N, that represent the minimum value of the
fractional part of each significant coefficient in such a way to keep the distortion
within acceptable limits.
Generate a binary stream, b(n), of 1s and 0s that encodes the zero-locations in c (n) .
This is done by coding each significant coefficient in c (n) by a binary 1. The length of
the binary stream equals n1, where n1 designates the index value of the last significant
coefficients in c (n) . Hence, there is no need to encode the zeros for n > n1. The value of
n1 need not be stored because it can be determined as the length of the vector b(n) in the
decoding process.
Compress the binary stream using run length encoding of 0s and 1s as follows:
5.1 Set i = 1, Run-type= b(i), and set the run length Z to 1;
If b(i) b(i+1) increment i by Z. Else, while b(i+1) = b(i), increment i by 1 and Z
by 1 end; end.
5.2 From Table (2), find the inequality that Z satisfies. Then output the symbol that
specifies the run type followed by the number Z. i.e., code = [code Z] , where
designates concatenation operator.
5.3 If index < n1 set Z=1 and go to step (5.1).
Represent the obtained run length code in binary format. There are 16 different symbols
that can be generated from step 5. These are the digits 0-9 and the letters A-F. Hence, 4
bits can be used to represent each symbol.
Symbol
A
B
C
Run Type
0
0
0
Range
100 Z 999
10 Z 99
2Z 9
Symbol
D
E
F
Run Type
1
1
1
Range
100 Z 999
10 Z 99
2Z 9
9. Conclusion
In literature, numerous ECG compression methods have been developed. They may be
defined either as reversible methods (offering low compression ratios but guaranteeing an
exact or near-lossless signal reconstruction), irreversible methods (designed for higher
compression ratios at the cost of a quality loss that must be controlled and characterized), or
scalable methods (fully adapted to data transmission purposes and enabling lossy
reconstruction). Choosing one method mainly depends on the use of the ECG signal. In the
case of the needs of a first diagnosis, a reversible compression would be most suitable.
165
10. References
Jalaleddine S. M. S., Hutchens C. G., Strattan R. D., & Coberly W. A. (1990). ECG data
compression techniques-A unified approach, IEEE Trans Biomed. Eng., vol. 37,
329343.
Addison P. S. (2005). Wavelet transforms and the ECG: a review. Institute Of Physics
Publishing Physiological Measurement, vol. 26, R155R199.
Padma T., Latha M. M., and Ahmed A. (2009). ECG compression and labview
implementation, J. Biomedical Science and Engineering, vol. 3, 177-183.
Moody G. (1992). MIT-BIH Arrhythmia Database, Overview. Massachusetts Institute of
Technology, Cambridge, 1992.
Dipersio D. A. and Barr R. C. (1985). Evaluation of the fan method of adaptive sampling on
human electrocardiograms, Med. Biol. Eng. Comput., 401410.
Abenstein J. and Tompkins W., (1982). New data-reduction algorithm for real-time ECG
analysis. IEEE Trans. Biomed. Eng., vol. BME-29, 4348, Jan. 1982.
Cox J., Noelle F., Fozzard H., and Oliver G. (1968). AZTEC: A preprocessing program for
real-time ECG rhythm analysis. IEEE Trans. Biomed. Eng., BME-15, 128129, Apr.
1968.
Mueller W., (1978). Arrhythmia detection program for an ambulatory ECG monitor.
Biomed. Sci. Instrument, vol. 14, 8185, 1978.
Moody G. B., Mark R. G., and Goldberger A. L., (1989). Evaluation of the TRIM ECG data
compressor. Proc. Of Comput. Cardiol., 1989, 167170.
Haugland D., Heber J., and Husy J., (1997). Optimization algorithms for ECG data
compression. Med. Biol. Eng. Comput, vol. 35, 420424, July 1997.
Olmos S., Milln M., Garca J., and Laguna P., (1996). ECG data compression with the
Karhunen-Love transform. Proc. Comput. Cardiol. , Indianapolis, 253256, Sept.
1996.
Reddy B. R. S., and Murthy I. S. N., (1986). ECG data compression using Fourier
descriptions. IEEE Trans. Biomed. Eng., vol. 33, 428434, Apr. 1986.
Ahmed N., Milne P. J., and Harris S. G., (1975). Electrocardiographic data compression via
orthogonal transforms. IEEE Trans. Biomed. Eng., vol. BME-22, 484487, Nov. 1975.
Husy J. H. and Gjerde T., (1996). Computationally efficient subband coding of ECG
signals, Med. Eng. Phys., vol. 18, 132142, Mar. 1996.
Mammen C. P. and Ramamurthi B., (1990). Vector quantization for compression of
multichannel ECG. IEEE Trans. Biomed. Eng., vol. 37, 821825, Sept. 1990.
Chen J., Itoh S., and Hashimoto T., (1993). ECG data compression by using wavelet
transform. IEICE Trans. Inform. Syst., vol. E76-D, 14541461, Dec. 1993.
Miaou S. G., Yen H. L. and Lin C.L., (2002). Wavelet-based ECG compression using dynamic
vector quantization with tree codevectors in single codebook. IEEE Trans. Biomed.
Eng., vol. 49, 671680, 2002.
166
Nygaard R., Melnikov G., and Katsaggelos A. K., (1999). Rate distortion optimal ECG signal
compression. in Proc. Int. Conf. Image Processing, 348351, Oct. 1999, Kobe, Japan.
Zigel Y., Cohen A., and Katz A., (2000). The Weighted Diagnostic Distortion (WDD)
Measure for ECG Signal Compression. IEEE Trans. Biomed. Eng., 47, 1422-1430,
2000.
Zou H. and Tewfik A. H., (1993). Parameterization of compactly supported orthonormal
wavelets. IEEE Trans. on Signal Processing, vol. 41, no. 3, 1428-1431, March 1993.
Ahmed S. M., Al-Shrouf A. and Abo-Zahhad M., (2000). ECG data compression using
optimal non-orthogonal wavelet transform. Med. Eng. Phys. 22, 3946, 2000.
Javaid R., Besar R., Abas F. S., (2008). Performance Evaluation of Percent Root Mean Square
Difference for ECG Signals Compression. Signal Processing: An International
Journal (SPIJ): 19, April 2008.
Benzid R., Marir F., Boussaad A., Benyoucef M., and Arar D., (2003). Fixed percentage of
wavelet coefficients to be zeroed for ECG compression. Electronics Letters, vol. 39,
830831, 2003.
Blanco-Velasco M., Cruz-Roldan F., Godino-Llorente J. I., and Barner K. E., (2004). ECG
compression with retrieved quality guaranteed. Electronics Letters, vol. 40, no. 23,
900901, 2004.
Al-Fahoum A. S., (2006). Quality assessment of ECG compression techniques using a
wavelet-based diagnostic measure. IEEE Trans. in Biomedicine, vol. 10, 182-191,
2006.
Thakor N. V., Sun Y. C., Rix H. and Caminal P., (1993). Multiwave: a wavelet-based ECG
data compression algorithm. IEICE Trans. Inf. Syst. E76D 14629, 1993.
Chen J. and Itoh S., (1998). A wavelet transform-base ECG compression method
guaranteeing desired signal quality. IEEE Trans. Biomed. Eng. 45, 14141419, 1998.
Miaou S. G. and Lin H. L., (2000). Quality driven gold washing adaptive vector quantization
and its application to ECG data compression. IEEE Trans. Biomed. Eng., 47 209
218, 2000.
Miaou S. G. and Lin C. L. (2002). A quality-on-demand algorithm for wavelet-based
compression of electrocardiogram signals. IEEE Trans. Biomed. Eng. 49, 233239,
2002.
Bradie B., (1996). Wavelet packet-based compression of single lead ECG. IEEE Trans.
Biomed. Eng. 43, 493501, 1996.
Ramakrishnan A. G. and Saha S., (1997). ECG coding by wavelet-based linear prediction.
IEEE Trans. Biomed. Eng., vol. 44, 12531261, 1997.
Lu Z., Kim D. Y. and Pearlman W. A., (2000). Wavelet compression of ECG signals by the set
partitioning in hierarchical trees algorithm. IEEE Trans. Biomed. Eng., 47, 849855,
2000.
Daubechies I., (1998). Orthonormal bases of compactly supported wavelets. Communication
on Pure and Applied Mathematics, vol. 41, no. 7, 909-996, Nov. 1998.
Donoho D. L., and Johnstone I. M., (1998). Minimax estimation via wavelet shrinkage. Ann.
Statist., vol. 26, 879921, 1998.
Vaidyanathan P. P., (1993). Multirate digital filters, filter banks, polyphase networks and
applications: A tutorial review. Proceedings of the IEEE, vol. 41, 3463-3479, Dec.
1993.
167
Xie H. and Morris J. M., (1994). Design of orthonormal wavelets with better time-frequency
resolution. Proc. Of SPIE, Wavelet Applications, 878-997, Orlando, Florida, 1994.
Maitrot A., Lucas M. F., Doncarli C., and Farina D., (2005). Signal-dependent wavelet for
electromyogram classification. Med. Biol. Eng. Computers., vol. 43, 48792, 2005.
Shapiro J. M., (1993). Embedded Image Coding Using Zero trees of Wavelet Coefficients.
IEEE Trans. Signal Processing, vol. 41, no. 12, 3445-3462, Dec. 1993.
Said A., and Pearlman W. A., (1996). A New, Fast and Efficient Image Codec Based on Set
Partitioning in Hierarchical Trees. IEEE Trans. Circuits & Systems, vol. 6, 243-250,
1996.
Lu Z., Kim D. Y., Pearlman W. A., (2000). Wavelet Compression of ECG Signals by the Set
Partitioning in Hierarchical Trees Algorithm. IEEE Trans. Biomed. Eng., vol. 47, no.
7, 849-856, July 2000.
Pooyan M., Taheri A., Moazami-Goudarzi M., and Saboori I., (2005). Wavelet Compression
of ECG Signals Using SPIHT Algorithm. World Academy of Science, Engineering
and Technology, vol. 2, 212-215, 2005.
Reza A., Moghaddam A., and Nayebi K., (2001). A two dimensional wavelet packet
approach for ECG compression. Proc. Int. Symp. Signal Processing Applications,
226229, Aug. 2001.
Ali B., Marcellin M.W., and Altbach M. I., (2003). Compression of electrocardiogram signals
using JPEG2000. IEEE Trans. Consumer Electronics, vol. 49, no. 4, Nov. 2003.
Tai S. C., Sun C. C., and Yan W. C., (2005). A 2-D ECG compression method based on
wavelet transform and modified SPIHT. IEEE Trans. Biomed. Eng., 52, 9991008,
2005.
Wang X., and Meng J., (2008). A 2-D ECG compression algorithm based on wavelet
transform and vector quantization. Digital Signal Processing vol. 18, 179188, 2008.
Ahmed S. M., Al-Zoubi Q., and Abo-Zahhad M., (2007). A hybrid ECG compression
algorithm based on singular value decomposition and discrete wavelet transform.
J. Med. Eng. Technology, vol. 31, 54-61, 2007.
Xingyuan W. and Juan M., (2009). Wavelet-based hybrid ECG compression technique",
Analog Integrated Circuits Signal Processing, vol. 59, 301308, 2009.
Ahmed S. M., Al-Ajlouni A. F., Abo-Zahhad M., and Harb B., (2008). ECG signal
compression using combined modified discrete-cosine and discrete-wavelet
transforms. 2008.
Britanak V. and Rao K. R., (2002). A new fast algorithm for the unified forward and inverse
MDCT/MDST computation. Signal Processing 82, 433-459, 2002.
Nikolajevic V. and Fettweis G., (2003). A new Recursive Algorithm for the Unified Forward
and Inverse MIDCT/MIDST. Journal of VLSI Signal Processing, vol. 9, 203-208,
2003.
Abo-Zahhad M., Ahmed S. M., and Al-Shrouf A., (2000). Electrocardiogram data
compression algorithm based on the linear prediction of the wavelet coefficients.
Proc. of 7th IEEE Int. Conf., Electronics, Circuits and Systems, Lebanon, 599603,
Dec. 2000.
Ahmed S. M., and Abo-Zahhad M., (2001). A new hybrid algorithm for ECG signal
compression based on the wavelet transformation. Medical Engineering and
Physics, vol. 24, no. 3, 50-66, 2001.
168
Abo-Zahhad M., and Rajoub B. A., (2001). ECG compression algorithm based on coding and
energy compaction of the wavelet coefficients. The 8th IEEE International Conf. On
Electronics, Circuits and Systems, Malta, 441-444, Sept. 2001.
Abo-Zahhad M., and Rajoub B. A., (2002). An effective coding technique for the compression
of one-dimensional signals using wavelets. Med. Eng. and Phy., vol. 24, 185-199,
2002.
9
Shift Invariant Biorthogonal
Discrete Wavelet Transform
for EEG Signal Analysis
Juuso T. Olkkonen and Hannu Olkkonen
170
2. Theoretical considerations
2.1 Two-channel BDWT filter bank
The two-channel BDWT analysis filters are of the general form (Olkkonen et al. 2005)
H 0 ( z) = (1 + z 1 )K P( z)
(1)
H 1 ( z) = (1 z1 )K Q( z)
where H 0 ( z ) is the Nth order low-pass scaling filter polynomial having the Kth order zero
at = . P( z) is polynomial in z 1 . H 1 ( z) is the corresponding Mth order high-pass
wavelet filter having Kth order zero at = 0 . Q( z) is polynomial in z 1 . For a two-channel
perfect reconstruction filter bank, the well known perfect reconstruction (PR) condition is
H 0 ( z) G0 ( z) + H 1 ( z) G1 ( z) = 2 z k
H 0 ( z) G0 ( z) + H 1 ( z) G1 ( z) = 0
(2)
where G0 ( z ) and G1 ( z ) are the low-pass and high-pass reconstruction filters defined as
G0 ( z) = H 1 ( z)
G1 ( z) = H 0 ( z)
(3)
A typical set of the scaling and wavelet filter coefficients is given in (Olkkonen et al. 2005).
In this work we apply the following essential result concerning on the PR condition (2).
Lemma 1: If H 0 ( z) and H 1 ( z) are the scaling and wavelet filters, the following modified
analysis and synthesis filters obey the PR condition
H 0 ( z) = P( z)H 0 ( z)
H 1 ( z) = P 1 ( z)H 1 ( z)
(4)
G0 ( z) = P 1 ( z)G0 ( z)
G1 ( z) = P( z)G1 ( z)
where P(z) is any polynomial in z 1 and P 1 ( z) its inverse. Proof: The result can be proved
by direct insertion of (4) into (2).
2.2 Fractional delay B-spline filter
The ideal FD operator has the z-transform
D( , z) = z
(5)
where [0,1] . In (Olkkonen & Olkkonen, 2007) we have described the FD filter design
procedure based on the B-spline interpolation and decimation procedure for the
construction of the fractional delays = N / M ( N , M N , N = 0,1,..., M 1) . The FD filter
has the following representation
D( N , M , z) = p1 ( z) z N p ( z)F( z)
(6)
Shift Invariant Biorthogonal Discrete Wavelet Transform for EEG Signal Analysis
171
p 1
1
M
1z
1
F( z) =
1 M 1 k
= p 1 z
M k =o
(7)
p ( z)F( z) =
M 1
Pk ( z M )z k
(8)
k =0
(9)
(10)
G1 ( N , M , z) = D( N , M , z)G1 ( z)
The FD B-spline filter (9) suits readily for the implementation of the FD BDWT bank (10).
For example, if we construct the four parallel filter banks, we select M = 4 and
N = 0,1, 2 and 3 . For M=4 the wavelet filter H 1 (0, 4, z) equals the original H 1 ( z) , which is
FIR. However, the filters H 1 (1, 4, z) , H 1 (2, 4, z) and H 1 (3, 4, z) are IIR-type. In the following
we present a novel modification of the FD BDWT filter bank (10), where all FD wavelet
filters are FIR-type.
172
H 0 (0, M , z) = p1 ( z)H 0 ( z)
H 1 (0, M , z) = p ( z)H 1 ( z)
(11)
which obey the PR condition. Since the discrete B-spline filter p ( z) contains no zeroes at
z = 1 , the regulatory degree (the number of zeros at z = 1 ) of the scaling filter is not
affected. The corresponding fractionally delayed wavelet filters are
H 1 ( N , M , z) = PN ( z)H 1 ( z) N = 1, 2,..., M 1
(12)
Now, for N = 0,1,..., M 1 all the wavelet filters are FIR-type and they are the fractionally
delayed versions of each other. The polyphase components PN ( z) in (12) have high-pass
filter characteristics. Hence, the frequency response of the modified wavelet filters is only
slightly altered. Fig. 2 shows the impulse responses of the BDWT wavelet filter (Olkkonen et
al. 2005) and the corresponding fractionally delayed wavelet filters for M = 4 and N = 0, 1,2
and 2. The energy (absolute value) of the impulse response is a smooth function, which
warrants the shift invariance. The corresponding impulse responses of the fractionally
Shift Invariant Biorthogonal Discrete Wavelet Transform for EEG Signal Analysis
173
delayed Daubechies 7/9 wavelet filters (Unser & Blu, 2003) are given in Fig. 3 and the
fractionally delayed Legall 3/5 wavelet filters (Unser & Blu, 2003) in Fig. 4.
Fig. 2. The FD impulse responses of the BDWT wavelet filter (M=4 and N=0,1,2 and 3).
h1[n] = [1 -1 -8 -8 62 -62 8 8 1 -1]/128. The dashed line denotes the energy (absolute value) of
the wavelet filter coefficients.
Fig. 3. The FD impulse responses of the Daubechies 7/9 BDWT wavelet filters (M=4 and
N=0,1,2 and 3). The energy of the wavelet filter coefficients is denoted by the dashed line.
174
Fig. 4. The FD impulse responses of the Legall 3/5 BDWT wavelet filters (M=4 and N=0,1,2
and 3).The dashed line denotes the energy of the wavelet filter coefficients.
3. Experimental
The usefulness of the FD B-spline method was tested for the EEG signal waveforms. For
comparison the EEG signals were analysed using the well established Hilbert transform
assisted complex wavelet transform (Olkkonen et al. 2006). The EEG recording method is
described in detail in our previous work (Olkkonen et al. 2006). The EEG signals were
treated using the BDWT bank given in (Olkkonen et al. 2005). The FD wavelet coefficients
were calculated via (12) using M=4 and N=0,1,2 and 3. Fig. 5A shows the nondelayed
wavelet coefficients. Fig. 5B shows the energy (absolute value) of the wavelet coefficients
and Fig. 5C the energy of the wavelet coefficients computed via the Hilbert transform
method (Olkkonen et al. 2006).
4. Discussion
This book chapter presents an original idea for construction of the shift invariant BDWT
bank. Based on the FD B-spline filter (9) we obtain the FD BDWT filter bank (12), which
yields the wavelet sequences by the FIR filters. The integer valued polyphase components
(Table I) enable efficient implementation in VLSI and microprocessor circuits. The present
paper serves as a framework, since the FD B-spline filter implementation can be adapted in
any of the existing BDWT bank, such as the lifting DWT (Olkkonen et al. 2005), Daubechies
7/9 and Legall 3/5 wavelet filters (Unser &Blu, 2003).
The present idea is highly impacted on the work of Selesnick (2002). He observed that if the
impulse responses of the two scaling filters are related as h0 [n] and h0 [n 0.5] , then the
corresponding wavelets form a Hilbert transform pair. We may treat the two parallel
wavelets as a complex sequence
wc [n] = w[n] + j w[ n 0.5]
(13)
Shift Invariant Biorthogonal Discrete Wavelet Transform for EEG Signal Analysis
175
Fig. 5. The FD BDWT analysis of the neuroelectric signal waveform recorded from the
frontal cortex at a 300 Hz sampling rate. The nondelayed wavelet coefficients (A). The
energy of the FD wavelet coefficients (M=4, N = 0,1,2 and 3) (B). The Hilbert transform
assisted energy (envelope) of the wavelet coefficients (C).
The energy (absolute value) of the complex wavelet corresponds to the envelope, which is a
smooth function. Hence, the energy of the complex wavelet sequence is nearly shift
invariant to fractional delays of the signal.
Gopinath (2003) has studied the effect of the M parallel CQF wavelets on the shift
invariance. According to the theoretical treatment the shift invariance improves most from
the change M=1 to 2. For M=3,4,. the shift invariance elevates, but only gradually. Hence,
M = 4 is usually optimal for computation cost and data redundancy. If we consider the case
M=4 the corresponding hyper complex (hc) wavelet sequence is
whc [n] = w[ n] + i w[n 0.25] + j w[n 0.5] + k w[n 0.75]
(14)
where i, j and k are the unit vectors in the hc space. It is evident that the energy of the hc
wavelet coefficients is more shift invariant to the fractional delay in the signal compared
with the dual tree complex wavelets (13). According to our experience the values M > 4 do
not produce any additional advantage to the treatment of the EEG data.
The FD BDWT bank offers an effective tool for EEG data compression and denoising
applications. Instead of considering the wavelet coefficients we may threshold the energy of
the hc wavelet coefficients as
if whc [n] < then w[n] = 0
(15)
176
where is a small number. Due to the smooth behaviour of the energy function, can be
made relatively high compared with the conventional wavelet denoising methods. In tree
structured BDWT applications only the nondelayed scaling sequence is fed to the next scale.
Usually the scaling sequence is not thresholded, but only the wavelet coefficients. The FD
BDWT bank does not increase the memory requirement (redundancy) compared with the
original nondelayed BDWT bank, since the reconstruction of the data can be performed by
knowing only the nondelayed scaling and wavelet sequences. The FD BDWT bank can be
considered as a subsampling device, which improves the quality of the critically sampled
wavelet sequence. As an example we consider the multi-scale analysis of the neuroelectric
signal (Fig. 5). The energy of the signal in different scales can be estimated with the aid of
the Hilbert transform (Olkkonen et al. 2006). Applying the result of this work the energy of
the wavelet sequence whc [n] (14) approaches closely to the energy (envelope) of the signal.
However, the delayed wavelet sequence is produced only by the polyphase filter
PN ( z) (N=1,2,,M-1)(12), while the Hilbert transform requires the FFT based signal
processing (Olkkonen et al. 2006). In the EEG signal recorded from the frontal cortex, the
spindle waves have concentrated energy, which is clearly revealed both by the FD BDWT
and the Hilbert transform analysis (Fig. 5). The energy content of the EEG signal yielded by
the two different methods is remarkably similar.
The essential difference compared with the half-sample shifted CQF filter bank (Selesnick,
2002) is the linear phase of the BDWT bank and the FD B-spline filters adapted in this work.
The shifted CQF filter bank is constructed with the aid of the all-pass Thiran filters and the
scaling and wavelet coefficients suffer from nonlinear phase distortion effects (Fernandes,
2003). The linear phase warrants that the wavelet sequences in different scales are accurately
time related. The FD wavelet coefficients enable the high resolution computation of the
cross and autocorrelation and other statistical functions.
Appendix I
The discrete B-spline filter
B-splines p (t ) are defined as p -times convolution of a rectangular pulse
1 for 0 t 1
p(t ) =
0 elsewhere
(16)
1
1
L {p(t )} = (1 e s ) p (s ) = p 1 e s
s
s
k s
) = (1) k es
p
k =0
(17)
and the inverse Laplace transform gives the time domain solution
p (t ) =
p
p
1
p 1
k
( 1) (t k )+
( p 1)! k = 0 k
(18)
The discrete B-spline p [n] equals to the continuous B-spline at integer values of time.
Hence, the Laplace transform (17) and the z-transform of the discrete B-spline have inverse
transforms which coincide at integer values in the time domain. Using the relation
Shift Invariant Biorthogonal Discrete Wavelet Transform for EEG Signal Analysis
t p1
1
L1 p = +
s ( p 1)!
177
(19)
(1 e s )p = N p ( z )(1 z 1 )p
p
s
p ( z) = Z p [ n] = Z L1
(20)
where
1 n p 1 n
N p ( z ) = Z L1 p =
z
s n = 0 ( p 1)!
(21)
N p + 1 ( z) =
(22)
1
1
i = 1 1 bi z
p1 ( z) = c
i =1
j =1
1 b z1 = c Si ( z) R j ( z)
j =1
(23)
where c is a constant and the roots bi 1 and b j > 1 . The Si ( z ) filters in (23) can be directly
implemented. The R j ( z) filters in (23) can be implemented by the following recursive
filtering procedure. First we replace z by z-1
Ri ( z) =
Y ( z)
1
=
1 b j z1 U ( z)
Ri ( z 1 ) =
b j 1 z 1
1 b j 1 z 1
Y ( z1 )
U ( z 1 )
(24)
where U ( z ) and Y ( z) denote z-transforms of the input u[n] and output y[ n] signals
( n = 0,1, 2,..., N ) . The U ( z 1 ) and Y ( z1 ) are the z-transforms of the time reversed input
u[ N n] and output y[ N n] . The R j ( z1 ) filter is stable having a root b j 1 inside the unit
circle. The following Matlab program rfilter.m demonstrates the computation procedure:
function y=rfilter(u,b)
u=u(end:-1:1);
y=filter([0 -1/b],[1 -1/b],u);
y=y(end:-1:1);
5. References
Daubechies, I. (1988). Orthonormal bases of compactly supported wavelets. Commmun. Pure
Appl. Math., Vol. 41, 909-996.
178
Fernandes,F., Selesnick , I.W., van Spaendonck, R. & Burrus, C. (2003). Complex wavelet
transforms with allpass filters, Signal Processing, Vol. 83, 1689-706.
Gopinath, R.A. (2003).The phaselet transform - An integral redundancy nearly shift
invariant wavelet transform, IEEE Trans. Signal Process. Vol. 51, No. 7, 1792-1805.
Kingsbury, N.G. (2001). Complex wavelets for shift invariant analysis and filtering of
signals. J. Appl. Comput. Harmonic Analysis. Vol. 10, 234-253.
Olkkonen, H., Pesola, P. & Olkkonen, J.T. (2005). Efficient lifting wavelet transform for
microprocessor and VLSI applications. IEEE Signal Process. Lett. Vol. 12, No. 2, 120122.
Olkkonen, H., Pesola, P., Olkkonen, J.T. & Zhou, H. (2006). Hilbert transform assisted
complex wavelet transform for neuroelectric signal analysis. J. Neuroscience Meth.
Vol. 151, 106-113.
Olkkonen, J.T. & and Olkkonen, H. (2007). Fractional Delay Filter Based on the B-Spline
Transform, IEEE Signal Processing Letters, Vol. 14, No. 2, 97-100.
Selesnick, I.W. (2002). The design of approximate Hilbert transform pairs of wavelet bases.
IEEE Trans. Signal Process. Vol. 50, No. 5, 1144-1152.
Smith, M.J.T. & Barnwell, T.P. (1986). Exaxt reconstruction for tree-structured subband
coders. IEEE Trans. Acoust. Speech Signal Process. Vol. 34, 434-441.
Sweldens, W. (1988). The lifting scheme: A construction of second generation wavelets.
SIAM J. Math. Anal. Vol. 29, 511-546.
Unser, M. & Blu,T. (2003), Mathematical properties of the JPEG2000 wavelet filters, IEEE
Trans. Image Process., Vol. 12, No. 9, 1080-1090.
0
1
10
0
Shift-Invariant
Medical
Image
Shift-Invariant
DWT
for
MedicalImage
Image
Shift-Invariant
DWTDWT
forfor
Medical
Classification
Classification
Classification
1. Introduction
1. Introduction
The discrete wavelet transform (DWT) is gaining momentum as a feature extraction and/or
The discrete wavelet transform (DWT) is gaining momentum as a feature extraction and/or
classification tool, because of its ability to localize structures with good resolution in a
classification tool, because of its ability to localize structures with good resolution in a
computationally effective manner. The result is a unique and discriminatory representation,
computationally effective manner. The result is a unique and discriminatory representation,
where important and interesting structures (edges, details) are quantified efficiently by few
where important and interesting structures (edges, details) are quantified efficiently by few
coefficients. These coefficients may be used as features themselves, or features may be
coefficients. These coefficients may be used as features themselves, or features may be
computed from the wavelet domain that describe the anomalies in the data.
computed from the wavelet domain that describe the anomalies in the data.
As a result of the potential that the DWT possesses for feature extraction and classification
As a result of the potential that the DWT possesses for feature extraction and classification
applications, the current work focuses on its utility in a computer-aided diagnosis (CAD)
applications, the current work focuses on its utility in a computer-aided diagnosis (CAD)
framework. CAD systems are computer-based methods that offer diagnosis support to
framework. CAD systems are computer-based methods that offer diagnosis support to
physicians. The images are automatically analyzed and the presence of pathology is identified
physicians. The images are automatically analyzed and the presence of pathology is identified
using quantitative measures (features) of disease.
using quantitative measures (features) of disease.
With traditional radiology screening techniques, visually analyzing medical images is
With traditional radiology screening techniques, visually analyzing medical images is
labourious, time consuming, expensive (in terms of the radiologists time) and each individual
labourious, time consuming, expensive (in terms of the radiologists time) and each individual
scan is prone to interpretation error (the error rate among radiologists is reported to hover
scan is prone to interpretation error (the error rate among radiologists is reported to hover
around 30% Lee (2007)). Additionally, visual analysis of radiographic images is subjective; one
around 30% Lee (2007)). Additionally, visual analysis of radiographic images is subjective; one
rater may choose a particular lesion as a candidate, while another radiologist may find this
rater may choose a particular lesion as a candidate, while another radiologist may find this
lesion insignificant. Consequently, some lesions are being missed or misinterpreted. To reduce
lesion insignificant. Consequently, some lesions are being missed or misinterpreted. To reduce
the error rates, a secondary opinion may be obtained with a CAD system (automatically
the error rates, a secondary opinion may be obtained with a CAD system (automatically
reanalyze the images after the physician). Such methods are advantageous not only because
reanalyze the images after the physician). Such methods are advantageous not only because
they are cost effective, but also because they are designed to objectively quantify pathology in
they are cost effective, but also because they are designed to objectively quantify pathology in
a robust, reliable and reproducible manner.
a robust, reliable and reproducible manner.
There has been a lot of research in CAD-system design for specific modalities or applications
There has been a lot of research in CAD-system design for specific modalities or applications
with excellent results, i.e. see Sato et al. (2006) for CT, or Guliato et al. (2007) for
with excellent results, i.e. see Sato et al. (2006) for CT, or Guliato et al. (2007) for
mammography. Although these techniques may render good results for the particular
mammography. Although these techniques may render good results for the particular
modality it was built for, the technique is not transferable and has little-to-no utility in other
modality it was built for, the technique is not transferable and has little-to-no utility in other
CAD problems (cannot be applied to other images or databases). Since CAD systems are being
CAD problems (cannot be applied to other images or databases). Since CAD systems are being
employed widely, a framework that encompasses a variety of imaging modalities - not just a
employed widely, a framework that encompasses a variety of imaging modalities - not just a
single one - would be of value.
single one - would be of value.
To this end, this work concerns the development of a generalized computer-aided diagnosis
To this end, this work concerns the development of a generalized computer-aided diagnosis
system that is based on the DWT. It is considered generalized, since the same framework can
system that is based on the DWT. It is considered generalized, since the same framework can
180
be applied to different images with no modifications. There are three image databases that
are used to test the generalized CAD system: small bowel, mammogram and retinal images.
Although these images are very different from one another, a common attribute is noticed:
pathology is rough and heterogeneous, and healthy (normal) tissue is uniform. These images
are described in Section 2.
To quantify these differences between textures in normal and abnormal images, a texture
analysis scheme based on human texture perception is proposed.
To describe the
elemetary units of texture (which are needed for overall texture perception), important
features such as scale, frequency and orientation are used for texture discrimination. The
DWT is a perfect mechanism to highlight these space-localized features, since it offers
a high resolution, scale-invariant representation of nonstationary phenomena (such as
texture). Multiresolutional analysis, the wavelet transform, DWT with its properties and
implementations are discussed in Section 4.
Although the DWT has many beneficial qualities, the DWT is shift-variant. Therefore, any
texture metrics extracted from the wavelet coefficients will also be shift-variant, reducing the
classification performance of our system. To combat this, a shift-invariant DWT (SIDWT)
is utilized to ensure that only translation invariant features are extracted (see Section 5).
To robustly quantify these texture elements (as described by the wavelet coefficients), a
multiscale texture analysis scheme is employed on the shift-invariant coefficients. At various
levels of decomposition, wavelet-domain graylevel cooccurrence matrices were implemented
in a variety of directions over all subbands to capture the orientation of such texture
elements. Texture features were extracted from each of the wavelet subbands to quantify the
randomness of the coefficients and they are classified using a linear classifier. The multiscale
texture analysis scheme and the classification technique are described in Section 6 and Section
7. Section 8 and Section 9 presents the results of the proposed generalized CAD framework for
all images and the concluding remarks, respectively. This work is a consolidation of several
research efforts Khademi (2006) Khademi & Krishnan (2007) Khademi & Krishnan (2008).
2. Biomedical imagery
Three imaging modalities are utilized to test the classification system: mammography, retinal
and small bowel images. Each one of these image types are used to diagnose diseases from
a specific anatomical region. Although these images are quite different from one another, the
current work develops a generalized framework for CAD that may be applied directly to each
of the images. The only apriori assumption is a very general one: the texture between normal
tissue and pathology is different.
The first modality, mammography, is an imaging technology which acquires an x-ray image
of the breast Ferreira & Borges (2001). They are currently the most effective method for early
detection of breast cancers Cheng et al. (2006) Wei et al. (1995). A challenging problem in
human-based analysis of mammography is the discrimination between malignant and benign
masses. Incorrectly identifying the lesion type results in negative to positive biopsies ratios
as high as 11:1 in some clinics Rangayyan et al. (1997). Normal tissue masks the lesions and
breast parenchyma is much more prominent than the lesion itself Ferreira & Borges (2001).
To test the CAD system with mammography images, a database is used where images contain
either a benign or malignant lesion(s). Examples of benign and malignant masses (along with
the contrast enhanced versionS) are shown in Figure 1. Normal regions are also shown for
comparison.
181
(a) Regular
(b) Enhanced
Fig. 1. Mammographic regions (128 128). (a)-(c) Normal regions, (d)-(f) circumscribed
benign masses, (g)-(i) spiculated malignant masses. The contrast enhanced versions of these
regions are also included to highlight the textural differences between lesions.
182
The benign masses have a rounded appearance with a defined boundary, while the inside
of the mass is relatively uniform and radiolucent. This has also been noted by other others,
see Ferreira & Borges (2001) Rangayyan et al. (1997) Mudigonda et al. (2000). In contrast, the
malignant masses possess ill-defined boundaries, are of higher density (radiopaque) and have
an overall nonuniform appearance in comparison to the benign lesions. Furthermore, spicules
from malignant masses cause disturbances in the homogeneity of tissues in the surrounding
breast parenchyma Rangayyan (2005). Since benign and malignant masses carry different
textural qualities, these textural differences will be exploited in the CAD system.
The second type of images are known as small bowel images. They are acquired by Given
Imaging Ltd.s capsule endoscopy known as the PillCamTM SB video capsule. The PillCamTM
is a tiny capsule (10mm 27mm Kim et al. (2005)), which is ingested from the mouth. As
natural peristalsis moves the capsule through the gastrointestinal tract it captures video and
wirelessly transmits it to a data recorder the patient is wearing around his or her waist Given
Imaging Ltd. (2006a). This video provides visualization of the 21 foot long small bowel, which
was originally seen as a black box to doctors Given Imaging Ltd. (2006b).
Video is recorded for approximately eight hours and then the capsule is excreted naturally
Fig. 2. Small bowel images captured by the PillCamTM SB, which exhibit textural
characteristics. (a) Healthy small bowel, (b) normal neocecal valve, (c) normal colonic
mucosa, (d) normal small bowel, (e) normal jejunum, (f) small bowel polyp, (g) small bowel
lymphoma, (h) GIST tumor, (i) polypoid mass, (j) small bowel polyp.
with a bowel movement Given Imaging Ltd. (2006a). Clinical results for the PillCamTM show
that it is a superior diagnostic method for diseases of the small intestine Given Imaging Ltd.
(2006c). The downfall of this technology comes from the large amount of data which is
collected while the PillCamTM - the doctor has to watch and diagnose eight hours of footage!
This is quite a labourious task, which could cause the physicians to miss important clues due
to fatigue, boredom or due to the repetitive nature of the task. To combat missed pathologies,
the proposed CAD system could be used to double check the image data.
To test out the generalized CAD system, a small bowel image database is utilized that contains
both normal (healthy regions) and several abnormal images. As shown Figure 2(a)-(e), the
183
normal small bowel images contain smooth, homogeneous texture elements with very little
disruption in uniformity except for folds and crevices.
Many types of pathologies are found in the small bowel image database ("abnormal" image
class), such as Abnormal: polyp, Kaposis sarcoma, carcinoma, etc. These diseases may
occur in various sizes, shapes, orientations and locations within the gastrointestinal tract.
Abnormalities have some common textural characteristics: the diseased region contains
many different textured areas simultaneously and these diseased areas are composed of
heterogeneous texture components. This may be seen in Figure 2(f)-(j).
The data for each patient is a series of 2D colour images. As the current chapter is focused
on grayscale processing, the colour images are converted to grayscale first. Additionally, each
image has been lossy JPEG compressed, so feature extraction is completed in the compressed
domain. Feature extraction in the compressed domain has become an important topic recently
Chiu et al. (2004) Xiong & Huang (2002) Chang (1995) Armstrong & Jiang (2001) Voulgaris &
Jiang (2001), since the prevalence of images stored in lossy formats far supersedes the number
of images stored in their raw format.
The last set of images are known as retinal images. Ophthalmologists use digital fundus
cameras to acquire and collect retinal images of the human eye Sinthanayothin et al. (2003),
which includes the optic nerve, fovea, surrounding vessels and the retinal layer Goldbaum
(2002). Although screening with retinal imaging reduces the risk of serious eye impairment
(i.e. blindness caused by diabetic retinopathy by 50% Sinthanayothin et al. (2003)), it also
creates a large number of images which the doctors need to interpret Brandon & Hoover
(2003). This is expensive, time consuming and may be prone to human error. The current
automated system can be used to help the doctors with this diagnostic task by offering a
secondary opinion of the images.
The current database contains several normal (healthy) retinal images as well as several
images that contain a variety of pathologies. Examples of normal and abnormal retinal images
are shown in Figure 3. Healthy eyes are easily characterized by their overall homogeneous
appearance, as easily seen in Figure 3(a)-(c).
Eyes which contain disease do not possess uniform texture qualities. Three cases of abnormal
retinal images are shown in Figure 3(d)-(f). Diabetic retinopathy, which is characterized by
exudates or lesions (random whitish/yellow patches locations Wang et al. (2000)) are shown
in Figure 3(a).
Another clinical sign of diabetic retinopathy are microaneurysms and haemorrhages and
macular degeneration, which can cause blindness if it goes untreated. Macular degeneration
may be characterized by drusens, which appear as yellowish, cloudy blobs, which exhibit
no specific size or shape Brandon & Hoover (2003). This is shown in Figure 3(e). These
pathologies disrupt the homogeneity of normal tissues. Other diseases include central retinal
vein and/or artery occlusion shown in Figure 3(f) (an oriented texture pattern which radiates
from the optic nerve).
2.1 Texture for pathology discrimination
As shown in the previous subsection, pathological regions in the images have a heterogeneous
appearance and normal regions are uniform. Moreover, texture elements occur at a variety of
orientations, scales and locations. Thus the CAD system must be robust to all these variances,
but still remain modality- or database-independent (i.e. not tuned specifically for a modality).
Computing devices are becoming an integral part of our daily lives and in many times, these
184
Fig. 3. Retinal images which exhibit textural characteristics. (a)-(c) Normal, homogeneous
retinal images, (d) background diabetic retinopathy (dense, homogeneous yellow clusters),
(e) macular degeneration (large, radiolucent drusens with heterogeneous texture properties),
(f) central retinal vein occlusion (oriented, radiating texture).
algorithms are designed to mimic human behaviour. In fact, this is the major motivation of
many CAD systems; to understand and analyze medical image content in the same fashion
as humans do. Since texture has been shown to be an important feature for discrimination in
medical images, understanding how humans perceive texture provides important clues into
how a computer vision system should be designed to discriminate pathology.
As shown, these images possess textural characteristics that differentiate between
pathological and healthy (normal) tissues. A common denominator is that the pathological
(cancerous) lesions seem to have heterogeneous, oriented texture characteristics, while the
normal images are relatively homogeneous. These differences are easily spotted by the human
observer and thus we want our system to also differentiate between these two texture types
(homogeneity and heterogeneity) for classification purposes.
To build a system that understands textural properties that is in line with human texture
perception, a human texture analysis model must first be examined. When a surface is viewed,
the human visual system can discriminate between textured regions quite easily. To describe
how the human visual system can differentiate between textures, Julesz defined textons,
which are elementary units of texture Julesz (1981). Textured regions can be decomposed
using these textons, which include elongated blobs, lines, terminators and more. It was found
that the frequency content, scale, orientation and periodicity of these textons can provide
important clues on how humans differentiate between two or more textured areas Julesz
(1981). Therefore, to create a system which mimics human understanding of texture for
pathology discrimination, it is necessary that the analysis system can detect the properties of
185
the fundamental units of texture (texture markers). In accordance to Juleszs model, textural
events will be detected based on their scale, frequency and orientation.
3. Feature invariance
To describe the textural characteristics of medical images, a feature extraction scheme will
be used. The extracted features are fed into a classifier, which arrives at a decision related
to the diagnosis of the patient. Let X Rn represent the signal space which contains all
biomedical images with the dimensions of n = N N. Since the images X can be expected
to have a very high dimensionality, using all these samples to arrive at a classification result
would be prohibitive Coifman & Saito (1995). Furthermore, the original image space X is
also redundant, which means that all the image samples are not necessary for classification.
Therefore, to gain a more useful representation, a feature extraction operator f may map the
subspace X into a feature space F
f : X F,
(1)
where F Rk , k n and a particular sample in the feature space may be written as a feature
vector: F = { F1 , F2 , F2 , , Fk }. If k < n, the feature space mapping would also result in a
dimensionality reduction.
Although it is important to choose features which provide the maximum discrimination
between textures, it is also important that these features are robust. A feature is robust if
it provides consistent results across the entire application domain Umbaugh et al. (1997). To
ensure robustness, the numerical descriptors should be rotation, scale and translation (RST)
invariant. In other words, if the image is rotated, scaled or translated, the extracted features
should be insensitive to these changes, or it should be a rotated, scaled or translated version
of the original features, but not modified Mallat (1998). This would be useful for classifying
unknown image samples since these test images will not have structures that have the same
orientation and size as the images in the training set Leung & Peterson (1992). By ensuring
invariant features, it is possible to account for the natural variations and structures within the
retinal, mammographic and small bowel images.
As will be shown in the next section, such features are extracted from the wavelet domain. If a
feature is extracted from a transform domain, it is also important to investigate the invariance
properties of the transform since any invariance in this domain also translates to an invariance
in the features. For instance, the 1-D Fourier spectrum is a well-known translation-invariant
transform since any translation in the time domain representation of the signal, does not
change the magnitude spectrum in the Fourier domain
f (t o ) F ( ) e jo ,
(2)
for all real values of o . Similarly, scaling in time results in a easily definable reaction in the
frequency domain
1
F
f (t)
,
(3)
| |
186
texture events, as well as the feature extraction framework that is used to extract robust
features (in the RST-invariant sense).
4. Multiresolutional analysis
All signals and images may be categorized into one of two categories: 1) deterministic
or 2) non-deterministic (random). Deterministic signals allow for advanced prediction of
signal quantities, since the signal may be described by a mathematical function. In contrast,
instantaneous values of non-deterministic signals are unpredictable due to their random
nature and must be represented using a probabilistic model Ross (2003). This stochastic model
describes the inherent behaviour of the signal or image in question.
Random signals (both 1D and 2D) may be further classified into two groups: 1) stationary
or 2) nonstationary. A stationary signal (1D) is a signal which has a constant probability
distribution for all time instants. As a consequence, first order statistics such as the mean and
second order statistics such as variance must also remain constant. In contrast, a nonstationary
signal has a time-varying probability distribution which causes quantities computed from the
probability density function (PDF) to also be time-varying. For instance, the mean, variance
and autocorrelation function of a nonstationary signal would change with time. Since the
Fourier transform of the autocorrelation function is equal to the power spectral density (PSD)
of a signal (which is related to the spectral content), the PSD of a nonstationary signal is also
time-varying. Consequently, a nonstationary signal has time-varying spectral content.
The medical images (as with most natural images) are nonstationary since they have
spatially-varying frequency components. Texture is comprised of a variety of frequency
content (and may be found in any location in the image), and therefore texture is also a type of
nonstationary phenomena. Since textured regions provide important clues that discriminate
between pathologies and/or healthy tissue, nonstationary analysis would add extra utility in
the sense that it would quantify or localize these textural elements. As discussed, the theory
of human texture perception is defined in terms of several features for texture discrimination:
the scale, frequency, orientation of textons. Therefore, analyzing the scale, frequency and
orientation properties of textural elements by nonstationary image analysis is in accordance
to the human texture perception model.
The type of nonstationary image analysis tool that will be utilized is part of the
multiresolutional analysis family, and is known as the Discrete Wavelet Transform (DWT).
As will be discussed, wavelet transforms are optimal for texture localization since the wavelet
basis have excellent joint space-frequency resolution Mallat (1998).
The section will begin by presenting the signal decomposition theory needed to understand
the fundamentals of the DWT. Following the introduction, the wavelet transform (with
descriptions of the wavelet and scaling basis functions) are given, with emphasis given to
signal space definitions. The DWT is then defined using the filter-bank method which was
implemented by the lifting-approach for the 5/3 Le Gull wavelet.
4.1 Signal decomposition techniques
Signal decomposition techniques can be used to transform the images into a representation
that highlights features of interest. As such decomposition techniques are used to define the
wavelet transform and its variants, some brief background is given here.
A decomposition technique linearly expands a signal or image using a set of mathematical
functions. For a 1D signal, using a set of real-valued expansion coefficients ak , and a series
187
of 1-D mathematical functions k (t) known as an expansion set (k (t) = (t k) for all
integer values of k), a signal f (t) may be expressed as a weighted linear combination of these
functions
f (t) = ak k (t), k Z.
(4)
k
If the members of the expansion set k (t) are orthogonal to one another:
k (t), l (t) = 0,
k = l,
(5)
(6)
where the inner product of two signals x (t) and y(t) is defined by
x (t), y(t) =
x (t) y(t)dt.
(7)
The definition of an expansion set depends on various properties. For instance, if there is
a signal f (t) which belongs to a subspace S ( f (t) S), then k (t) will only be called an
expansion set for S if f (t) can be expressed with linear combinations of k (t). The expansion
set forms a basis if the representation it provides is unique Burrus et al. (1998). Similarly, a basis
set may be defined first and then the space S spans all functions f (t) which can be expressed
by f (t) = k ak k (t).
For images, the basis functions may be dependant on both the horizontal and vertical spatial
variables ( x, y). This leads to 2D basis functions m,n ( x, y), where m,n ( x, y) = ( x n, y
m), for all (m, n ) Z. Therefore, a 2D function (image) f ( x, y), that belongs to the space of the
basis functions, may be rewritten as a linear expansion
f ( x, y) = am,n m,n ( x, y),
m n
(8)
(9)
188
However, this is not possible, because there is a direct trade off between time and
frequency resolution of basis functions as governed by the Heisenburg uncertainty principal
Burrus et al. (1998) Mallat (1998). The Heisenburg uncertainty principal states that resolution
of the time-frequency functions are lower bounded by
t 1/2.
(10)
The wavelet transform offers solutions to all the problems associated with other basis
functions (such as the STFT) Mallat (1989) Wang & Karayiannis (1998) Vetterli & Herley
(1992) Mallat (1998). It offers a multiresolutional representation (decomposes the image using
various scale-frequency resolutions), which is achieved by dyadically changing the size of the
window. Space-frequency events are localized with good results since the changing window
function is tuned to events which have high frequency components in a small analysis
window (scale) or low frequency events with a large scale Burrus et al. (1998). Therefore,
texture events could be efficiently represented using a set of multiresolutional basis functions.
Additionally, the discrete wavelet transform utilizes critical subsampling along rows and
columns and uses these subsampled subbands as the input to the next decomposition level.
For a 2-D image, this reduces the number of input samples by a factor of four for each level of
decomposition. This representation may be stored back on to the original image for minimum
memory usage and it also permits for an organized, computationally efficient manner to
access these subbands and extract meaningful features.
The wavelet transform utilizes both wavelet basis j,k (t) and scaling basis k (t) functions.
The wavelet functions are used to localize the high frequency content, whereas the scaling
function examines the low frequencies. The scale of the analysis window changes with each
decomposition level, thus achieving a multiresolutional representation. Starting with the
initial scale j = 0, the wavelet transform of any function f (t) which belongs to L2 (R) is found
by
f (t) =
k=
k =
c(k) k (t) +
j= k=
j =0 k =
d( j, k) j,k (t),
(11)
where c(k) are the scaling or averaging coefficients (low frequency material) defined by
c(k) = c0 (k) = f (t), k (t) =
(12)
189
and d j (k) are the detail wavelet coefficients (high frequency content) defined by
d j (k) = d( j, k) = f (t), j,k (t) =
(13)
In order to achieve a wavelet transform, the functions j,k (t) and k (t) have to meet specific
criteria. These criteria, the properties of the scaling/wavelet functions and the corresponding
signal spaces are described next.
4.2.1 Scaling function subspaces
Consider a set of basis functions {k (t)} which may be created by translating the prototype
scaling function (t) Burrus et al. (1998)
k ( t ) = ( t k ) ,
where k (t) spans the space V o
k Z,
Vo = Spank {k (t)}.
(14)
(15)
If a set of basis functions span a signal space V o , then any function f (t) which also belongs to
that space can be completely represented using those basis functions as in: f (t) = k ak k (t)
(for any f (t) Vo ).
For added flexibility, the time and frequency resolution of these scaling functions may be
adjusted by including an additional scale parameter j in the characteristic basis function
expression
(16)
j,k (t) = 2 j/2 (2 j t k), j, k Z,
where the scalar multiple 2 j/2 is included to ensure orthonormality Mallat (1989). Therefore,
an entire series of basis functions can be created by simply dilating (changing the j value) or
translating (changing the k value) the prototype scaling function (t). These basis functions
span the subspace V j
V j = Spank {k (2 j t)},
= Spank {j,k (t)},
(17)
and any signal f (t) can be expressed using this expansion set, as long as it is also a set of V j
f (t) =
a k (2 j t k ),
k
f (t) V j .
(18)
The introduction of a scale parameter changes the time duration of the scaling functions.
This allows different resolutions to isolate different anomalies in the signals or images. For
instance, if j > 0, j,k (t) is narrower and would provide a good representation of finer
detail. For j < 0, the basis functions j,k (t) are wider and would be ideal to represent coarse
information Burrus et al. (1998).
4.2.2 Wavelet basis functions
190
...
...
j, k Z.
(19)
To find the mother wavelet (t), it is necessary to find the relationship between the mother
wavelet (t) and the generating scaling function (t).
Starting with an initial resolution of j = 0, the nested subspaces may be written as
V o V1 V2 L2 .
(20)
The corresponding spaces spanned by the wavelet basis functions are shown in Figure 4,
which illustrates how each W subspace spans the difference of two subspaces. As shown in
Figure 4, the signal spaces V1 and V2 may be expressed as
and
V1 = V o W o ,
(21)
V2 = V o W o W1 ,
(22)
where is a direct sum. If V j is the space spanned by the scaling functions j,k (t) and
V j+1 is the space spanned by the functions j+1,k (t), then W j is the disjoint difference or the
orthogonal compliments of V j and V j+1 spanned by the wavelet basis functions j,k (t). This
may be shown by
V j+1 = V j W j , j Z.
(23)
Using Equation 21, Equation 22 and Figure 4, a general expression for the L2 subspace may be
developed:
L 2 = V o W o W1 W2 ,
(24)
V o W o W1 W2 W3 ,
(25)
191
the corresponding basis functions which span these spaces are also orthogonal
(26)
Furthermore, wavelet spaces at a scale j are a subset of the scale spaces at the next scale j + 1
W j V j +1 .
(27)
Consequently, wavelets reside in the space spanned by the next narrower scaling function and
can be expressed as a weighted sum of shifted scaling functions, (2t)
(t) = h1 (n ) 2 (2t n ), n Z,
(28)
n
where h1 (n ) are the wavelets coefficients. Equation 28 shows that the generating wavelet
(t) can be produced from the prototype scaling function (t) by choosing the appropriate
h1 (n ). In order to ensure orthogonality, the scaling and wavelet coefficients must be related
by Burrus et al. (1998)
h1 (n ) = (1)n ho (1 n ).
(29)
Therefore, for analysis with orthogonal wavelets, the highpass filter h1 (n ), which is half-band,
is calculated as the quadrature mirror filter of the lowpass ho (n ). These filters may be
used to efficiently implement the wavelet transform for discrete signals (the Discrete Wavelet
Transform) and is discussed next.
4.3 Discrete wavelet transform
In order to perform the wavelet transform for discrete images, implementation of the DWT
using filterbanks is popular choice since the complexities of the wavelet transform are
explained in terms of filtering operations (which is intuitive). The material is first presented
for one dimensional signals and then is expanded to 2D for images.
After performing a series of simplifications and change of variables Burrus et al. (1998) Mallat
(1998) Vetterli & Herley (1992), Equation 28 may be rewritten as
c j (k) = ho (m 2k)c j+1 (m),
(30)
(31)
and
This illustrates that c j (k) and d j (k) can be found by filtering c j+1 (k) with ho and h1 ,
respectively, followed by a decimation by a factor of 2. The two filters, ho (n ) and h1 (n )
are half-band lowpass and highpass filters, respectively. Consequently, the lowpass filter
ho (n ) produces lowpassed or averaged coefficients c j (k) and the highpass filter h1 (n ) creates
highpassed or detail coefficients d j (k).
To compute the DWT coefficients for two levels, examine the two stage analysis filterbank in
Figure 5(a) alongside the signal spaces in Figure 5(b). Note that the initial scale here is j + 1,
and therefore c j+1 would represent the original input signal. After one level of decomposition,
the lowpass coefficients c j and the highpass details d j are produced. For a multiresolutional
representation, c j are further decomposed with ho and h1 , to produce the coefficients c j1 (k)
192
and d j1 (k) (they describe the next scale of low and high frequency structures). The 2D
extension for images is detailed next.
Fig. 5. (a) Computing the 1-D wavelet and scaling coefficients using filtering and decimation
with a 2-stage analysis filterbank, (b) corresponding decomposition tree showing the division
of signal spaces.
4.3.1 2-D extension for images
Instead of having a wavelet or filter which is a function of the two spatial dimensions of an
image, the filter can be separable, which allows a particular 1D filter to be applied to the rows
and columns of an image separately to gain the desired overall 2D response Lawson & Zhu
(2004). A separable filter for two dimensions may be denoted by:
H ( z1 , z2 ) = H ( z1 ) H ( z2 ),
(32)
where z1 and z2 relate to the spatial dimensions of an image. Therefore, the filters defined
for the 1D DWT may be applied separably to gain a 2D DWT representation for images. The
2-D DWT filterbank scheme for an N N image x (m, n ) is shown in Figure 6. Initially, the
filters Ho (z) and H1 (z) are applied to the rows of image x (m, n ), creating two images which
respectively contain the low and high frequency content of the image in question. After this,
both frequency bands are subsampled by a factor of 2, and are sent to the next set of filters for
filtering along the columns. After these bands have been filtered, decimation by a factor of 2
is again performed, but this time along columns. At the output of one level of decomposition,
as shown in Figure 6, there are four subband images of size N2 N
2 labeled LL, LH, HL and
HH. Using the separability concept, at scale j, these subbands may be computed by
LL j ( x, y) =
(33)
HL j ( x, y) =
(34)
LH j ( x, y) =
(35)
HH j ( x, y) =
(36)
m n
m n
m n
m n
193
The first letter of the subimages indicates the operation that was performed on the columns
(i.e. L is for lowpass filtering with Ho (z) and H is for highpass filtering with H1 (z))
whereas the last letter indicates which operation was performed on the rows. If more levels
Fig. 7. Graphical depiction of wavelet coefficient placement for two levels of decomposition.
corners of the square (the original image) are composed of localized high frequency content,
which is captured in the high frequency subbands in the wavelet domain, regardless of the
orientation (horizontal, diagonal, vertical). As texture is comprised of such localized high
frequency events, utilization of such a transform will be able to describe the textural events
as required. The diffusion of textural features or events will occur across subbands, which
194
allows features to be captured not only within subbands, but also across subbands.
For an example of the localization properties of wavelets in a medical image, as well as
the textural differences between normal and abnormal medical images, see Figure 9. The
normal images decomposition exhibits an overly homogeneous appearance of the wavelet
coefficients in the HH, HL and LH bands (which reflects the uniform nature of the original
image). The decomposition of the retinal image with diabetic retinopathy shows that each
of the higher frequency subbands localizes the retinopathy, which appears as heterogeneous
textured blobs (high-valued wavelet coefficients) in the center of the subband. This illustrates
how the DWT can localize the textural differences in medical images also how multiscale
texture may be used to discriminate between pathological cases . Similar results are obtained
with the small bowel and mammographic lesions, however, are not shown here due to space
constraints.
Another benefit of wavelet analysis is that the basis functions are scale-invariant.
Fig. 8. Left: original image. Right: one level of DWT of left image.
Scale-invariant basis functions will give rise to a localized description of the texture elements,
regardless of their size or scale, i.e. coarse texture can be made up of large textons, while fine
texture is comprised of smaller elementary units. Therefore, the DWT can handle both of these
scenarios.
Although the filterbank method is efficient, it requires a lot of filtering operations which is
computationally expensive. For more efficient implementations of the filterbank-based DWT,
the lifting-based approach is one such approach that is employed in the current framework
and detailed next.
4.4 Lifting-based DWT
To compute the DWT in an efficient manner, the lifting based approach is used Fernndez
et al. (1996) Sweldens (1995) Sweldens (1996). To increase computation speed, lifting based
approaches make optimal use of similarities which exist between the lowpass (H1 (z)) and
highpass (Ho (z)) filters. All 1D implementations will be later extended to 2D implementations
by lifting both the columns and the rows separately.
The lifting based DWT is an efficient scheme since it aims to implement complicated functions
with simple and invertible stages Zhang & Zeytinoglu (1999). Compared to the filterbank
method, the lifting based DWT method offers a less computationally expensive solution to
compute the DWT Zhang & Zeytinoglu (1999) Sweldens (1996).
The lifting based scheme relies on three operations to achieve the discrete wavelet transform:
195
Fig. 9. One level of DWT decomposition of retinal images. Left: normal image
decomposition. Right: decomposition of retinal image with diabetic retinopathy. Contrast
enhancement was performed in the higher frequency bands (HH, LH, HL) for visualization
purposes.
1) split, 2) predict and 3) update. These three operations which comprise the 1-D lifting
scheme, are shown in Figure 10, where S is the splitting function, P is the predictor function
and U is the update operation. As shown by Figure 10, the scaling and wavelet coefficients
(c j (n ) and d j (n )) are still from the previous levels coefficients, c j+1 (n ). Lifting may be also
applied separably to the rows and columns of an image to arrive at a 2D DWT.
The splitting operation divides the 1-D input string into even and odd samples, as denoted
by c j+1 (2n ) and c j+1 (2n + 1), respectively. Using digital signal processing, the even samples
may be obtained by decimating the original signal by a factor of 2, and the odd samples may
be obtained by subsampling a time shifted (single unit of time) version of the original signal
by 2. This is often referred to as the Lazy Wavelet Transform Fernndez et al. (1996).
196
4.4.2 Prediction
(37)
where P() is the predictor function. As stated earlier, the wavelet coefficients correspond to
the high frequency components which makes this operation equivalent to highpass filtering.
A good predictor function would produce small valued wavelet coefficients (ideally zero),
since the predicted version of the signal would be identical to the original. However, for
nonstationary signals (such as biomedical images) that have properties which change over
time, it is not possible to exactly predict the signal Zhang & Zeytinoglu (1999) and non-zero
wavelet coefficients can be expected. There are many different predictor functions which may
be used Maragos et al. (1984) Haijiang et al. (2004) Denecker et al. (1997), however, in order
to implement the forward wavelet transform, the interpolation function is chosen such that it
relates to the wavelet (t) Zhang & Zeytinoglu (1999).
4.4.3 Updating
In a lifting based DWT implementation, the scaling coefficients c j (n ) are computed as the sum
of the even-indexed samples (c j+1 (2n )) and an updated version of the wavelet coefficients
d j (n ) as shown below:
c j (n ) = c j+1 (2n ) + U (d j (n )),
(38)
where U () is the update function. This operation isolates the low frequency components
within the original signal. For images, lifting based DWT must be extended to two
dimensions. As shown earlier in the 2D DWT filterbank approach, 1D wavelet transforms
were applied separably to the images in order to gain a 2D DWT representation. This also
applies to lifting based schemes as well. By sequentially applying the lifting operation first to
the rows and then to the columns of an image, the forward transformation is achieved. The
forward operation is depicted in Figure 11.
Fig. 11. Lifting-based implementation of the DWT for two dimensional signals.
197
The integer wavelet which will be used is part of the Odd-Length Analysis/Synthesis
Filter (OLASF) family, where the number of filter taps in the FIR filter (for the filterbank
implementation) are odd Adams & Ward (2003). Additionally, biomedical images are high
resolution images, which results in large image sizes. Consequently, for these large-sized
images, a wavelet with fewer taps is desired so that the overall computational load may be
reduced. The 5/3 Le Gull wavelet will be used since the filter lengths are small (5 and 3
taps for the analysis low and highpass filters) and can warrant an efficient implementation
Marcellin et al. (2000) Zhang & Fritts (2004) . The 5/3 filter coefficients are listed in Table ??.
i
0
1
2
Table 1. Analysis and synthesis filter coefficients for the 5/3 wavelet.
Using the 5/3 integer wavelet, the highpass details d j (n ) can be computed using a lifting
based approach:
c j+1 (2n ) + c j+1 (2n + 2)
,
(39)
d j (n ) = c j+1 (2n + 1)
2
where X is the greatest integer less than or equal to X. The low frequency, average
coefficients c j (n ) may be found using an update function
c j (n ) = c j+1 (2n ) +
d j ( n ) + d j ( n 1) + 2
.
4
(40)
For reconstruction, the reverse DWT can be found by reversing the arithmetic operations of
the forward transform. This is shown below:
d j ( n ) + d j ( n 1) + 2
c j+1 (2n ) = c j (n ) +
,
(41)
4
c j+1 (2n ) + c j+1 (2n + 2)
c j+1 (2n + 1) = d j (n )
.
(42)
2
These equations may be applied separably to the images in order to gain a 2-D DWT
representation.
198
original images coefficients. For instance, the DWT of an input biomedical image f ( x, y)
can be shown as:
f ( x, y) DWT F(k1 , k2 , j )
where F(k1 , k2 , j ) are the 2-D DWT coefficients at scale j. A shift of the image will result in a
different set of coefficients
f ( x + x, y + y) DWT F(k1 , k2 , j )
Mean Variance 2
-0.050537
97.017
-0.051025
100.42
0.057861
96.82
0.058350
98.383
Table 2. Mean and variance 2 of the DWT coefficients of the LH band for circular
translates (x, y) of Figure 12.
transform (SIDWT) on the input image f ( x, y)
f ( x, y) SIDWT F(k1 , k2 , j )
199
f ( x + x, y + y) SIDWT F(k1 , k2 , j )
200
For different shifts of the input image, it was shown that the DWT can produce one of four
possible representations after one level of decomposition. These four DWT coefficient sets
(cosets) are not translated versions of one another and each coset may be generated as the
DWT response to one of four shifts of the input: (0, 0), (0, 1), (1, 0), (1, 1), where the first
index corresponds to the row shift and the second index is the column shift. All other shifts
of the input (at this decomposition level) will result in coefficients which are shifted versions
of one of these four cosets. Therefore, to account for all possible representations, these four
cosets may be computed for each level of decomposition. This requires the LL band from each
level to be shifted by the four translates {(0, 0), (0, 1), (1, 0), (1, 1)} and each of these new
images to be separately decomposed to account for all representations.
To compute the coefficients at the jth decomposition level, for the input shift of (0, 0), the
subbands LL j , LH j , HL j , HH j may be found by filtering the previous levels coefficients LL j+1 ,
as shown below:
j
LL (0,0) ( x, y) =
j
LH(0,0) ( x, y) =
j
HL (0,0) ( x, y) =
j
HH(0,0) ( x, y) =
(43)
(44)
(45)
(46)
m n
m n
m n
m n
The subband expressions listed in Equation 43 through to Equations 46 contain the coefficients
which would appear the same if LL j+1 is circularly shifted by {0, 2, 4, 6, , s} rows and
{0, 2, 4, 6, , s} columns, where s is the number of row and column coefficients in each of
the subbands for the level j + 1.
The subband coefficients which are the response to a shift of (0,1) in the previous levels
coefficients may be computed by
j
LL (0,1) ( x, y) =
j
LH(0,1) ( x, y) =
j
HL (0,1) ( x, y) =
j
HH(0,1) ( x, y) =
(47)
(48)
(49)
(50)
m n
m n
m n
m n
which contain all the coefficients for {0, 2, 4, 6, , s} row shifts and {1, 3, 5, 7, , s 1}
column shifts of LL j+1 . Similarly, for a shift of (1,0) in the input, the DWT coefficients may be
found by
j
LL (1,0) ( x, y) =
j
LH(1,0) ( x, y) =
(51)
(52)
m n
m n
201
HL (1,0) ( x, y) =
j
HH(1,0) ( x, y) =
(53)
(54)
m n
m n
which contain all the coefficients if the previous levels coefficients LL j+1 are shifted by
{1, 3, 5, 7, , s 1} rows and {0, 2, 4, 6, , s} columns. For an input shift of (1,1), the
subbands may be computed by
j
LL (1,1) ( x, y) =
j
LH(1,1) ( x, y) =
j
HL (1,1) ( x, y) =
j
HH(1,1) ( x, y) =
(55)
(56)
(57)
(58)
m n
m n
m n
m n
Similarly, these subband coefficients account for all DWT representations, which correspond
to {1, 3, 5, 7, , s 1} row shifts and {1, 3, 5, 7, , s 1} column shifts of the input subband
LL j+1 .
Performing a full decomposition will result in a tree which contains the DWT coefficients for
all N 2 circular translates of an N N image. At each level of decomposition, the LL band is
shifted four times, and for each shift (0, 0), (0, 1), (1, 0), (1, 1), four new sets of subbands are
generated. The decomposition tree is shown in Figure 13 and each circular node corresponds
to only three subband images: HH, LH and HL, since at each level the LL band is shifted
and then further decomposed. The number of coefficients in each node (per decomposition
level) remains constant at 3N 2 , and a complete decomposition tree will have N 2 (3log2 N + 1)
elements Liang & Parks (1994). To compute the DWT for all N 2 translates of the image costs
O( N 2 log2 N ), due to the periodicity of the rate change operators Liang & Parks (1998).
To achieve shift-invariance, a subset of the wavelet coeffieints in the Tree of Figur e13 must be
chose in a consistent manner. To do this, metrics can be computed from the tree. This requires
an organized way to address each of the coefficients. A proper addressing scheme will help
to find the wavelet transform for a particular translate (m, n ), where m is the row shift and n
is the column translate of the input image.
For a path in the tree, which originates from the root, terminates at a leaf node and corresponds
to the translate (m, n ), an expression may be developed which considers all row shifts and all
column shifts as binary vectors, where each vector entry can be either 0 or 1. Therefore, the
binary expansions may be rewritten as
m=
log2 N
i =1
n=
log2 N
i =1
a i 2i 1 ,
(59)
b i 2i 1 ,
(60)
202
..
..
..
..
..
Fig. 13. Shift-invariant DWT decomposition tree for three decomposition levels.
where ai and bi correspond to the binary symbol which represents the row and column shift
at decomposition level i, respectively. In order to find the three subimages (HL, HH and
LH) which correspond to the translate (m, n ) in the K th decomposition level in the tree, it is
necessary to find the S th node which corresponds to this shift, as shown below
S = 2
i =1
i =1
a i 4K i + b i 4K i .
(61)
After the three subimages are located within the tree, to ensure that they correspond to
the translate of the input by (m, n ), these three images (HH, LH, HL) must be shifted by
(xShift, yShift)
xShift =
log2 N
i = K +1
yShift =
log2 N
i = K +1
a i 2i K 1 ,
(62)
b i 2i K 1 .
(63)
This scheme allows us to address the wavelet coefficients that correspond to a particular shift
of the input. The following section, which focuses on Coifmen and Wickenhausers best
basis selection technique Coifman & Wickerhauser (1992), is focused on a method to select
a consistent set of wavelet coefficients which are independent of the input translation. Since
the same coefficients are selected every time the algorithm is run, regardless of any initial
offset, shift-invariance is achieved.
203
Coifmen and Wickerhauser defined a method to choose a set of basis functions, based on
the minimization of a cost function J Coifman & Wickerhauser (1992). The cost function J
is often called an information cost and it evaluates and compares the efficiency of many
basis sets Coifman & Saito (1995). Although there are many choices for cost functions, an
additive information cost is preferred so that a fast-divide and conquer tree search algorithm
may be used to find the best set of wavelet coefficients Liang & Parks (1994). A cost function
J is additive if it maps a sequence { xi } to R while ensuring that the following properties are
always true:
J (0) = 0,
J ({ xi }) =
J ( x i ).
(64)
(65)
To choose a consistent set of wavelet coefficients, an entropy cost function J is used for
best basis determination. Entropy gives insight about the uniformity of the coefficients
representation (maximum energy compaction), which may be used for texture analysis.
Furthermore, entropy is beneficial since it can achieve additivity Coifman & Saito (1995).
Shown below is the expression of entropy which is minimized:
hr ( x ) =
| xi |r log| xi |r ,
(66)
Fig. 14. Best basis selection corresponding to the minimum cost path.
Figures 13 and 14) and work upwards. For each parent node, there are four child nodes, each
containing the high frequency subbands of a particular translate. The cost A of a particular
translate ( p, q ) {(0, 0), (0, 1), (1, 0)(1, 1)} at some node is computed by summing the cost of
the individual high frequency subbands for that shift:
(67)
204
To minimize entropy, the node with the minimum cost for each parent would be selected at
every decomposition level. The path which is connected from the root of the tree all the way
down to the leaves, is selected as the the minimum cost path, as shown in Figure 14. This path
corresponds to the DWT of a particular translate and is chosen as the consistent set of basis
functions in order to achieve shift-invariance.
P ( l1 , l2 )
,
L 1 L 1
l1 =0 l2 =0 P ( l 1 , l 2 )
(68)
L 1 L 1
l1 =0 l2 =0
p2 (l1 , l2 , d, )
(69)
205
e( ) =
L 1 L 1
l1 =0 l2 =0
(70)
These features describe the relative uniformity of textured elements in the wavelet domain
(which are localized with good results due to the space-frequency resolution of the bases).
Recall that abnormal and normal cases were shown to have significant differences in terms
of their texture uniformity (normal images contained smooth texture while abnormal images
were heterogeneous). Therefore, such a scheme, which captures textural differences between
images, should be able to arrive at high classification results for CAD (i.e. the classification of
normal and abnormal retinal and small bowel images, and differentiation between malignant
and benign lesions in the mammogram images).
For each decomposition level j, more than one directional feature is generated for the HH
and LL subbands. The features in these subbands are averaged so that: features are not
biased to a particular orientation of texture and the representation will offer some rotational
invariance. The features generated in these subbands (HH and LL) are shown below (note
that the quantity in parenthesis is the angle at which the GCM was computed):
1 j
j
j
hHH =
hHH (45 ) + hHH (135 ) ,
2
1 j
j
j
eHH =
eHH (45 ) + eHH (135 ) ,
2
1 j
j
j
j
j
hLL =
hLL (0 ) + hLL (45 ) + hLL (90 ) + hLL (135 ) ,
4
1 j
j
j
j
j
eLL =
eLL (0 ) + eLL (45 ) + eLL (90 ) + eLL (135 ) .
4
As a result, for each decomposition level j, two feature sets are generated:
j
j
j
j
j
hHH ,
hLL ,
Fh = hHL (0 ), hLH (90 ),
j
j
j
j
j
eHH ,
eLL ,
Fe = eHL (0 ), eLH (90 ),
(71)
(72)
j
j
j
j
hLL , eHH and
eLL are the averaged texture descriptions from the HH and LL
where
hHH ,
j
band previously described and hHL (0 ), eHL (0 ), hLH (90 ) and eLH (90 ) are homogeneity and
entropy texture measures extracted from the HL and LH bands. Since directional GCMs are
used to compute the features in each subband, the final feature representation is not biased for
a particular orientation of texture and may provide a semi-rotational invariant representation.
7. Classification
After the multiscale texture features have been extracted, a pattern recognition technique
is needed classify the features. A large number of test samples are required to evaluate
a classifier with low error (misclassification) rates since a small database will cause the
parameters of the classifiers to be estimated with low accuracy. This requires the biomedical
image database to be large, which may not always be the case since acquiring the images
for specific diseases can take years. If the extracted features are strong (i.e. the features
are mapped into nonoverlapping clusters in the feature space) the use of a simple (linear)
206
classification scheme will be sufficient in discriminating between classes. The desire is to test
the robustness of the found feature set to the variations found in image databases. This can be
easily determined by a linear classifier.
To satisfy the above criteria, linear discriminant analysis (LDA) will be the classification
scheme used in conjunction with the Leave One Out Method (LOOM). In LOOM, one sample is
removed from the whole set and the discriminant functions are derived from the remaining
N 1 data samples and the left out sample is classified. This procedure is completed for all N
samples. LOOM will allow the classifier parameters to be estimated with least bias Fukunaga
& Hayes (1989).
8. Results
The objective of the proposed system is to automatically classify pathologies based on their
textural characteristics. Such a system examines texture in accordance to the human texture
perception model and is shown in Figure 15.
Fig. 15. System block diagram for the classification of medical images.
The classification performance of the proposed system is evaluated for three types of imagery:
1. Small Bowel Images: 41 normal and 34 abnormal (submucosal masses, lymphomas,
jejunal carcinomas, multifocal carcinomas, polypoid masses, Kaposis sarcomas, etc.),
2. Retinal Images: 38 normal, 48 abnormal (exudates, large drusens, fine drusens, choroidal
neovascularization, central vein and artery occlusion, arteriosclerotic retinopathy,
histoplasmosis, hemi-central retinal vein occlusion and more),
3. Mammograms: 35 benign and 19 malignant lesions.
The image specifications are shown in Table 3 and example images were shown earlier in
Section 2. Only the luminance plane was utilized for the colour images (retinal and small
bowel), in order to examine the performance of grayscale-based features. Furthermore, in
the mammogram images, only a 128 128 region of interest is analyzed which contains the
candidate lesion (to strictly analyze the textural properties of the lesions). Features were
Small Bowel
Colour (24 bpp)
Lossy (.jpeg)
256 256
Retinal
Colour (24 bpp)
Lossy (.jpeg)
700 605
Mammogram
Grayscale (8 bpp)
Raw (.pgm)
1024 1024
207
In order to find the optimal sub-feature set, an exhaustive search was performed (i.e. all
possible feature combinations were tested using the proposed classification scheme). For
the small bowel images, the optimal classification performance was achieved by combining
homogeneity features from the first and third decomposition levels with entropy from the first
decomposition level (see Khademi & Krishnan (2006) for more details):
h1HH ,
h1LL ,
(73)
Fh1 = h1HL (0 ), h1LH (90 ),
Fh3 = h3HL (0 ), h3LH (90 ),
h3HH ,
h3LL ,
(74)
e1HH ,
e1LL , .
(75)
Fe1 = e1HL (0 ), e1LH (90 ),
The optimal feature set for the retinal images were found to be homogeneity features from
the fourth decomposition level with entropy from the first, second and fourth decomposition
levels (see Khademi & Krishnan (2007) for more details):
h4HH ,
h4LL ,
(76)
Fh4 = h4HL (0 ), h4LH (90 ),
e1HH ,
e1LL ,
Fe1 = e1HL (0 ), e1LH (90 ),
(77)
Fe2 = e2HL (0 ), e2LH (90 ),
e2HH ,
e2LL ,
(78)
Fe4 = e4HL (0 ), e4LH (90 ),
e4HH ,
e4LL , .
(79)
Lastly, the optimal feature set for the mammographic lesions were found by combining
homogeneity features from the second decomposition level with entropy from the fourth
decomposition level:
Fh2 = h2HL (0 ), h2LH (90 ),
h2HH ,
h2LL ,
(80)
e4HH ,
e4LL . .
(81)
Fe4 = e4HL (0 ), e4LH (90 ),
Using the above features in conjunction with LOOM and LDA, the classification results for
the small bowel, retinal and mammogram images are shown as a confusion matrix in Table 4,
Table 5 and Table 6, respectively.
Normal
Abnormal
Normal
35 (85%)
5 (15%)
Abnormal
6 (15%)
29 (85%)
Normal
Abnormal
Normal
30 (79%)
7 (14.6%)
Abnormal
8 (21%)
41 (85.4%)
208
Benign
Malignant
Benign
28 (80%)
8 (42%)
Malignant
7 (20%)
11 (58%)
9. Conclusions
A total of 75 abnormal and normal bowel images were correctly classified at an average rate of
85%, 86 retinal images had an average classification accuracy of 82.2% and the mammogram
lesions (54) were classified correctly 69% on average. The classification results are quite high,
considering that the system wasnt tuned for a specific modality. The system performed well,
even though: (1) pathologies came in various orientations, (2) pathologies arose in a variety
of locations in the image, (3) the masses and lesions were of various sizes and shapes and
(4) there was no restriction on the type of pathology for the retinal and small bowel images.
Accounting for all these scenarios in one algorithm was a major challenge while designing
such a unified framework for computer-aided diagnosis.
Although the classification results are high, any misclassification can be accounted to cases
where there is a lack of statistical differentiation between the texture uniformity of the
pathologies. Additionally, normal tissue can sometimes assume the properties of abnormal
regions; for example, consider a normal small bowel image which has more than the average
amount of folds. This may be characterized as non-uniform texture and consequently would
be misclassified. In a normal retinal image, if the patient has more than the average number
of vessels in their eye, this may be detected as oriented or heterogeneous texture and could
be misclassified. Moreover, when considering the mammogram lesions, the normal breast
parenchyma is overlapping with the lesions and also assumes some textural properties itself.
In order to improve the performance of the mammogram lesions, a segmentation step could
be applied prior to feature extraction.
Another important consideration arises from the database sizes. As was stated in Section 7, the
number of images used for classification can determine the accuracy of the estimated classifier
parameters. Since only a modest number of images were used, misclassification could result
due to the lack of proper estimation of the classifiers parameters (although the scheme tried
to combat this with LOOM). This could be the case for the mammogram lesions especially,
since the number of benign lesions outnumbered the malignant lesions by almost double this could have caused difficulties in classification parameter accuracy. Additionally, finding
the right trade off between number of features and database size is an ongoing research topic
and has yet to be perfectly defined Fukunaga & Hayes (1989).
The overall success of the system is a result of the design of the algorithm, which aimed
to account for all the pathological scenarios previously described. Firstly, the utilization of
the DWT was important to gain a space-localized representation of the images elementary
texture units (textons), which is in accordance to human texture perception. Secondly,
the choice of wavelet-based statistical texture measures (entropy and homogeneity) was
critical in quantifying the localized texture properties of the images (which provided
discrimination between normal and other pathological cases). Utilization of the SIDWT
allowed for the extraction of consistent (i.e. shift-invariant) features. Furthermore, due to
the scale-invariant basis functions of the DWT, pathologies of varying sizes were captured
within one transformation (i.e. the features were scale-invariant).
By design, the system is relatively robust to pathologies which occurred in various
209
10. References
Adams, M. D. & Ward, R. K. (2003).
Symmetric-extension-compatible reversible
integer-to-integer wavelet transforms, IEEE Transactions on Signal Processing
51(10): 26242636.
Armstrong, A. & Jiang, J. (2001). An efficient image indexing algorithm in JPEG compressed
domain, International Conference on Image Processing, pp. 350351.
Beylkin, G. (1992). On the representation of operators in bases of compactly supported
wavelets, SIAM Journal of Numerical Analysis 29: 17161740.
Bradley, A. (2003). Shift-invariance in the discrete wavelet transform, Digital Image
Computing: Techniques and Applications, pp. 29 38.
Brandon, L. & Hoover, A. (2003). Drusen detection in a retinal image using multi-level
analysis, Vol. 1, MICCAI, pp. 618625.
Burrus, C., Gopinath, R. & Guo, H. (1998). Introduction to Wavelets and Wavelet Transforms - A
Primer, Prentice Hall International, Inc., Houston, Texas.
Chang, S. (1995).
Compressed-domain techniques for image/video indexing and
manipulation, Vol. 1, International Conference on Image Processing, pp. 314 317.
Cheng, H. D., Shi, X. J., Min, R., Hu, L. M., Cai, X. P. & Du, H. N. (2006). Approaches
for automated detection and classification of masses in mammograms, Pattern
Recognition 39(4): 646668.
Chiu, C., Wong, H. & Ip, H. H. S. (2004). Compressed domain feature transformation using
evolutionary strategies for image classification, Vol. 1, International Conference on
Image Processing, pp. 429 432.
210
211
Kim, B., Park, S., Jee, C. & Yoon, S. (2005). An earthworm-like locomotive mechanism for
capsule endoscopes, IEEE/RSJ International Conference on Intelligent Robots and
Systems, pp. 2997 3002.
Lawson, S. & Zhu, J. (2004). Image compression using wavelets and JPEG2000: a tutorial,
Electronics & Communication Engineering Journal 14(3): 112121.
Lee, J. K. T. (2007). Interpretation accuracy and pertinence, American College of Radiology
4: 162165.
Leung, M. M. & Peterson, A. M. (1992). Scale and rotation invariant texture classification,
Vol. 1, Conference Record of The Twenty-Sixth Asilomar Conference on Signals,
Systems and Computers, pp. 461 465.
Liang, J. & Parks, T. W. (1994). A two-dimensional translation invariant wavelet representation
and its applications, Vol. 1, IEEE International Conference on Image Processing,
pp. 66 70.
Liang, J. & Parks, T. W. (1996). Translation invariant wavelet transforms with symmetric
extensions, IEEE Digital Signal Processing Workshop pp. 6972.
Liang, J. & Parks, T. W. (1998). Image coding using translation invariant wavelet transforms
with symmetric extensions, IEEE Transactions on Image Processing 7: 762 769.
Mallat, S. (1989).
A theory for multiresolution signal decomposition: The wavelet
representation, IEEE Transactions on Pattern Analysis and Machine Intelligence
11(7): 674693.
Mallat, S. (1998). Wavelet Tour of Signal Processing, Academic Press, USA.
Mallat, S. G. & Zhang, Z. (1993). Matching pursuits with time-frequency dictionaries, IEEE
Transactions on Signal Processing 41: 3397 3415.
Maragos, P., Mersereau, R. & Shafer, R. (1984). Two-dimensional linear prediction and its
application to adaptive predictive coding of images, IEEE Transactions on Acoustics,
Speech, and Signal Processing 32: 12131229.
Marcellin, M. W., Bilgin, A., Gormish, M. J. & Boliek, M. P. (2000). An overview of JPEG-2000,
Proceedings of the IEEE Data Compression Conference, IEEE Computer Society, p. 523.
Mudigonda, N., Rangayyan, R. & Desautels, J. (2000). Gradient and texture analysis for
the classification of mammographic masses, IEEE Transactions on Medical Imaging
19(10): 10321043.
Rangayyan, R. (2005). Biomedical Image Analysis, CRC Press LLC, United States of America.
Rangayyan, R. M., El-Faramawy, N. M., Desautels, J. E. L. & Alim, O. (1997). Measures of
acutance and shape for classification of breast tumors, IEEE Transactions on Medical
Imaging 16(6): 799 810.
Ross, S. (2003). Introduction to Probability Models, Academic Press, USA, California.
Sato, T., Abe, N., Tanaka, K., Kinoshita, Y. & He, S. (2006). Toward developing multiple
organs and diseases diagnosing intellectual system referring to knowledge base and
ct images, IEEE Symposium on Computer-Based Medical System, pp. 16.
Simoncelli, E. P., Freeman, W. T., Adelson, E. H. & Heeger, D. J. (1992). Shiftable multiscale
transforms, IEEE Transactions on Information Theory 38: 587 607.
Sinthanayothin, C., Kongbunkiat, V., Phoojaruenchanachai, S. & Singalavanija, A. (2003).
Automated screening system for diabetic retinopathy, Vol. 2, Proceedings of the 3rd
International Symposium on Image and Signal Processing and Analysis, pp. 915
920.
212
Part 4
Industrial Applications
0
11
Discrete Wavelet Transforms for Synchronization
of Power Converters Connected to Electrical Grids
Alberto Pigazo and Vctor M. Moreno
University of Cantabria
Spain
1. Introduction
Electronic power converters connected to electrical grids allow industrial processes, traction
applications and home appliances to be improved by controlling the energy flow depending
on the operation conditions of both the electrical load and the grid. This is the case of variable
frequency drives, which can be found in pump drives or ship propulsion systems (Bose, 2009)
maintaining the electrical machine in the required operation state while ensuring a proper
current consumption from the electrical grid. Recent researching and developing efforts on
grid-connected power converters are due to the integration of renewable energy sources
in electrical grids, which requires the implementation of new functionalities, such as grid
support, while maintaining reduced current distortion levels and an optimal power extraction
from the renewable energy source (Carrasco et al., 2006; Liserre et al., 2010).
In the most general case, a grid-connected power converter consists of power and control
stages which ensures the appropriate energy management (Erickson & Maksimovic, 2001;
Mohan et al., 2003). In the first one, electronic power devices, such as power diodes, thyristors,
insulated gate bipolar transistors (IGBTs) or MOS-controlled thyristors (MCTs), and passive
elements (inductances and capacitors) are found. The switching state of the power devices
allows the voltage or/and current across the passive components to be controlled. Resistive
behaviors must be minimized in order to avoid conduction power losses. The second stage,
in case of controlled semiconductor devices, consists of a signal conditioning system and the
required hardware for implementation of the converter controller (Bose, 2006).
Recent advances in field programmable gate arrays (FPGAs) and digital signal processors
(DSPs) allow the complexity and functionalities of the controllers employed in power
converters to be increased and improved (Bueno et al., 2009). In grid-connected power
converters these functionalities include, in most cases, the synchronization with the electrical
grid, the evaluation of the reference current amplitude at the grid-side and current control
(Kazmierkowski et al., 2002). The amplitude and phase of the grid-side current depends
on the reference current evaluation and the synchronization subsystems while the current
controller ensures that the current waveform matches the reference one. The implementation
of these subsystems depends on the application characteristics. Other functionalities, such as
grid support (Ullah et al., 2009) or detection of the islanding condition (De Mango, Liserre &
DAquila, 2006; De Mango, Liserre, DAquila & Pigazo, 2006), can be added if it is required.
These controller functionalities can be implemented by applying diverse approaches, such as
digital signal processing techniques, i.e. Fourier Transforms (McGrath et al., 2005), Kalman
216
Filters (Moreno et al., 2007) or Discrete Wavelet Transforms (DWTs) (Pigazo et al., 2009).
Frequency and time localization of wavelet analysis allow the performance of controllers in
grid-connected power converters to be improved. This is the case of active power filters, where
the compensation reference current can be evaluated by means of DWT (Driesen & Belmans,
2002), modulation techniques in controlled rectifiers (Saleh & Rahman, 2009) or the controller
design process using averaging models of power converters (Gandelli et al., 2001).
This book chapter proposes to take advantage of DWTs properties in order to improve the
synchronization subsystem of controllers in grid-connected power converters. After a review
of the state of art in wavelet analysis applied to power electronics, the main characteristics
of controllers in grid-connected power converters are presented as well as the new approach
for synchronization purposes. Results validating the proposal, considering diverse operation
conditions, are shown.
The power system protection can be improved by applying the wavelet analysis to activate the
relays in case of power system transients. Time resolution capability of the wavelet analysis
is employed in (Chaari et al., 1996) for detection of earth faults in case of a 20 kV resonant
grounded network. High impedance faults identification and protection of transformers and
generators by means of wavelets are also shown in (Solanki et al., 2001) and (Eren & Devaney,
2001) respectively. In this last case, the frequency resolution of wavelets allows the changes of
the power signals spectra to be measured in order to detect the degradation of the insulation
and identify internal and external faults. Wavelets have been also employed for modeling of
electrical machines in wind turbines and detection of turn-to-turn rotor faults (Dinkhauser &
Fuchs, 2008).
The evaluation of the electrical power quality (PQ) can take advantage of wavelet analysis
for detection and measurement of interferences, impulses, notches, glitches, interruptions,
harmonics, flicker and other disturbances. In case of harmonic currents/voltages and voltage
flicker the multiresolution analysis (MRA) using wavelet filter banks (Pham & Wong, 1999;
Pham et al., 2000) and continuous wavelet transforms (Zhen et al., 2000) can be applied. The
propagation of power system transients can be also analyzed by means of wavelets (Heydt &
Galli, 1997; Wilkinson & Cox, 1996). The characteristics of partial discharges (short duration,
high frequency and low amplitude) make it difficult to detect. Wavelet analysis allows partial
discharges to be detected due to its time resolution, as it is shown in (Shim et al., 2000) in case
of transformer windings and cables.
217
The efficient management of electrical power system requires a proper forecasting of electrical
loads. The combination of wavelets and neural networks in (Huang & Yang, 2001; Yao et al.,
2000) allows it by considering the current waveforms as a linear combination of different
frequencies. The wavelet analysis can also be applied for measurement of the electrical
active/reactive power and the root mean square (rms) value of line voltages and currents
on a frequency band basis (Hamid & Kawasaki, 2001).
2.2 Wavelet analysis in controllers for power converters
Wavelets have been recently applied in power converters used in diverse applications. The
covered functionalities include modeling of the power converter, its control and supervision
tasks.
In order to obtain a more flexible model of a dc/dc power converter, wavelets are applied in
(Ponci et al., 2009) for detection of the operation mode of the power converter, consisting on
an extension of conventional analysis techniques based on state-space averaging. A model of
a dc/ac converter based on wavelets is obtained in (Gandelli et al., 2002) in order to perform
a detailed analysis and the optimization of the power converter.
Wavelet-based controllers have been also proposed in literature in order to improve the
performance of the power converter. This is the case of (Hsu et al., 2008), where a
wavelet-based neural network is employed in order to minimize the impact of input voltage
and load resistance variations on a dc/dc converter. In (Saleh & Rahman, 2009) wavelets allow
a new switching strategy to be developed in order to reduce the harmonic content of the
output voltage in a ac/dc converter maintaining unity power factor. A three-phase induction
generator (IG) system for stand-alone power systems is controlled by means of one ac/dc plus
one dc/ac converter and applying a recurrent wavelet neural network (RWNN) controller
with improved particle swarm optimization (IPSO) (Teng et al., 2009). The controllers in
dc/ac converters can be optimized by applying wavelets, this is the case of (Mercorelli et al.,
2004), where it is employed for optimization of the applied model predictive controller. The
wavelet analysis is applied in (Gonzlez et al., 2008) in order to evaluate the performance of
the employed modulation technique, including the spectrum of the converter output voltage
and its ripple. Controllers in multilevel converters can also take advantage of wavelets, as it
is shown in (Iwaszkiewicz & Perz, 2007), in order to ensure a better and faster adaptation of
their output voltage waveforms to sine waveforms and reduce the harmonic distortion of the
output voltage at relatively low switching frequencies. High-level control functionalities, such
as islanding detection or source impedance measurement, required in distributed generation
systems connected to electrical grids can also obtain benefits from the wavelet analysis. The
high frequency bands of voltage and current waveforms are evaluated in (Pigazo et al.,
2009; 2007) in order to detect the islanding condition. The power system impedance can
be measured in real-time by injecting a controlled disturbance into the electrical grid, the
wavelet analysis allows an fast detection of faults (Sumner et al., 2006).The wavelets can be
also applied for characterization of power converters performance. In (Knezevic et al., 2000)
the wavelet analysis is applied for measurement of transients caused by ac/dc converters.
218
switch on and off the controlled power devices, an LCL-filter, employed as a second-order low
pass filtering stage which allows the high frequency ripple of the full-bridge output voltage to
be filtered out and a dc-side filtering stage, which can be implemented by means of one shunt
capacitor (first order) or a series inductance plus a shunt capacitor (second order).
single-phase grid-connected power converter
dc
load or bus
R1
Cdc
R2
L1
C3
L2
ig
vg
R3
dc-side
filter
gate circuit
IGBT
H-bridge
signal
conditioning
acquisition
Reference
Current
Evaluation
signal
conditioning
PWM
i*
current
controller
grid
sin t synchronization
ig
vg
acquisition
digital
controller
219
Finally, the control action must be applied to the gate circuitry of the H-bridge, where square
signal waveforms with variable width are required. In order to obtain these variable switching
patterns, diverse approaches can be also found. A detailed description of these techniques is
available in (Holmes & Lipo, 2003)
220
The Controller block of the software PLL is commonly implemented as a PI controller or, more
generally, as a first order or second order low pass filter, however, recent researching works on
DWTs for control applications suggest that the performance of PI controllers can be improved
by using DWTs (Parvez & Gao, 2005). The proposed software PLL substitutes the PI controller
by a DWT implemented using filter banks. The inner structure of the Controller in case of the
proposed software PLL is shown in Fig. 3. As it can be seen, it consists of one Buffer, where
2 L samples of the input are buffered to be analyzed and L is the number of decomposition
levels, the Dyadic Analysis Filter Bank from the Signal Processing Blockset in MatLab/Simulink,
which generates an output vector containing the output at each sub-band. Then, the loop
gains, contained in the Constant Diagonal Matrix Block and needed to adjust the response of
the proposed software PLL, are applied. Finally, the Controller output signal is obtained by
adding the current output of the previous stage at each sub-band.
5. Simulation results
In order to analyze the performance of the proposed synchronization block diverse simulation
tests have been carried out. After the selection of the most suitable mother wavelet considering
diverse decomposition levels and operation conditions, the proposed synchronization system
is employed in order to control a dc machine by means of a grid connected controlled
rectifier.
The applied tests include step amplitude variations of the voltage grid from 23 2 V to 230 2
V and step frequency variations from 47.5 Hz to 52.5 Hz, in both cases including a 7% 5th
voltage harmonic. The employed sampling frequency is 6.4 kHz.
221
(a)
(b)
0.015
Error
Error
0.015
0.01
0.005
0.01
0.005
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
30
20
10
GA3
40
50
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
1.5
10
30
20
GA3
40
50
1.5
Ripple
Ripple
30
20
GA3
(d)
(c)
1
0.5
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
10
30
20
10
GA3
40
50
1
0.5
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
40
50
Fig. 4. L=3. a) Magnitude of the WPLL average error after the voltage amplitude step, b)
magnitude of the WPLL average error after the fundamental grid frequency step, c) ripple of
the WPLL error after the voltage amplitude step and d) ripple of the WPLL error after the
voltage amplitude step.
5.1 Selection of the mother wavelet
The selection of the most suitable mother wavelet has been carried out considering
decomposition levels (L) in the range [3, 6]. At each decomposition level, diverse values of
the WPLL loop gains have been applied under the operation conditions described previously.
The obtained results, the average error magnitude of the WPLL and the ripple of this error,
have been measured 0.5 s after each transient in order to compare the performance of each
mother wavelet. Figs. 4, 5, 6 and 7 show the obtained results for L in [3, 6].
From Fig. 4, the best results at L = 3 are obtained by applying a Daubechies 7 mother wavelet
with a loop gain at the lowest frequency sub-band GA3 = 39. In this case, the cumulative
measured average error of the WPLL falls to 1.1 103 V after the first transient, which reaches
5.2 103 V after the frequency step. The error ripple measured after the grid voltage transients
are 0.23 V and 0.16 V. The worst results are obtained in case of Daubechies 4 at GA3 , reaching
cumulative average errors of 2.0 102 V after both grid voltage transients. In case of the
measured error ripple, it decreases up to 5.4 103 V and 4.6 102 respectively but, as it will
be shown in the following subsection, the phase of the input signal is not tracked accurately
due to the average error.
In case of four decomposition levels (L = 4, in Fig. 5), the most suitable mother wavelet is
Haar applying GA4 = 43. The obtained cumulative average errors after each transient of the
grid voltage are 1.1 103 V and 2.0 103 respectively. The measured ripples are 0.14 V and
0.16 V respectively. In comparison to the obtained results for L = 3, in this case (L = 4)
the cumulative error after the grid frequency transient is reduced to 38%. The comparisson
of the measured ripples using L = 3 and L = 4 shows that, after the first transient, L = 4
with Haar wavelets results on better results. The worst results in case of L = 4 are obtained
222
(a)
0.015
Error
Error
0.015
0.01
0.005
0.01
0.005
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
30
20
10
GA4
40
50
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
10
30
20
GA4
40
50
1.5
Ripple
1.5
Ripple
30
20
GA4
(d)
(c)
1
0.5
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
10
30
20
10
GA4
40
50
1
0.5
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
40
50
Fig. 5. L=4. a) Magnitude of the WPLL average error after the voltage amplitude step, b)
magnitude of the WPLL average error after the fundamental grid frequency step, c) ripple of
the WPLL error after the voltage amplitude step and d) ripple of the WPLL error after the
voltage amplitude step.
by employing Coifet 5 as mother wavelet with GA4 = 21. The obtained cumulative average
errors are 2.0 102 V and 8.6 103 V while the measured ripples reach 1.1 102 V and 1.5
V.
From Fig. 6, again Haar wavelets, in this case with GA5 = 29, result on the best tracking of the
applied grid voltage. The measured cumulative average errors were 8.1 104 V and 2.6 103
V after the amplitude and frequency steps respectively while, in case of the error ripple, the
measured values were 0.14 V and 0.12 V. Comparing these results to the ones obtained in case
of L = 4, the cumulative average error decreases after the amplitude step of the grid voltage
due to the added fifth decomposition level. The worst results at L = 5 are obtained for symlet
8, where the cumulative average errors after the transients are 2 10.2 V and 1.7 102 V. The
measured error ripples are 2.4 102 and 0.16 V.
Again in case of L = 6 (Fig. 7), Haar wavelets with GA6 = 22 allow the best tracking
performance to be reached. The measured cumulative average errors in this case were
6.4 104 V and 9.3 104 V corresponding to amplitude and frequency transients respectively,
which improves the obtained results in case of L = 5. The measured error ripples were 0.16 V
and 0.22 V. The worst results were obtained in case of the mother wavelet Biorthonormal 4.4,
with cumulative average errors equal to 2.0 102 V and 1.62 V. The error ripple reached
0.02 V and 0.67 V for each grid voltage transient.
The evolution of the frequency measurement obtained by means of the WPLL in case of L = 3,
Daubechies 7, GA3 = 39 and GD3 = 30 is shown in Fig. 8.a where the response time of the
WPLL is 305 ms. Response times with Haar wavelet and four (GA4 = 43, GD4 = 18.5) and
five (GA5 = 29, GD5 = 4.5 and GD4 = 3) decomposition levels are shown in Fig. 8.b and 8.c.
In these cases the measured response times are 64 ms and 150 ms corresponding to L = 4 and
223
(a)
(b)
0.015
Error
Error
0.015
0.01
0.01
0.005
0.005
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
10
30
20
GA5
40
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
50
(c)
30
20
GA5
10
30
20
GA5
40
50
(d)
1.5
Ripple
1.5
Ripple
10
1
0.5
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
30
20
10
GA5
40
1
0.5
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
50
40
50
Fig. 6. L=5. a) Magnitude of the WPLL average error after the voltage amplitude step, b)
magnitude of the WPLL average error after the fundamental grid frequency step, c) ripple of
the WPLL error after the voltage amplitude step and d) ripple of the WPLL error after the
voltage amplitude step.
(a)
(b)
0.015
Error
Error
0.015
0.01
0.005
0.01
0.005
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
10
30
20
GA6
40
50
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
10
30
20
GA6
40
50
1.5
Ripple
1.5
Ripple
30
20
GA6
(d)
(c)
1
0.5
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
10
30
20
10
GA6
40
50
1
0.5
rbio4.4
rbio1.5
bior3.1
coif3
db10
haar
40
50
Fig. 7. L=6. a) Magnitude of the WPLL average error after the voltage amplitude step, b)
magnitude of the WPLL average error after the fundamental grid frequency step, c) ripple of
the WPLL error after the voltage amplitude step and d) ripple of the WPLL error after the
voltage amplitude step.
224
55
50
45
40
Error (V)
Frequency (Hz)
45
1
Error (V)
50
45
0
0
0.5
0
1
Time (s)
(f)
0.5
0
0.5
1
2
Time (s)
(d)
55
40
0.5
Time (s)
(e)
60
0
0.5
50
0.5
55
40
Frequency (Hz)
2
Time (s)
(c)
60
(b)
1
Error (V)
Frequency (Hz)
60
Time (s)
Time (s)
Fig. 8. Time response of the WPLL. a) Frequency measurement with L = 3, b) WPLL error
with with L = 3, c) Frequency measurement with L = 4, d) WPLL error with with L = 4, e)
Frequency measurement with L = 5 and f) WPLL error with with L = 5.
L = 5. In despite of a higher number of decomposition levels, the response time of Daubechies
7 is the longest one due to the filter length. Haar wavelets result on simple filter banks with
low response times. Moreover, from Fig. 8.e, the WPLL performance improves by selecting
more decomposition levels which results on less frequency ripple.
The WPLL outputs, considering Daubechies 7 (L = 3) and Haar (L = 4 and L = 5), for control
purposes of the grid-connected power converter can be compared by means of Fig. 9, where
1.5
L=3
L=4
L=5
1
0.5
0.5
1.5
1.96
1.98
2.02
2.04
2.06
2.08
2.1
Time (s)
Fig. 9. Time response of the WPLL after the frequency step, at 2 s, from 47.5 Hz to 52.5 Hz.
225
the time response of the PLL after the frequency transient of the grid voltage (at t = 2 s) is
shown.
5.2 Control of a dc motor
The proposed synchronization subsystem has been tested in simulation as a part of the whole
controller in case of a grid-connected power converter feeding a dc motor. The employed
MatLab/Simulink simulation model, including the power stage, the converter controller and
the measurement, is depicted in Fig. 10.
The power stage includes a pure sinusoidal waveform with 230 2 V amplitude and 50 Hz
frequency as grid voltage. The grid impedance and the inverter side inductance have been
modeled as a series RL with values 0.4 and 2.5 mH. The IGBT+Diode H-bridge is modeled
by means of the Universal Bridge block of the SimPowerSystems Blockset. The dc filtering
stage consist of one 550 F capacitor and it is connected to the dc motor windings, which
are connected in series. The dc machine is modeled as a separately excited dc machine by
means of the DC Machine block. The measured variables in this model are, at the dc motor
side, the motor speed, the output voltage of the power converter (across the dc capacitor) and
the output current (flowing through the dc motor), at the electrical grid side, the grid voltage
and line current waveforms are also measured.
The inner structure of the employed controller is shown in Fig. 11. The power signals
employed for control purposes (voltage across the dc capacitor, grid voltage and line current)
are filtered out in order to avoid the aliasing due to the sampling process. The PLL block
generates a sinusoidal signal, with unitary amplitude, which is employed to evaluate the
reference current (applied to port iGrid* in current controller). The proportional-integral (PI)
block with K p = Ki = 0.4, employed in case of the SPLL-based model, evaluates the amplitude
of this reference current in order to maintain the dc bus voltage at the reference value, in this
case 450 V. In order to compare the obatined results, the same reference voltage is employed
in case of the WPLL-based model, where the Haar wavelet with five decomposition levels, that
226
800
Vdc (V)
600
400
200
0
0.5
1.5
2
Time (s)
2.5
3.5
(b)
SPLL
WPLL
ig (A)
4
9.8
9.82
9.84
9.86
9.88
9.9
Time (s)
9.92
9.94
9.96
9.98
10
227
starting transient, are 3.72 s. This is due to the fact that reference current amplitude is obtained,
in both cases, by applying the same PI controller. The current line waveforms, once the dc
motor reaches the steady state in both models (with the conventional SPLL and the WPLL),
are shown in Fig. 12.b. The measured current THDs are 0.67% and 0.55% corresponding to the
conventional SPLL and the proposedl WPLL.
6. Conclusions
This book chapter presents an application of wavelets in grid-connected power converters.
The proposed approach allows wavelet-based software phase locked loops (SPLLs) to be
developed and implemented, replacing the proportional-integral controller in conventional
SPLLs. The proposed approach results on a more flexible synchronization subsystem whose
characteristics can be adjusted depending on the electrical grid disturbances. Simulation
results, comparing the performance of the conventional SPLL and the wavelet-based proposed
one, are given in case of a grid-connected power converter feeding a dc motor.
7.References
Bose, B. K. (2006). Power electronics and motor drives : advances and trends, Academic Press.
Bose, B. K. (2009). The past, present, and future of power electronics, IEEE Industrial Electronics
Magazine 3(2): 711,14.
Bueno, E. J., Cobreces, S., Rodriguez, F. J., Hernandez, A. & Espinosa, F. (2008). Design of a
back-to-back npc converter interface for wind turbines with squirrel-cage induction
generator, IEEE Transactions on Energy Conversion 23(3): 932945.
Bueno, E. J., Hernandez, A., Rodriguez, F. J., Giron, C., Mateos, R. & Cobreces, S. (2009). A
dsp- and fpga-based industrial control with high-speed communication interfaces for
grid converters applied to distributed power generation systems, IEEE Transactions
on Industrial Electronics 56(3): 654669.
Carrasco, J. M., Franquelo, L. G., Bialasiewicz, J. T., Galvan, E., Guisado, R. C. P., Prats, M.
A. M., Leon, J. I. & Moreno-Alfonso, N. (2006). Power-electronic systems for the grid
integration of renewable energy sources: A survey, IEEE Transactions on Industrial
Electronics 53(4): 10021016.
Castro, R. M. & Diaz, H. N. (2002). An overview of wavelet transform application in power
systems, 14th Power Systems Computation Conference, pp. 6.16.9.
Chaari, O., Meunier, M. & Brouaye, F. (1996). Wavelets: a new tool for the resonant
grounded power distribution systems relaying, IEEE Transactions on Power Delivery
11(3): 13011308.
Dannehl, J., Fuchs, F. W. & Thgersen, P. B. (2010). Pi state space current control of
grid-connected pwm converters with lcl filters, IEEE Transactions on Power Electronics
25(9): 23202330.
De Mango, F., Liserre, M. & DAquila, A. (2006). Overview of anti-islanding algorithms for pv
systems. part ii: Activemethods, 12th International Power Electronics and Motion Control
Conference, EPE-PEMC 2006, pp. 18841889.
De Mango, F., Liserre, M., DAquila, A. & Pigazo, A. (2006). Overview of anti-islanding
algorithms for pv systems. part i: Passive methods, 12th International Power Electronics
and Motion Control Conference, EPE-PEMC 2006, pp. 18781883.
228
229
Kazmierkowski, M. P., Krishnan, R. & Blaabjerg, F. (eds) (2002). Control in power electronics :
selected problems, Academic Press.
Knezevic, J., Katic, V. & Graovac, D. (2000). Transient analysis of ac/dc converters input
waveforms using wavelet, Proceedings of the Mediterranean Electrotechnical Conference,
Vol. 3, pp. 11931196.
Koizumi, H., Mizuno, T., Kaito, T., Noda, Y., Goshima, N., Kawasaki, M., Nagasaka, K.
& Kurokawa, K. (2006). A novel microcontroller for grid-connected photovoltaic
systems, IEEE Transactions on Industrial Electronics 53(6): 18891897.
Kojabadi, H. M., Yu, B., Gadoura, I. A., Chang, L. & Ghribi, M. (2006). A novel dsp-based
current-controlled pwm strategy for single phase grid connected inverters, IEEE
Transactions on Power Electronics 21(4): 985993.
Liserre, M., Sauter, T. & Hung, J. Y. (2010). Future energy systems: Integrating renewable
energy sources into the smart power grid through industrial electronics, IEEE
Industrial Electronics Magazine 4(1): 1837.
Liserre, M., Teodorescu, R. & Blaabjerg, F. (2006). Multiple harmonics control for three-phase
grid converter systems with the use of pi-res current controller in a rotating frame,
IEEE Transactions on Power Electronics 21(3): 836841.
McGrath, B. P., Holmes, D. G. & Galloway, J. J. H. (2005). Power converter line synchronization
using a discrete fourier transform (dft) based on a variable sample rate, IEEE
Transactions on Power Electronics 20(4): 877884.
Mercorelli, P., Kubasiak, N. & Liu, S. (2004). Model predictive control of an electromagnetic
actuator fed by multilevel pwm inverter, IEEE International Symposium on Industrial
Electronics, Vol. 1, pp. 531535.
Mohamed, Y. A.-R. & El-Saadany, E. F. (2008). Adaptive discrete-time grid-voltage sensorless
interfacing scheme for grid-connected dg-inverters based on neural-network
identification and deadbeat current regulation, IEEE Transactions on Power Electronics
23(1): 308321.
Mohan, N., Undeland, T. M. & Robbins, W. P. (2003). Power electronics : converters, applications
and design, John Wiley & Sons.
Moreno, V. M., Liserre, M., Pigazo, A. & DellAquila, A. (2007). A comparative analysis
of real-time algorithms for power signal decomposition in multiple synchronous
reference frames, IEEE Transactions on Power Electronics 22(4): 12801289.
Parvez, S. & Gao, Z. (2005). A wavelet-based multiresolution pid controller, IEEE Transactions
on Industry Applications 41(2): 537543.
Pham, V. L. & Wong, K. P. (1999). Wavelet-transform-based algorithm for harmonic
analysis of power system waveforms, IEE Proceedings - Generation and Distribution
146(3): 249254.
Pham, V. L., Wong, K. P. & Arrillaga, J. (2000). Sub-harmonics state estimation in power
system, IEEE Power Engineering Society Winter Meeting, Vol. 2, pp. 11681173.
Pigazo, A., Liserre, M., Mastromauro, R. A., Moreno, V. M. & DellAquila, A. (2009).
Wavelet-based islanding detection in grid-connected pv systems, IEEE Transactions
on Industrial Electronics 56(11): 44454455.
Pigazo, A., Moreno, V. M., Liserre, M. & DellAquila, A. (2007). Wavelet-based islanding
detection algorithm for single-phase photovoltaic (pv) distributed generation
systems, IEEE International Symposium on Industrial Electronics, pp. 24092413.
Ponci, F., Santi, E. & Monti, A. (2009). Discrete-time multi-resolution modeling of switching
power converters using wavelets, Simulation 85(2): 6988.
230
service providersUtechnical
and economic issues, IEEE Transactions on Energy
Conversion 24(3): 661672.
Vainio, O. & Ovaska, S. (1995). Noise reduction in zero crossing detection by predictive digital
filtering, IEEE Transactions on Industrial Electronics 42(1): 5862.
Valiviita, S. (1999). Zero-crossing detection of distorted line voltages using 1-b measurements,
IEEE Transactions on Industrial Electronics 46(5): 917922.
Weiss, G., Zhong, Q.-C., Green, T. C. & Liang, J. (2004). h repetitive control of dc-ac converters
in microgrids, IEEE Transactions on Power Electronics 19(1): 319230.
Wilkinson, W. A. & Cox, M. D. (1996). Discrete wavelet analysis of power system transient,
IEEE Transactions on Power Systems 11(4): 20182044.
Yao, S. J., Song, Y. H., Zhan, L. Z. & Cheng, X. Y. (2000). Wavelet transform and neural
networks for short-term electrical forecasting, Energy Conversion and Management
41(18): 19751988.
Zhen, R., Qungu, H., Lin, G. & Weniying, H. (2000). A new method for power system
frequency tracking based on trapezoid wavelet transform, International Conference on
advances in Power System Control, Operation and Management, Vol. 2, pp. 364369.
12
Discrete Wavelet Transform Based Wireless
Digital Communication Systems
Prof Ali A. A. MIEEE, MComSoc
There has been paradigm shift in mobile communications systems every decade. Now, just
coming into the new century, it might be a good time to start discussions on the fourth
generation (4G) systems which may be in service around 2010. For systems beyond 3G, there
may be a requirement for a new wireless access technology for the terrestrial components
[1]. Its envisaged that these potential new radio interfaces will support up to approximately
100 Mbps for high mobility and up to 1 Gbps for the low mobility, such as nomadic, leads to
the 4th generation system. The data rate figures are targets for research and investigation on
the basic technologies necessary to implement the vision. The future system specification
and the design will be based on the results of the research and investigations.
Due to the high rate requirements, additional spectrum will be needed for the new
capabilities beyond International Mobile Telecommunications-2000 (IMT-2000). In
conjunction with the future development of IMT-2000 and systems beyond IMT-2000 there
will be an increasing relation ship between radio access and communication system, such as
wireless Personal Area Networks (PANs), Local Area Networks (LANs), digital broadcast,
and fixed wireless access.
In discussion about 2G systems in the 1980, two candidates for the radio access technique
existed, Time Division Multiple Access (TDMA) and Code Division Multiple Access
(CDMA) schemes. In discussion about 3G system, the Orthogonal Frequency Division
Multiplexing (OFDM) appeared in the 1990s and gained a lot of attention and is a potential
candidate for 4G systems. OFDM is very efficient in spectrum usage and is very effective in
a frequency selective channel. A variation of OFDM which allows multiple accesses is MultiCarrier CDMA (MC-CDMA) which is essentially an OFDM technique where the individual
data symbols are spread using spreading code in frequency domain. The inherent
processing gain due to the spreading helps in interference suppression in addition to
providing high data rates. OFDM is already the technique used in Digital Audio and Video
Broadcasting (DAB, DVB) and WLANs, 802-11 family, and is believed to be the technique for
future broadband wireless access [2]. The present third generation (3G) systems can provide
a maximum data rate of 2 Mbps for indoor environment which is quite less than the needed
for the currently evolving multimedia applications requiring very high bandwidth.
This had led the researchers worldwide to the evolution of the 4G systems that are expected
to provide a data rate ranging from 20 Mbps to 100 Mbps on the air interface. The reader can
easily understand the reason why the OFDM is suited for 4G systems, with some
justifications that appeared as follows:
232
Multicarrier techniques can combat hostile frequency selective fading countered in mobile
communications. The robustness against frequency selective fading is very attractive,
especially for high-speed data transmission [3]
OFDM scheme has been well matured through research and development for high-rate
wireless LANs and terrestrial digital video broadcasting.
By combining OFDM with CDMA, it has been synergistic effect, such as enhancement of
robustness against frequency selective fading and high scalability in possible data
transmission rate.
OFDM can provide higher data rates as is a very good choice for service providers to compete
with wire-line carriers [3]. The CDMA scheme is robust to frequency selective fading and has
been successfully introduced in commercial cellular mobile communications systems such as
Interim Standard-95 (IS-95) and 3G systems. Combining multi-carrier OFDM transmissions
with Code Division Multiple Accesses (CDMA) allows us to exploit the wideband channels
inherent frequency diversity by spreading each symbol across multiple carriers.
Although OFDM is robust to frequency selective fading, it has severe disadvantages in subcarrier synchronization and sensitivity to frequency offset estimation. The other one is
related with the presence of a large number of sub-carriers which exhibit a non-constant
nature in its envelope. The combining of OFDM and CDMA has one major advantage
though; it can lower the symbol rate in each sub-carrier compared to OFDM so that longer
symbol duration makes it easier to synchronize. The MC-CDMA not only mitigates the
Inter-Symbol Interference (ISI) but also exploits the multipath. The MC-CDMA suffers only
slightly in presence of interference as opposed to Direct Sequence-CDMA (DS-CDMA)
whose performance decreases significantly in the presence of interference [4].
In the second section of this chapter, the theory of the Wavelet Transform (with a special
concentration on the Discrete Wavelet Transform) will be presented in a very simple and
comprehensive manner to make it understandable enough for the formulation of the next
sections where the Wavelet based Wireless Digital Communication Systems will be
discussed. Also performance comparisons of Fourier and Wavelet based communication
systems on different channel models will be presented.
2. Wavelet transform
Any general signal can be decomposed into wavelets, i.e., the original function is
synthesized by adding elementary building blocks, of constant shape but different size and
amplitude. In this approach, one can design a set of basis functions by choosing a proper
basic wavelet ( t ) (mother wavelet) and use a delayed and scaled version of that. The most
important properties of wavelets are the admissibility and the regularity conditions and
these are the properties which gave wavelets their name. It can be shown [5] that square
integrable functions ( t ) satisfying the admissibility condition:
( w)
w
dw < +
(1)
can be used to first analyze and then reconstruct a signal without loss of information. In (1)
( w ) stands for the Fourier transform of ( t ) . The admissibility condition implies that the
Fourier transform of ( t ) vanishes at the zero frequency, i.e.
233
( w)
2
w =0
=0
(2)
A zero at the zero frequency also means that the average value of the wavelet in the time
domain must be zero and therefore it must be oscillatory. In other words, ( w ) must be a
wave. The reconstruction or inverse transformation is satisfied whenever ( w ) is of finite
energy and band pass (oscillates in time like a short wave). These are the regularity
conditions and they state that the wavelet function should have some smoothness and
concentration in both time and frequency domains. For sufficiently regular ( w ) , the
reconstruction condition is:
( t ) dt = 0
(3)
Summarizing, the admissibility condition gave us the wave, regularity and vanishing
moments gave us the fast decay or the let, and together they give us the wavelet.
2.1 The discrete wavelet transform
Under the reconstruction condition (3), the continuously labeled basis functions (wavelets),
j , k ( t ) behaves in the wavelet analysis and synthesis just like an orthonormal basis. By
appropriately discretizing the time-scale parameters, , s, and choosing the right mother
wavelet, ( t ) , it is possible to obtain a true orthonormal basis. The natural way is to
discretizing the scaling variable s in a logarithmic manner s = s0 j and to use Nyquist
sampling rule, based on the spectrum of function x (t), to discretizing at any given scale
= k s0 j T . The resultant wavelet functions are then as follows:
j , k ( t ) = s0j
s0j t k 0
(4)
If s0 is close enough to one and if T is small enough, then the wavelet functions are overcomplete and signal reconstruction takes place within non-restrictive conditions on ( t ) .
On the other hand, if the sampling is sparse, e.g., the computation is done octave by octave
(s0 = 0), a true orthonormal basis will be obtained only for very special choices of ( t ) .
Based on the assumption that wavelet functions are orthonormal:
1 if j = m and k = n
otherwise
j , k ( t ) m ,n ( t ) dt = 0
(5)
For discrete time cases, equation (4) is generally used with s0 = 2, the computation is done
octave by octave. In this case, the basis for a wavelet expansion system is generated from
simple scaling and translation. The generating wavelet or mother wavelet, represented
by ( t ) , results in the following two-dimensional parameterization of j , k ( t ) .
j,k (t ) = 2 j 2 2 j t k
(6)
The 2 j/2 factor in equation (6) normalizes each wavelet to maintain a constant norm
independent of scale j. In this case, the discretizing period in is normalized to one and is
234
assumed that it is the same as the sampling period of the discrete signal = k 2 -j . All
useful wavelet systems satisfy the multiresolution conditions. In this case, the lower
resolution coefficients can be calculated from the higher resolution coefficients by a treestructured algorithm called filter-bank [6]. In wavelet transform literatures; this approach is
referred to as discrete wavelet transform (DWT).
2.1.1 The scaling function
The multiresolution idea is better understood by using a function represented by ( t ) and
referred to as scaling function. A two-dimensional family of functions is generated, similar
to (6), from the basic scaling function by [7]:
j,k (t ) = 2 j
2jt k
(7)
Any continuous function, f(t), can be represented, at a given resolution or scale j0, by a
sequence of coefficients given by the expansion:
f j0 ( t ) = f j0 [ k ] j0 , k ( t )
(8)
In other words, the sequence x j0 [ k ] is the set of samples of the continuous function x(t) at
resolution j0. Higher values of j correspond to higher resolution. Discrete signals are
assumed samples of continuous signals at known scales or resolutions. In this case, it is not
possible to obtain information about higher resolution components of that signal. The main
required property is the nesting of the spanned spaces by the scaling functions. In other
words, for any integer j, the functional space spanned by [8]:
{ j ,k (t ) ;
for k 1,2,
(9)
{ j +1,k (t ) ;
for k 1,2,
(10)
(t ) = h ( k )
k
2 ( 2t k )
(11)
The set of coefficients h ( k ) being the scaling function coefficients and 2 maintains the
norm of the scaling function with scale of two. ( t ) being the scaling function which
satisfies this equation which is sometimes called the refinement equation, the dilation
equation, or the multiresolution analysis equation (MRA).
235
2 ( 2t k )
(12)
The set of coefficients g ( k ) s is called the wavelet function coefficients (or the wavelet filter).
It is shown that the wavelet coefficients are required by orthogonality to be related to the
scaling function coefficients by [9, 10]:
g ( k ) = ( 1 ) h ( 1 k )
n
(13)
g ( k ) = ( 1 ) h ( N 1 k )
k
(14)
The function generated by equation (12) gives the prototype or mother wavelet ( t ) for a
class of expansion functions of the form shown in equation (6). For example the Haar scaling
function is the simple unit-width, unit-height pulse function ( t ) shown in Fig (1a) [7] and
it is obvious that ( 2t ) can be used to construct ( t ) by:
( t ) = ( 2t ) + ( 2t 1 )
Which means (11) is satisfied for coefficients h ( 0 ) = 1
(15)
2 , h ( 1) = 1
2.
The Haar wavelet function that is associated with the scaling function in Fig.(1a) is shown in
Fig. (1b). For Haar wavelet, the coefficients in equation (14) are g ( 0 ) = 1 2 , g ( 1 ) = 1 2 .
( t ) = ( 2t ) + ( 2 t 1 )
( t ) = ( 2 t ) ( 2t 1 )
(a)
Fig. 1. (a) Haar Scaling Function, (b) Haar wavelet function.
(b)
236
Any function f ( t ) could be written as a series expansion in terms of the scaling function
and wavelets by [11]:
f (t ) =
k =
a j0 ( k ) j0 , k ( t ) +
b j ( k ) j ,k ( t )
(16)
j = j0 k =
In this expansion, the first summation gives a function that is a low resolution or coarse
approximation of f(t) at scale j0 . For each increasing j in the second summation, a higher or
finer resolution function is added, which adds increasing details. The choice of j0 sets taking
the coarsest scale whose space is spanned by j0 .k ( t ) . The rest of the function is spanned by
the wavelets providing the high-resolution details of the function. The set of coefficients in
the wavelet expansion represented by equation (14) is called the discrete wavelet transform
(DWT) of the function f(t).
These wavelet coefficients, under certain conditions, can completely describe the original
function, and in a way similar to Fourier series coefficients, can be used for analysis,
description, approximation, and filtering. If the scaling function is well behaved, then at a
high scale, samples of the signal are very close to the scaling coefficients. As mentioned
before, for well-behaved scaling or wavelet functions, the samples of a discrete signal can
approximate the highest achievable scaling coefficients.
It is shown that the scaling and wavelet coefficients at scale j are related to the scaling
coefficients at scale (j + 1) by the following two relations.
a j ( k ) = h ( m 2k ) a j +1 ( m)
(17)
bj ( k ) = g ( m 2k ) b j+1 ( m)
(18)
237
238
fN-1
N 1
a[ k ] e j 2 kft
0 t Tu
(19)
k =0
N 1
n
Tu ) = a[ k ]e j 2 kfTu /N
N
k =0
(20)
1
)
Tu
(21)
is satisfied, then the multi-carriers are orthogonal to each other and equation (20) can be
rewritten as:
x a [n] =
N 1
a[ k ]e j 2 nk /N
(22)
k =0
One of the major advantages of OFDM is that the modulation can be performed in the
discrete domain using an Inverse Discrete Fourier Transform (IDFT) or more
computationally efficient inverse Fast Fourier Transform (IFFT). The above equation is just
the IDFT of the input signal stream {a[k]}, equation (22) can be rewritten as [14]:
X e [ n] = N .IDFT {a [ k ]}
(23)
239
At the receiver the DFT implementation to find the approximate signal [k] can be written
as:
a[ k ] = DFT {x a [ n]}
(24)
= x a [n]e j 2nk / N
n =0
=
=
N 1 N 1
1
N
a[m]e
1
N
N 1
m =0
m =0
n =0
j 2n ( m k ) / N
n =0 m =0
a[m] e j 2n( mk ) / N
1
N
N 1
a[m]N [m k ]
m =0
= a[k ]
Here [m-k] is the delta function defined as :
1 , if n = 0
0 , otherwise
[n]
From the derivation above, it can be observed that there are two most important features of
the OFDM technique, these are:
1. Each sub-carrier has a different center frequency. These frequencies are chosen so that
the following integral over a symbol period is satisfied:
Tu
am e jwm t al e jwl t dt = 0, m l
(25)
The sub-carrier signals in an OFDM system are mathematically orthogonal to each other.
The sub-carrier pulse used for transmission is chosen to be rectangular. The rectangular
sin( x )
type of spectrum. The spectrum of the three adjustment OFDM
pulse leads to a
x
sub-carriers is illustrated in Fig. (5). The spectrum of the sub-carriers is overlapped to
each other, thus the OFDM communication system has high spectrum efficiency.
Maintenance of the orthogonality of the sub-carriers is very important in an OFDM
system, which requires the transmitter and receiver to be in the perfect synchronization
[12].
2. IDFT and DFT functions can be exploited to realize the OFDM modulation and
demodulation instead of the filter banks in the transmitter and the receiver to lower the
system implementation complexity and cost. This feature is attractive for practical use.
The IFFT and FFT algorithms can be used to calculate the IDFT and DFT efficiently.
IFFT and FFT are used to realize the OFDM modulation and demodulation to reduce
the system implementation complexity and to improve the system running speed.
240
241
bits. The first diagram represents the serial data stream. After the parallel transformation
each bit lies at one of the inputs of the IFFT unit for the duration Tu=5Tb and generates a
sub signal. The frequencies of the individual sub signals result in integral multiples of f0
=1/Tu .They are therefore orthogonal to one another [15]
3.1.2 Guard Interval
One of the most important properties of OFDM transmissions is the robustness against
multipath delay spread. This is achieved by having a long symbol period, which minimizes
the inter-symbol interference. The level of robustness can in fact be increased even more by
the addition of a guard period between transmitted symbols. The guard period allows time
for multipath signals from the pervious symbol to die away before the information from the
current symbol is gathered [16].
As long as the multipath delay echoes stay within the guard period duration, there is strictly
no limitation regarding the signal level of the echoes: they may even exceed the signal level
of the shorter path. The signal energy from all paths just adds at the input to the receiver,
and since the FFT is energy conservative, the whole available power feeds the decoder. If the
delay spread is longer then guard intervals then they begin to cause inter symbol
interference. However, provided the echoes are sufficiently small they do not cause
significant problems. This is true most of the time as multipath echoes delayed longer than
the guard period will have been reflected off very distant objects. There are several types of
guard interval such that cyclic prefix (CP), zero padded, and other variation of guard
interval are possible.
3.1.3 Cyclic prefix
The most effective guard period to use is a cyclic extension of the symbol, see Fig (7). If a
mirror in time, of the end of the symbol waveform is put at the start of the symbol as the
guard period, this effectively extends the length of the symbol, while maintaining the
orthogonality of the waveform. Using this cyclic extended symbol the samples required for
performing the FFT can be taken anywhere over the length of the symbol. This provides
multipath immunity as well as symbol time synchronization tolerance.
Time
OFDM Symbol
Tg
OFDM Symbol
TG
OFDM Symbol
Tg
242
eliminate ISI. If the number of zeros padded is equal to cyclic prefix length, then ZP-OFDM
and CP-OFDM transmission has the same spectral efficiency.
Other types of guard intervals are possible. One possible type is to have half the guard
period a cyclic prefix of the symbol, as in cyclic prefix type, and the other half a zero
padded, as above [16].
3.2 Synchronization of OFDM systems
Synchronization is a big hurdle in OFDM. Synchronization usually consists of three parts as
follows:
3.2.1 Frame detection
Frame detection is used to determine the symbol boundary so that correct samples for a
symbol frame can be taken. The sampling starting point TX at the receiving end must satisfy
the condition max Tx Tg ,where max is maximum delay spread. Since the previous symbol
will only have effect over samples within [0,max],there is no ISI [18].
There are many algorithms that can be applied to estimate the start of an OFDM symbol
based on pilots or on the cyclic prefix. A good synchronization method must be fast, have a
reliable indication of the synchronized state and introduce a minimum of redundancy in the
transmitted stream.
Most existing timing algorithms use correlations between repeated OFDM signal portions to
create a timing plateau. Such algorithms are not able to give precise timing position
especially when the SNR is low. To improve the robustness of the algorithms, in [29] they
used a differentially coded time-domain PN sequence for frame detection. Because of its
delta like self-correlation property, the PN sequence allows to find the precise timing
position. The PN sequence is transmitted as part of the OFDM packet preamble. At the
receiver, the received signal samples are correlated with the known sequence. When the
transmitted PN sequence is aligned with receiver PN sequence, a correlation peak is
observed from which the OFDM symbol boundary can be inferred.
3.2.2 Carrier synchronization error
Carrier frequency offset estimation plays an important role in OFDM communication
systems because of their high sensitivity to carrier frequency offsets [19]. Due to the carrier
frequency difference of the transmitter and receiver, each signal sample at time t contains an
unknown phase factor e j 2fc t , where f is the unknown carrier frequency offset. This
unknown phase factor must be estimated and compensated for each sample before FFT at
the receiver, since otherwise the orthogonality between sub-carriers is lost.
The impact of a frequency error can be seen as an error in the frequency instants, where the
received signal is sampled during demodulation by the FFT Fig (8) depicts this two-fold
effect. The amplitude of the desired sub-carrier is reduced (+) and inter-carrier-interference
ICI arises from the adjacent sub-carriers (0) [20].
3.2.3 Sampling error correction
Because the sampling clock difference between the transmitter and receiver, each signal
sample is off from its correct sampling time by a small amount which is linearly increasing
with the index of the sample. For example, for 100ppm crystal offset, it will be off by 1
243
244
N p 1
p=0
ap e
j (2 f
d , pt
+ p )
( p )
(27)
where,
1, if = p
( p ) =
0, otherwise
(28)
and ap, fd,p, p, and p are the amplitude, the Doppler frequency, the phase, and the
propagation delay, respectively, associated with the path p, p=0,1,2,,Np-1. A
channel impulse response with corresponding channel transfer function is illustrated in
Fig.(10), while, Fig.(11), is a block diagram representation of a fading channel with two
paths, i.e., with two rays.
Fig. 10. Time-variant channel impulse response and channel transfer function with
frequency-selective fading
245
N p 1
p =0
ap e
j (2 ( f
d , pt
+ f p )+ p )
(29)
The delays are measured relative to the first detectable path at the receiver. The Doppler
frequency is given by [22]:
f d ,p =
vf c cos( p )
c
(30)
It is obvious that fd,p depends on the velocity v of the terminal station, the speed of light c,
the carrier frequency fc, and the angle of incidence p of a wave assigned to a path p.
246
Delay max are characteristic parameters of the delay power density spectrum. The mean
delay is:
N P 1
P aP
P =0
NP 1
P =0
aP
(31)
where, the term aP , in equation (31) represents the power of path p. The RMS delay
spread is defined as:
N P 1
P =
P 2 aP
P =0
N P 1
P =0
aP
( )2
(32)
Similarly, the Doppler power density spectrum S(fd) can be defined as that characterizing
the time variance of the mobile radio channel and gives the average power of the channel
output as a function of the Doppler frequency fD. The frequency dispersive properties of
multi-path channels are most commonly quantified by the maximum occurring Doppler
frequency fD max and the Doppler spread fDspread. The Doppler spread is the bandwidth
of the Doppler power density spectrum and can take on values up to two times
fDmax[25] , i.e.,
fDspread 2 fDmax
(33)
a=a(f,t)=H(f,t)
(34)
p( a) =
2a
e
(35)
{ }
(36)
247
The phase is uniformly distributed in the interval [0, 2]. In the case that the multi-path
channel contains a LOS or dominant component in addition to the randomly moving
scatters. The channel impulse response can no longer be modeled as zero-mean. Under the
assumption of a complex-valued Gaussian process for the channel impulse response, the
magnitude of the channel transfer function has a Rice distribution given by:
P( a ) =
K
2 a ( a2 /+ K Rice )
e
I 0 2 a Rice
(37)
The Rice factor KRice is determined by the ratio of the power of the dominant path to the
power of the scattered paths. Io is the zero-order modified Bessel function. The phase is
uniformly distributed in the interval [0, 2].
Fig. 13. Probability density function of Ricean Distribution, k=- for (Rayleigh)
4.3 Inter-symbol interference (ISI) and inter-channel interference (ICI)
The delay spread can cause inter-symbol-interference (ISI), when adjacent data symbols
overlap and interfere with each other due to different delays on different propagation paths.
The number of interfering symbols in a single-carrier modulated system is given by
(38)
For high data rate applications with very short symbol duration Td < max, the effect of ISI
and, with that, the receiver complexity can increase significantly. The effect of ISI can be
counteracted by different measures such as time or frequency domain equalization.
In spread spectrum systems, rake receivers with several arms are used to reduce the effect of
ISI by exploiting the multi-path diversity such that individual arms are adapted to different
propagation paths.
If the duration of the transmitted symbol is significantly larger than the maximum delay
Td>>max, the channel produces a negligible amount of ISI. This effect is exploited with
multi-carrier transmission where the duration of the transmitted symbol increases with the
number of sub-carrier Nc and, hence, the amount of ISI decreases. The number of the
interfering symbols in a multi-carrier modulated system is given by:
248
(39)
Residual ISI can be eliminated by the use of a guard interval. The maximum Doppler spread
in mobile radio applications using single-carrier modulation is typically much less than the
distance between adjacent channels, such that the effect of interference on adjacent channels
due to Doppler spread is not a problem for a single-carrier modulated systems. For multicarrier modulated systems, the sub-channel spacing Fs can become quite small, such that
Doppler effects can cause significant ICI. As long as all sub-carriers are affected by a
common Doppler shift fd, this Doppler shift can be compensated for in the receiver and ICI
can be avoided. However, if Doppler spread on the order of several percent of the subcarrier spacing occurs. ICI may degrade the system performance significantly. To avoid
performance degradations due to ICI more complex receivers with ICI equalization should
be used. The sub-carrier spacing Fs should be chosen as:
FS >> f D max
(40)
249
Tc, then the sub-carrier spacing in one system is 1/Tc and the other is 1/Tb. The former is
called the Multicarrier DS-CDMA (MC-DS-CDMA) and the latter is called the Multi-tone
CDMA (MT-CDMA). The performance of these two schemes has been studied for an uplink
channel in [29]. Hara has shown that MC-CDMA outperforms MC-DS-CDMA and MTCDMA in terms of downlink BER performance. MC-CDMA is thus an attractive technique
for the downlink [30]. A simple block diagram of a MC-CDMA system is as shown below in
Figure (14) below.
250
251
6. Simulation results
In this section the simulation of the FFT based OFDM STBC DWTCS-OFDM system in
MATLAB version 7 are achieved. And the BER performance of the OFDM system considered
in different channel models, the AWGN channel, the flat fading channel. Table (1) shows the
parameters of the system that are used in the simulations; the bandwidth used was 10MHz.
6.1 Performance of DWT-MC-CDMA in AWGN and flat fading channel models
Simulation results of the DWT-OFDM system is shown in figure (19). It is shown clearly that
the DWT-MC-CDMA is much better than the system FFT-MC-CDMA. This is a reflection to
the fact that the orthogonal bases of the wavelet is much significant than the orthogonal
bases used in FFT-MC-CDMA.
252
QPSK
Number of sub-carriers
64
64
AWGN
Channel model
253
Fig. 20. BER performance of DWT-MC-CDMA in Flat Fading Channel at Max. Doppler
Shift=5
6.2 Performance of STBC-MC-DS-CDMA systems in AWGN and flat fading channel
models
Simulation result of the STBC-MC-DS-CDMA Systems in AWGN channel is shown in
Figure (21). It is clearly shown that the STBC-MC-DS-CDMA Based on DWT is much better
than STBC-MC-DS-CDMA systems Based on FFT.
In flat Fading Channel simulation a Doppler frequency of 10Hz is used. From Figure (22) it
can be seen that for BER=10-4 the SNR required for DWT based STBC-MC-DS-CDMA was
about 13dB and for FFT based STBC-MC-DS-CDMA has 25dB, therefore a gain of 12dB for
the DWT against FFT is achieved.
254
Fig. 22. Performance of S TBC-DS-CDMA in Flat Fading Channel Max. Doppler Shift= 10Hz.
7. Conclusions
The improved performance of MC-DS-CDMA system using STBC schemes and DWT is
investigated. The performance comparisons of BER performance for the conventional MCDS-CDMA based on FFT, STBC MC-DS-CDMA and DWT based STBC MC-DS-CDMA in
the different channel models together with their comparison for best achievable BER have
been presented. Simulation results were provided to demonstrate that significant gains can
be achieved by introducing such combination technique with very little decoding
complexity. Therefore, the DWT based STBC MC-DS-CDMA is a feasible way to reach the
next generation of wireless communication for large data rates and applications
8. References
[1] Shinsuke Hara, and Ramjrr Prasad. Multicarrier Techniques for 4G Mobile
Communications, Artech House, Boston. London, (2003).
255
[2] Sttot J. H. Explaining Some of the Magic COFDM, Proceedings of 20th International
Television Symposium, (1997).
[3] Chuang J., and Sollenberger N. Beyond 3G Wideband Wireless Data Access Based on
OFDM and Dynamic Packet Assignment, IEEE Communication Magazine, vol. 38,
no.7, pp. 78-87, July (2000)
[4] Kaiser S. On the Performance of Different Detection Techniques for OFDM-CDMA in
Fading Channels, IEEE ICC95. pp. 2059-2063, June (1995).
[5] C. Valens, A Really Friendly Guide to Wavelets, 1999.
[6] A. Graps, An Introduction to Wavelets, IEEE Computational Science and Eng., Vol. 2,
No. 2, 1995.
[7] Goswami J. C., Chan A. K., Fundamentals of Wavelets Theory, Algorithms and
Applications, John Wiley & Sons Ltd. 1999.
[8] Mallat S., A Theory for Multiresolution Signal Decomposition: the Wavelet
Representation, IEEE Pattern Anal. And Machine Intel, vol. 11, no. 7, pp. 674-693.
1989.
[9] V. Strela, G. Strang et al, The Application of Multiwavelet Filter Banks to Image
Processing IEEE Transaction on Image Processing, 1993.
[10] V. Strela, Multiwavelets: Theory and Application, Ph.D Thesis, MIT, June 1996.
[11] H. Steendam and M. Moeneclaey The Effect of Carrier Frequency Offsets on Downlink
and Uplink MC-DS-CDMA, IEEE JOURNAL on Select. Areas in Comm., vol. 19,
no. 12, Dec. 2001.
[12] S. Hara and R. Prasad Multi-Carrier Techniques for 4G Mobile Communications, 1st
Edition, Artech House, Boston, 2003
[13] Hanzo L. et al, OFDM and MC-CDMA for Broadband Multi-User Communications,
WLANs and Broadcasting, John Wiley & Sons, (2003):
CS-Books@wiley.co.uk
[14] J. G. Proakis, Digital Communications, Prentice-Hall, 4th edition, 2004.
[15] Minn H., Bhargava V.K., An Investigation into Time Domain Approach for OFDM
Channel Estimation IEEE Transaction on Broadcasting, Vol. 46, Dec 2000.
[16] Y. Zigang, and et al, Blind Bayesian Multiuser Receiver for Space-time Coded MCCDMA System over Frequency-Selective Fading Channel, IEEE Trans. Veh. Techn,
vol. VT-40, pp. 781-785, May 2001.
[17] M. Alard and R. Lassalle, "Principles of Modulation and Channel Coding for Digital
Broadcasting for Mobile Receiver," Tech. Rep., no. 224, pp.47-69, Aug. 1987.
[18] P. Frederik and L. Geert SpaceTime Block Coding for Single-Carrier Block
Transmission DS-CDMA Downlink IEEE Journal On Selected Areas In
Communications, Vol. 21, No. 3, pp. 350-361, APRIL 2003
[19] Hui Liu and Hujun Yin, Receiver Design in Multi-carrier Direct-Sequence CDMA
Communications, IEEE Trans. On Comm., vol. 49, no. 8, Aug. 2001.
[20] H. Steendam and M. Moeneclaey The Effect of Carrier Frequency Offsets on Downlink
and Uplink MC-DS-CDMA, IEEE JOURNAL on Select. Areas in Comm., vol. 19,
no. 12, Dec. 2001.
[21] J. D. Gibson, The Communication Handbook 2nd Edition, Southern Methodist
University Dallas, Texas, 2002.
H
256
[22] L. Hanzo, C.H. Wong, M.S. Yee, Adaptive Wireless Transceivers Turbo-Coded, TurboEqualized and Space-Time Coded TDMA, CDMA and OFDM Systems, John
Wiley & Sons Ltd, 2002.
[23] O.M Mustaf Performance Evaluation of a Proposed MC-DS-CDMA for Broadband
Wireless Access, PhD Thesis, University of Baghdad, 2006
[24] I. Barhumi et al, Optimal Training Design for MIMO OFDM Systems in Mobile
Wireless Channels, IEEE Trans. On Signal Processing, Vol.51, no. 6, June. 2003.
[25] Z. Cao et al Efficient Structure-based Carrier Frequency Offset Estimation for
Interleaved OFDMA Uplink, under publication of IEEE.
[26] F. Molisch Wideband Wireless Digital Communications, 2nd Edition, Prentice Hall,
New York, 2002.
[27] N.Yuan, An Equalization Technique for High Rate OFDM Systems M.Sc. Thesis
University of Saskatchewan .Saskatoon, Dec.2003.
[28] Zhi Zliang and Li Guoqing, A Novel Decoding Algorithm of STBC for CDMA Receiver
in Multipath Fading Environments, IEEE Trans. on Comm., vol. 49, pp. 1956-1959,
April 2001
[29] I. Barhumi et al, Time-Varying FIR Equalization for Doubly-Selective Channels, IEEE
Trans. On Wireless Comm., Vol. 4, no. 1, Jan. 2005.
[30] K. Ming and T. Chee Hybrid OFDM-CDMA: A Comparison of MC/DS-CDMA, MCCDMA and OFCDM Dept of Electrical & Electronic, Adelaide University, SA 5005,
Australia. 2002.
[31] J. Tang, and Xi Zhang, Transmit Selection Diversity With Maximal-Ratio Combining
for Multicarrier DS-CDMA Wireless Networks Over Nakagami-m Fading
Channels, IEEE Journal On Selected Areas In Communications, VOL. 24, NO. 1,
pp. 57105713, January 2006.
[32] Y. Jing Space-Time Code Design and Its Applications in Wireless Networks Ph.D.
thesis in California Institute of Technology Pasadena, California, September 7, 2004