Cepstrum

1428 PROCEEDINGS OF THE IEEE, VOL. 65, NO.
10, OCTOBER 1977
The Cepstrum: A Guide to Processing
A&tmct-This paper is a pragmatic tutorial review of the cepstrum perspectives; but, hopefully,these will be tempered bya
litemture focusing on data processing. The power, complex, and phase healthy skepticism.
cepstrr~showntobeerpilyrehtedtooaemother. Roblemsrssod-
ated with plnse unwrqphg, linear phase components,spechum notch- Fundamentally,cepstrumtechniquesaresuitedforthe
ing, rlhsing, oversunpling, and extending &e data aeqnence with zeros analysis of data that contain echoes (wavelets) or reverbera-
are dixmssed. The advlntaga and di.uddpmtrges of windowing the tions of a fundamental wavelet (sometimes called a signature)
sampled data sequence, the log s pecn
tm,and the complex cepstrum whose shape need not be known a priori. The power cepstrum
are pxesented. The iufluence of noise upon the data processing
procedures is discussed throughout the paper, but is not thoroughly is usually used to determine the arrivaltimes of the funda-
analyzed. The effects of nrious forms of liftering the cepstnun are mental wavelet and its echoesandtheirrelativeamplitudes;
desaibed. The d xl obtained by applying whitening and trend re- the processing of the complexcepstrum can determinethe
moval techniques to the spectrum prior to the alcuhtion of the cep wavelet waveform.
S i N D l U e d i S C U S d
We have attempted to synthesize the results, procedures, and infor-
Theapplicationareasencompassradarandsonar[61-[81,
mation~tothemauyfieldsthatareEindingcepstnunamlysis [291, [401-[411, [431, [611, [621 where cepstrum processing
wful. In particubu we discuss the interpretation and F g of can be used to advantage to reducereflectioninterference,
data in such areas as speech, - , and hydmacoustm. But we speech[131, 1191, 1211, [421,[461-[511,[531,[551,[571,
must caution the readex that the paper is heavily influenced by our own [ 6 1I-[ 66 I, [ 7 1 I, where speaker fundamental frequency (pitch)
experiences;specirk procedures that have been founduseful in one
field should not be conaidered as totally general to other fields. is estimated and spectrum envelopes are calculated, marine and
1 t ~ h o p e d t h . t t h i s r w i e w w i l l b e o f n l u e t o t h o s e f ~ w i t h t h eearth seismology, seismic exploration and detection [ 11, [31 I,
fi~andreducethetimerequiredforthosewishingtobecome~. [34]-[381,[451,[581-[591,[801wheresourcedepth de-
terminations are made and the ocean bottom is mapped, and
1. INTRODUCTION the electroencephalogram (EEG) orbrain waves [4], [91, [ 1 11,
T HIS PAPER has two objectives: first, to present a guide [121, [221-[241, [401, [541, [641 where correlates
to the cepstrum literature which is becomingincreas- physiologicaleventsarederived.Otherareas
ingdiverse;andsecond, to surveycepstrum
processing procedureswhich have foundapplication in the More recentapplicationsaretwo-dimensionalfunctions
analysis of data. While a review section is included to establish [20] andaeroacousticsornoisepollution
of electro-
of interestin-
signal clude the deconvolution of probability density functions [ 52 ].
[44]. Someaddi-
[SI,
a notational reference, no attemptis made to provide an exten- tional exciting work is the restoration of old recordings 1301
sive rederivation of previous results. Instead, key sources are and image processing [ 191, [ 301, [61], [631. Throughout the
referenced. paper we attempt to synthesize the observationsfromthese
We recognize that for some readers our summary of various many fields.
cepstnun processing procedures which have been and continue The paper’s outline is as follows. First, we provide a brief
to be applied to the analysis of data may be too concise. It is historical review and the fundamental mathematical formula-
our hope that this weakness is counterbalanced by our dis- tion for processing discrete data sequences by cepstrum tech-
cussion of the pitfalls, advantages, and disadvantages of such niques. This is followed by a section on phase problems en-
procedures. While themajority of theseresultsappear else- countered in the complex cepstrum, e.&, linear phase terms,
where, sometimes mentioned only briefly, we do include some spectral notching, oversampling, phase unwrapping, and noise.
new findings. A good deal of what follows is based upon our Next comes a section on other problems, e.g., aliasing, over-
own experiences but we have tried to incorporate and synthe- sampling, the addition of zeros.Thefollowingsection is on
size the results and experiences of the numerous other investi- windowingtheoriginaldatasequence,thelogspectrum
gators in this field as well. Perhaps the single most important sequence, and the cepstrum sequence. In the next to the last
observation that has been made by us and others is that the section we discuss data processing for the threeareas of speech,
results obtained from cepstrum analysis are highlydata depen- seismology, and hydroacoustics. Finally, we conclude with a
dent. Because of this, few generalities, which cut across appli- summary, recapitulation, and some general observations. The
cation areas, can be formulated from empirical results from paper is organized in this manner in an attempt to localize, as
one application area alone. Nonetheless, investigators appear well as possible, the terminology that is special to each of the
to be obtaining satisfactory results from the processing of their diverse fields. Thus, the next to the last section on data pro-
databycepstrumtechniques.Thus it is ourhopethatthe cessing employs many terms peculiarto each field.
reader will gain from this paper some new insights, ideas, and It is our hope that this guide will prove useful to those just
becoming involved in cepstrum analysisas well as to those with
previous experience.
Manuscript received November 2 5 , 1975;revised November 19, 1976
md February 7, 1977. 11. THE CEPSTRA
D. G. ChiMcrs is with the Department of Electrical Engineering, Uni-
rsnity of Florida, Gaiuervilk, FL 3261 1 . Historically, the cepstrum has its roots in the general prob-
D. SLciAner
P. k with the N a n 1 CoastalSylJtemsLaboratory, lem of the deconvolution of two or more signals. This litera-
ham a City, FL 32401.
R. C. Kernemit is with ENSCO. Satellite m c h , FL 32937. ture is richand varied andencompasseslinearprediction,
CHILDERS e t al.: GUIDE TO PROCESSING 1429
predictive deconvolution, inverse filtering, and deconvolution. ward z-transform and/or the final squaring could be changed
In the interest of brevity we have provided only a selected list to magnitude squared. In actuality the final squaring operation
of references and books in this area [66]-[86]. in(1) is unnecessary and is frequentlyomittedfor several
In what follows it will be seen that the (complex) cepstrum reasons, but it has been used here t o provide historical con-
is also concernedwith the deconvolution of two signals, tinuity with [ 1 ].
namely, a basic orfundamental wavelet and atrain of im- Therefore, if wehave the convolution of two sequences,
pulses. The phase cepsoum is defied and it is shown how the then
power,complex,and phase cepstra are related via a form
which simplifies computation. x(nT)=f(nT) *g(nT) (2)
or
A . The Power Cepstrum
The power cepstrum was first described by Bogert et al. [ 1 ] IX ( z ) I’ = I F(z) I2 * I G(z) I’ (31
in 1963 as a heuristic technique for finding echo arrival times or
in acomposite signal. Basically, these authors defined the
cepstrum(which we term the powercepstrum [ 9 ] , [ 121 t o log I X(z) I’ = log IF(z) l2 + log I C(z) 12. (4)
avoid confusion with the complex cepstrum) of a function as If we apply (l), then
the power spectrum of the logarithm of the power spectrum of
that function. xpc(n T ) = fpc(n T ) + gpc(nT ) + a cross-product term. (Sa)
These authors quickly showed (aswe repeat below) that the If the power cepstra o f f and g occupy different quefrency
effect of a delayed echo will manifest itself as a ripple in the ranges, then (Sa) can be reduced to
log spectrum. The “frequency” of this ripple is easily deter-
mined by calculating the spectrum of the log spectrum wherein Xpcb T ) = fpc(n T ) + gp& TI. (5b)
this “frequency” will appear as a peak. However, the units of
(This is the result that would be obtained if the f i a l squaring
“frequency” of this ripple in the log spectrum are in units of
operation in (1) were not included in the definition.) Under
time; thus, the independent variable (abscissa) in the spectrum
thisconditionthe individual contributions of each power
of the log spectrum is time. Other parameters were also ob-
cepstrum can be separated by liftering(filtering) inthe
served t o undergo similar transformations of units. To avoid
quefrency domain.
confusion, Bogert et al. [ 11 introducedthe following now-
For the case of acomposite signal consisting of the basic
classical paraphrased terms according to a syllabic interchange
wavelet and a single echo, then
rule :
frequency. . . . . . . . quefrency g ( n T ) = 6 ( n T ) + aG(nT - n o T ) (6)
spectrum . . . . . . . .cepstrum where 6 ( n T ) denotes the unit pulse function in a sampled data
phase . . . . . . . . . . . saphe sequence.
amplitude. . . . . . . .gamnitude Equation (3) then becomes
filtering . . . . . . . . .liftering
harmonic . . . . . . . . rahmonic IX(Z)l2 = lF(Z)l2
l(1 +az-”o)12 (7)
period ...........repiod and if we evaluate (4) on the unitcircle (z = eiWT), then
along withothers.Todaythetwo most prevalent terms are log IX(eiWT)12 =logI F ( ~ ~ ~ ~ > I ’
cepstrum and quefrency, e.g., filtering in the cepstrum domain
is usually called just that and not “liftering” as suggested by + log (1 +a’ + 2a cos (unoT))
Bogert et al. [ 1 1, but this can and often does lead to confusion.
=log I F ( ~ ~+ l~o g~( l )+ aI2 ~
)
In practice the power cepstrum is effective if the wavelet and
the impulse train, whose convolution comprise the composite 2a
data,occupydifferentquefrency ranges. In actuality,the
power cepstrum does not exist for most signals;it is meaning-
(
+log 1 +-
l+a2
ful only when d e f i e d in a sampled data sense (asis the com- We may now expand the third term on the right of (8) in a
plex cepstrum) although attempts to extend exist it [ 31. Thus power series (except for thepoint values a = f 1and
the following definition is offered: the power cepstrum of a COS ( ~ n o T ) =
f l [61) t o obtain
data sequence is the square of the inverse z-transform of the
logarithm of the magnitude squared of the z-transform of the
data sequence. When this definition is evaluated on the unit
circle, the result (except for the normalization factors associ-
ated with the power spectrum) is the same as that obtained in (9)
[ 11. Thus we may write the power cepstrum as
where a0 = (2a)/( 1 + a’).
We see that the logarithm of the magnitude squared of the
z-transform of the composite signal will contain cosinusoidal
ripples (sometimes referred to as spectral modulation) whose
gamnitude (amplitude) and quefrency (i.e., the frequency of
the ripples) are related t o the echo amplitude (a) and delay
( n o r ) , respectively.
where X(z) is the z-transform of thedata sequence x(nT). Using (9) in (81, we can take the inverse z-trmsform of (8)
Alternately, the definition could be changed t o use the for- t o obtain tbe term within the brackets in (1) which will bare
1430 PROCEEDINGS OF THE IEEE, VOL. 65, NO. 10, OCTOBER 1977
peaks at quefrencies of (noT) and multiples thereof. (We per- Cohen [36], [37] discusses the echo polarity determination
form this task in detail laterforthe complex cepstrum.) problem. In some experimentalsituationstheecho may
These peaks will be detectable provided the log 1 F ( e j W T )1' is undergo 180' phase reversals atcertain boundary reflection
approximately quefrency limited t o less than (noT), i.e., the interfaces, i.e., the reflection coefficient, a , may be negative.
ripples in log I F(elwT)12 should not have a repoid (period) This knowledge can be useful indatainterpretation.For
greater than (noT)-'. In other words (5b) holds or at least example, in (8) if a is negative this leads to spectral nulls at
approximately so. It is apparent that the echo arrival time can -
f = ( m ) / ( n o T ) ,rn = 0,1 , 2 , * * . By measuring the frequency
be estimated by simply noting the time of occurrence of the spacing between two successive spectral nulls one can then
first peak in the power cepstrum. Further, it is possible using determine the delay time n o T . The source depth can in turn
(9) to estimate theecho amplitude. This will be discussed be estimated if the average signal velocity is known.
more fully for the complex cepstrum. The presence of multiple Cohen [36] also points out that the cepstrum may contain
echoes can and does confuse the interpretation procedure be- many peaks which can confuse the analyst. He suggests doing
cause of the nonlinearity introduced by the log function. In pseudo-autocorrelation (defined as the inverse Fourier trans-
addition, aliasing causes problems [ 321. These points are form of the lifteredpower spectrum [ 11, [361) analysis
given further considerationlater. It should also be apparent simultaneously. If a negative reflection has occurred, then the
that if (5b) does not hold, then the cross-product term in (Sa) cepstrum will have a positive peak while the pseudo-
will introduce further extraneouspeaks. autocorrelation will show a negative peak. The dot product
It should be noted that the peaks in the power cepstrum can (Le., keeping track of the phase) of the cepstrumand the
be removed by notch liftering (filtering) to yield an estimate of pseudo-autocorrelation can even be determined. Here the
the power cepstrum of the basic wavelet. Further, if the fiial cepstrum can be any one of those obtained by transforming
squaring operation in the calculation of the power cepstrum in one of the three whitened spectra discussed above. This pro-
(1) is not performed, then the peaks still appear and again can cedure can apparently help separate multiple echoes(or events)
be removed by notch liftering, but now the operations per- from a single echo (or event) such as might occur with multi-
formed to calculate this modified powercepstrum can be path, i.e., multiple reflections.
reversed to obtain an estimate of the log power spectrum and
with exponentiation t o yield the power spectrum of the basic
wavelet itself. But the waveform of the basic wavelet cannot B . The Complex Cepstrurn and Wavelet Recovery
be recovered by processing the power cepstrum since the phase Thecomplexcepstrum is an outgrowthofhomomorphic
information is discarded.This lattersituation is corrected system theory developed by Oppenheim [ 14]-[ 171. In fact,
with the complex cepstrum which we discuss in the next sub- the powercepstrum is also aspecificapplicationofhomo-
section along with the inversion process. morphicsystem theory.The complexcepstrum has been
The power cepstrum has been applied t o seismic data [ 11, investigated extensively [91, [12],[131, 1191, [211-[231,
[361, sonar [43], speech [13],[46]-[49], and the electro- [26]-[321, [611-[641.
encephalogram (EEG) [ 1 2 I, [22 I, [ 24 I. Its statistical proper- Since the complex cepstrum retains the phase information of
ties have also been examined [ 2 ], [ 7 ]. the composite data, it can be used not only for echo detection
It is hopefully beneficial to point out that alternate view- but also wavelet recovery; this process is also known as homo-
points and, thus,subsequentterminologies have arisen since morphicdeconvolutionor homomorphic filtering and has
the original paper by Bogert et al. [ 1 1. These viewpoints have since been applied to seismic data (311, [341-[381, [451,
led to what might wellbe considered two lines of investiga- [571-[591,speech 1191, [211, [301, 1501, [511, 1531, [571,
tion: (1) the use of varying degrees of spectral whitening; and [62], [631,[651,[661, image processing [191, [301,[611,
(2) the attempts to devise methods for obtaining the phase [63],and EEGanalysis [91, [ I l l , 1121, [221-[241.
relations of the wavelet with respect tothe reference sig- Formally, we define the complexcepstrum of a data se-
nal [36]. quence as the inverse z-transform of the complex logarithm of
We have seen that the occurrence of an echo in thetime the z-transform of the datasequence [ 2 11, [631, i.e.,
domain signal leads to what amounts to a spectral modulation
(or ripple) in the frequency domain.
The spectral whitening approach to echo detection considers
the application of the logarithm a severe spectralwhitener
2', f
x^(nT ) = - log (X(z))z"-' dz(10)
(rather thana mechanism to transform theproduct of two

functions into the sum of the logarithm of the two functions where p(0) = log [x(O)] and X(z) is the z-transform of the data
as Bogert et al. intended). Cohen [36] and his co-workers sequence x(nT). Frequently, f ( z ) is used to denotethe
consider the logarithm too severe a spectral whitener for some logX(z); then :(nT), the complexcepstrum,is the inverse
applications, e.g., when the signal is narrow bandwidth and the z-transform of X(z). The contour of integration lies within an
signal-to-noise ratio (SNR) is low, when multipleechoes are annular region in which f ( z ) has been defined as single valued
present, and when the echo amplitudeis large, I a 1 > 0.5. They and analytic. Ifwe have the convolution of two sequences,
consider separating the echo and the original wavelet in three then
ways: 1) the unwhitened version, which calculates the Power
spectrum of the data where some form of mean removal or x(nT)=f(nT) d n T ) 1) (1
lifter in the quefrency domain is always applied t o the spec- or
trum before the second transformation;2)the moderately
whitened version, which calculates the power spectrum of the X(z)
(12)C(z)
= F(z)
square root of the power spectrum; and 3) thesevere whitened and
version, which calculates the power spectrum of thelogarithm
of the power spectrum.
CHILDERS et d . : GUIDE TO PROCESSING 1431
LONGPASS NoTcn
(C)
Fig. 1. Overall wavelet recovery system, also known as homomorphic deconvolution (filtering) or cepstrum
system. The DFT is performed by an FFTalgorithm. XR(n) denotestherecoveredwavelet. The input
sequence is windowed and thenappendedwithzeros.(a)Simplifiedblock diagram. (b) More detailed
block diagram which can be used to process data in real time. (c) Typical lifters for the single echo, mihi-
mum phase ((I < 1) case where peaks occur at n , and multiples thereof. (The notch lifter is sometimes called
a comb lifter.)
or main. Thenotchlifterhas beenfrequentlyreferred to as a

comb lifter in the literature.
x^(nT ) = f^(n T)+ T). (14) I) Phase Unwrapping: The computation of the complex cep-
Further, if ?and goccupy different quefrency ranges, then the strum is complicated by the fact that the complexlogarithm is
complex cepstrum can be liftered (filtered) to remove one or multivalued. If the imaginarypart of the logarithm is com-
the other of the convolved sequences. Since the phase infor- putedmodule 2n, i.e., evaluated as itsprincipalvalue, then
mation is retained, the complex cepstrum is invertible. Thus discontinuities appear in the phase curve. This is not allowed
if g^(nT) is rejected from x^(nT)by liftering, then xh=fand we since log (X(z)) is the z-transform of x"(nT) and thus must be
may then z-transform, exponentiate, and inverse z-transform analytic in some annular region of the z-plane. This problem
to obtainthesequence f ( n T ) , i.e., f and g have been be rectified by making the following observations:
deconvolved.
Fig. 1 illustrates an overall wavelet recovery or homomorphic The imaginary part of log (X(z)) must be a continuous
deconvolution (filtering) system which is not only functional and periodic (evaluated on the unit circle) function of w
for off-.line computations but can be implemented in real-time with period ( 2 n / T )since it is the z-transform of !&PIT).
[ 2 6 ] - [ 2 8 ] . Examples of long pass, short pass, and notch lifters Since it is required that the complex cepstrum of a real
(filters) arealso shown. These lifters are analogous to high pass, function be real, it follows that the imaginarypart of
low pass, and notch filters, respectively, in the frequency do- log (X(z)) must be an odd function of w.
tion).Analogously,amaximumphasesequence may be de-

fined (the z-transform has no poles or zeros inside the unit
circle). The complex cepstrum for such a sequence is zero for
positive quefrencies. Maximum and minimum phase sequences
are discussed moreextensivelyin [18], [33], [63], [84]. This
is a very important topic since the signals of general interest
arefrequently of mixedphase.It is difficult t o properly
process such signals especially in the presence of noise. It has
recentlybeenshownthataprocedure called homomorphic
prediction, which is a combination of homomorphic decon-
volution and linear prediction, is quite helpful in processing
,mch mixed phase signals [ 181, [421, 1571. We mention this
- 2. O ..... X again later.
We now show that the impulses that appear in the complex
cepstrum can be caused by the presence of a single additive
ha echo.Theseimpulses are nonzero on onlyoneside of the
origin and are therefore referred to as minimum or maximum
..' . .. phase impulse trains.
0 ~ ( l l ) l e t g ( n T ) = 6 ( n T ) + a s ( n TnoT),then
-
-... x ( n T ) = f ( n T ) + a f ( n T - noT).
Taking the z-transform and evaluating it
(16)
on the unit circle,
-h
we have
(4
X ( e j w T , = F(eiw T , ( 1 + ae-jwh 1' . (17)
Fig. 2. phase unwrapping. (a) Phase modulo 2r. (b) c ( k ) , the Taking the log of both sides, we obtain
correction sequence. (c) Unwrapped phase.
f(eiw T = log (F(ejWT)) + log (1 + ae-jwnoT
(184 1.
Subject to these conditions we may compute the unwrapped If a < 1 (corresponding to a minimum phase sequence), then
phasecurve as follows [ Z l l , 1221 (providedthephase is we may expand the right most term in (18) in a power series,
sampled at aratesufficientlygreat to assure thatit never then
changes by more than r between samples [26], [27]): a cor-
rection sequence C(k) is added to the modulo 2 1 phase se-
quence P(k) where C ( k )is
C(0) = 0
t
C(k- 1 ) - 2n, i f P ( k ) - P ( k - 1) > r
~ ( k ) = C ( k - 1 ) + 2 ri,f P ( k1- ) - P ( k ) > n
C(k - 11,
This is illustrated in Fig. 2.
(15)
otherwise.
Alternately, the phase may be unwrapped by computing the

Inverse z-transforming, we have the complex cepstrum
%nT)=f^(nT)+a6(nT- n0T)- - 6 ( n T -
2
ac
2noT)
relative phase betweenadjacentsamples of the spectrum. a3

Thesephasesmay be added to achieve acumulative(un- +-6(nT- 3 n o T ) - - - . (20a)
3
wrapped) phase for each point. Both methods have the draw-
back that the computation must be done sequentially. It is Thus the complex cepstrum of the composit5signal consists
also noted that if the phase never changes by more than n/2 of the complex cepstrum of the basic wavelet f plus a train of
between samples, the phase modulo n could be computed and 6 functions located at positive quefrencies at the echo delay
unwrapped with algorithms similar to the above. This is in- (and its multiples) whose amplitudes are directly related to the
terestingsinceit is slightlyeasier to calculatethephase echo amplitude. Notch liftering and interpolation (smoothing)
modulo r than the phase modulo 2n (the arctangent algorithm canbeperformedtoremovethese6functions[9],[11],[12],
is simpler) and many signals have this property (though noise 1191, [21]-[24], [501, [51],[63],[641.The basic wavelet
generally does not). can then be recovered by inverting the operationsused to com-
Several other phase unwrapping procedures have been dis- putethecomplexcepstrum (see Fig. 1). If thecomplex
cussed, e.g., integratingthephasederivative [21], [3 11, an cepstra of the basic wavelet andtheimpulsetrain are suf-
adaptive numerical integration procedure [56], and a recursive ficiently separated in quefrency, then short-pass liftering can
procedure to remove the linear phase [3 11. be used to recover the basic wavelet. Analogously, the im-
\ P h a s e unwrapping is unnecessary for the class of minimum pulse train g,can be recovered by using long-pass liftering.
phase signals, i.e., a sequencewhose z-transform has no poles or If the echo amplitude is greater than unity a 2 1, (corre-
zeros outside the unit circle, which implies that 3nT)= 0 for sponding to amaximum phase sequence)then(18) can be
n < O [21], 1611, [63], [64]. Thecomplexcepstrumofsuch rewritten as
a sequence is zero at negative quefrencies. Further, for n > 0
thecomplexcepstrum is identical to thepowercepstrum
(except for a factor of 2 and the squaring operation); forn = 0
the two cepstra are identical (except for the squaring opera-
CHILDERS et d.:GUIDE TO PROCESSING 1433
Anotherinteresting,andperhaps,morerepresentativeex-
ample is the one with an infinite series of decaying echoes.
Here
g ( n T ) = 6 ( n T ) + a : O 6 ( n T - no^)
+ ~ : ~ ~ 6 ( n T - 2 n ~ T ) +w*h-e-r e O < a l < l .
, then
f" Or with Q = a
:.
g(nT)=6(nT)+as(nT-noT)+aZS(nT- 2noT)+*..
.<1 .>I
=
m =O
amG(nT- m n o T )
whichwhen convolved with f ( n T ) will give us a minimum

phase sequence. Then
G ( Z )= 1 + az-nO + Q ' ~ - ' ~ o +.
Frg. 3. The superposition of two wavelets t o form x(nT); the complex

cepstra for a < 1 and u > 1 ;and the liftering of ?(nT) t o eliminate the Thus the equation correspondingt o (20a) is
echo pulse train. This p r o m is analogous t o what is commonly
called notch filtering in the frequency domain. az
x^(nT)=f^(nT)+as(nT-noT)+-6(nT- 2n0T)+-.-.
2
which may be expanded as
(23)
This complex cepstrum is minimum phase and is nearly identi-
cal to that in (20a) except that the signs of the train of pulse
functions are all positive rather than alternating in sign. The
remarksfollowing (20a) apply here as well.When Q is near
unity this example might be consideredmorerepresentative
of speech data for the situation of a sustainedvowel phonation
such as / i /.
In the general multiple echocase the delays become "mixed"
via the series expansion of the logarithm. This greatly com-
plicates the proper estimation of the true echo delay times
Upon removal of the linear phase term, -jwnoT, the com- [91, [9a], [ 121, [31], [35], [38]-[40]. The estimation is even
plex cepstrum becomes further complicated if aliasing is severe [ 12 1, [ 321.
C . The Relationship Between the Complex and

Power Cepstra
Clearly the complex and power cepstra are closely related.
1
- -6(nT + 2noT) * * . . (20b) Thesimpleformalrelationshipcan be obtained from (1) as
2a2 follows:
Thusthecomplexcepstrum again has peaks attheecho xpc(nT)= (z-' (log (X(Z) * X*(Z))))'
delay(and itsmultiples),butthesepeaks now occurat
negative rather than positive quefrencies and their gamnitudes = (z-' (log X(Z) + log X*(z)))'. (24)
(amplitudes) are related t o ( l / a ) rather than a. If these peaks Assuming x ( n T ) is real and evaluating its z-transform on the
are removed by liftering and thewavelet recovery procedure is unit circle, we find X * ( z ) = X(z-' ), thus we may write
followed including the reinsertion of the linearphase term,
then the echo is recovered rather than the basic wavelet. The
effect of liftering on the complex cepstrum is schematized in x& T ) =
Fig. 3 for a > 1 and a< 1 within the contextof Fig. 1.
(& $
log X ( z ) z"-' dz
2
It will be noted that the peaks in the complex cepstrum due
to the impulse train may never have an amplitude greater than
+ -
2nj
1
$
log X(z-')z"-' d z ) . (25)
unity regardless of the value of a. Further, note that multiply-
ing the original composite signal bya scale factor only changes Letting z'= z-' , we obtain
the coefficient of the 6 ( n T ) term in the complex cepstrum,
since the scale factor appears as a shift in the mean of the log
spectrum. Therefore, the complex cepstrum does not depend
on the composite signal scale factor, but does depend on the \=
$
- n
I
SNR as well as on the ratio of the signal-to-noise bandwidths + - log X(z')z)-"-' d z ' ) . (26)
121, [71, [301. 2nj
1434 PROCEEDINGS OF THE IEEE, VOL. 6 5 , NO. 10, OCTOBER 1977
Then by the definition of the complex cepstrum in (1 0) we logarithm) of the z-transform of the data sequence. This may
have be written as
xpe(nT) = (x^(nT)+ P(-nT))'. (27) xL(nT) = (2-'(2 log X(z) - 2 log I X(z)I))' (29)
Thus the power cepstrum is four times the square of the even where the factor of 2 has been introduced to eliminate any
part of the complex cepstrum. This also follows from the fact normalizationfactors in the relation between the phase and
that the power cepstrum is the squareof the inverse transform complex cepstra and x ~ ( 0=) 0. From (1 0 ) , (24), and (27) the
of twice the real part of the log spectrum; and, as was noted phase cepstrum can be easily shown to be
earlier, the power cepstrum containsno phase information.
Equation (27) is of value since the power cepstrum is often xL(nT) = (E(-nT) - E(-nT))'. (30)
superior to the complex cepstrum for echo arrival time estima- Thus the phase cepstrum is to thephase as the power cepstrum
tion [ 12 I , [ 241, [ 251. This is apparently due to the fact that is to the log magnitude. Once again the final squaring opera-
the linear phase contribution (to be discussed below) of the tion could be changed to magnitude squared or eliminated.
imaginary part of the logarithm tends to mask the echo delay. Empirically, it has been determined that the phase cepstrum
There are probably other phase unwrapping errors as well as is less useful than the power cepstrum in the determination of
noise errorswhichcontribute to thisobservation.Complex echo amval times [ 261, [ 271. This is apparently due to the
exponential weighting [ 2-51, which we discuss later, appears to phase unwrappiiig efiors-produced by additive noise and linear
be a method which can assist the investigator in the determina- phase terms. The phase cepstrum is as difficult to compute as
tion of echo delay times from the complex cepstrum. the complex cepstrum, since both require phase unwrapping.
A wavelet recovery (homomorphic filtering)system can However, the phase cepstrum has proven valuable in evaluating
easily compute both the power and complex cepstra as shown the effects of noise on the signal phase [26]. Significant dif-
in Fig. 1. ferences in the appearance of the phase and power cepstra can
Finally, as was noted earlier, if the squaring operation in(27) be. indicative of phase unwrappingproblemswhich might
is not performed, then the system in Fig. 1 can be used to ob- otherwise go unnoticed.This has proven to be the case in
tain an estimate of the logpowerspectrumandin turn the someof the work byone of theauthors (DPS)on echoes
power spectrum of the basic wavelet. Note that if this is one's generated by chirp signals.
objective (and not wavelet recovery), then the problems as-
sociated with phase unwrapping are not encountered. 111. PHASE PERPLEXITIES
D. The Phase Cepstrum Many problems arise in the computation of the phase se-
quence for the complex cepstrum. Here we address several of
The inverse transform of the phase of the complex logarithm these problems along withtheir alleviation.
yields peaks at multiples of the echo amval time in much the
same way that the inverse transform of the log magnitude does. A . Linear Phase Components
This can be shown as follows for the single additive echo case:
The presence of a linear component in the phase sequence
introducesrapidlydecayingoscillationsinthecomplex cep
strum, e.g., let the spectrum of such a signal be represented as
X(eiwT) = e-irw X'(eiWT) or X(z) = Z-~PX'(Z). Thenthe
cepstrum of the linear phase term alone is
f 0, n=O
a sin w n o T
1 +a cos w n o T
). (28) 2 d n T )= ;-
-cos
-r
n r = -(- n # 0. (31)
The fourth term on the rightproduces ripples in the phase, nT
just ashthe third term produces ripples in the log magnitude. This term is added to thecepstrum of the remaining portion of
Since W e i W T ) is obtainedfromthetransform of a real se- thedata beinganalyzed.Note that it changessign at each
quence, its real part (magnitude of the transform of the real sample and although it does decay, it may be quite large de-
sequence) is aneven function of w, anditsimaginarypart pending upon r . Such a term maymask echo peaks in the com-
(phase of the transform of the real sequence) is an odd func- plex cepstrum, and should. be removed by subtraction from
tion of w. Thus the inverse transform ofRe (?(eJWT)) will the composite signal phase. Several procedures for doing this
yield the even portion of the complex cepstrum and the in- appear in the literature [9], [ 31 1. Basically, this is just trend
verse transform of j Im ( 2 ( e J W T )will
) produce the odd portion removal,which is standardpractice for improvingspectral
of the complex cepstrum. Since the inverse transform of the estimates.The removed linear phase term canbe recorded
term log (1 + ae-jwnOT ) produces peaks on one side of the and then reinserted during the inversion process if necessary.
origin only, the peaks produced by its real and imaginary parts. The presence of a linear phase term may influence the choice
must be equal in magnitude and opposite insign on one side of of liftering to beapplied to thecomplexcepstrum. If the
the origin but of the same sign on the other side of the origin echo is to be removed and the basic wavelet is to be recovered,
(dependinguponwhethertheechoamplitude a isless or then the echo peaks should not be notch liftered (removed) by
greater than unity). simply replacing them with theaverage of their adjacent points,
From these observations we formally define the phase cep- since these adjacent points have contributions from the linear
strum of a data sequence as the square of the inverse z- phase component (if it has not beencompletelyremoved)
transform of twice the phase (the imaginarypart of the which are oppositein sign to thecontribution of the echo
CHILDERS et al.: GUIDE TO P R O C E S S I N G 1435
point to be removed. Instead, if the echo is located at no in Thus the change in phase is inversely proportional tothe
the complex cepstrum then this point should be replaced with magnitude squared of the spectrum. If a notch occurs in the
the average of the (no + 2) and (no - 2) points. This form of spectrum, then the change in the phase may be quite large,
liftering results in a smaller meansquare error (MSE) in the and, therefore, proper phase unwrapping may be difficult to
recovered wavelet than when the average of the points adjacent achieve. Further, the phase may change sign rapidly in these
to ‘the echo peak is used. This has been found to be the case spectrum notches. This represents a serious problem even in
evenwhen the linear phase component has been completely the absence of noise as the above example illustrates. There-
removed [26]. Thislifteringprocedure is not claimed to be fore, it is quite possible fortheunwrapped phasecurve to
optimum. In fact the liftering procedure is undoubtedly signal containdiscontinuities(jumps or steps) in the vicinity of a
andnoise dependent and would in general involveaveraging spectrum notch.
more than just two points in the complex cepstrum. As was pointed out earlierspectralnulls canbecaused
A serious problem in phase unwrapping is encountered when physically by 180’ phase reversals in reflections at boundary
discontinuitiesinthe phase occurincalculating the phase interfaces [ 361, [37]. Nulls in the spectrum may be an im-
modulo 2n via the arctangent routine. The phase unwrapping portant aid to data interpretation. The investigator needs to
algorithm previously described removes thesediscontinuities understandthephysicalsituationunderwhichthedataare
provided the phasechanges by less than A between samples. collected and to model it well [37].
Recently, it has been pointed out that a linear phase compo-
nent with a large slope w lli cause errors in this unwrapping C . OtherResults
procedure [26], [31]. If the phasechanges betweensamples Finally, our results have indicated that when cepstrum
are greater than A due to the presence of a linear phase term, analysis is performed on a bandpassfunction phase unwrapping
then this unwrapping problem can be alleviated by increasing outside the signal band is of little value and may actually be
the recordlengthwith theaddition of zeros [26]. This is detrimental to wavelet recovery since the spectrum outside the
equivalent to sampling the z-transformmorefrequently. If signal band is dominated by noise. Similar considerations lead
one is unsure whether the phasechange betweensamples is us to avoid oversampling since this leads to large segments of
less than A, then one can check such an hypothesis with the the log spectrumbeingdominatedbynoise.(Seethenext
above procedurebycomparingtheunwrapped phase before section.)
and afterthe recordlengthhasbeenappendedwithzeros.
Others suggest that an iterative approach to phase unwrapping IV. O T H E RPROBLEMS
is helpful [ 3 11. A. Aliasing [ 9 ] , (261, [32]
Oneexample of where the linear phase component gives
Aliasing of the cepstrum is of course an ever present prob-
problems is when x ( n T ) = f ( n T- noT), [0, N - 11, zero other-
lemsince th% nonlinearcomplexlogarithmintroduces har-
wise, then X ( e j w T ) = e -jwnoTF(ejwT). As expected the phase monics into X ( z ) . The appending of zeros to the input data
of x is the sum of a linear phase component and thephase off. sequencereduces aliasing as will selecting thedata record
If wis theminimum rate (w = ( n 2 n ) / ( N T ) )then X(ein(2nlN))= length N T to be as large as possible. This latter choice is sub-
e ~ z n n ( n o ~ ~ F ( e ~ n ( * nIf/ Nn)o) >
. N / 2 the linear phasecom- ject to theconstraintsimposedbytheinvestigatoronthe
ponent will change by more than n between samples and un- number of pointsthat can beanalyzedand the minimum
less the phase o f f counteracts this change, the phase unwrap sampling rate. If thetotaldata recordlengthexceeds the
ping algorithm will yield erroneousresults.This has been duration of the composite signal contained within the record,
observed in computer experiments when the composite signal then it is questionable if the total data record length should be
is delayed by more than half the record length. As expected further extended with still more “data.” The reason for this
this not only reduces the echo detectability in both the phase doubt is that the spectral samples will increasingly reflect the
and complex cepstra, but also severely distorts the recovered effect of the noiseratherthanthe signal as thetotaldata
wavelet. record length surpasses that of the composite signal duration.
B. Spectrum Notching B . Oversampling
It should also be noted that zeros near the unit circle in the Oversampling of the data record when noise is present is also
z-transform of the echo sequenceresult in notches in thespec- a problem. Outside the signal band noise dominates the spec-
trum sequence wherein additive noise may dominate. We have trum. This usually presents no problem in ordinary spectrum
seen earlier that one phase unwrapping algorithm requires that analysis
since these
components frequently
containlittle
the changes in phase between samples must be less than f n , power but this may not be the case for the cepstrum. Because
i.e., the derivative of the phase with respect to frequency must of thenonlinear logarithmic operation,the regionsoflow
be less than + A . power in the spectrum may contribute as much or moreto the
Consider the z-transform of the data sequence evaluated on cepstrum as the regionswhich containthe signal inthe
the unit circle, then spectrum. When this occurs it affects both echo detectability
X ( e j U T ) = I X(e
~ W T )j /~x ( e j W T ) - and wavelet recovery. Oversampling also aggravates phase un-
e - XRe(eiwT)
+iXIm(e iw T I
wrapping and aliasing since it shortens the data record (if the
or total number of data points orsamples is fixed), which in turn
implies that the samples of the log spectrum are spaced farther cessing this is notthe case. Here thedata are highly non-
apart. stationary. And the investigator is frequentlyinterestedin
analyzing the speech signal over one pitch period (or at most
C. Appending Zeros three pitch periods). In this case windowing is of considerable
It is well known that appending zeros to a data sequence benefit.
increases the sampling "rate" of its discrete Fourier transform. One can see for the single echo case that windowing the in-
This benefits the computation of the cepstrum in two ways. put data record normally prevents the logarithmic operation
First, the increased sampling "rate" in the frequency domain from fully separating the basic wavelet and the echo series as
reduces aliasing of the cepstrum. Second, increasing the fiie- follows:
ness with which the phase curve is sampled reduces the number
x(nT) = [ f ( n T )+ u f ( n T - noT)l w(nT)
of phase unwrapping errors (which result from jumps greater
than n between samples). Our results have indicated that or
extendingthe record lengthwith zeros results in amodest
improvement in the recovered wavelet even when aliasing and X(z) = [F(z) (1 +aZ-no)l * W(z). (37)
phase unwrapping errors do not appear t o be aproblem.It For arbitrary W(z),the contributions of the basic wavelet and
should be noted that unless the record length is extended with theechocannot generally be separated by taking the loga-
zeros, then aliasing causes an ambiguity in the determination rithm of (37) since theterm in brackets is convolved with
of the echo epoch (arrival time) and amplitude. This is due t o W(z). Fortunately, as will be discussed morefully below, in
the fact that there is no way to distinguish between an echo of practice the cepstrum procedure can still be applied with effec-
relative amplitude a and delay no and one with amplitude 1/a tiveness even though there is some error.
and delay (N- n o ) where N is the total numberof samples. Schafer [21] suggested awindow which does preserve the
Mathematically, these statements are verified as follows: separability of the basic wavelet and echo series and which has
consider the z-transform of the sequence x(nT) where x(nT) = 0 proven extremely useful in cepstrum analysis. This window
outside [ 0, N - 1 ] denoted as
N- 1 anT, O G n G N - 1, O<a<l
X(z) = x(nT)z-" (334 w(nT) =
n =O otherwise 8) (3
which when evaluated on the unitcircle gives was fmt proposed t o reduce the error associated with truncat-
ing the echo when it extended beyond the end of the record
N-1
[21].- Obr results and those of others [261, [31], [581 have
X(ejwT) = x(nT)e -jwnT. (33b)
n =o indicated that this window is quite useful because it reduces
the aliasing of the echo impulse train in the complex cepstrum
Ifwe sample at uniformly spaced intervals around the unit by imposing an (anOT)nweighting on the impulses. This fol-
circle, we obtain lows directly from acalculation of the z-transform of (36)
N-1
with (38) used for w(nT), i.e., for this case
X,(eibnm/N)) = x(n~)e-i(2"mn/N) (34)
n=O X(z) =F(a-Tz)(l +uanOTZ-no) (39)
which is just the DFT of x(nT). It follows that provided no truncationerror is presentand thatthe basic
wavelet begins at n = 0.
From (20) we see that when no window is used and a is near
unity and the echo delay is a substantial portion of the record
Since the logarithm (which is a zero memory nonlinearity)of a
length (NT), the higher order peaks may not decrease rapidly
sampled function is equivalent to sampling the logarithm of
enough to avoid aliasing. This problem can be overcome with
the function, then with a little additional effort itfollows that
the window under consideration. Our results indicate that the
the complex cepstrum of the DFT of x(nT) (or thez-transform
choice of a is data dependent and a should be chosen as close
of x(nT) sampled on the unit circle) is just the periodic exten-
to unity as possible, consistent with the desired reduction in
sion of the complex cepstrum of the original data sequence.
aliasing. The closer the data sequence is t o a maximum phase
We see that the effect of appendingzeros is to increase N.
sequence, the more one can reduce a,e.g., from 0.99 to 0.98
This implies we sample the log spectrum at smaller intervals,
or 0.96. The choice of a is also dependent on the echo delay
since the spacing between these samples is proportional to
time which is discussed more fully later.
1/N. As described above, theerrorsintroducedby a linear
The exponential window can introduce some distortion into
phase componentor aliasing arereduced by increasing N
the recovered wavelet even if the data are unweighted by the
through theappendage of zeros.
inverse window in the recovery process [lo], [ 261. This is
V. WINDOWING primarily due to the distortions introduced into the data that
extend beyond the duration of the wavelet of interest.
A. The Composite Data In summary the exponential window performs nearly as well
Echo detection and extraction are degraded by applying to as the rectangular window when no noise is present but does
the data record a window ordinarily used to reduce leakage, introduce some distortion as noted above. Further, the echo
e.g., Hamming, Hanning, Tapering (Tukey window), unless the arrival time can be determined even when wavelet recovery
window is relatively constant (flat) over that portion of the cannot be effected. Also if rectangular windowing is judi-
data recordcontaining the composite signal. In speechpro- ciously applied, thenthe cepstrum can be used to detect
CHILDERS e? al.: GUIDE TO
PROCESSING 1437
similar but not necessarily identical wavelets. We suspect that fidelity of the recovered wavelet in a noisy environment [26].
if theexponentialwindow is used to make thecomposite Atlow SNR the MSE can be reduced by a factor of 2 by a
signal minimum phase, then the echo, most probably, will be judiciousrectangularwindowing of thecomplexcepstrum.
lost. This is essentially short pass liftering [ 91, [ 121, [ 191, [ 2 1 ] in
Finally,it should be noted that the exponential window may which the aim is not to eliminate the echo peaks (which are
be used to alter the SNR of a data record more effectively than generally notch filtered prior to the windowing) but rather to
the rectangular window. This can be effected when the com- eliminate the high quefrency noise dominated sections of the
posite signal occupies only a portion of the total record. In complex cepstrum. This concurs with the results of [ 121 in
this case the windowmay weight the signal moreor less which it is reported that a Hanning smoothing of the log spec-
heavily than those portions of the record containing thenoise. trum (which is equivalent to Hanning windowing of the com-
However, caution should be exercised in echo detection and plex cepstrum)improves wavelet recovery.Itappears that
extraction when the signal (wavelet) of interestoccursnear there is little to choose between the rectangular or Hanning
the end of the data record and thus will be greatly reduced by window of the complexcepstrum to improve the fidelity
the exponential window. (MSE) of the recovered wavelet. We mention once again that
We wish to emphasize that the comments made in the last theseobservations are probablydatadependentand are in-
three paragraphs have been greatly influenced by our analyses fluenced by the duration of the window as well.
of echo type data. It is our opinion (as well as that of others)
D. SequenceTruncation
that such observations are and will be highly data dependent.
Recently it has been proposed that the exponential window As mentioned above in subsection A errors may be intro-
be generalized to include complex exponential weighting, i.e., duced by truncating the echo if it extends beyond the end of
#Tei@nT [25]. It may at f i t appear that this phase factor the record. In addition, aliasing of the echo impulse train in
will have no significant effect on the complex cepstrum, i.e., the complex cepstrummay occur. These errors can be reduced
it will introduce only a phase shift. However, the procedure appreciably by exponentiallywindowingthesampleddata
can be used to change the phase relation of the echo (multi- sequence. But we suspect that if the exponential window is
pathreflection)by 180' [ 251. Thismaymake it easier to too severe, then the echo may be lost.
detectapeakinthecepstrum.Thecomplexexponential ,SEISMIC,
VI. DATAPROCESSING-SPEECH
factor 4 can be varied in a prescribed fashion so that it may be AND HYDROACOUSTIC
used as a hypothesis tester. Thus trial sweeps of the complex
weight can be generated to confirm or deny a priori estimates Three application areas which appear to be using cepstrum
of the echo delay [25 I. It appears that this technique may analysis quite frequently are speech research, seismology, and
prove to be a powerful investigative tool to assist the researcher hydroacoustics. In the three subsections that follow wehave
in interpreting his data. tried to itemizethe major data processing procedures that
investigatorsineach of these areas tend to use. For some
B . The Log Spectrum situations these lists may be simplistic but we feel that they
are nonetheless indicative of the typical steps considered by
Onemight be motivated to window the logspectrumin some if not all investigators in theseareas.
order to reduce leakage in the complex cepstrum which could
be falsely interpreted as peaks due to echoes. Windowing of A. Speech f181, (191, /211, f301,[421, f461-/511,
the log spectrum will, of course, introduce some loss in time /531, (551, f611-/651, (661, f711
resolution in the cepstrum domain. Then, if the echo contri- Within speech research there are at least three problems to
butions can be liftered from the complex cepstrum and if the whichcepstrum analysis is applied. The first is perhaps the
recovered log spectrum can be corrected (by multiplying by most difficult. This problem seeks t o achieve the deconvolu-
the inverse of the windowing series), we should be able to re- tion of three signals which form the basis for a model of the
cover the basic wavelet. Our results have, however, indicated speech process. This simple model considers voiced sounds to
that such windowing of the log spectrum raises the echo detec- be produced by quasi-periodicpulses of air which in turn cause
tion threshold by around 12 dB and severely distorts the re- the vocal cords to vibrate producing glottal pulses which excite
coveredwavelet whenadditive noiseis present [26]. This is the vocal tract to finally produce speech. For nonnasal sounds
apparently due to the fact that windowing the log spectrum is the vocal tract is modeled as an all pole filter over short time
equivalent to smoothing the complex cepstrum.Thus, it ap- intervals. Theglottalsource is modeledwithzerosin the
pears that windowing the log spectrum may smooth out the zdomain again over short time intervals. The vocalized speech
very peaks one wishes to detectinthecomplexcepstrum. signal is, therefore, modeled as the three fold convolution of
Thedistortionintroduced into the recovered wavelet is un- an impulse train, the glottal impulse response, and the vocal
doubtedlydue to thiswindowing of the logspectrum(or track impulse response. These three signals are to be decon-
smoothing of the complex cepstrum). volved. This is difficult to achieve without considerable addi-
tionalinformationorassumptions.It is less difficult to
C. The Complex Cepstrum achieve the deconvolution of the pulse (impulse) train with the
Since noise is usually interspersed throughoutthedata compositeconvolution of theglottalimpulseresponseand
record and the composite signal may occupy only a portion of vocal track impulse response since these two time sequences
the record, it seems reasonable that the high quefrency com- occupy different quefrency ranges in the cepstrumdomain.
ponents of thecomplexcepstrummayfrequentlycontain Anotherrelatedproblem previously mentionedis to esti-
morenoise than signal information. Our results have shown matethe envelope. of the speechspectrum.Thespeech
that by judiciously zeroing the high quefrency components of spectrum is generally quite scalloped, i.e., it looks like an un-
thecomplexcepstrum we maysignificantlyimprove the dulating picket fence. The scalloping or spectral modulationis
due to the speaker fundamental frequency (pitch) or periodic

pulse train. The pulse train can be liftered from the cepstrum
by ashortpasslifter.The inverse process shown in Fig. 1 is
then followed to the point where the spectrum is obtained.
This yields an estimate forthe speech spectrum envelope.
However, this estimate isnot as good as that obtained bylinear
prediction(predictivedeconvolution) [66],[71]. We illus-
trate this later with an example.
The third common problem is to achieve an estimate of the
pitch period or the interval between the pulses in the excita-
tion pulse train. This is easily accomplished by longpass lifter-
ing the cepstrum and then following the inverse process out- ‘0:Oo 5.:IO 19.30 15.30 c‘O.YO 25.50
T l H l IMSECI
lined in Fig. 1. The pitch period can also be measured directly
from the cepstrum by measuring the timeinterval from the
origin to the fmtpeak.
Very recentlyit has been suggested thathomomorphic
filtering in the form of cepstrum analysis be combined with
linear prediction to effectauseful pole-zero modeling and
inverse filtering procedure for mixed phase signals [ 181, [42],
[571. This procedurehas been applied to speech [42 I and
seismic data[57].It is sure to see increased application
within the near future.
The specific information and procedures commonly used in
processing speech data are the following:
1) The speech record is usually windowed with a Hanning or
Hamming window. The total window duration is often on the A ;
order of three pitch periods or less, i.e., approximately 24 ms. r 9.00 6.00
T I M E (tl5ECl
12.00 16.00
4
20.00
This will, of course, vary with the speech signal being analyzed
and the individual investigator.
2) Zeros may be appended t o the windowed speech signal
Fig. 4. (a) Speech signal. (b) The corresponding cepstrum.
to increase its effective record length; sometimes this is on the
order of a factor of 10 but is usually less. A typical sampling
rate is 10 kHz. 0
0
3) The speech bandpass is typically in the range of 50 Hz to dT n
4 or 5 kHz with resonances appearing in the spectrum at the
natural frequencies (formants) of the vocal tract. While the
location of these formants varies with the phonation or articu-
lation a great deal is known about their typical location and
bandwidth. This information helps considerably in the analy-
sis of speech data.
4) Aliasing is not a serious problem in the analysis of speech
data.
5 ) The compositeconvolution of theglottal impulse re-
sponse and the vocal tract impulse response is generally less ‘0: 00 1.M 2.00 3.00 1.00 .) 00
than 5 ms in the cepstrum domain. The f m t peak in the cep- FREQUENCY I K H Z )
strum due to the pulse train is in the vicinity of 8 ms. This
information is used t o design the shortpass and longpass lifters.
Example-Speech: Inthissimpleexample we compare the
speech spectral envelopes obtained by both linear prediction
and cepstrum processing.
The steps followed for cepstrum processing are those shown
in Fig. l(a) [or l(b)]. The resultant envelope is obtained at
step (6). Included in the complex logarithm step is the calcu-
lation of the squared magnitude of the DFT. All DFT calcula-
tions are performed by an FFT algorithm.
The speech record analyzedappears in Fig. 4(a)and was
sampled at 10 kHz to give 256 data points with a 512 point
FFT being used. Thephonation was asustained / i /. The
record was then Nanning windowed (256 points) prior t o
cepstrum analysis. No zeros were appended for this example.
The “cepstrum” (inverse DFT of log 11’) is shown is Fig. 4(b). (b)
The peak at 6.7 ms corresponds to the pitchperiod which can Fig. 5. (a) Speech spectrum o f Fig. 4(a) and envelope derived by tifter-
ing the cepstrum inFig. 4(b) (the superimposition is less than perfect).
also be estimated from Fig. 4(a). Next, shortpass liftering was (b) Speechspectrum o f Fig.4(a) and envelope derived by linear
applied. The lifter was constant from zero t o 1.5 ms and then predlctlon.
CHILDERS e t al.: GUIDE TO PROCESSING 1439
had a cosine taper from 1.5 ms to 2.5 ms and was zero beyond
2.5 ms. The final step was a forward FFT to yield the speech
spectrum envelope shown in Fig. 5(a) which is superimposed
albeitimperfectlyuponthepowerspectrum of windowed
speech signal. Thelifterselected gave the “best”spectral
envelope
fit the
opinion
in of the
authors. However, the EVfNT
results are sensitive to the typeof lifter used.
The peaks shown correspond to the formants, but it is possi-
ble insome cases forthecepstrumprocedure described to
yield false formants in between the actual formants.
Thelinearpredictionprocedurecalculates the coefficients
for an all pole filter
Fig. 6. Simple representation of earth and two seismic waves resulting
1 seismic
event.a from The P wavelongitudinal
direct,
is the
wave.
The pP wave is the single reflected wave. Neither passes through the
P inner or outer cores, but rather propagate through the mantle.
1- akz-k
k=l
has just been proposed which combines linear prediction and
from the windowed speech signal by an autocorrelation tech- homomorphic deconvolution to more effectively
analyze
nique [71]. For this example the windowed data was 20 ms mixed phase data [18], [42], [57].
with p = 16. The envelope of the spectrum was determined by The specific information and procedures commonly used in
finding the FFT of the sequence (1, - a l , - a z , * * ,- a p } and processing seismic data are the following:
thencalculating the reciprocal of theFFT. Zeroswere ap- 1) A shorttime series window(usuallystarting about 1 s
pendedprior to calculating the 512 point FFT. The results before the onset of the signal and lasting from 3 to 7 s into the
appearin Fig. 5(b)superimposed uponthespeech power data record) is used. Some form of tapering is almost always
spectrum.It will be notedthat linearprediction provides a applied, e.g., for a 7 s window, a linear taper may be applied
“better” spectral envelope and yields smaller formant band- for 1 s before the signal onset as well as to the last second of
widths than cepstrum processing. the record with the window being constant for the 5 s in be-
tween the two taperedends. Longer windows are not generally
B . Seismic Data Processing [ I ] , [18], [34], [36/, (371, used since these would include more of the coda (i.e., tails of
I s 71-[591, (801 the data) which contain too many multipath reflection signals.
For more than a decade cepstrum analysis has been applied 2) A weighting of the time series may also be used, indepen-
to seismic data. 1) to determine the focal depth of a seismic dent of whether a window is applied. This weighting is gener-
event; 2) to remove spectral modulations caused by multipath ally in the form d“‘, where 0.96 < (Y < 1.0. This procedure
reflection; and 3) to determine the slapdown phase which re- tends to make the P phase more minimum phase.
sults from spallation of the earth’s surface near ground zero, 3) Zeros areappended tothe time series to increase its
and other situations. Knowledge of the depth of the seismic length,sometimes by as much as afactor of 10. A typical
event can be used to help discriminate whether the event is an sampling rate is 20 samples per s. The windowed data record
earthquakeoraman-madeexplosion.Theeliminationof may then be 5 s to give 100 data samples. This record is then
multipath reflections assists the data interpreter in the deter- extended with zeros to a duration of 1024 samples.
mination of source (event) parameters. Similar remarks apply 4) The seismic bandpass forbody waves is generally con-
to spalling as well. sidered to be in the range 0.1 to 2-5Hz depending on various
Usually, the power cepstrum, or a variation thereof (asdis- factors [36]. Echo delay (epoch) times for seismic events are
cussed earlier), is used to estimate the (P- p P ) time difference in the range 0.1 Q no T G several seconds. The lower end of
(see Fig. 6) which is the most realistic indicator of the depth this range requires 10 Hz bandwidth which is not always avail-
of theevent.Thecomplexcepstrum is used to separate able. At 20 samples per s, the data is over sampled so aliasing
(deconvolve) the basic P phase wavelet from the impulse train is of no concern.
caused by the echo or echoes. Epoch timing information can 5) Since seismic data has a narrow passband, the spectrum
onoccasion be obtainedby longpass liftering thecomplex is nonwhite. An elastic absorption by the earth of teleseismic
cepstrum. But this may require that the event be deep so that signals above 2 Hz means that the cepstrum is dominated by
the P phase cepstral information and the echo information are the bandpasscharacteristic of the earth [36]. Thiseffectis
adequately separated in quefrency. aggravated bythefrequencyresponse of the sensing instru-
Good success in processingseismic data is, however,ap- ment,but this canbe correctedby inverse filtering [36].
parentlydependentupona good SNR and wide bandwidth Pseudo-heterodyning,atranslation inquefrency(amethod
data [34]. analogous to conventional heterodyning), may be helpful for
Formanyyearspredictivedeconvolution(orlinear pre- such data [9(a)l.
diction) hasbeen used in theanalysis of seismic data [ 661-[ 861, 6) Cepstraare also calculated using thecodaandthen
(see [80]for a review). Thisprocedure is parametricand averaged, taking into account the predictable travel time dif-
proposes a model for the basicwavelet. As such it does not ferences. Spectra
are also
similarly
averaged to enhance
work particularly well if the wavelet to be removed is mixed spectralnulls [36], [37].
phase. Cepstrum analysis (homomorphicdeconvolution) is a Example-Seismic Data: This example maybe considered a
more general method for deconvolution, and is effective when simulation of seismic data with the echo being negative at the
the cepstra of the signals to be deconvolved occupy different airearth interface. In Fig. 7 we present four groupingsof
quefrency ranges. A method, called homomorphic prediction, threegraphs each. The firstgraph inthe firstgroup is the
NOTCH F I L T E RE X R M P L E notch liftering. Here it can be seen that the spectral nulls have
C O R L E X CEPSTRLM NOTCH FlLTER
m
mm r m -1s- not beencompletelyremoved. And finally we have thelog
spectrum of the estimate of the echo wavelet.
INPUT T l l E SERIES
MOEL 0.50 SEC C. Hydroacoustic Data Processing (181, (311, (351,
.E
IIECOVLAED YRVELET [381, (451, (571
The power cepstrum has been used to estimate the source
depth of a known explosive charge by analyzing the data re-
ceived at long ranges. This is accomplished by measuring the
time period of the bubble pulse modulation on the spectrum
which results in a peak in the power cepstrum [45].
I
Thecepstrum has also been used to investigate multipath
A I
I \ I I
I
I WIVELET CEPSTRUM
conditions in shallow water as well 1311, [351, [381. But the
echoes are not all identicalin waveshape as commonly
assumed. The cepstrum is also apparently affected to a con-
siderable degree byfluctuationsinthetransmissionmedia,
IWUT flMPLITUU€ SPEC bottom reflections, and surface scattering1381.
As mentioned under the subsections on speech and seismic
WIVELET R R . SPEC. dataprocessing, it has been proposed that linear prediction
(predictivedeconvolution)andhomomorphicdeconvolution
ECHO R(IPL1fUDE S P E L be combined to more effectively analyze mixed phase signals
[ 181, [421, [ 571. This suggestion has been tested on marine
\
+ NULLS INPUT LOG SPECTRUM seismic data with apparently goodsuccess [ 571.
The specific information and procedures commonly used in
processing hydroacoustic data are the following:
1) The sampling rates used are dependent on specific appli-
cations as wellas the computational resources available, but
typically they are in the range of 100 to 1000samples per s.
2)Theeffects of windowingare not considered to be as
important as for seismicdata. Windows commonly used in-
cludetherectangular(boxcar), Hamming, Hanning (cosine),
Fa.7. Simulated seismic data example. The normalized amplitude of and linear taper.
the wavelet plus a negative echo is 1000 units. The echo occurs at 3) Rather than extend the data record with zeros, a longer
0.5 s and i s 0.6 that of thewavelet. An exponentialwindow time window is frequently used. It is commonly assumed that
(a= 0.99) was applied to the composite signal prior to calculating the
cepstmm. The four groupings of graphs show the time domain data hydroacoustic data is stationary, thus time averaging can be
and the corresponding cepstra, spectra and log spectra, respectively. employed. Typicalwindow durations are 1024or2048
samples, or longer depending on the computational capacity
available.
composite signal, i.e., the wavelet plus the negative echo. The 4) The spectrum is whitened prior to computing the power
overall normalized peak amplitude of the composite signal is cepstrum.Thewhiteningmaybe achieved witheitherthe
1000 units. The echo occurs 0.5 s after the onset of the basic square root or the logarithmic operation.
wavelet. The echo amplitude is 0.6 that of the original wave- 5 ) The spectrum is often highpass liftered to remove low-
let. The second graph in this group is the wavelet recovered quefrencycomponentsprior to computingthe power cep
by homomorphic filtering, i.e., notch liftering (rejecting)only strum. This is a form of trend removal to reduce leakage in
the fvst peak in the cepstrum. The third graph is an estimate the cepstrum domain.
of the echo obtained by calculating the difference between the 6)Theexpectedechodelay(epoch)times are 0.02 Q
first and second graphs. At the top of t h i s figure are the time noT Q 2 or 3 s. The bandwidth is at least 50 Hz.
scales for both the input data and the cepstrum. The spectrum
frequency scale appears at the bottom of the figure.
An exponentialwindow (a=0.99) was applied to the VII. CONCLUDINGREMARKS
composite signal prior to calculating the cepstrum. This made
the cepstrum of the composite signal nearly minimum phase. A . The Effects of Noise
The second grouping of three graphs presents the cepstra for The effects of noise are discussed in a qualitative way at
the previousthree waveforms respectively. The f m t two various points throughout the paper along with two recom-
points at the extreme left in each cepstra were zeroed. The mendedproceduresforalleviatingtheseeffectsfor wavelet
vertical line at 0.5 s shows the peak that was notch liftered. extraction and echo detection, namely, windowing the com-
Ideally, the second and third cepstra shouldbe identical when plex cepstrum and reducing errors due to aliasing and phase
normalized, but due to the fact that only one cepstrum peak unwrapping by appending zeros to the sampled data sequence.
was notch Liftered, these waveforms are in fact different. Noise analysis is presented in a more quantitative and extensive
The third grouping of three waveforms presents the respec- manner in [2], 171, [91,1121, [26], 1301. We point out in
tive amplitude spectra. Finally, we havein the last grouping particular that it has been recently shown that S.NR alone is an
the respective log spectra. The spectral nulls due to the echo insufficientmeasurefordeterminingcepstrumperformance
can barely be seen due to the narrow bandwidth simulated in and that the relative bandwidths of the signal and noise are
the first graph. The secondgraph is the log spectrum after also needed [ 71.
CHILDERS et 01.: GUIDE TO PROCESSING 1441
B. Summary addition to the specific papers already cited. Aspects of real-

We have attempted to provide the reader with a unified ap- time computation of the complex cepstrum are discussed in
proach to the power, complex, and phase cepstra, namely if [ 261, [28]. And those interested in homomorphic prediction
x^(nT)denotes the complex cepstrum, then the power cepstrumto process mixed phase signals should consult the recent work
is x p c ( n T ) = ( 3 ( n T )+x^(-nT))* andthe phase cepstrum is in [181,[421, [571.
xL(n T ) = (x^((n T )- x^(-n TI)’. We wish to point out that the principles of homomorphic
The problems associated with phase unwrapping, G e a r phase deconvolution have recently been applied to effect a trausfor-
components, and spectrum notchinghave been described along mation of the independent variable t rather than to effect a
with those of aliasing and over sampling. The extension of the deconvolution of two signals [60]. This results in a new type
sampled data sequence by appending zeros was shown to pro- of nonlinear filter which because of its signal dependency can
vide computational benefits, namely, the reduction of aliasing filter out superimposed noise on a signal, leaving large peaks of
in the cepstrum and the reduction of phase unwrapping errors. the signal unattenuated. This filter is signal dependent; it is
As a recapitulation we offer the following commentsrelative therefore apparently not a true homomorphic technique. The
tothe proceduresfollowedbyinvestigatorsinthe various filter can be realized in real time. One of the authors (DGC)
fields at each step in the cepstrum process shown in Fig. 1. has successfully simulated these results on alarge computer.
First, thedata are usually windowed or weighted insome We anticipate that there will be other interesting results in
manner. But the type of window used is data dependent. This the future and we hope that t h i s paper w liboth stimulate and
l
is true even for echo type data. Here the exponential window assist others to this end.
may be usefulornowindowat all (i.e., therectangular
window). For speech data the Hamming or Hanning windows REFERENCES
are frequently used. Afterwindowingzeros are usually ap-
Theory-Homomorphic Systems and Cepstra
pended to extend the data record. The fxst DFT is then per-
(with Applications)
formed with an FFT algorithm. The magnitude (or magnitude
[ 11 B.P. Bogert, M. J. Healy,and J. W. Tukey,“Thequefrency
squared) of the DFT is then usually found, even if the com- alanysis of time series for echoes: cepstrum, pseudo-autocovari-
plex cepstrum is to be calculated. At this point a number of ance, aoss-cepstrum, and saphe cracking,” in Time Series Analy-
procedures may be followed. The square root of the magni- sir, M. Rosenblatt, Ed. New York: Wiley, 1963, Chap.15, pp.
209-243.
tude of the DFT may be calculated in order to whiten the [ 2 ] B.P. Bogertand J. F. Ossanna, “Theheuristicsofcepstrum
spectrum. The logarithm performs a similar function but may analysis ; o f astationarycomplexechoed G a b a n signal in
be considered too severe a whitener. Trend removal may also stationaryGaussiannoise,” IEEE 7kans. Inform. Theory, vol.
IT-12, pp- 373-380, July 1966.
be performed on the spectrum to prevent leakage in the cep- [ 31 J. F. Bohme, “The cepstrum as a generalized function,” ZEEE
strum. Phase unwrapping is also done during this stage if the Trans. Inform. Theory,vol. IT-20, pp. 650-653, Sept., 1974.
[ 4 j D. Childers.“Compositesignaldecompositiontechniques,” in
complex cepstrum is to be calculated. It appears that window- Inr. Con5 on Comm. (ICC), Seattle, WA, June11-13,1973,
ing the log spectrum should not be performed for the reasons PP.1-6.
already given.And apparentlyzeros arerarelyappended to [SI D.E. Dudgeon, “Existence
of
cepstra for twodimensional,
rational polynomials,’’ IEEE Trans Acoust., Speech, Signal
the log spectrum. At the next step the forward orinverse FFT FV~ceSsing,VOI. ASSP-23, pp. 242-243, A p . 1975.
may be calculated. The forward FFT is usually calculated if [ 6 ] J. C. Hassab, “On the convergence interval
of thepower
the power cepstrum is desired. The inverse FFT of the liftered cepstrum,”IEEE Trans. Inform. Theory, vol. IT-20, pp. 111-112,
Jan. 1974.
(usually long pass) power spectrum gives the pseudo- [ 7 1 I. C. Hassab and R. Boucher, “A probabilistic analysis of time
autocorrelationwhichcontains sign informationconcerning delay extraction by the cepstrum in stationary Galrssian noise,”
IEEE Trans. Inform. Theory,vol. IT-22, pp. 4 4 4 4 5 4 , July 1976.
echo reflections.The inverse FFT of the log magnitude [ 8 ] -, “Analysis of signal extraction, echo detection and removal
spectrum is also useful for obtaining estimates of the spectrum bycomplexcepstrum,” J. Sound and Vibration, voL 40,pp.
envelope. If the inverse FFT of the log magnitude spectrum is 321-335, June 1975.
191R. C. Kemerait,“Signaldetectionandextractionbycepstrum
squared,thenthepowercepstrum is obtained.The inverse techniques,” Ph.D. D h r t a t i o n , University of Elorida, 1971.
FFT of thewhitenedspectrumis also frequentlyfound as [ s a I -, “Pseudo-heterodyning in the cepstral domain,” in 8th Ann.
well. The complex cepstrum is the inverse FFT of the com- SoutheasremSymp.SystemTheory, Apr.26-27,1976,pp.
37-41.
plex logarithmic spectrum, i.e., keeping track of phase. The [ 101 R. C. Kemerait and L. Balceda, “Signal detection and extraction
“cepstrum,” i.e., any of the above forms, is then either short- byweightedcepstrum techniques,”
in 1976 Sourheastcon,
ClemsonUniversity,IEEECatalogNo. 7 6 CH1059-5 Reg. 3,
pass, longpass, or notch (comb)liftered. Included in this lifter- pp. 3B-1-3B-3, 1976.
ing operation may be a procedureto simply zero the cepstrum [ 11 1 R. Kemerait and D.G. Childers, “Composite signal decomposi-
atvariouspointsorthecepstrummay be windowed.After tionbycepstrumtechniques,” in IEEERegion 3 Contention
Record, pp. K1-1-Kl-4, 1972.
liftering the inverse steps are usually followed in a conventional [ 12 1 -, “Signal detection and extraction by cepstrum techniques,”
manner.Theinvestigatormayterminatethe process atany IEEE Trans. on Inform. Theory, vol. IT-18, pp. 745-759, Nov.
step depending uponhis or her goal. 1972.
[ 1 3 ) A. Noll,“Thecepstrumandsomecloserelatives,” in Signal
The complex cepstrum is basically a method for deconvolv- fiocessing, J. W. R. Griffiths, P. L. Stocklin, and C. Van Schoone-
ing a train of impulses from a basic wavelet. Thus the form of veld, Eds. London: Academic Press, 1973, pp. 11-22.
the wavelet or the echo impulsetrain or both can be recovered. [ 141 A. Oppenheim, “Superposition m a class of nonlinear systems,”
M.I.T. Res. Lab. of Electronics, Cambridge, MA, Tech. Rep. 432
But this procedure clearly has its limitations. This is why we (F’h.D. dissertation), Mar. 31, 1965.
have included a limited list of references to the literature on [ 1 5 1 -, “Optimum homomorphic filters,” M.I.T. Res. Lab. o f Elec-
tronics,Quarterly Progress Rep.no. 77, vol. XIII, Statistical
linear prediction,predictivedeconvolution, inverse filtering, Communication Theory, AD 615324, Apr. 15,1965,pp. 248-260.
and general deconvolution [66]-[80] including some books on [ 1 6 ] -, “Nonlinear filtering of convolved signals,” M.I.T. Res. Lab.
computational seismology [81 I-[ 861. Forthosewhomay of Electronics, Quarterly Progress Rep. no. 80, 1966.
[ 171 -, “Generalizedsuperposition,” Inform. Cone., vol. 11, pp.
wish to learnmoreabouthomomorphicsystemsandthe 528-536, Nov.-Dec., 1967.
cepstrum including applications we recommend [ 6 1I-[ 65 ] in [ 181 A.V. Oppenheim, G. E. Kopec, and J. M. Tribolet. ‘‘Signalanaly-
1442 PROCEEDINGS O F THE IEEE, VOL. 65, NO. 10, OCTOBER 1977
sis by homomorphic prediction,” IEEE Trans. Acoust., Speech, soundspectra using cepstraltechniques,”presentedat 90th
and SignalProcessing, vol. ASSP-24, pp. 327-332, Aug. 1976. Meeting of Acoust. Soc. Amer., Nov. 4-7, 1975.
1191 A. V. Oppenheim, R. W.Schafer, and T. G. Stockham, Jr., “Non- 145 ] S. K. Mitchell and N. R. Bedford, “Long range sensing of explo-
linear filtering of multiplied and convolved signals,” Proc. IEEE, sive sourcedepths using cepstrum,”paperpresentedat 90th
VOI. 56, pp. 1264-1291, Aug. 1968. Meeting Acoust. Soc. Amer., Nov. 1975.
[20] R. Rom, “On the cepstrum of twodimensional functions,” IEEE [ 4 6 ] A. Noll,“Short-timespectrumand‘cepstrum’techniques for
Trads. on Inform. Theory, pp. 214-217, Mar. 1975. vocal-pitch detection,” J. Acoust. SOC. Amer., vol. 36, pp.
(21 ] R. W.Schafer, “Echo removal by discrete generalized linear filter- 296-302, Feb. 1964.
ing,” Ph.D. Dissertation, M.I.T., Cambridge, MA, 1968. [ 4 7 ] -, “Ceptrum pitch determination,” J. Acoust. SOC.A m . , vol.
1221 S. Senmoto, “Adaptivedecompositionofcomposite signals in 41, no. 2, pp. 293-309, Feb. 1967.
noise,’’ Ph.D. Dissertation, University of Florida, Gainesville, FL, 1481 -, “Clipstrumpitchdetermination,” J. Acourt. SOC. Am.,
1971. vol.44,no. 6, pp. 1585-1591, Dec. 1968.
[23] S. Senmoto and D. G. Childers, “Analysis of a composite signal [ 4 9 ] -, “Pitchdeterminationofhumanspeechbytheharmonic
bycomplexcepstrumandadaptivefilter,” Trans.Insr.Elec. product spectrum, the harmonic sum spectrum, and a maximum
Comm. Engrs., (Japan), pt. A, pp. 9-16, 1972. likelihood estimate,” in CompurerProcessing in Communications,
(241 -, “Adaptive decomposition of a composite signal of identical J. Fox, Ed. Brooklyn, NY: Polytechnic, 1970, pp. 779-797.
unknownwaveletsin noise,” IEEE Trans. on Sysr.,Man, and [SO] A. Oppenheim, “Speechanalysis-synthesissystem based on
Cybem., vol. SMC-2, pp. 59-66, Jan. 1972. homomorphic filtering,” J. Acoust. SOC. Amer., vol. 45, no. 2,
[ 25 1 M. J. Shensa, “Complex exponential weighting applied to homo- pp. 4 5 8 4 6 5 , Feb. 1969.
morphic deconvolution,” Geophys. J. Roy. Asrron. SOC.,vol. 44, I 5 1 1 A.Oppenheimand R. W. Schafer,“Homomorphicanalysisof
PP. 379-387, 1976. speech,” IEEE Trans. AudioElecnoacoust., vol. AU-16,pp.
(261 D. P. Skinner, “Real-time composite signal decomposition,” Ph.D. 221-226, June 1968.
Dissertation, University of Florida, Gainesville, FL, 1974. [52 1 J. Rabhakarand S. C. Gupta, “Separation of Rayleigh and
[27] D.P. Skinnerand D. G.Childers, “Thepower,complex,and Poisson density functions through homomorphic filtering,” Nut.
phase cepstra,” presented at 1975 Roc. Nat. Telecommun. Conf., Elec. Conf., pp. 605-610, Dec. 1970.
Dec. 1975. [ 5 3 ] R. Schaferand L.R. Rabiner,“Systemforautomaticformant
I281 -, “Real-timecompositesignaldecomposition,” IEEE Trans. analysis of voiced speech,” J. Acousr. SOC.Amer., vol. 47, (Pt.2),
on Acoust.,Speech,and Signal Processing, vol.ASSP-24,pp. pp. 634-648, Feb. 1970.
267-270, June 1976. [ 5 4 ] S. Senmoto and D. G.Childers, “Decomposition of a composite
[29] R. G. Smith, “Cepstrum descrimination function,” IEEE Trans. signal of unknown wavelets in noise,” in Int. Conx Comm. (ICC),
Inform. Theory, vol. IT-2 1, pp. 332-334, May, 1975. 71C 28-COM, Montreal, Canada, 1971, pp. 5-14-5-19.
(301 T. G. Stockham, Jr., T. M. Cannon, R. B. Ingebretsen, “Blind [SS] J. Tenold, D.H. Crowell, R. H. Jones, T. H. Daniel,D. F.
deconvolutionthroughdigital signalprocessing,” Proc.IEEE, McPherson, A. N. Popper, “Cepstral and stationarity analysis of
vol. 63, pp. 678-692, Apr.1975.(Refer toforotherrelated full-termandprematureinfants’ cries,’’ J. Acoust.SOC.Amer.,
reports and Master’s theses.) vol. 56, no. 3, pp. 975-980, Sept. 1974.
I311 P. L. Stoffa, P. Buhl, and G. M. Bryan,“Theapplicationof [ 561 J. M. Tribolet, “A new phase unwrapping algorithm,” submitted
homomorphic deconvolution t o shallow-water marine seismology t o IEEE Trans. Acousr., Speech,and Signal Processing.
-Part I: Models,” Geophysics, vol. 39, no. 4 , pp.401-416, [ 5 7 ] J. M. Tribolet, A. V. Oppenheim, and G. E. Kopec, “Deconvolu-
Aug. 1974. tion by homomorphic prediction,” submitted t o Geophysics.
[32] -, “Cepstrum aliasing andthe calculation ofthe Hilbert [ S a ] , T.Ulrych,“Applicationofhomomorphicdeconvolution to
transform,” Geophysics, vol. 39, no. 4, pp. 543-544, Aug. 1974. seismology,” Geophysics, vol. 36, no. 4 , pp. 650-660, Aug. 1971.
[ 5 9 ] T. Ulrych, 0. G. Jensen, R. M. Ellis, and P. G. Sommerville,
“Homomorphic deconvolution of some teleseismic events,” Bull.
Theory-Related Material Seismological SOC. Amer., vol. 62, no. 5 , pp. 1253-1265, Mar.
[ 3 3 ] A. J. Berkhout, “On theminimum phase criterionofsampled 1972.
signals,” IEEE Trans. Geosci. Electron, vol. GE-11, pp. 186-198,
Oct. 1973.
Extensions
Applications-Homomorphic Filtering and Cepstra [601 D. I. H. Moore and D. J. Parker, “On nonlinear filters involving
(withTheory) transformationofthetime variable,” IEEE Trans. Inform.
Theory, vol. IT-19, pp. 415-422, July 1973.
[341 W. H. Bakunand L. R. Johnson,“Thedeconvolutionoftele-
seismic P waves from explosionsMilrow and Cannikin,” Geophys. Books-Homomorphic Systems, Cepstrum Theory,
J. Roy. Astron. SOC.,vol. 34, pp. 321-342, 1973.
[ 3 5 ] P. Buhl, P. L. Stoffa,and G. M. Bryan,“Theapplicationof and Applications
homomorphic deconvolution t o shallow-water marine seismology [ 6 1 ] B. Gold and C. M. Rader, Digital Processing of Signals. New
-Part 11: Real data,” Geophysics, vol. 39, no. 4, pp. 4 1 7 4 2 6 , York: McGraw-Hill, 1969.
Aug. 1974. [ 6 2 ] L. R. Rabinerand B. Gold, Theory and Application of Digital
[ 361T. Cohen, “Sourcedepth determinations using spectral, pseudo- SignalProcessing. Englewood Cliffs, NJ: Rentice-Hall, 1975.
autocorrelation and cepstral analysis,” Geophys. J. Roy. Asrron. I631 A. V. Oppenheim and R. W. Schafer, Digital Signal Processing.
SOC., VOl. 20, pp. 223-231, 1970.
EnglewoodCliffs, NJ: Rentice-Hall, 1975.
[ 3 7 ] T. J. Cohen, “Ps and pP phases from seven Pahute Mesa events,” [ 6 4 ] D.G. Childersand A. E.Durling, DigitalFiltering and Signal
Bull.SeismologicalSOC.Amer., vol. 65, no. 4, pp. 1029-1032, Processing. St. Paul, MN:West Publishing, 1975.
Aug. 1975. [ 6 5 ] J. L. Flanagan, Speech Analysis, Synthesis, and Perception, 2nd
[ 381 P. 0. Fjell, “Use of the cepstrum method for arrival times extrac- Ed. New York:Springer-Verlag,1972.
tion of overlapping signals due to multipath conditions in shallow
water,” J. Acourt. SOC. Amer., vol. 59, no. 1, pp. 209-21 1 , Jan.
1976. Linear Prediction, Predictive Deconvolution,
[ 391 0. S . Halpeny, “Epoch detection by cepstrum analysis,” Master’s Inverse Filtering, and Deconvolution
Thesis, University of Florida, Gainesville, FL, 1970.
1401R. Kemeraitand D.G. Childers, “Detectionofmultipleechoes (661 B. S. Atal,“Effectivenessoflinearpredictioncharacteristics of
immersed in noise,” in Roc. 15th Midwest Symp. Circuit Theory, the speech wave for automatic speaker identification andverifica-
May 4-5,1972 Pp. 1-10. tion,” J. Acoust. SOC. Amer, vol. 55, pp. 1304-1312, June 1974.
[411 -, “Decomposition of pulse-type data by cepstrum techniques,” [ 6 7 ] D. G. Childers, R. S. Varga, and N. W. Perry, Jr., “Composite
1973 IEEE SoutheastConf., Apr.30, May 1-2, 1973, pp. signal decomposition,” IEEE Trans. AudioElectroacousr , vol.
F-4-1-F-44. AU-18, pp. 471-477, Dec. 1970.
[ 4 2 ] G.E.Kopec,A. V. Oppenheim,and J. M. Tribolet,“Speech [ 681 M.P. Ekstrom, “A spectral characterization of the ill-conditioning
analysis by homomorphic prediction,” submitted t o IEEE Trans. in numerical deconvolution,” IEEE Tram. Audio Elecrroacoust.
Acoust., Speech and Signal Processing. (This paper is apparently VOI. AU-21, pp. 344-348, Aug. 1973.
an earlier version of [ 181.) 1691 B. R. Hunt, “A theorem on the difficulty of numerical decon-
1431 L. R. LeBknc, “Narrow-band sampleddatatechniquesforde- volution,” IEEETrans. Audio Elecrroacousr., vol. AU-20, pp.
tection via theunderwateracousticcommunicationchannel,” 94-95, Mar. 1972.
IEEE Trans. Commun.Technol., vol. COM-17,pp. 481488, [ 7 0 ] -, “Deconvolution of linear systems by constrained regressipn
Aug. 1969. and its relationship t o t h eWiener theory,” IEEE Trans. Automat.
[ 4 4 ] J. H. Miles, G. H. Stevens,and G. G. Leininger,“Analysisand Contr., vol. AC-17, pp. 703-705, Oct. 1972.
correction of ground reflection effects in measured narrowband [ 7 1 ] J. Makhoul, “Linear prediction: a tutorial review,” Proc.IEEE,
PROCEEDINGS OF
VOL.
THE IEEE, 65, NO. 10, OCTOBER
1443 1977
VOl. 63, pp. 561-580, A P . 1975. Trans. Geosci. Electron., vol. GE-9, no. 1 , pp. 28-34, Jan. 197 1.
B. Mitchell, M. Landisman, and 2. A. Der, “Predictive deconvolu- [SO] L. C. Woodand S. Treitel,“Seismic signal processing,” h o c .
tion applied to long range seismic refraction observations,” Pure IEEE, vol. 63, pp. 649-661, Apr. 1975.
Appl. Geophysics, vol. 96, pp. 127-133, 1972.
K. L. Peacock and S. Treitel, “Redictive deconvolution: theory
and practice,” Geophysics, vol. 34, pp. 155-169, Apr. 1969. Books-Computational Seismology
E. A.Robinson,“Predictivedecomposition o f seismictraces,”
Geophysics, vol. 22, pp. 767-778, 1957. [ 8 1 ] M. Bath, MathematicalAspects of Seismology. Amsterdam:
-, “Properties ofthe Wold decomposition of stationary Ekevier, 1968.
stochastic processes,” Theory h o b . Appl., vol. 8, pp. 187-194, [ 82 1 Computational Seismology, V. I. Keilis-Borok, Ed. New York:
1963. Consultants Bureau, 1972.
-, “Mathematicaldevelopment of discretefilters for the de- [ 8 3 ] E. A. Robinson, Multichannel Time Series Analysis w i t h Digital
tection of nuclearexplosions,” J. GeophysicsRes., vol. 68, Computer Programs. San Francisco: Holden-Day, 1967.
pp. 5559-5567,1963. [ 841 -, StatisticalCommunicationandDetectionwith Special
-, “Multichannelz-transforms and minimum delay,” Geo- Referenceto DigitalDataProcessing of Radarand Seismic
physics, vol. 31, pp, 482-500, June 1966. Signa&. New York:Hafner, 1967.
-, “Redictive decomposition of time series with application to [ 85 1 E. A. Robinson and S. Treitel, Robinson-Treitel Reader, 3rd Ed.
seismicexploration,” Geophysics, vol. 32, pp. 4 1 8 4 8 4 , June Tulsa, OK: Seismographic Service Corp., 1973.
1967. [ 8 6 ] Seismic Filtering, R. Van Nostrand, Ed. Tulsa,OK:Society of
(791 J. Schell, “Dereverberation by linear systems techniques,” IEEE Exploration Geophysicists, 1971.
FILSYN-A General Purpose Filter

Synthesis Program
GEORGE SZENTIRMAI, FELLOW, IEEE
Abrfmct-A very general computer prognm is d e s c r i i that can be were attempting to present a package of foolproof programs
used for thesynthesis of passive LC,active RC,and (infiite impulse re- that could be used without any knowledge of fiiter design the-
sponse)d @ fdters. Althou& it operates in both batch and intenc-
tive modes, this discussion deals exclusively with the interactive mode,
oryandstillyieldmeaningful results. Great emphasis was
which is somewhat more general and very easy to use. Apart fiom of- hence placed on the computer second guessing the user and
fering superior accuracy and flexiiility, this progt-rm offers many f&, supplying answers to questions the designer should have an-
including the passive realizationof complex quadruplets of hamnuwon.. sweredin the first place. This philosophy necessarily led to
zeros, the simultaneous replization of two transmission zero pairs, the considerable restrictions.
active RC leapfrag realization,and m y others.
Today we should perhaps try to keep people off the comput-
I. INTRODUCTION ers (that should at least improve turnaround time) and the au-
thor himself came around to the point of view that no matter
V ERY CLOSE to ten years ago the author had reported

on a computer program package that was developed for
the synthesisof passive LC ladder filters [ 11. This paper
is about a vastly expanded and completely rewritten program
for the interactive synthesis of filters of all kinds: passive LC,
how sophisticated a computer program is, users must at least
have a reasonable idea as to what is going on in the design pro-
cess to be able to utilize the program to its fullest. Such users
would be more knowledgeable, but they would also appreciate
the greater flexibility available, coupled with the possibility of
active R C , or digital. correcting wrong choices thatis available in an interactive oper-
One must justify the amount of space devoted to such a proj- ation. In keeping with this philosophy, the new version to be
ect, not to mention the effort expended in developing it, but described below needs more understanding of the theory be-
this is easily done in this instance. hind the program and hence a user who can make decisions
First of al, ten years is more than the average lifetime of only a designer can make, but in return it allows him a much
computer programs and our previous effort is no exception. greater flexibility in fiiter types, configurations, and other op-
Although it has been converted to Fortran IV and is still in use, tions. At the same time, to help the inexperienced user to be-
it has serious shortcomings. Some of these come from the fact come an experienced one, we retained most of the computer
that our philosophy of computer usage has changed tremen- aids and added some new ones to help in this decision making
dously during the past ten years. At the time we were fighting process and sometimes takeit over completely.
thereluctance ofdesign engineers to use the computer and Another drawback of the old programs was found to be still
therefore were making a very strenuous effort to make the use numerical in the case of very high degree filters. The new ver-
of the program as easy and painless as possible. We in fact, sion combines the two methods [ 2 I , [ 3 I previously suggested
as remedies and uses the transformed variable in the product
Manuscript received April 30,1976; revised March 30,1977. This form for most of the computations. Numerical problems still
work was supported in part by The National Science Foundation under exist, but only where they do no harm.
Grant GK-38557.
The author is with the Electronics Research Centerof Rockwell Inter- One may s t i U ask: why worry about passive LC fiiters? Of
national, Anaheim, CA. course they are still around (and will remain so for some time

Cepstrum

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Cepstrum

Diunggah oleh

Hak Cipta:

Format Tersedia

1428 PROCEEDINGS OF THE IEEE, VOL. 65, NO.

10, OCTOBER 1977

The Cepstrum: A Guide to Processing

(rather thana mechanism to transform theproduct of two

or main. Thenotchlifterhas beenfrequentlyreferred to as a

tion).Analogously,amaximumphasesequence may be de-

Alternately, the phase may be unwrapped by computing the

relative phase betweenadjacentsamples of the spectrum. a3

whichwhen convolved with f ( n T ) will give us a minimum

Frg. 3. The superposition of two wavelets t o form x(nT); the complex

C . The Relationship Between the Complex and

due to the speaker fundamental frequency (pitch) or periodic

B. Summary addition to the specific papers already cited. Aspects of real-

FILSYN-A General Purpose Filter

V ERY CLOSE to ten years ago the author had reported

Anda mungkin juga menyukai