A&tmct-This paper is a pragmatic tutorial review of the cepstrum perspectives; but, hopefully,these will be tempered bya
litemture focusing on data processing. The power, complex, and phase healthy skepticism.
cepstrr~showntobeerpilyrehtedtooaemother. Roblemsrssod-
ated with plnse unwrqphg, linear phase components,spechum notch- Fundamentally,cepstrumtechniquesaresuitedforthe
ing, rlhsing, oversunpling, and extending &e data aeqnence with zeros analysis of data that contain echoes (wavelets) or reverbera-
are dixmssed. The advlntaga and di.uddpmtrges of windowing the tions of a fundamental wavelet (sometimes called a signature)
sampled data sequence, the log s pecn
tm,and the complex cepstrum whose shape need not be known a priori. The power cepstrum
are pxesented. The iufluence of noise upon the data processing
procedures is discussed throughout the paper, but is not thoroughly is usually used to determine the arrivaltimes of the funda-
analyzed. The effects of nrious forms of liftering the cepstnun are mental wavelet and its echoesandtheirrelativeamplitudes;
desaibed. The d xl obtained by applying whitening and trend re- the processing of the complexcepstrum can determinethe
moval techniques to the spectrum prior to the alcuhtion of the cep wavelet waveform.
S i N D l U e d i S C U S d
We have attempted to synthesize the results, procedures, and infor-
Theapplicationareasencompassradarandsonar[61-[81,
mation~tothemauyfieldsthatareEindingcepstnunamlysis [291, [401-[411, [431, [611, [621 where cepstrum processing
wful. In particubu we discuss the interpretation and F g of can be used to advantage to reducereflectioninterference,
data in such areas as speech, - , and hydmacoustm. But we speech[131, 1191, 1211, [421,[461-[511,[531,[551,[571,
must caution the readex that the paper is heavily influenced by our own [ 6 1I-[ 66 I, [ 7 1 I, where speaker fundamental frequency (pitch)
experiences;specirk procedures that have been founduseful in one
field should not be conaidered as totally general to other fields. is estimated and spectrum envelopes are calculated, marine and
1 t ~ h o p e d t h . t t h i s r w i e w w i l l b e o f n l u e t o t h o s e f ~ w i t h t h eearth seismology, seismic exploration and detection [ 11, [31 I,
fi~andreducethetimerequiredforthosewishingtobecome~. [34]-[381,[451,[581-[591,[801wheresourcedepth de-
terminations are made and the ocean bottom is mapped, and
1. INTRODUCTION the electroencephalogram (EEG) orbrain waves [4], [91, [ 1 11,
T HIS PAPER has two objectives: first, to present a guide [121, [221-[241, [401, [541, [641 where correlates
to the cepstrum literature which is becomingincreas- physiologicaleventsarederived.Otherareas
ingdiverse;andsecond, to surveycepstrum
processing procedureswhich have foundapplication in the More recentapplicationsaretwo-dimensionalfunctions
analysis of data. While a review section is included to establish [20] andaeroacousticsornoisepollution
of electro-
of interestin-
signal clude the deconvolution of probability density functions [ 52 ].
[44]. Someaddi-
[SI,
a notational reference, no attemptis made to provide an exten- tional exciting work is the restoration of old recordings 1301
sive rederivation of previous results. Instead, key sources are and image processing [ 191, [ 301, [61], [631. Throughout the
referenced. paper we attempt to synthesize the observationsfromthese
We recognize that for some readers our summary of various many fields.
cepstnun processing procedures which have been and continue The paper’s outline is as follows. First, we provide a brief
to be applied to the analysis of data may be too concise. It is historical review and the fundamental mathematical formula-
our hope that this weakness is counterbalanced by our dis- tion for processing discrete data sequences by cepstrum tech-
cussion of the pitfalls, advantages, and disadvantages of such niques. This is followed by a section on phase problems en-
procedures. While themajority of theseresultsappear else- countered in the complex cepstrum, e.&, linear phase terms,
where, sometimes mentioned only briefly, we do include some spectral notching, oversampling, phase unwrapping, and noise.
new findings. A good deal of what follows is based upon our Next comes a section on other problems, e.g., aliasing, over-
own experiences but we have tried to incorporate and synthe- sampling, the addition of zeros.Thefollowingsection is on
size the results and experiences of the numerous other investi- windowingtheoriginaldatasequence,thelogspectrum
gators in this field as well. Perhaps the single most important sequence, and the cepstrum sequence. In the next to the last
observation that has been made by us and others is that the section we discuss data processing for the threeareas of speech,
results obtained from cepstrum analysis are highlydata depen- seismology, and hydroacoustics. Finally, we conclude with a
dent. Because of this, few generalities, which cut across appli- summary, recapitulation, and some general observations. The
cation areas, can be formulated from empirical results from paper is organized in this manner in an attempt to localize, as
one application area alone. Nonetheless, investigators appear well as possible, the terminology that is special to each of the
to be obtaining satisfactory results from the processing of their diverse fields. Thus, the next to the last section on data pro-
databycepstrumtechniques.Thus it is ourhopethatthe cessing employs many terms peculiarto each field.
reader will gain from this paper some new insights, ideas, and It is our hope that this guide will prove useful to those just
becoming involved in cepstrum analysisas well as to those with
previous experience.
Manuscript received November 2 5 , 1975;revised November 19, 1976
md February 7, 1977. 11. THE CEPSTRA
D. G. ChiMcrs is with the Department of Electrical Engineering, Uni-
rsnity of Florida, Gaiuervilk, FL 3261 1 . Historically, the cepstrum has its roots in the general prob-
D. SLciAner
P. k with the N a n 1 CoastalSylJtemsLaboratory, lem of the deconvolution of two or more signals. This litera-
ham a City, FL 32401.
R. C. Kernemit is with ENSCO. Satellite m c h , FL 32937. ture is richand varied andencompasseslinearprediction,
CHILDERS e t al.: GUIDE TO PROCESSING 1429
predictive deconvolution, inverse filtering, and deconvolution. ward z-transform and/or the final squaring could be changed
In the interest of brevity we have provided only a selected list to magnitude squared. In actuality the final squaring operation
of references and books in this area [66]-[86]. in(1) is unnecessary and is frequentlyomittedfor several
In what follows it will be seen that the (complex) cepstrum reasons, but it has been used here t o provide historical con-
is also concernedwith the deconvolution of two signals, tinuity with [ 1 ].
namely, a basic orfundamental wavelet and atrain of im- Therefore, if wehave the convolution of two sequences,
pulses. The phase cepsoum is defied and it is shown how the then
power,complex,and phase cepstra are related via a form
which simplifies computation. x(nT)=f(nT) *g(nT) (2)
or
A . The Power Cepstrum
The power cepstrum was first described by Bogert et al. [ 1 ] IX ( z ) I’ = I F(z) I2 * I G(z) I’ (31
in 1963 as a heuristic technique for finding echo arrival times or
in acomposite signal. Basically, these authors defined the
cepstrum(which we term the powercepstrum [ 9 ] , [ 121 t o log I X(z) I’ = log IF(z) l2 + log I C(z) 12. (4)
avoid confusion with the complex cepstrum) of a function as If we apply (l), then
the power spectrum of the logarithm of the power spectrum of
that function. xpc(n T ) = fpc(n T ) + gpc(nT ) + a cross-product term. (Sa)
These authors quickly showed (aswe repeat below) that the If the power cepstra o f f and g occupy different quefrency
effect of a delayed echo will manifest itself as a ripple in the ranges, then (Sa) can be reduced to
log spectrum. The “frequency” of this ripple is easily deter-
mined by calculating the spectrum of the log spectrum wherein Xpcb T ) = fpc(n T ) + gp& TI. (5b)
this “frequency” will appear as a peak. However, the units of
(This is the result that would be obtained if the f i a l squaring
“frequency” of this ripple in the log spectrum are in units of
operation in (1) were not included in the definition.) Under
time; thus, the independent variable (abscissa) in the spectrum
thisconditionthe individual contributions of each power
of the log spectrum is time. Other parameters were also ob-
cepstrum can be separated by liftering(filtering) inthe
served t o undergo similar transformations of units. To avoid
quefrency domain.
confusion, Bogert et al. [ 11 introducedthe following now-
For the case of acomposite signal consisting of the basic
classical paraphrased terms according to a syllabic interchange
wavelet and a single echo, then
rule :
frequency. . . . . . . . quefrency g ( n T ) = 6 ( n T ) + aG(nT - n o T ) (6)
spectrum . . . . . . . .cepstrum where 6 ( n T ) denotes the unit pulse function in a sampled data
phase . . . . . . . . . . . saphe sequence.
amplitude. . . . . . . .gamnitude Equation (3) then becomes
filtering . . . . . . . . .liftering
harmonic . . . . . . . . rahmonic IX(Z)l2 = lF(Z)l2
l(1 +az-”o)12 (7)
period ...........repiod and if we evaluate (4) on the unitcircle (z = eiWT), then
along withothers.Todaythetwo most prevalent terms are log IX(eiWT)12 =logI F ( ~ ~ ~ ~ > I ’
cepstrum and quefrency, e.g., filtering in the cepstrum domain
is usually called just that and not “liftering” as suggested by + log (1 +a’ + 2a cos (unoT))
Bogert et al. [ 1 1, but this can and often does lead to confusion.
=log I F ( ~ ~+ l~o g~( l )+ aI2 ~
)
In practice the power cepstrum is effective if the wavelet and
the impulse train, whose convolution comprise the composite 2a
data,occupydifferentquefrency ranges. In actuality,the
power cepstrum does not exist for most signals;it is meaning-
(
+log 1 +-
l+a2
ful only when d e f i e d in a sampled data sense (asis the com- We may now expand the third term on the right of (8) in a
plex cepstrum) although attempts to extend exist it [ 31. Thus power series (except for thepoint values a = f 1and
the following definition is offered: the power cepstrum of a COS ( ~ n o T ) =
f l [61) t o obtain
data sequence is the square of the inverse z-transform of the
logarithm of the magnitude squared of the z-transform of the
data sequence. When this definition is evaluated on the unit
circle, the result (except for the normalization factors associ-
ated with the power spectrum) is the same as that obtained in (9)
[ 11. Thus we may write the power cepstrum as
where a0 = (2a)/( 1 + a’).
We see that the logarithm of the magnitude squared of the
z-transform of the composite signal will contain cosinusoidal
ripples (sometimes referred to as spectral modulation) whose
gamnitude (amplitude) and quefrency (i.e., the frequency of
the ripples) are related t o the echo amplitude (a) and delay
( n o r ) , respectively.
where X(z) is the z-transform of thedata sequence x(nT). Using (9) in (81, we can take the inverse z-trmsform of (8)
Alternately, the definition could be changed t o use the for- t o obtain tbe term within the brackets in (1) which will bare
1430 PROCEEDINGS OF THE IEEE, VOL. 65, NO. 10, OCTOBER 1977
peaks at quefrencies of (noT) and multiples thereof. (We per- Cohen [36], [37] discusses the echo polarity determination
form this task in detail laterforthe complex cepstrum.) problem. In some experimentalsituationstheecho may
These peaks will be detectable provided the log 1 F ( e j W T )1' is undergo 180' phase reversals atcertain boundary reflection
approximately quefrency limited t o less than (noT), i.e., the interfaces, i.e., the reflection coefficient, a , may be negative.
ripples in log I F(elwT)12 should not have a repoid (period) This knowledge can be useful indatainterpretation.For
greater than (noT)-'. In other words (5b) holds or at least example, in (8) if a is negative this leads to spectral nulls at
approximately so. It is apparent that the echo arrival time can -
f = ( m ) / ( n o T ) ,rn = 0,1 , 2 , * * . By measuring the frequency
be estimated by simply noting the time of occurrence of the spacing between two successive spectral nulls one can then
first peak in the power cepstrum. Further, it is possible using determine the delay time n o T . The source depth can in turn
(9) to estimate theecho amplitude. This will be discussed be estimated if the average signal velocity is known.
more fully for the complex cepstrum. The presence of multiple Cohen [36] also points out that the cepstrum may contain
echoes can and does confuse the interpretation procedure be- many peaks which can confuse the analyst. He suggests doing
cause of the nonlinearity introduced by the log function. In pseudo-autocorrelation (defined as the inverse Fourier trans-
addition, aliasing causes problems [ 321. These points are form of the lifteredpower spectrum [ 11, [361) analysis
given further considerationlater. It should also be apparent simultaneously. If a negative reflection has occurred, then the
that if (5b) does not hold, then the cross-product term in (Sa) cepstrum will have a positive peak while the pseudo-
will introduce further extraneouspeaks. autocorrelation will show a negative peak. The dot product
It should be noted that the peaks in the power cepstrum can (Le., keeping track of the phase) of the cepstrumand the
be removed by notch liftering (filtering) to yield an estimate of pseudo-autocorrelation can even be determined. Here the
the power cepstrum of the basic wavelet. Further, if the fiial cepstrum can be any one of those obtained by transforming
squaring operation in the calculation of the power cepstrum in one of the three whitened spectra discussed above. This pro-
(1) is not performed, then the peaks still appear and again can cedure can apparently help separate multiple echoes(or events)
be removed by notch liftering, but now the operations per- from a single echo (or event) such as might occur with multi-
formed to calculate this modified powercepstrum can be path, i.e., multiple reflections.
reversed to obtain an estimate of the log power spectrum and
with exponentiation t o yield the power spectrum of the basic
wavelet itself. But the waveform of the basic wavelet cannot B . The Complex Cepstrurn and Wavelet Recovery
be recovered by processing the power cepstrum since the phase Thecomplexcepstrum is an outgrowthofhomomorphic
information is discarded.This lattersituation is corrected system theory developed by Oppenheim [ 14]-[ 171. In fact,
with the complex cepstrum which we discuss in the next sub- the powercepstrum is also aspecificapplicationofhomo-
section along with the inversion process. morphicsystem theory.The complexcepstrum has been
The power cepstrum has been applied t o seismic data [ 11, investigated extensively [91, [12],[131, 1191, [211-[231,
[361, sonar [43], speech [13],[46]-[49], and the electro- [26]-[321, [611-[641.
encephalogram (EEG) [ 1 2 I, [22 I, [ 24 I. Its statistical proper- Since the complex cepstrum retains the phase information of
ties have also been examined [ 2 ], [ 7 ]. the composite data, it can be used not only for echo detection
It is hopefully beneficial to point out that alternate view- but also wavelet recovery; this process is also known as homo-
points and, thus,subsequentterminologies have arisen since morphicdeconvolutionor homomorphic filtering and has
the original paper by Bogert et al. [ 1 1. These viewpoints have since been applied to seismic data (311, [341-[381, [451,
led to what might wellbe considered two lines of investiga- [571-[591,speech 1191, [211, [301, 1501, [511, 1531, [571,
tion: (1) the use of varying degrees of spectral whitening; and [62], [631,[651,[661, image processing [191, [301,[611,
(2) the attempts to devise methods for obtaining the phase [63],and EEGanalysis [91, [ I l l , 1121, [221-[241.
relations of the wavelet with respect tothe reference sig- Formally, we define the complexcepstrum of a data se-
nal [36]. quence as the inverse z-transform of the complex logarithm of
We have seen that the occurrence of an echo in thetime the z-transform of the datasequence [ 2 11, [631, i.e.,
domain signal leads to what amounts to a spectral modulation
(or ripple) in the frequency domain.
The spectral whitening approach to echo detection considers
the application of the logarithm a severe spectralwhitener
2', f
x^(nT ) = - log (X(z))z"-' dz(10)
LONGPASS NoTcn
(C)
Fig. 1. Overall wavelet recovery system, also known as homomorphic deconvolution (filtering) or cepstrum
system. The DFT is performed by an FFTalgorithm. XR(n) denotestherecoveredwavelet. The input
sequence is windowed and thenappendedwithzeros.(a)Simplifiedblock diagram. (b) More detailed
block diagram which can be used to process data in real time. (c) Typical lifters for the single echo, mihi-
mum phase ((I < 1) case where peaks occur at n , and multiples thereof. (The notch lifter is sometimes called
a comb lifter.)
t
C(k- 1 ) - 2n, i f P ( k ) - P ( k - 1) > r
~ ( k ) = C ( k - 1 ) + 2 ri,f P ( k1- ) - P ( k ) > n
C(k - 11,
This is illustrated in Fig. 2.
(15)
otherwise.
%nT)=f^(nT)+a6(nT- n0T)- - 6 ( n T -
2
ac
2noT)
Anotherinteresting,andperhaps,morerepresentativeex-
ample is the one with an infinite series of decaying echoes.
Here
g ( n T ) = 6 ( n T ) + a : O 6 ( n T - no^)
+ ~ : ~ ~ 6 ( n T - 2 n ~ T ) +w*h-e-r e O < a l < l .
, then
f" Or with Q = a
:.
g(nT)=6(nT)+as(nT-noT)+aZS(nT- 2noT)+*..
.<1 .>I
=
m =O
amG(nT- m n o T )
Then by the definition of the complex cepstrum in (1 0) we logarithm) of the z-transform of the data sequence. This may
have be written as
xpe(nT) = (x^(nT)+ P(-nT))'. (27) xL(nT) = (2-'(2 log X(z) - 2 log I X(z)I))' (29)
Thus the power cepstrum is four times the square of the even where the factor of 2 has been introduced to eliminate any
part of the complex cepstrum. This also follows from the fact normalizationfactors in the relation between the phase and
that the power cepstrum is the squareof the inverse transform complex cepstra and x ~ ( 0=) 0. From (1 0 ) , (24), and (27) the
of twice the real part of the log spectrum; and, as was noted phase cepstrum can be easily shown to be
earlier, the power cepstrum containsno phase information.
Equation (27) is of value since the power cepstrum is often xL(nT) = (E(-nT) - E(-nT))'. (30)
superior to the complex cepstrum for echo arrival time estima- Thus the phase cepstrum is to thephase as the power cepstrum
tion [ 12 I , [ 241, [ 251. This is apparently due to the fact that is to the log magnitude. Once again the final squaring opera-
the linear phase contribution (to be discussed below) of the tion could be changed to magnitude squared or eliminated.
imaginary part of the logarithm tends to mask the echo delay. Empirically, it has been determined that the phase cepstrum
There are probably other phase unwrapping errors as well as is less useful than the power cepstrum in the determination of
noise errorswhichcontribute to thisobservation.Complex echo amval times [ 261, [ 271. This is apparently due to the
exponential weighting [ 2-51, which we discuss later, appears to phase unwrappiiig efiors-produced by additive noise and linear
be a method which can assist the investigator in the determina- phase terms. The phase cepstrum is as difficult to compute as
tion of echo delay times from the complex cepstrum. the complex cepstrum, since both require phase unwrapping.
A wavelet recovery (homomorphic filtering)system can However, the phase cepstrum has proven valuable in evaluating
easily compute both the power and complex cepstra as shown the effects of noise on the signal phase [26]. Significant dif-
in Fig. 1. ferences in the appearance of the phase and power cepstra can
Finally, as was noted earlier, if the squaring operation in(27) be. indicative of phase unwrappingproblemswhich might
is not performed, then the system in Fig. 1 can be used to ob- otherwise go unnoticed.This has proven to be the case in
tain an estimate of the logpowerspectrumandin turn the someof the work byone of theauthors (DPS)on echoes
power spectrum of the basic wavelet. Note that if this is one's generated by chirp signals.
objective (and not wavelet recovery), then the problems as-
sociated with phase unwrapping are not encountered. 111. PHASE PERPLEXITIES
D. The Phase Cepstrum Many problems arise in the computation of the phase se-
quence for the complex cepstrum. Here we address several of
The inverse transform of the phase of the complex logarithm these problems along withtheir alleviation.
yields peaks at multiples of the echo amval time in much the
same way that the inverse transform of the log magnitude does. A . Linear Phase Components
This can be shown as follows for the single additive echo case:
The presence of a linear component in the phase sequence
introducesrapidlydecayingoscillationsinthecomplex cep
strum, e.g., let the spectrum of such a signal be represented as
X(eiwT) = e-irw X'(eiWT) or X(z) = Z-~PX'(Z). Thenthe
cepstrum of the linear phase term alone is
f 0, n=O
a sin w n o T
1 +a cos w n o T
). (28) 2 d n T )= ;-
-cos
-r
n r = -(- n # 0. (31)
The fourth term on the rightproduces ripples in the phase, nT
just ashthe third term produces ripples in the log magnitude. This term is added to thecepstrum of the remaining portion of
Since W e i W T ) is obtainedfromthetransform of a real se- thedata beinganalyzed.Note that it changessign at each
quence, its real part (magnitude of the transform of the real sample and although it does decay, it may be quite large de-
sequence) is aneven function of w, anditsimaginarypart pending upon r . Such a term maymask echo peaks in the com-
(phase of the transform of the real sequence) is an odd func- plex cepstrum, and should. be removed by subtraction from
tion of w. Thus the inverse transform ofRe (?(eJWT)) will the composite signal phase. Several procedures for doing this
yield the even portion of the complex cepstrum and the in- appear in the literature [9], [ 31 1. Basically, this is just trend
verse transform of j Im ( 2 ( e J W T )will
) produce the odd portion removal,which is standardpractice for improvingspectral
of the complex cepstrum. Since the inverse transform of the estimates.The removed linear phase term canbe recorded
term log (1 + ae-jwnOT ) produces peaks on one side of the and then reinserted during the inversion process if necessary.
origin only, the peaks produced by its real and imaginary parts. The presence of a linear phase term may influence the choice
must be equal in magnitude and opposite insign on one side of of liftering to beapplied to thecomplexcepstrum. If the
the origin but of the same sign on the other side of the origin echo is to be removed and the basic wavelet is to be recovered,
(dependinguponwhethertheechoamplitude a isless or then the echo peaks should not be notch liftered (removed) by
greater than unity). simply replacing them with theaverage of their adjacent points,
From these observations we formally define the phase cep- since these adjacent points have contributions from the linear
strum of a data sequence as the square of the inverse z- phase component (if it has not beencompletelyremoved)
transform of twice the phase (the imaginarypart of the which are oppositein sign to thecontribution of the echo
CHILDERS et al.: GUIDE TO P R O C E S S I N G 1435
point to be removed. Instead, if the echo is located at no in Thus the change in phase is inversely proportional tothe
the complex cepstrum then this point should be replaced with magnitude squared of the spectrum. If a notch occurs in the
the average of the (no + 2) and (no - 2) points. This form of spectrum, then the change in the phase may be quite large,
liftering results in a smaller meansquare error (MSE) in the and, therefore, proper phase unwrapping may be difficult to
recovered wavelet than when the average of the points adjacent achieve. Further, the phase may change sign rapidly in these
to ‘the echo peak is used. This has been found to be the case spectrum notches. This represents a serious problem even in
evenwhen the linear phase component has been completely the absence of noise as the above example illustrates. There-
removed [26]. Thislifteringprocedure is not claimed to be fore, it is quite possible fortheunwrapped phasecurve to
optimum. In fact the liftering procedure is undoubtedly signal containdiscontinuities(jumps or steps) in the vicinity of a
andnoise dependent and would in general involveaveraging spectrum notch.
more than just two points in the complex cepstrum. As was pointed out earlierspectralnulls canbecaused
A serious problem in phase unwrapping is encountered when physically by 180’ phase reversals in reflections at boundary
discontinuitiesinthe phase occurincalculating the phase interfaces [ 361, [37]. Nulls in the spectrum may be an im-
modulo 2n via the arctangent routine. The phase unwrapping portant aid to data interpretation. The investigator needs to
algorithm previously described removes thesediscontinuities understandthephysicalsituationunderwhichthedataare
provided the phasechanges by less than A between samples. collected and to model it well [37].
Recently, it has been pointed out that a linear phase compo-
nent with a large slope w lli cause errors in this unwrapping C . OtherResults
procedure [26], [31]. If the phasechanges betweensamples Finally, our results have indicated that when cepstrum
are greater than A due to the presence of a linear phase term, analysis is performed on a bandpassfunction phase unwrapping
then this unwrapping problem can be alleviated by increasing outside the signal band is of little value and may actually be
the recordlengthwith theaddition of zeros [26]. This is detrimental to wavelet recovery since the spectrum outside the
equivalent to sampling the z-transformmorefrequently. If signal band is dominated by noise. Similar considerations lead
one is unsure whether the phasechange betweensamples is us to avoid oversampling since this leads to large segments of
less than A, then one can check such an hypothesis with the the log spectrumbeingdominatedbynoise.(Seethenext
above procedurebycomparingtheunwrapped phase before section.)
and afterthe recordlengthhasbeenappendedwithzeros.
Others suggest that an iterative approach to phase unwrapping IV. O T H E RPROBLEMS
is helpful [ 3 11. A. Aliasing [ 9 ] , (261, [32]
Oneexample of where the linear phase component gives
Aliasing of the cepstrum is of course an ever present prob-
problems is when x ( n T ) = f ( n T- noT), [0, N - 11, zero other-
lemsince th% nonlinearcomplexlogarithmintroduces har-
wise, then X ( e j w T ) = e -jwnoTF(ejwT). As expected the phase monics into X ( z ) . The appending of zeros to the input data
of x is the sum of a linear phase component and thephase off. sequencereduces aliasing as will selecting thedata record
If wis theminimum rate (w = ( n 2 n ) / ( N T ) )then X(ein(2nlN))= length N T to be as large as possible. This latter choice is sub-
e ~ z n n ( n o ~ ~ F ( e ~ n ( * nIf/ Nn)o) >
. N / 2 the linear phasecom- ject to theconstraintsimposedbytheinvestigatoronthe
ponent will change by more than n between samples and un- number of pointsthat can beanalyzedand the minimum
less the phase o f f counteracts this change, the phase unwrap sampling rate. If thetotaldata recordlengthexceeds the
ping algorithm will yield erroneousresults.This has been duration of the composite signal contained within the record,
observed in computer experiments when the composite signal then it is questionable if the total data record length should be
is delayed by more than half the record length. As expected further extended with still more “data.” The reason for this
this not only reduces the echo detectability in both the phase doubt is that the spectral samples will increasingly reflect the
and complex cepstra, but also severely distorts the recovered effect of the noiseratherthanthe signal as thetotaldata
wavelet. record length surpasses that of the composite signal duration.
B. Spectrum Notching B . Oversampling
It should also be noted that zeros near the unit circle in the Oversampling of the data record when noise is present is also
z-transform of the echo sequenceresult in notches in thespec- a problem. Outside the signal band noise dominates the spec-
trum sequence wherein additive noise may dominate. We have trum. This usually presents no problem in ordinary spectrum
seen earlier that one phase unwrapping algorithm requires that analysis
since these
components frequently
containlittle
the changes in phase between samples must be less than f n , power but this may not be the case for the cepstrum. Because
i.e., the derivative of the phase with respect to frequency must of thenonlinear logarithmic operation,the regionsoflow
be less than + A . power in the spectrum may contribute as much or moreto the
Consider the z-transform of the data sequence evaluated on cepstrum as the regionswhich containthe signal inthe
the unit circle, then spectrum. When this occurs it affects both echo detectability
X ( e j U T ) = I X(e
~ W T )j /~x ( e j W T ) - and wavelet recovery. Oversampling also aggravates phase un-
e - XRe(eiwT)
+iXIm(e iw T I
wrapping and aliasing since it shortens the data record (if the
or total number of data points orsamples is fixed), which in turn
1436 PROCEEDINGS OF THE IEEE, VOL. 65, NO. 10, OCTOBER 1977
implies that the samples of the log spectrum are spaced farther cessing this is notthe case. Here thedata are highly non-
apart. stationary. And the investigator is frequentlyinterestedin
analyzing the speech signal over one pitch period (or at most
C. Appending Zeros three pitch periods). In this case windowing is of considerable
It is well known that appending zeros to a data sequence benefit.
increases the sampling "rate" of its discrete Fourier transform. One can see for the single echo case that windowing the in-
This benefits the computation of the cepstrum in two ways. put data record normally prevents the logarithmic operation
First, the increased sampling "rate" in the frequency domain from fully separating the basic wavelet and the echo series as
reduces aliasing of the cepstrum. Second, increasing the fiie- follows:
ness with which the phase curve is sampled reduces the number
x(nT) = [ f ( n T )+ u f ( n T - noT)l w(nT)
of phase unwrapping errors (which result from jumps greater
than n between samples). Our results have indicated that or
extendingthe record lengthwith zeros results in amodest
improvement in the recovered wavelet even when aliasing and X(z) = [F(z) (1 +aZ-no)l * W(z). (37)
phase unwrapping errors do not appear t o be aproblem.It For arbitrary W(z),the contributions of the basic wavelet and
should be noted that unless the record length is extended with theechocannot generally be separated by taking the loga-
zeros, then aliasing causes an ambiguity in the determination rithm of (37) since theterm in brackets is convolved with
of the echo epoch (arrival time) and amplitude. This is due t o W(z). Fortunately, as will be discussed morefully below, in
the fact that there is no way to distinguish between an echo of practice the cepstrum procedure can still be applied with effec-
relative amplitude a and delay no and one with amplitude 1/a tiveness even though there is some error.
and delay (N- n o ) where N is the total numberof samples. Schafer [21] suggested awindow which does preserve the
Mathematically, these statements are verified as follows: separability of the basic wavelet and echo series and which has
consider the z-transform of the sequence x(nT) where x(nT) = 0 proven extremely useful in cepstrum analysis. This window
outside [ 0, N - 1 ] denoted as
N- 1 anT, O G n G N - 1, O<a<l
X(z) = x(nT)z-" (334 w(nT) =
n =O otherwise 8) (3
which when evaluated on the unitcircle gives was fmt proposed t o reduce the error associated with truncat-
ing the echo when it extended beyond the end of the record
N-1
[21].- Obr results and those of others [261, [31], [581 have
X(ejwT) = x(nT)e -jwnT. (33b)
n =o indicated that this window is quite useful because it reduces
the aliasing of the echo impulse train in the complex cepstrum
Ifwe sample at uniformly spaced intervals around the unit by imposing an (anOT)nweighting on the impulses. This fol-
circle, we obtain lows directly from acalculation of the z-transform of (36)
N-1
with (38) used for w(nT), i.e., for this case
X,(eibnm/N)) = x(n~)e-i(2"mn/N) (34)
n=O X(z) =F(a-Tz)(l +uanOTZ-no) (39)
which is just the DFT of x(nT). It follows that provided no truncationerror is presentand thatthe basic
wavelet begins at n = 0.
From (20) we see that when no window is used and a is near
unity and the echo delay is a substantial portion of the record
Since the logarithm (which is a zero memory nonlinearity)of a
length (NT), the higher order peaks may not decrease rapidly
sampled function is equivalent to sampling the logarithm of
enough to avoid aliasing. This problem can be overcome with
the function, then with a little additional effort itfollows that
the window under consideration. Our results indicate that the
the complex cepstrum of the DFT of x(nT) (or thez-transform
choice of a is data dependent and a should be chosen as close
of x(nT) sampled on the unit circle) is just the periodic exten-
to unity as possible, consistent with the desired reduction in
sion of the complex cepstrum of the original data sequence.
aliasing. The closer the data sequence is t o a maximum phase
We see that the effect of appendingzeros is to increase N.
sequence, the more one can reduce a,e.g., from 0.99 to 0.98
This implies we sample the log spectrum at smaller intervals,
or 0.96. The choice of a is also dependent on the echo delay
since the spacing between these samples is proportional to
time which is discussed more fully later.
1/N. As described above, theerrorsintroducedby a linear
The exponential window can introduce some distortion into
phase componentor aliasing arereduced by increasing N
the recovered wavelet even if the data are unweighted by the
through theappendage of zeros.
inverse window in the recovery process [lo], [ 261. This is
V. WINDOWING primarily due to the distortions introduced into the data that
extend beyond the duration of the wavelet of interest.
A. The Composite Data In summary the exponential window performs nearly as well
Echo detection and extraction are degraded by applying to as the rectangular window when no noise is present but does
the data record a window ordinarily used to reduce leakage, introduce some distortion as noted above. Further, the echo
e.g., Hamming, Hanning, Tapering (Tukey window), unless the arrival time can be determined even when wavelet recovery
window is relatively constant (flat) over that portion of the cannot be effected. Also if rectangular windowing is judi-
data recordcontaining the composite signal. In speechpro- ciously applied, thenthe cepstrum can be used to detect
CHILDERS e? al.: GUIDE TO
PROCESSING 1437
similar but not necessarily identical wavelets. We suspect that fidelity of the recovered wavelet in a noisy environment [26].
if theexponentialwindow is used to make thecomposite Atlow SNR the MSE can be reduced by a factor of 2 by a
signal minimum phase, then the echo, most probably, will be judiciousrectangularwindowing of thecomplexcepstrum.
lost. This is essentially short pass liftering [ 91, [ 121, [ 191, [ 2 1 ] in
Finally,it should be noted that the exponential window may which the aim is not to eliminate the echo peaks (which are
be used to alter the SNR of a data record more effectively than generally notch filtered prior to the windowing) but rather to
the rectangular window. This can be effected when the com- eliminate the high quefrency noise dominated sections of the
posite signal occupies only a portion of the total record. In complex cepstrum. This concurs with the results of [ 121 in
this case the windowmay weight the signal moreor less which it is reported that a Hanning smoothing of the log spec-
heavily than those portions of the record containing thenoise. trum (which is equivalent to Hanning windowing of the com-
However, caution should be exercised in echo detection and plex cepstrum)improves wavelet recovery.Itappears that
extraction when the signal (wavelet) of interestoccursnear there is little to choose between the rectangular or Hanning
the end of the data record and thus will be greatly reduced by window of the complexcepstrum to improve the fidelity
the exponential window. (MSE) of the recovered wavelet. We mention once again that
We wish to emphasize that the comments made in the last theseobservations are probablydatadependentand are in-
three paragraphs have been greatly influenced by our analyses fluenced by the duration of the window as well.
of echo type data. It is our opinion (as well as that of others)
D. SequenceTruncation
that such observations are and will be highly data dependent.
Recently it has been proposed that the exponential window As mentioned above in subsection A errors may be intro-
be generalized to include complex exponential weighting, i.e., duced by truncating the echo if it extends beyond the end of
#Tei@nT [25]. It may at f i t appear that this phase factor the record. In addition, aliasing of the echo impulse train in
will have no significant effect on the complex cepstrum, i.e., the complex cepstrummay occur. These errors can be reduced
it will introduce only a phase shift. However, the procedure appreciably by exponentiallywindowingthesampleddata
can be used to change the phase relation of the echo (multi- sequence. But we suspect that if the exponential window is
pathreflection)by 180' [ 251. Thismaymake it easier to too severe, then the echo may be lost.
detectapeakinthecepstrum.Thecomplexexponential ,SEISMIC,
VI. DATAPROCESSING-SPEECH
factor 4 can be varied in a prescribed fashion so that it may be AND HYDROACOUSTIC
used as a hypothesis tester. Thus trial sweeps of the complex
weight can be generated to confirm or deny a priori estimates Three application areas which appear to be using cepstrum
of the echo delay [25 I. It appears that this technique may analysis quite frequently are speech research, seismology, and
prove to be a powerful investigative tool to assist the researcher hydroacoustics. In the three subsections that follow wehave
in interpreting his data. tried to itemizethe major data processing procedures that
investigatorsineach of these areas tend to use. For some
B . The Log Spectrum situations these lists may be simplistic but we feel that they
are nonetheless indicative of the typical steps considered by
Onemight be motivated to window the logspectrumin some if not all investigators in theseareas.
order to reduce leakage in the complex cepstrum which could
be falsely interpreted as peaks due to echoes. Windowing of A. Speech f181, (191, /211, f301,[421, f461-/511,
the log spectrum will, of course, introduce some loss in time /531, (551, f611-/651, (661, f711
resolution in the cepstrum domain. Then, if the echo contri- Within speech research there are at least three problems to
butions can be liftered from the complex cepstrum and if the whichcepstrum analysis is applied. The first is perhaps the
recovered log spectrum can be corrected (by multiplying by most difficult. This problem seeks t o achieve the deconvolu-
the inverse of the windowing series), we should be able to re- tion of three signals which form the basis for a model of the
cover the basic wavelet. Our results have, however, indicated speech process. This simple model considers voiced sounds to
that such windowing of the log spectrum raises the echo detec- be produced by quasi-periodicpulses of air which in turn cause
tion threshold by around 12 dB and severely distorts the re- the vocal cords to vibrate producing glottal pulses which excite
coveredwavelet whenadditive noiseis present [26]. This is the vocal tract to finally produce speech. For nonnasal sounds
apparently due to the fact that windowing the log spectrum is the vocal tract is modeled as an all pole filter over short time
equivalent to smoothing the complex cepstrum.Thus, it ap- intervals. Theglottalsource is modeledwithzerosin the
pears that windowing the log spectrum may smooth out the zdomain again over short time intervals. The vocalized speech
very peaks one wishes to detectinthecomplexcepstrum. signal is, therefore, modeled as the three fold convolution of
Thedistortionintroduced into the recovered wavelet is un- an impulse train, the glottal impulse response, and the vocal
doubtedlydue to thiswindowing of the logspectrum(or track impulse response. These three signals are to be decon-
smoothing of the complex cepstrum). volved. This is difficult to achieve without considerable addi-
tionalinformationorassumptions.It is less difficult to
C. The Complex Cepstrum achieve the deconvolution of the pulse (impulse) train with the
Since noise is usually interspersed throughoutthedata compositeconvolution of theglottalimpulseresponseand
record and the composite signal may occupy only a portion of vocal track impulse response since these two time sequences
the record, it seems reasonable that the high quefrency com- occupy different quefrency ranges in the cepstrumdomain.
ponents of thecomplexcepstrummayfrequentlycontain Anotherrelatedproblem previously mentionedis to esti-
morenoise than signal information. Our results have shown matethe envelope. of the speechspectrum.Thespeech
that by judiciously zeroing the high quefrency components of spectrum is generally quite scalloped, i.e., it looks like an un-
thecomplexcepstrum we maysignificantlyimprove the dulating picket fence. The scalloping or spectral modulationis
1438 PROCEEDINGS OF THE IEEE, VOL. 6 5 , NO. 10, OCTOBER 1977
This will, of course, vary with the speech signal being analyzed
and the individual investigator.
2) Zeros may be appended t o the windowed speech signal
Fig. 4. (a) Speech signal. (b) The corresponding cepstrum.
to increase its effective record length; sometimes this is on the
order of a factor of 10 but is usually less. A typical sampling
rate is 10 kHz. 0
0
3) The speech bandpass is typically in the range of 50 Hz to dT n
4 or 5 kHz with resonances appearing in the spectrum at the
natural frequencies (formants) of the vocal tract. While the
location of these formants varies with the phonation or articu-
lation a great deal is known about their typical location and
bandwidth. This information helps considerably in the analy-
sis of speech data.
4) Aliasing is not a serious problem in the analysis of speech
data.
5 ) The compositeconvolution of theglottal impulse re-
sponse and the vocal tract impulse response is generally less ‘0: 00 1.M 2.00 3.00 1.00 .) 00
than 5 ms in the cepstrum domain. The f m t peak in the cep- FREQUENCY I K H Z )
strum due to the pulse train is in the vicinity of 8 ms. This
information is used t o design the shortpass and longpass lifters.
Example-Speech: Inthissimpleexample we compare the
speech spectral envelopes obtained by both linear prediction
and cepstrum processing.
The steps followed for cepstrum processing are those shown
in Fig. l(a) [or l(b)]. The resultant envelope is obtained at
step (6). Included in the complex logarithm step is the calcu-
lation of the squared magnitude of the DFT. All DFT calcula-
tions are performed by an FFT algorithm.
The speech record analyzedappears in Fig. 4(a)and was
sampled at 10 kHz to give 256 data points with a 512 point
FFT being used. Thephonation was asustained / i /. The
record was then Nanning windowed (256 points) prior t o
cepstrum analysis. No zeros were appended for this example.
The “cepstrum” (inverse DFT of log 11’) is shown is Fig. 4(b). (b)
The peak at 6.7 ms corresponds to the pitchperiod which can Fig. 5. (a) Speech spectrum o f Fig. 4(a) and envelope derived by tifter-
ing the cepstrum inFig. 4(b) (the superimposition is less than perfect).
also be estimated from Fig. 4(a). Next, shortpass liftering was (b) Speechspectrum o f Fig.4(a) and envelope derived by linear
applied. The lifter was constant from zero t o 1.5 ms and then predlctlon.
CHILDERS e t al.: GUIDE TO PROCESSING 1439
had a cosine taper from 1.5 ms to 2.5 ms and was zero beyond
2.5 ms. The final step was a forward FFT to yield the speech
spectrum envelope shown in Fig. 5(a) which is superimposed
albeitimperfectlyuponthepowerspectrum of windowed
speech signal. Thelifterselected gave the “best”spectral
envelope
fit the
opinion
in of the
authors. However, the EVfNT
results are sensitive to the typeof lifter used.
The peaks shown correspond to the formants, but it is possi-
ble insome cases forthecepstrumprocedure described to
yield false formants in between the actual formants.
Thelinearpredictionprocedurecalculates the coefficients
for an all pole filter
Fig. 6. Simple representation of earth and two seismic waves resulting
1 seismic
event.a from The P wavelongitudinal
direct,
is the
wave.
The pP wave is the single reflected wave. Neither passes through the
P inner or outer cores, but rather propagate through the mantle.
1- akz-k
k=l
has just been proposed which combines linear prediction and
from the windowed speech signal by an autocorrelation tech- homomorphic deconvolution to more effectively
analyze
nique [71]. For this example the windowed data was 20 ms mixed phase data [18], [42], [57].
with p = 16. The envelope of the spectrum was determined by The specific information and procedures commonly used in
finding the FFT of the sequence (1, - a l , - a z , * * ,- a p } and processing seismic data are the following:
thencalculating the reciprocal of theFFT. Zeroswere ap- 1) A shorttime series window(usuallystarting about 1 s
pendedprior to calculating the 512 point FFT. The results before the onset of the signal and lasting from 3 to 7 s into the
appearin Fig. 5(b)superimposed uponthespeech power data record) is used. Some form of tapering is almost always
spectrum.It will be notedthat linearprediction provides a applied, e.g., for a 7 s window, a linear taper may be applied
“better” spectral envelope and yields smaller formant band- for 1 s before the signal onset as well as to the last second of
widths than cepstrum processing. the record with the window being constant for the 5 s in be-
tween the two taperedends. Longer windows are not generally
B . Seismic Data Processing [ I ] , [18], [34], [36/, (371, used since these would include more of the coda (i.e., tails of
I s 71-[591, (801 the data) which contain too many multipath reflection signals.
For more than a decade cepstrum analysis has been applied 2) A weighting of the time series may also be used, indepen-
to seismic data. 1) to determine the focal depth of a seismic dent of whether a window is applied. This weighting is gener-
event; 2) to remove spectral modulations caused by multipath ally in the form d“‘, where 0.96 < (Y < 1.0. This procedure
reflection; and 3) to determine the slapdown phase which re- tends to make the P phase more minimum phase.
sults from spallation of the earth’s surface near ground zero, 3) Zeros areappended tothe time series to increase its
and other situations. Knowledge of the depth of the seismic length,sometimes by as much as afactor of 10. A typical
event can be used to help discriminate whether the event is an sampling rate is 20 samples per s. The windowed data record
earthquakeoraman-madeexplosion.Theeliminationof may then be 5 s to give 100 data samples. This record is then
multipath reflections assists the data interpreter in the deter- extended with zeros to a duration of 1024 samples.
mination of source (event) parameters. Similar remarks apply 4) The seismic bandpass forbody waves is generally con-
to spalling as well. sidered to be in the range 0.1 to 2-5Hz depending on various
Usually, the power cepstrum, or a variation thereof (asdis- factors [36]. Echo delay (epoch) times for seismic events are
cussed earlier), is used to estimate the (P- p P ) time difference in the range 0.1 Q no T G several seconds. The lower end of
(see Fig. 6) which is the most realistic indicator of the depth this range requires 10 Hz bandwidth which is not always avail-
of theevent.Thecomplexcepstrum is used to separate able. At 20 samples per s, the data is over sampled so aliasing
(deconvolve) the basic P phase wavelet from the impulse train is of no concern.
caused by the echo or echoes. Epoch timing information can 5) Since seismic data has a narrow passband, the spectrum
onoccasion be obtainedby longpass liftering thecomplex is nonwhite. An elastic absorption by the earth of teleseismic
cepstrum. But this may require that the event be deep so that signals above 2 Hz means that the cepstrum is dominated by
the P phase cepstral information and the echo information are the bandpasscharacteristic of the earth [36]. Thiseffectis
adequately separated in quefrency. aggravated bythefrequencyresponse of the sensing instru-
Good success in processingseismic data is, however,ap- ment,but this canbe correctedby inverse filtering [36].
parentlydependentupona good SNR and wide bandwidth Pseudo-heterodyning,atranslation inquefrency(amethod
data [34]. analogous to conventional heterodyning), may be helpful for
Formanyyearspredictivedeconvolution(orlinear pre- such data [9(a)l.
diction) hasbeen used in theanalysis of seismic data [ 661-[ 861, 6) Cepstraare also calculated using thecodaandthen
(see [80]for a review). Thisprocedure is parametricand averaged, taking into account the predictable travel time dif-
proposes a model for the basicwavelet. As such it does not ferences. Spectra
are also
similarly
averaged to enhance
work particularly well if the wavelet to be removed is mixed spectralnulls [36], [37].
phase. Cepstrum analysis (homomorphicdeconvolution) is a Example-Seismic Data: This example maybe considered a
more general method for deconvolution, and is effective when simulation of seismic data with the echo being negative at the
the cepstra of the signals to be deconvolved occupy different airearth interface. In Fig. 7 we present four groupingsof
quefrency ranges. A method, called homomorphic prediction, threegraphs each. The firstgraph inthe firstgroup is the
1440 PROCEEDINGS OF THE IEEE, VOL. 6 5 , NO. 10, OCTOBER 1977
NOTCH F I L T E RE X R M P L E notch liftering. Here it can be seen that the spectral nulls have
C O R L E X CEPSTRLM NOTCH FlLTER
m
mm r m -1s- not beencompletelyremoved. And finally we have thelog
spectrum of the estimate of the echo wavelet.
INPUT T l l E SERIES
MOEL 0.50 SEC C. Hydroacoustic Data Processing (181, (311, (351,
.E
IIECOVLAED YRVELET [381, (451, (571
The power cepstrum has been used to estimate the source
depth of a known explosive charge by analyzing the data re-
ceived at long ranges. This is accomplished by measuring the
time period of the bubble pulse modulation on the spectrum
which results in a peak in the power cepstrum [45].
I
Thecepstrum has also been used to investigate multipath
A I
I \ I I
I
I WIVELET CEPSTRUM
conditions in shallow water as well 1311, [351, [381. But the
echoes are not all identicalin waveshape as commonly
assumed. The cepstrum is also apparently affected to a con-
siderable degree byfluctuationsinthetransmissionmedia,
IWUT flMPLITUU€ SPEC bottom reflections, and surface scattering1381.
As mentioned under the subsections on speech and seismic
WIVELET R R . SPEC. dataprocessing, it has been proposed that linear prediction
(predictivedeconvolution)andhomomorphicdeconvolution
ECHO R(IPL1fUDE S P E L be combined to more effectively analyze mixed phase signals
[ 181, [421, [ 571. This suggestion has been tested on marine
\
+ NULLS INPUT LOG SPECTRUM seismic data with apparently goodsuccess [ 571.
The specific information and procedures commonly used in
processing hydroacoustic data are the following:
1) The sampling rates used are dependent on specific appli-
cations as wellas the computational resources available, but
typically they are in the range of 100 to 1000samples per s.
2)Theeffects of windowingare not considered to be as
important as for seismicdata. Windows commonly used in-
cludetherectangular(boxcar), Hamming, Hanning (cosine),
Fa.7. Simulated seismic data example. The normalized amplitude of and linear taper.
the wavelet plus a negative echo is 1000 units. The echo occurs at 3) Rather than extend the data record with zeros, a longer
0.5 s and i s 0.6 that of thewavelet. An exponentialwindow time window is frequently used. It is commonly assumed that
(a= 0.99) was applied to the composite signal prior to calculating the
cepstmm. The four groupings of graphs show the time domain data hydroacoustic data is stationary, thus time averaging can be
and the corresponding cepstra, spectra and log spectra, respectively. employed. Typicalwindow durations are 1024or2048
samples, or longer depending on the computational capacity
available.
composite signal, i.e., the wavelet plus the negative echo. The 4) The spectrum is whitened prior to computing the power
overall normalized peak amplitude of the composite signal is cepstrum.Thewhiteningmaybe achieved witheitherthe
1000 units. The echo occurs 0.5 s after the onset of the basic square root or the logarithmic operation.
wavelet. The echo amplitude is 0.6 that of the original wave- 5 ) The spectrum is often highpass liftered to remove low-
let. The second graph in this group is the wavelet recovered quefrencycomponentsprior to computingthe power cep
by homomorphic filtering, i.e., notch liftering (rejecting)only strum. This is a form of trend removal to reduce leakage in
the fvst peak in the cepstrum. The third graph is an estimate the cepstrum domain.
of the echo obtained by calculating the difference between the 6)Theexpectedechodelay(epoch)times are 0.02 Q
first and second graphs. At the top of t h i s figure are the time noT Q 2 or 3 s. The bandwidth is at least 50 Hz.
scales for both the input data and the cepstrum. The spectrum
frequency scale appears at the bottom of the figure.
An exponentialwindow (a=0.99) was applied to the VII. CONCLUDINGREMARKS
composite signal prior to calculating the cepstrum. This made
the cepstrum of the composite signal nearly minimum phase. A . The Effects of Noise
The second grouping of three graphs presents the cepstra for The effects of noise are discussed in a qualitative way at
the previousthree waveforms respectively. The f m t two various points throughout the paper along with two recom-
points at the extreme left in each cepstra were zeroed. The mendedproceduresforalleviatingtheseeffectsfor wavelet
vertical line at 0.5 s shows the peak that was notch liftered. extraction and echo detection, namely, windowing the com-
Ideally, the second and third cepstra shouldbe identical when plex cepstrum and reducing errors due to aliasing and phase
normalized, but due to the fact that only one cepstrum peak unwrapping by appending zeros to the sampled data sequence.
was notch Liftered, these waveforms are in fact different. Noise analysis is presented in a more quantitative and extensive
The third grouping of three waveforms presents the respec- manner in [2], 171, [91,1121, [26], 1301. We point out in
tive amplitude spectra. Finally, we havein the last grouping particular that it has been recently shown that S.NR alone is an
the respective log spectra. The spectral nulls due to the echo insufficientmeasurefordeterminingcepstrumperformance
can barely be seen due to the narrow bandwidth simulated in and that the relative bandwidths of the signal and noise are
the first graph. The secondgraph is the log spectrum after also needed [ 71.
CHILDERS et 01.: GUIDE TO PROCESSING 1441
sis by homomorphic prediction,” IEEE Trans. Acoust., Speech, soundspectra using cepstraltechniques,”presentedat 90th
and SignalProcessing, vol. ASSP-24, pp. 327-332, Aug. 1976. Meeting of Acoust. Soc. Amer., Nov. 4-7, 1975.
1191 A. V. Oppenheim, R. W.Schafer, and T. G. Stockham, Jr., “Non- 145 ] S. K. Mitchell and N. R. Bedford, “Long range sensing of explo-
linear filtering of multiplied and convolved signals,” Proc. IEEE, sive sourcedepths using cepstrum,”paperpresentedat 90th
VOI. 56, pp. 1264-1291, Aug. 1968. Meeting Acoust. Soc. Amer., Nov. 1975.
[20] R. Rom, “On the cepstrum of twodimensional functions,” IEEE [ 4 6 ] A. Noll,“Short-timespectrumand‘cepstrum’techniques for
Trads. on Inform. Theory, pp. 214-217, Mar. 1975. vocal-pitch detection,” J. Acoust. SOC. Amer., vol. 36, pp.
(21 ] R. W.Schafer, “Echo removal by discrete generalized linear filter- 296-302, Feb. 1964.
ing,” Ph.D. Dissertation, M.I.T., Cambridge, MA, 1968. [ 4 7 ] -, “Ceptrum pitch determination,” J. Acoust. SOC.A m . , vol.
1221 S. Senmoto, “Adaptivedecompositionofcomposite signals in 41, no. 2, pp. 293-309, Feb. 1967.
noise,’’ Ph.D. Dissertation, University of Florida, Gainesville, FL, 1481 -, “Clipstrumpitchdetermination,” J. Acourt. SOC. Am.,
1971. vol.44,no. 6, pp. 1585-1591, Dec. 1968.
[23] S. Senmoto and D. G. Childers, “Analysis of a composite signal [ 4 9 ] -, “Pitchdeterminationofhumanspeechbytheharmonic
bycomplexcepstrumandadaptivefilter,” Trans.Insr.Elec. product spectrum, the harmonic sum spectrum, and a maximum
Comm. Engrs., (Japan), pt. A, pp. 9-16, 1972. likelihood estimate,” in CompurerProcessing in Communications,
(241 -, “Adaptive decomposition of a composite signal of identical J. Fox, Ed. Brooklyn, NY: Polytechnic, 1970, pp. 779-797.
unknownwaveletsin noise,” IEEE Trans. on Sysr.,Man, and [SO] A. Oppenheim, “Speechanalysis-synthesissystem based on
Cybem., vol. SMC-2, pp. 59-66, Jan. 1972. homomorphic filtering,” J. Acoust. SOC. Amer., vol. 45, no. 2,
[ 25 1 M. J. Shensa, “Complex exponential weighting applied to homo- pp. 4 5 8 4 6 5 , Feb. 1969.
morphic deconvolution,” Geophys. J. Roy. Asrron. SOC.,vol. 44, I 5 1 1 A.Oppenheimand R. W. Schafer,“Homomorphicanalysisof
PP. 379-387, 1976. speech,” IEEE Trans. AudioElecnoacoust., vol. AU-16,pp.
(261 D. P. Skinner, “Real-time composite signal decomposition,” Ph.D. 221-226, June 1968.
Dissertation, University of Florida, Gainesville, FL, 1974. [52 1 J. Rabhakarand S. C. Gupta, “Separation of Rayleigh and
[27] D.P. Skinnerand D. G.Childers, “Thepower,complex,and Poisson density functions through homomorphic filtering,” Nut.
phase cepstra,” presented at 1975 Roc. Nat. Telecommun. Conf., Elec. Conf., pp. 605-610, Dec. 1970.
Dec. 1975. [ 5 3 ] R. Schaferand L.R. Rabiner,“Systemforautomaticformant
I281 -, “Real-timecompositesignaldecomposition,” IEEE Trans. analysis of voiced speech,” J. Acousr. SOC.Amer., vol. 47, (Pt.2),
on Acoust.,Speech,and Signal Processing, vol.ASSP-24,pp. pp. 634-648, Feb. 1970.
267-270, June 1976. [ 5 4 ] S. Senmoto and D. G.Childers, “Decomposition of a composite
[29] R. G. Smith, “Cepstrum descrimination function,” IEEE Trans. signal of unknown wavelets in noise,” in Int. Conx Comm. (ICC),
Inform. Theory, vol. IT-2 1, pp. 332-334, May, 1975. 71C 28-COM, Montreal, Canada, 1971, pp. 5-14-5-19.
(301 T. G. Stockham, Jr., T. M. Cannon, R. B. Ingebretsen, “Blind [SS] J. Tenold, D.H. Crowell, R. H. Jones, T. H. Daniel,D. F.
deconvolutionthroughdigital signalprocessing,” Proc.IEEE, McPherson, A. N. Popper, “Cepstral and stationarity analysis of
vol. 63, pp. 678-692, Apr.1975.(Refer toforotherrelated full-termandprematureinfants’ cries,’’ J. Acoust.SOC.Amer.,
reports and Master’s theses.) vol. 56, no. 3, pp. 975-980, Sept. 1974.
I311 P. L. Stoffa, P. Buhl, and G. M. Bryan,“Theapplicationof [ 561 J. M. Tribolet, “A new phase unwrapping algorithm,” submitted
homomorphic deconvolution t o shallow-water marine seismology t o IEEE Trans. Acousr., Speech,and Signal Processing.
-Part I: Models,” Geophysics, vol. 39, no. 4 , pp.401-416, [ 5 7 ] J. M. Tribolet, A. V. Oppenheim, and G. E. Kopec, “Deconvolu-
Aug. 1974. tion by homomorphic prediction,” submitted t o Geophysics.
[32] -, “Cepstrum aliasing andthe calculation ofthe Hilbert [ S a ] , T.Ulrych,“Applicationofhomomorphicdeconvolution to
transform,” Geophysics, vol. 39, no. 4, pp. 543-544, Aug. 1974. seismology,” Geophysics, vol. 36, no. 4 , pp. 650-660, Aug. 1971.
[ 5 9 ] T. Ulrych, 0. G. Jensen, R. M. Ellis, and P. G. Sommerville,
“Homomorphic deconvolution of some teleseismic events,” Bull.
Theory-Related Material Seismological SOC. Amer., vol. 62, no. 5 , pp. 1253-1265, Mar.
[ 3 3 ] A. J. Berkhout, “On theminimum phase criterionofsampled 1972.
signals,” IEEE Trans. Geosci. Electron, vol. GE-11, pp. 186-198,
Oct. 1973.
Extensions
Applications-Homomorphic Filtering and Cepstra [601 D. I. H. Moore and D. J. Parker, “On nonlinear filters involving
(withTheory) transformationofthetime variable,” IEEE Trans. Inform.
Theory, vol. IT-19, pp. 415-422, July 1973.
[341 W. H. Bakunand L. R. Johnson,“Thedeconvolutionoftele-
seismic P waves from explosionsMilrow and Cannikin,” Geophys. Books-Homomorphic Systems, Cepstrum Theory,
J. Roy. Astron. SOC.,vol. 34, pp. 321-342, 1973.
[ 3 5 ] P. Buhl, P. L. Stoffa,and G. M. Bryan,“Theapplicationof and Applications
homomorphic deconvolution t o shallow-water marine seismology [ 6 1 ] B. Gold and C. M. Rader, Digital Processing of Signals. New
-Part 11: Real data,” Geophysics, vol. 39, no. 4, pp. 4 1 7 4 2 6 , York: McGraw-Hill, 1969.
Aug. 1974. [ 6 2 ] L. R. Rabinerand B. Gold, Theory and Application of Digital
[ 361T. Cohen, “Sourcedepth determinations using spectral, pseudo- SignalProcessing. Englewood Cliffs, NJ: Rentice-Hall, 1975.
autocorrelation and cepstral analysis,” Geophys. J. Roy. Asrron. I631 A. V. Oppenheim and R. W. Schafer, Digital Signal Processing.
SOC., VOl. 20, pp. 223-231, 1970.
EnglewoodCliffs, NJ: Rentice-Hall, 1975.
[ 3 7 ] T. J. Cohen, “Ps and pP phases from seven Pahute Mesa events,” [ 6 4 ] D.G. Childersand A. E.Durling, DigitalFiltering and Signal
Bull.SeismologicalSOC.Amer., vol. 65, no. 4, pp. 1029-1032, Processing. St. Paul, MN:West Publishing, 1975.
Aug. 1975. [ 6 5 ] J. L. Flanagan, Speech Analysis, Synthesis, and Perception, 2nd
[ 381 P. 0. Fjell, “Use of the cepstrum method for arrival times extrac- Ed. New York:Springer-Verlag,1972.
tion of overlapping signals due to multipath conditions in shallow
water,” J. Acourt. SOC. Amer., vol. 59, no. 1, pp. 209-21 1 , Jan.
1976. Linear Prediction, Predictive Deconvolution,
[ 391 0. S . Halpeny, “Epoch detection by cepstrum analysis,” Master’s Inverse Filtering, and Deconvolution
Thesis, University of Florida, Gainesville, FL, 1970.
1401R. Kemeraitand D.G. Childers, “Detectionofmultipleechoes (661 B. S. Atal,“Effectivenessoflinearpredictioncharacteristics of
immersed in noise,” in Roc. 15th Midwest Symp. Circuit Theory, the speech wave for automatic speaker identification andverifica-
May 4-5,1972 Pp. 1-10. tion,” J. Acoust. SOC. Amer, vol. 55, pp. 1304-1312, June 1974.
[411 -, “Decomposition of pulse-type data by cepstrum techniques,” [ 6 7 ] D. G. Childers, R. S. Varga, and N. W. Perry, Jr., “Composite
1973 IEEE SoutheastConf., Apr.30, May 1-2, 1973, pp. signal decomposition,” IEEE Trans. AudioElectroacousr , vol.
F-4-1-F-44. AU-18, pp. 471-477, Dec. 1970.
[ 4 2 ] G.E.Kopec,A. V. Oppenheim,and J. M. Tribolet,“Speech [ 681 M.P. Ekstrom, “A spectral characterization of the ill-conditioning
analysis by homomorphic prediction,” submitted t o IEEE Trans. in numerical deconvolution,” IEEE Tram. Audio Elecrroacoust.
Acoust., Speech and Signal Processing. (This paper is apparently VOI. AU-21, pp. 344-348, Aug. 1973.
an earlier version of [ 181.) 1691 B. R. Hunt, “A theorem on the difficulty of numerical decon-
1431 L. R. LeBknc, “Narrow-band sampleddatatechniquesforde- volution,” IEEETrans. Audio Elecrroacousr., vol. AU-20, pp.
tection via theunderwateracousticcommunicationchannel,” 94-95, Mar. 1972.
IEEE Trans. Commun.Technol., vol. COM-17,pp. 481488, [ 7 0 ] -, “Deconvolution of linear systems by constrained regressipn
Aug. 1969. and its relationship t o t h eWiener theory,” IEEE Trans. Automat.
[ 4 4 ] J. H. Miles, G. H. Stevens,and G. G. Leininger,“Analysisand Contr., vol. AC-17, pp. 703-705, Oct. 1972.
correction of ground reflection effects in measured narrowband [ 7 1 ] J. Makhoul, “Linear prediction: a tutorial review,” Proc.IEEE,
PROCEEDINGS OF
VOL.
THE IEEE, 65, NO. 10, OCTOBER
1443 1977
VOl. 63, pp. 561-580, A P . 1975. Trans. Geosci. Electron., vol. GE-9, no. 1 , pp. 28-34, Jan. 197 1.
B. Mitchell, M. Landisman, and 2. A. Der, “Predictive deconvolu- [SO] L. C. Woodand S. Treitel,“Seismic signal processing,” h o c .
tion applied to long range seismic refraction observations,” Pure IEEE, vol. 63, pp. 649-661, Apr. 1975.
Appl. Geophysics, vol. 96, pp. 127-133, 1972.
K. L. Peacock and S. Treitel, “Redictive deconvolution: theory
and practice,” Geophysics, vol. 34, pp. 155-169, Apr. 1969. Books-Computational Seismology
E. A.Robinson,“Predictivedecomposition o f seismictraces,”
Geophysics, vol. 22, pp. 767-778, 1957. [ 8 1 ] M. Bath, MathematicalAspects of Seismology. Amsterdam:
-, “Properties ofthe Wold decomposition of stationary Ekevier, 1968.
stochastic processes,” Theory h o b . Appl., vol. 8, pp. 187-194, [ 82 1 Computational Seismology, V. I. Keilis-Borok, Ed. New York:
1963. Consultants Bureau, 1972.
-, “Mathematicaldevelopment of discretefilters for the de- [ 8 3 ] E. A. Robinson, Multichannel Time Series Analysis w i t h Digital
tection of nuclearexplosions,” J. GeophysicsRes., vol. 68, Computer Programs. San Francisco: Holden-Day, 1967.
pp. 5559-5567,1963. [ 841 -, StatisticalCommunicationandDetectionwith Special
-, “Multichannelz-transforms and minimum delay,” Geo- Referenceto DigitalDataProcessing of Radarand Seismic
physics, vol. 31, pp, 482-500, June 1966. Signa&. New York:Hafner, 1967.
-, “Redictive decomposition of time series with application to [ 85 1 E. A. Robinson and S. Treitel, Robinson-Treitel Reader, 3rd Ed.
seismicexploration,” Geophysics, vol. 32, pp. 4 1 8 4 8 4 , June Tulsa, OK: Seismographic Service Corp., 1973.
1967. [ 8 6 ] Seismic Filtering, R. Van Nostrand, Ed. Tulsa,OK:Society of
(791 J. Schell, “Dereverberation by linear systems techniques,” IEEE Exploration Geophysicists, 1971.
Abrfmct-A very general computer prognm is d e s c r i i that can be were attempting to present a package of foolproof programs
used for thesynthesis of passive LC,active RC,and (infiite impulse re- that could be used without any knowledge of fiiter design the-
sponse)d @ fdters. Althou& it operates in both batch and intenc-
tive modes, this discussion deals exclusively with the interactive mode,
oryandstillyieldmeaningful results. Great emphasis was
which is somewhat more general and very easy to use. Apart fiom of- hence placed on the computer second guessing the user and
fering superior accuracy and flexiiility, this progt-rm offers many f&, supplying answers to questions the designer should have an-
including the passive realizationof complex quadruplets of hamnuwon.. sweredin the first place. This philosophy necessarily led to
zeros, the simultaneous replization of two transmission zero pairs, the considerable restrictions.
active RC leapfrag realization,and m y others.
Today we should perhaps try to keep people off the comput-
I. INTRODUCTION ers (that should at least improve turnaround time) and the au-
thor himself came around to the point of view that no matter