1 tayangan

Judul Asli: forensic audio.pdf

Diunggah oleh Lino García Morales

- Numerical and Experimental Analysis of the Power Output of a Point Absorber Wave Energy Converter in Irregular Waves
- Reverberation Time Problems (Solved)
- Power Spectrum
- Jin 2014
- LA1600
- ALTO+L-20+service+manual
- Quantile Based Noise Estimation for Spectral Subtraction
- DESC9115_DAS_Assign02_310106370
- lec38.pdf
- Chapter 12 Furier
- communication system
- Grid Nesting With STWAVE
- Audio OpAmp Ranking
- Dynamic Spectrum Access and Meteor Burst Communications
- EEC
- j9
- Noise 02 - Sound Absorption
- Article-Swinging the towers.pdf
- Performance Analysis of Signal Detection for Cognitive Radio
- 15-05-0393-00-004a-codes-data-and-preamble

Anda di halaman 1dari 42

Audio

and Yolanda Blanco-Archilla1

1

Universidad Europea de Madrid

Departamento de Electronica y Comunicaciones

E-mail: {lino.garcia,myolanda.blanco}@uem.es

2

Universidad Politecnica de Madrid

Senales, Sistemas y Radiocomunicaciones

E-mail: berako@gaps.ssr.upm.es

3

Universidad de Vigo

Departamento de Teora de Senal y Comunicaciones

E-mail: marisol@gts.tsc.vigo.es

Summary. The adaptive ltering techniques have plenty of applications in any ar-

eas where the modeled signals or systems are constantly changing. An adaptive lter

is a system whose structure is alterable or adjustable in such a way that its behavior

or performance improves through contact with its environment [16]. This chapter

focuses on adaptive ltering techniques for forensic audio applications. Multichan-

nel multirate specialized structures are presented as general cases. Five approaches

are studied: spectral equalization, adaptive linear prediction (ALP), adaptive noise

cancellation (ANC), beamforming and deconvolution or derreverberation. Objec-

tive and subjective measurements for the evaluation of intelligibility after speech

enhancement are revised.

1 Introduction

criminal and civil investigation in courts of law. There are sophisticated tech-

nologies and scientic methodologies of forensic audio that include several

specialties: voice and sound identication, audibility analysis, audio enhance-

ment, authenticity analysis and others. The employment of audio recordings of

telephone conversations and interviews, as well as covert surveillance record-

ings, is an integral part of law enforcement. Such recordings are frequently of

poor quality, but they are admissible if they are intelligible and meet the rules

of evidence. It can be equally important that the recording is listenable, and

the information that contains is easily discerned by the jury. Written tran-

scripts can be vital evidence, and these can only be made with condence if

2 Lino Garca et. al.

the recording is intelligible [5]. Forensic ltering can be used in the laboratory

to reject the noise and interference, as well as to restore, clarify or enhance the

audio information to assist law enforcement agencies criminal investigation,

civil investigation and the court process.

There is a number of possible degradations that can be found in a speech

recording and that can aect its quality. On one hand, the signal arriving

the microphone usually incorporates multiple sources: the desired signal plus

other unwanted signals generally termed as noise. On the other hand, there are

dierent sources of distortion that can reduce the clarity of the desired signal:

amplitude distortion caused by the electronics; frequency distortion caused by

either the electronics or the acoustic environment; and time-domain distortion

due to reection and reverberation in the acoustic environment.

Adaptive lters have traditionally found a eld of application in noise and

reverberation reduction, thanks to their ability to cope with changes in the

signals or the sound propagation conditions in the room where the recording

takes place. This chapter is an advanced tutorial about multichannel adaptive

ltering techniques suitables for forensic audio to provide relevant theoretical

foundation in this regard. The employment of more than one microphone is

useful for audio surveillance purposes. This is possible when the room where

the recording will be made can be prepared in advance. However, monochan-

nel adaptive ltering can be seen as a particular case of the more complex

and general multichannel adaptive ltering. The dierent adaptive ltering

techniques are presented in a common foundation useful in other forensic dis-

ciplines.

This chapter is organized as follows: in Sect. 1.1, we introduce a formal

denition of the forensic audio scenario from the multiple-input and multiple-

output (MIMO) perspective and the terminology that is used throughout the

chapter. In Sect. 1.2, signals and systems related to forensic audio are briey

summarized. Section 1.3 discusses quality measurements. Section 2 is dedi-

cated to the theoretical foundation of the adaptive lters. In Sects. 2.1 and 2.2

the lters structure and the adaptations algorithms are discussed. The dif-

ferent cost functions, stochastic estimations and optimization strategies over

transversal and lattice structures are shown. In Sect. 3 specialized structures

based on multirate techniques are presented. These schemes allow computa-

tionally ecient algorithms to be suitable for very large impulse responses

involved in forensic audio applications and real time implementations. Two

approaches are considered in Sects. 3.1 and 3.2: the subband and frequency-

domain partitioned adaptive ltering respectly. In Sect. 3.3 the partitioned

convolution is described and in Sect. 3.4 a delayless approach for real-time

applications are commented. Section 4 focuses on the adaptive ltering tech-

niques for forensic audio application: spectral equalization, linear prediction,

noise cancellation, beamforming and deconvolution.

Adaptive Filtering Techniques for Forensic Audio 3

V x2 (n) W y2 (n)

s2 (n)

xP (n) yO (n)

sI (n)

r(n)

The box, on the left, represents a room where the evidence is being

recorded. V is a P LI matrix that contains the acoustic impulse responses

(AIR) between the I sources and P microphones 1

v11 v12 v1I

v21 v22 v2I

V= . .. . . .. ,

.. . . .

vP 1 vP 2 vP I

vpi = vpi1 vpi2 vpiL . (1)

interference signals (to attenuate). xp (n), p = 1 . . . P , is a corrupted or poor

quality signal that wants to be improved, (P = 1 corresponds to the sin-

gle channel case). r(n) is an additive noise or interference signal due to the

recording device. The forensic ltering goal is to obtain a W matrix so that

yo (n) si (n) corresponds to a restored or enhanced signal.

The signals in the Fig. 1 are related by

y(n) = Wx(n). (3)

1

Note that the discontinuous lines represent only the direct path and some rst re-

ections between the s1 (n) source and the microphone with output signal x1 (n).

Each vpi vector represents the AIR between i = 1 I and p = 1 P positions

and is constantly changing depending on the position of both: source or micro-

phone (i.e. mobile recording device), angle between them, radiation pattern, etc.

4 Lino Garca et. al.

T

s(n) = sT1 (n) sT2 (n) sTI (n) , (4)

T

si (n) = si (n) si (n 1) si (n L + 1) .

x(n) is a P 1 vector that corresponds to the convolutive system output

excited by s(n) and the adaptive lter input of order O LP . xp (n) is an

input corresponding to the channel p containing the last L samples of the

input signal x,

T

x(n) = xT1 (n) xT2 (n) xTP (n) , (5)

T

xp (n) = xp (n) xp (n 1) xp (n L + 1) .

W is an O LP adaptive matrix that contains an AIRs between the P

inputs and O outputs

w11 w12 w1P

w21 w22 h2P

W= . .. . . .. ,

.. . . .

wO1 wO2 hOP

wop = wop1 wop2 wopL . (6)

For a particular output o = 1 O, normally matrix W adapts in a rear-

ranged way like a column vector

T

w = w1 w2 w P . (7)

Finally, y(n) is an O 1 target vector,

T

y(n) = y1 (n) y2 (n) yO (n) .

The used notation is the following: a or is a scalar, a is a vector and A is a

matrix in a time-domain, a is a vector and A is a matrix in a frequency-domain.

Equations (2) and (3) are in matricial form and correspond to convolutions

in a time-domain. The index n is the discrete time instant related with the

time (in seconds) by means of a sample frequency Fs according to t = nTs ,

Ts = 1/Fs . Ts is the sample period. Superscript T denotes the transpose of

a vector or a matrix, denotes the conjugate of a vector or a matrix and

superscript H denotes Hermitian (the conjugated transpose) of a vector or a

matrix. Note that, if adaptive lters are a L 1 vectors, L samples have to

be accumulated per channel (i.e. delay line) to make the convolutions (2) and

(3).

From a signal processing point of view, the particular problem of noise re-

duction generally involves two major steps: modeling and ltering. The mod-

eling step generally involves determining some approximations of either the

noise spectrum or the input signal spectrum. Then, some ltering is applied to

emphasize the signal spectrum or attenuate/reject the noise spectrum. If the

parameters of the signal or noise model change over time, then the ltering

must be adaptive [7].

Adaptive Filtering Techniques for Forensic Audio 5

The involved signals in forensic audio science are very particular. In general, it

is possible to group them in three big classes: speech, noise, and the acoustic

impulse responses of the involved room.

Speech

Speech is produced by inhaling air into the lungs and exhaling it through a

vibrating glottis and the vocal tract. The random noise-like, air ow from the

lungs is spectrally shaped and amplied by the vibrations of the glottal cords

and the resonance of the vocal tract. The eect of the glottal cords and the

vocal tract is to introduce a measure of correlation and predictability on the

random variations of the air from the lungs [29]. The speech signal (voice)

is formed by silence, noisy and fricative segments and harmonic or occlusive

segments and is highly modulated.

Speech signal

1

1

0 0.1 0.2 0.3 0.4 0.5

Seconds

Autocorrelation

400 0.8

200 0.2

0.4

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Seconds

Timefrequency analysis

2000 0

20

Hz

1000 40

60

80

0

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Seconds

Fig. 2. Example of speech word. In the upper part a 0.5-seconds segment of speech

signal corresponding to the word sota [so ta] is depicted in the time-domain. The

signal was sampled at 8192 Hz. The superimposed solid light gray line corresponds

to the normalized energy of the segment and shows the low-frequency spectrum

that denes the rate at which we utter phonemes while the dotted dark gray line

corresponds to the normalized zero crossing. In the middle part, an autocorrelation

analysis of the speech signal is depicted. In the lower part a time-frequency analysis

of the same segment is depicted. Dark colors represent areas with high energy, light

colors display areas with low energy.

At the bottom of the Fig. 2 each one of these parts can be seen: rst a

fricative segment corresponding to s is a noisy segment, relatively low-pass;

although there is not a particular spectral region with preponderate energy.

6 Lino Garca et. al.

energy is concentrated in narrow frequency bands which correspond to the

formants and it is directly related with the autocorrelation (measure of pre-

dictibility) (in the middle of the Fig. 2). The next segment corresponding to

the silence has low energy contribution (probably due to recording device since

any anechoic chamber was used for recording). The next segment correspond-

ing to t is of very short duration and is weakly harmonic. The last occlusive

segment, corresponding the a phoneme, is clearly harmonic but with a dier-

ent contribution from o phoneme. The probability density function (pdf) of

the speech signal in time-domain is close to Laplacian and it can be assumed

to be sparse2.

Noise

The types of noises and interferences which can be present in the evidence

recording can be strong, subtle, and/or time varying and these may occur in

the acoustic environment where the microphones are located in. Some common

examples of acoustic noises and interferences include: air conditioning and fan

hum, reverberation and echoes, engines and other machinery, wind and rain,

radio and TV, live music, background speech in public places, other talkers,

vehicular trac and road noise, etc. Noise and unwanted sounds may lead to

listener fatigue and confusion for untrained listeners. Another class of noise

is related with a distortion that can be introduced by the recording devices

(bandwidth distortion). Both cases have a dierent approach. In the rst case

the noise and interference signal can be considered as any input signal si (n).

In the second case the noise r(n) is added equally to all the channels. Figure

3 shows an example of speech aected by an additive noise.

AIR

gymnasium can be very dicult. Reverberation is made up of sound reections

that have the eect of smearing, or blurring speech, making it less clear and

distinct and therefore more dicult to understand. This eect is reinforced as

the room reverberation time grows up. The energy of a reverberating signal

in a room depends on its size, and the materials inside it (dierent materials

having a dierent reection and absorption coecients) [12]. AIR or acoustic

transfer function is the relation between the receptor sound pressure and the

source sound pressure and can be modeled with direct sound, rst reections

and diuse eld. The AIR of a typical room can be very large (thousands

of taps) and in most cases is an unknown parameter. Figure 4 shows the

consequences of a reverberant environment.

2

A signal is sparse if only few of its samples are signicantly dierent from zero.

Sparseness is usually modeled by a Laplacian pdf.

Adaptive Filtering Techniques for Forensic Audio 7

1

1

0 0.1 0.2 0.3 0.4 0.5

Seconds

Autocorrelation

400 0.8

200 0.2

Seconds

Timefrequency analysis

2000

0

20

Hz

1000

40

0

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Seconds

Fig. 3. Example of a contaminated speech word. The same segment speech signal

depicted in the Fig. 2 is contaminated with a broad band noise signal with a -3 dB of

signal noise rate (SNR). It is dicult to recognize the occlusive segments, however,

in spite of the high level of the noise the speech signal is intelligible and perfectly

recognized (even until for SNR = -10 dB).

1

1

0 0.1 0.2 0.3 0.4 0.5

Seconds

Autocorrelation

400 0.8

200 0.2

0.4

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Seconds

Timefrequency analysis

2000 0

20

40

Hz

1000

60

80

0

0 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4

Seconds

Fig. 4. Example of reverberating speech word. The same segment speech signal

depicted in the Fig. 2 is ltered simulating a recording of the evidence in the AIR

room. Note this the harmonic distortion.

8 Lino Garca et. al.

introducing artifacts and to leave the remainder as musical as possible. This

is not constricted by these guidelines with forensic audio. The only goal, in

most cases, is to audibly discern what is being said by certain individuals so

that the message, previously obscured by the noise can now be heard and

understood. The audio enhancement is used to improve listenability and/or

intelligibility of a sound source by rejecting noise and interference signals (the

process otherwise known as restoration ). Noise reduction allows preparing the

evidence to minimize confusion without altering the nature of wanted signal.

Speech intelligibility is a measure of eectiveness of understanding speech,

as dened by the ISO standard [19]. The measurement is usually expressed as

a percentage of a message that is understood correctly. Speech intelligibility

does not imply speech quality. A synthesized voice message may be completely

understood by the listener, but maybe judged to be harsh, unnatural, and of

low quality. A message that lacks quality may still be intelligible.

Speech intelligibility can be assessed applying two dierent methods: sub-

jective assessment, based on the use of speakers and listeners; and objective

assessment, based on the measurement of physical parameters of the trans-

mission channel. Subjective tests are laborious, because they must involve a

number of speakers and listeners to assure representative results. Furthermore,

the results are dicult to reproduce, even if the test includes several refer-

ence conditions. Objective measurements are much faster and repeatable, but

their results may not be reliable, because they do not measure intelligibility

directly, but determine physical parameters to predict intelligibility accord-

ing to certain model. Such model might have restrictions that need to be

considered.

The subjective intelligibility measure may be based on phonemes, words

(meaningful or nonsense), and sentences; the results attained by using these

three types of material can be, in principle and under controlled conditions,

related to each other through the common intelligibility scale (CIS) [2]. The

use of sentences is advantageous in case of temporal distortion, such as the

introduced by reverberation, because they allow a closer simulation of this

kind of distortion on continuous speech. A very reproducible test, based on

sentence intelligibility, is the speech reception threshold (SRT) [24]. The SRT

would give an estimation of how much noise can the enhancement algorithm

cope with.

Behind objective intelligibility assessment underlies the assumption that

the intelligibility of a speech signal is based on the contribution of individual

frequency bands. In [9], French and Steinberg showed that the information

content of a speech signal is not equally distributed across frequencies, and

developed a model of twenty contiguous frequency bands that equally con-

tribute to a intelligibility index, the articulation index (AI). Based on this

model, several objective measurements have been developed for dierent ap-

Adaptive Filtering Techniques for Forensic Audio 9

plication elds. Among them, the only one which accounts correctly for band-

limiting noise, reverberation, echoes and non-linear distortion is the speech

transmission index (STI), standardized by IEC [18]. The STI is based on the

generation and analysis of an articial probe signal that replaces the speech

signal, on which it is easier to measure the eects of noise and distortion.

Under some circumstances, it is possible to mathematically derive the STI

from the impulse response of the transmission system, thus avoiding the use

of the probe signal. The applicability of STI to evaluate the performance of

speech enhancement algorithms will be discussed in Sect. 5.

The major assumption in developing linear time-invariant (LTI) systems is

that the unwanted noise can be modeled by an additive Gaussian process.

However, in some physical and natural systems, noise can not be modeled

simply as an additive Gaussian process, and the signal processing solution

may also not be readily expressed in terms of mean squared errors (MSE)3

[7]. Adaptive ltering techniques are used largely in audio applications where

the ambient noise environment has a complicated spectrum, the statistics

are rapidly varying and the lter coecients must automatically change in

order to maintain a good intelligibility of the speech signal. Thus, ltering

techniques must be powerful, precise and adaptive.

The adaptive lter adjusts itself to remove the modeled signal represent-

ing the unwanted signal (noise and interference) while preserving the target

signal (speech) in such a way that the desired information can be recovered.

There are two major approaches depending on the number of available refer-

ence signals to help the estimation of the noise spectrum. Most non-referenced

noise reduction systems have only one single input signal. The task of esti-

mating the noise and/or signal spectra must then make use of the information

available only from the single input signal and the noise reduction lter will

also have only the input signal for ltering. Referenced adaptive noise reduc-

tion/cancellation systems work well only in constrained environments where

a good reference input is available, and the crosstalk problem is negligible or

properly addressed.

3

MSE is the best estimator for random (or stochastic) signals with Gaussian distri-

bution (normal process). The Gaussian process is perhaps the most widely applied

of all stochastic models: most error processes, in an estimation situation, can be

approximated by a Gaussian process; many non-Gaussian random processes can

be approximated with a weighted combination of a number of Gaussian densi-

ties of appropriated means and variances; optimal estimation methods based on

Gaussian models often result in linear and mathematically tractable solutions

and the sum of many independent random process has a Gaussian distribution

(central limit theorem) [29].

10 Lino Garca et. al.

the evidence. More microphones are better than one. In a multichannel sys-

tem (P > 1) it is possible to remove noise and interference signals by applying

sophisticated adaptive ltering techniques that use spatial or redundant in-

formation. However there are a number of noise and distortion sources that

can not be minimize by increasing the number of microphones. Examples of

this are the surveillance, recording, and playback equipment (i.e. body wire

on a police informant, a telephone tap, a wireless transmitter, an emergency

service phone recorder, a memo recorder, shotgun microphone or parabolic

dish, video tape recorder, or the like).

There are several classes of adaptive ltering [17] that can be useful for

audio forensic, as will be shown in Sect. 4. The dierences amoung them are

based on the external connections to the lter.

(a)

x(n) Adaptive

lter

Parameters

(b)

lter

(c) d(n)

lter

In the estimator application [see Fig. 5(a)], the internal parameters of the

adaptive lter are used as estimate.

In the predictor application [see Fig. 5(b)], the lter is used to lter an

input signal, x(n), in order to minimize the size of the output signal, e(n) =

x(n) y(n), within the constrains of the lter structure. A predictor structure

is a linear weighting of some nite number of past input samples used to

estimate or predict the current input sample.

Adaptive Filtering Techniques for Forensic Audio 11

In the joint-process estimator application [see Fig. 5(c)] there are two

inputs, x(n) and d(n). The objective is usually to minimize the size of the

output signal, e(n) = d(n) y(n), in which case the objective of the adaptive

lter itself is to generate an estimate of d(n), based on a ltered version of

x(n), y(n) [17].

structures. There are three types of linear lters with nite memory: the

transversal lter, lattice predictor and systolic array [16].

Transversal

sponse lter (FIR) is the most suitable and the most commonly employed

structure for an adaptive lter. The utility of this structure derives from its

simplicity and generality. Its transfer function can be changed easily con-

trolling the L coecients wl , l = 1 L. There is a simple linear relation-

ship between

L the parameters of the lter and the transfer lter function

W (z) = l=1 wl z l . In a FIR lter, each output value y(n) is determined

by a nite weighted combination of L previous values of the input signal

L

y(n) = wl x(n l + 1) = wH x(n) = w, x(n) , (8)

l=1

T T

with w = w1 w2 wL and x(n) = x(n) x(n 1) x(n L + 1) .

Equation (8) is called nite convolution sum 4 .

x(n) z 1 z 1 z 1

w1 w2 wL

y(n)

4

Although the audio signals are real, the Hermitian operator H is used (instead

of Transpose operator, T ) as the most general case because these structures can

be used within multirate specialized structures [Sect. 3] that generate complex

signals.

12 Lino Garca et. al.

transfer function of W (z) in terms of delays and multipliers.

The transversal predictor lter output in Fig. 5(b) is the dierence between

the current input sample x(n) and the predicted value, given by

L

y(n) = wl x(n l) = w, x(n l) . (9)

l=1

illustrated in Fig. 5(c) is given by

L

y(n) = wl x(n l + 1) = w, x(n) . (10)

l=1

Multichannel Adaptive Transversal Filtering Extension

Figure 7 shows a multichannel adaptive ltering scheme using a transversal

structure.

d(n)

w1

xP (n) yP (n)

wP

estimator as illustrated in Fig. 5(c) is given by

P

L P

y(n) = wpl xp (n l + 1) = wp , xp (n) = w, x(n) , (11)

p=1 l=1 p=1

Adaptive Filtering Techniques for Forensic Audio 13

Lattice

The lattice lter is an alternative to the transversal lter structure for the

realization of a predictor [10]. Figure 8 shows a Lattice-ladder joint-process

estimation consisting of L 1 stages; the number L 1 is refered to as the

predictor order. The coecient of the lattice structure, kl , l = 1 L 1, are

commonly called the reection or PARCOR coecients. In this framework,

instead of applying the input signal to a tapped-delay line, a prediction linear

lattice structure is used between both of them. The L observations in the x(n)

vector are replaced by the set of backward prediction errors b(n).

Stage 1 Stage L 1

x(n) f1 (n) f2 (n) fL1 (n) fL (n)

...

k1 kL1

b1 (n) k1 b2 (n)

...

bL1 (n) kL1 bL (n)

1 1

z z

w1 w2

wL1

wL

y(n)

...

Fig. 8. Multistage lattice lter.

spanned by x(n 1), x(n L). The lattice lter is a consequence of nding

a new set of vectors which also span this subspace, at the same time as they

have the valuable property of being mutually orthogonal. Considering (9) on

both prediction errors: forward and backward predictors, and that due to

the symmetry of the autocorrelation function, bl = fLl+1 , 1 l L, the

recursive equations of the prediction errors are given by

bl (n) = bl1 (n 1) + kl fl1 (n), b1 (n) = x(n). (13)

Applying the projection theorem and knowing that the optimal backward

2 2

and forward prediction errors have the same norm f l (n) = bl (n) , 1

l L, the reection coecients can be obtained as

kl = . (14)

f l1 (n) bl1 (n 1)

individual stages, each of them has the appearance of a lattice [16].

14 Lino Garca et. al.

useful for the adaptive ltering because its predictor, diagonalizes completely

the autocorrelation matrix. The transfer function of a lattice lter structure

is more complex than a transversal lter because the reexion coecients are

involved,

H

y(n) = w b(n), (16)

T

where b(n) = b1 (n) b2 (n) bL (n) is a backward predictor vector, Q is a

lower triangular matrix that depends on the reection coecients as follows

1 0 0 0 0

k1 1 0 0 0

k2

k1 k2 1 0 0

Q= . .. .. .. .. .. (17)

.. . . . . .

kL2 k1 kL2 k2 kL2 1 0

kL1 k1 kL1 k2 kL1 kL2 kL1 1

T

and x(n) = x(n) x(n 1) x(n L + 1) . There are L coecients of the

transversal part and L1 reection coecients. Q is a LL matrix which acts

like a preconditioner over the x(n) vector that generates a decorrelated signal

b(n). The error for both: predictor and joint-process estimator applications

illustrated in Fig. 5(b,c) respectly, depends on e = f {w, k}.

ture. The multichannel version of lattice-ladder structure [21] must consider

the interchannel relationship of the reection coecients by each stage l

bl (n) = bl1 (n 1) + Kl f l1 (n) , b1 (n) = x(n), (19)

(20)

T T

where f l (n) = f1l (n) f2l (n) fP l (n) , bl (n) = b1l (n) b2l (n) bP l (n) ,

T

k11l k12l k1P l

T k21l k22l k2P l

x(n) = x1 (n) x2 (n) xP (n) , and Kl = . .. . . . .

.. . . ..

kP 1l kP 2l kP P l

A compact equivalent representation of 15 and 16 is possible:

y(n) = wAb(n 1) + wKf 1 (n), (22)

Adaptive Filtering Techniques for Forensic Audio 15

z 1 z 1 d(n)

w11 w12 w1(L1) w1L

y1 (n)

y(n) e(n)

z 1 z 1

wp1 wp2 wp(L1) wpL

yP (n)

T

where w = wT1 wT2 wTL is a LP 1 vector of the joint-process

T

estimator coecients, wl = w1l w2l wP l .

T

b(n) = bT1 (n) bT2 (n) bTL is a LP 1 backward predictor coecients

vector. A is a LP LP matrix obtained with a recursive development of (18)

and (19)

0P P 0P P 0P P 0P P 0P P

IP P 0P P 0P P 0P P 0P P

K1 K2 I 0 0P P 0P P

P P P P

K1 K3 K2 K3 0P P 0P P 0P P

A= .. .. .. .. .. .. . (23)

. . . . . .

K KL3 K KL3 0P P 0P P 0P P

1 2

K1 KL2 K2 KL2 IP P 0P P 0P P

K1 KL1 K2 KL1 KL2 KL1 IP P 0P P

T

P P zero matrix. K = IP P K1 K2 KL1 is a LP P reection

coecients matrix.

An equivalence with a multichannel transversal lter transfer function is

possible by rewritting Q matrix (17) and x(n) used in (16) as

16 Lino Garca et. al.

IP P 0P P 0P P 0P P

K1 IP P 0P P 0P P

K2 K

K 0 0P P

1 2 P P

.. ,

Q = ... ..

.

..

.

..

. . (24)

KL3 K KL3 0P P 0P P

1

KL2 K1 KL2 IP P 0P P

KL1 K1 KL1 KL2 KL1 IP P

T

x(n) = x(n)T x(n 1)T x(n L 1)T , (25)

T

with x(n) = x1 (n) x2 (n) xP (n) . Note that (15), Q dened in (24),

is the multichannel version of Gram-Schmidt orthogonalization algorithm. The

determinant of Q matrix is unitary, therefore it is non singular and it has

inverse.

Once a lter structure has been selected, an adaptation algorithm must also be

chosen. From control engineering point of view, the forensic ltering is a sys-

tem identication problem that can be solved by choosing an optimum criteria

or cost function J(w) in a block or recursive approach. Several alternatives

are available, and they generally exchange increased complexity for improved

performance (speed of adaptation and accuracy of the transfer function after

adaption or misalignment5 ).

Cost Functions

Cost functions are related to the statistics of the involved signals and depends

on some error signal

The error signal e(n) depends on the specic structure and the adaptive

ltering strategy but it is usually some kind of similarity measure between

the target signal si (n) and the estimated signal yo (n) si (n) (for I = O).

The most habitual cost functions are listed in Table 1.

Stochastic Estimation

The input signal is divided into time blocks, and each block is processed

independently or with some overlap. This algorithms have a nite memory.

5

A misalignment for transversal lter is dened by = v w2 / v2 . A con-

version between lattice and tranversal lter is possible and useful to setup an

equivalent misalignment measure for lattice structures.

Adaptive Filtering Techniques for Forensic Audio 17

J(w) Comments

2

e (n) Mean squared error (MSE). Statistic mean operator

N1

1

N n=0

e2 (n) MSE estimator. MSE is normally unknown

e2 (n) Instantaneous squared error

|e(n)|

Absolute error. Instantaneous module error

n nm 2

m=0

e (m) Least squares (Weighted sum of the squared error)

E{f l (n)2 + bl (n)2 } Mean squared predictor errors (for a lattice structure)

the adaptive algorithms because they emphasize the variations in the cross-

correlation between the channels. However, this requires a careful structuring

of the data, and they also increase the computational exigencies: memory and

processing. For a p channel, the input signal vector dened in (5) happens to

be a matrix of the form

T

Xp (n) = xTp (n N + 1) xTp (n (N 1) + 1) xTp (n) , (27)

xp (n N + 1) xp (n (N 1) + 1) xp (n)

xp (n N ) xp (n (N 1)) xp (n 1)

Xp (n) = .. .. .. .. ,

. . . .

xp (n N L + 2) xp (n (N 1) L + 2) xp (n L + 1)

T

d(n) = d(n N + 1) d(n (N 1) + 1) d(n) , (28)

where N represents the memory size. The input signal matrix to the mul-

tichannel adaptive ltering has the form

T

X(n) = XT1 (n) XT2 (n) XTP (n) . (29)

In the most general case (with order memory N ), the input signal X(n) is

a matrix of size LP N . For N = 1 (memoryless) and P = 1 (single channel)

(29) is reduced to (5).

There are adaptive algorithms that use memory N > 1 to modify the

coecients of the lter, not only in the direction of the input signal x(n),

but within the hyperplane spanned by the input vector x(n)

and its N 1

immediate predecessors x(n) x(n 1) x(n N + 1) per channel. The

block adaptation algorithm updates its coecients once every N samples as

w(m) = arg minJ(w).

The time index m makes reference to a single update of the weights from time

n to the n + N based on the K accumulated samples.

18 Lino Garca et. al.

ministic iterative algorithms, allow the system to approach the solution with

the partial information of the signals every time using the general rule

w(n) = arg minJ(w).

The new estimator w(n+ 1) is updated from the previous estimation w(n)

plus adapting-step or gradient obtained from the cost function minimization

J(w). These algorithms have an innite memory. The trade-o between con-

vergence speed and the accuracy is intimately tied to the length of memory

of the algorithm.

The error of the joint-process estimator using a transversal lter with

memory can be rewritten like a vector as

The unknown system solution, applying the MSE like the cost function,

leads to the normal or Wiener-Hopf equation. The energy of the error vector

(sum of the squared elements of the error vector) is given by the inner vector

product as6

2

J(w) = e = eH e = (d XH w)H (d XH w), (33)

J(w)

= 2Xd + 2XXH w. (34)

w

The Wiener lter coecients are obtained by setting the gradient of the

square error function to zero, this yields

1

w = XXH Xd = R1 r. (35)

X1 X1 X1 X2 X1 XP

X2 X1 X2 X2 X2 XP

R = XXH = . .. .. .. , (36)

.. . . .

XP X1 XP X2 XP XP

T

r = Xd = X1 d X2 d XP d . (37)

Fig. 1; for each i = 1 I input source, P (P 1)/2 relations are ob-

tained: xH H

p wq = xq w p for p, q = 1 P , with p = q. For vector u =

T

P T T T

p=2 w p w 1 w 1 , it is possible to verify that Ru = 0P L1 , thus

6

The index time n is omitted by simplicity. All the vectors and matrices are refer-

enced at the same index time n.

Adaptive Filtering Techniques for Forensic Audio 19

R is not invertible and no unique problem solution exists. The adaptive al-

gorithm leads to one of many possible solutions which can be very dierent

from the target v. This is known as a non-unicity problem.

For a prediction application, the cross-correlation vector r must be slightly

modied assuming a particular form r = Xx(n 1), P = 1 and x(n 1) =

T

x(n 1) x(n 2) x(n N ) .

The optimal Wiener-Hopf solution wopt = R1 r requires the knowledge

of both magnitudes: the correlation matrix R of the input matrix X and the

cross-correlation vector r between the input vector and desired answer d. That

is the reason why it has little practical value. So that the linear system given

by (35) has solution, the correlation matrix R must be nonsingular.

It is possible to estimate both magnitudes according to the windowing

type of the input vector.

The sliding windowing method uses the sample data within a window of

nite length N . Correlation matrix and cross-correlation vector are estimated

averaging in time,

r(n) = X(n)d (n)/N.

The method that estimates the autocorrelation matrix like in (38) with

samples organized as in (27) is known as the covariance method. The matrix

that results is positive semidenite but it is not Toeplitz.

The exponential windowed method uses a recursive estimation according

to certain forgetfulness factor in the rank 0 < < 1,

r(n) = r(n 1) + X(n)d (n).

(39)

1

A1 = B1 B1 C I + DB1 C DB1 , (40)

is yielded

R1 (n) = 1 R1 (n 1) (41)

X(n)XH (n)R1 (n 1)

2 R1 (n 1) .

I + 1 XH (n)R1 (n)X(n)

When the excitation signal to the adaptive system is not stationary and

the unknown system is time-varying, the exponential and sliding windowed

methods allow the lter to forget or to eliminate errors happened farther in

time. The price of this forgetfulness is a deterioration in the delity of the

lter estimation [12].

20 Lino Garca et. al.

A recursive estimator has the form dened in (31). In each iteration, the

update of the estimator is made in a w(n) direction. For all the optimiza-

tion deterministic iterative schemes, a stochastic algorithm approach exists.

It is enough to replace the terms related to the cost function and calculate

the approximate values by each new set of input/output samples. In general,

most of adaptive algorithms turn an optimization stochastic problem into a

deterministic one7 and the obtained solution is an approximation to the one

of the original problem. The stochastic approximation method to the system

identication problem yields to squared minimum estimation as

w(n) = arg minJ(w).

g = 2(r + Rw), or by the equivalent one g = Xe , considering R and r

according to (38) or (39). It is possible to dene recursive updating strategies,

per each l stage, for lattice structures as

Kl (n) = arg minJ(Kl ).

Optimization strategies

the least square type). It is possible to use a quadratic (second order) approx-

imation of the error-performance surface around the current point denoted

w(n). Recalling the second-order Taylor series expansion of the cost function

J(w) around w(n), with w = w w(n), you have

1

J(w + w)

= J(w) + wH J(w) + wH 2 J(w)w (44)

2

Optimization deterministic iterative schemes require the knowledge of the

cost function, the gradient (rst derivatives) dened in (45) or the Hessian

matrix (second order partial derivatives) dened in (46,53) while stochastic

recursive methods replace these functions by impartial estimations.

T

J(w) J(w) J(w)

J(w) = w1 w2 wL

, (45)

2 J(w) 2 J(w) 2 J(w)

w1 w1 w1 w2 w1 wL

2 J(w) 2 J(w) 2 J(w)

w2 w1 w2 w2 w2 wL

2 J(w) = .. .. .. . (46)

..

.

. . .

2 J(w) 2 J(w) 2 J(w)

wL w1 wL w2 wL wL

7

Sampled data of the random variable are used.

Adaptive Filtering Techniques for Forensic Audio 21

The vector g(n) = J(w) is the gradient evaluated at w(n), and the

matrix H(n) = 2 J(w) is the Hessian of the cost function evaluated at

w(n).

Several rst order adaptation strategies are: to choose a starting initial

point w(0), to increment election w(n) = (n)g(n); two decisions are due

to take: movement direction g(n) in which the cost function decreases fastest

and the step-size in that direction (n). The iteration stops when a certain

level of error is reached w(n) < ,

w(n + 1) = w(n) + (n)g(n). (47)

Both parameters (n), g(n) are determined by a cost function. The second

order methods generate values close to the solution in a minimum number of

steps but, unlike the rst order methods, the second order derivatives are

very expensive computationally. The adaptive lters and its performance are

characterized by a selection criteria of (n) and g(n) parameters.

2

SD (n) = gg

H Rg Steepest-Descent

CG (See below) Conjugate Gradient

NR (n) = Q Newton-Raphson

a quadratic function, like in (33). Table 2 summarized the optimization meth-

ods. SD is an iterative optimization procedure of easy implementation and

computationaly very cheap. It is recommended with cost functions that have

only one minimum and whose gradients are isotropic in magnitude respect

to any direction far from this minimum. NR method increases SD perfor-

mance using a weighting matrix carefully selected. The simplest form of NR

uses Q = R1 . Quasy-Newton methods (QN) are a special case of NR with

Q simplied to a constant matrix. The solution to (33) is also the solution

to the normal equation (35). The conjugate gradient (CG) [6] was designed

originally for the minimization of convex quadratic functions but, with some

variations, it has been extended to the general case. The rst CG iteration is

the same that the SD algorithm and the new successive directions are selected

in such a way that they form a set of vectors mutually conjugated to the Hes-

sian matrix (corresponding to the autocorrelation matrix, R), qH i Rqj = 0,

i = j. In general, CG methods have the form

gl l = 1,

ql = (48)

gl + l ql1 l > 1,

gl , ql

l = , (49)

ql , gl pl

22 Lino Garca et. al.

gl 2

l = 2, (50)

gl1

wl+1 (n) = wl (n) + l (n)ql . (51)

combination of previous R-conjugated search directions. guarantees the

R-conjugation. Several methods can be used to obtain . This method (50)

is known as Fleetcher-Reeves. The gradients can be obtained as g = J(w)

and p = J(w g).

LMS (n) = Least Means Squares

NLMS (n) = x(n) 2 + Normalized LMS

FNLMS (n) = p(n) Filtered NLMS

p(n) = p(n 1) + L(1 ) x(n)2

Q

PNLMS (n) = xH (n)Qx(n)+ Proportionate NLMS

error cost function J(w) = e2 (n). The descent direction for all is a gradient

g(n) = x(n)e (n). The LMS algorithm is a stochastic version of the SD op-

timization method. NLMS frees the convergence speed of the algorithm with

the power signal. FNLMS lters the signal power estimation; 0 < < 1 is a

weighting factor. PNLMS adaptively controls the size of each weight. Q is a

diagonal matrix that weights the individual coecients of the lters, is a re-

laxation constant and guarantees that the denominator never becomes zero.

These algorithms are very cheap computationally but their convergence speed

depends strongly on the spectral condition number of the autocorrelation ma-

trix R (that relate the extreme eigenvalues) and can get to be unacceptable

as the correlation between the P channels increases.

The projection algorithms in Table 4 modify the lters coecients in the

input vector direction and on the subspace spanned by the N 1 predecessors.

RLS is a recursive solution to the normal equation that uses MSE like cost

function8 . There is an alternative fast version FRLS. LMS-SW is a variant

of SD that considers a data window. The step can be obtained by a linear

search. APA is a generalization of RLS and NLMS. APA is obtained by pro-

jecting the adaptive coecients vector w in the ane subspace. The ane

subspace is obtained by means of a translation from the orthogonal origin to

the subspace where the vector w is projected. PRA is a strategy to reduce

the computational complexity of APA by updating the coecients every N

8

Calculated according to (41).

Adaptive Filtering Techniques for Forensic Audio 23

RLS (n) = R1 (n) Recursive Least-Squares

g(n) = x(n)e (n)

g(n)2

LMS-SW (n) = gH (n)X(n)X H (n)g(n)+ Sliding-Window LMS

g(n) = X(n)e (n)

APA (n) = X(n)XH (n)+I

Ane Projection Algorithm

g(n) = X(n)e (n)

PRA w(n + 1) = w(n N + 1) + (n)g(n) Partial Rank Algorithm

(n) = X(n)XH (n)+I

g(n) = X(n)e (n)

1

DLMS (n) = x(n),z(n) Decorrelated LMS

g(n) = z(n)e (n)

z(n) = x(n) x(n),x(n1)

x(n1)2

x(n 1)

Q H

TDLMS (n) = x(n)2 , QQ = I Transform-Domain DLMS

g(n) = x(n)e (n)

the last input (order 2). This changes the updating vector direction of the

correlated input signals so that these ones correspond to uncorrelated input

signals. TDLMS decorrelates into transform domain by means of a Q matrix.

The adaptation of the transversal section of the joint-process estimator in

the lattice-ladder structure depends on the gradient g(n) and, indirectly, on

the reection coecients, through the backward predictor. Note that deriving

(16) respect to w, the gradient vector g(n) = b(n) is obtained. However, the

reection coecient adaptation depends on the gradient of y(n) with respect

to them

T

J(K) = J(K)K1

J(K)

K2 J(K)

KL

, (52)

2 2 2

J(K) J(K) J(K)

K

K 1 K1 K1 K2 1 KL

J(K) J(K) 2 J(K)

2 2

K2 K1 K2 K2 K2 KL

2 J(K) = .. .. .. . (53)

..

.

2 . . .

J(K) 2 J(K) 2 J(K)

KL K1 KL K2 KL KL

matrix can be obtained as G = J(K). Two recursive updatings are necesary

Kl (n + 1) = Kl (n) + l (n)Gl (n) (55)

24 Lino Garca et. al.

GAL l (n) = b (n) 2 Gradient Adaptive Lattice

l

gl (n) = e (n)bl (n)

l (n) = Bl1 (n)

Gl (n) = bl1 (n 1)f H H

l (n) + bl (n)f l1 (n 1)

CGAL (See below) CG Adaptive Lattice

GAL is a NLMS extension for a lattice structure that uses two cost func-

tions: instantaneous squared error for the tranversal part and prediction MSE

2 2

for the lattice-ladder part, Bl (n) = Bl (n1)+(1)(|f l (n))| +|bl (n 1))| ,

where and are relaxation factors. For CGAL, the same algorithm described

in (48-51) is used but it is necessary to rearrange the gradient matrices of

the lattice system in a column vector. It is possible to arrange the gradi-

T

ents of all lattice structures in matrices. U(n) = gT1 (n) gT2 (n) gTP (n)

is the P L gradient matrix with respect to the transversal coecients,

T T

gp = gp1 gp2 gpL , p = 1 P . V(n) = G1 (n) G2 (n) GP (n) is

a P (L 1)P gradient matrix with respect to the reection coecients; and

T

rearranging these matrices in one single column vector, uT vT is obtained

T

u = g11 g1L g21 g2L gP 1 gP L ,

T

v = G111 G1P 1 GP 11 GP P 1 G112 GP P (L1) .

gl l = 1,

ql = (56)

gl + l ql1 l > 1

T

uTl vTl l = 1,

gl = T (57)

gl1 + (1 ) uTl vTl l>1

2

gl

l = 2, (58)

gl1

wl+1 = wl + ul , (59)

Kl+1 = Kl + l Vl . (60)

The time index n has been removed by simplicity. 0 < < 1 is a forget-

fulness factor which weights the innovation importance specied in a low-pass

ltering in (57). The gradient selection is very important. A mean value that

uses more recent coecients is needed for gradient estimation and to generate

more than one conjugate directions vector (57).

Adaptive Filtering Techniques for Forensic Audio 25

The adaptive lters used for forensic audio are probably very large (due to the

AIRs). The multirate adaptive ltering works at a lower sampling rate that

allows reducing the complexity [26]. Depending on how the data and lters

are organized, these approaches may upgrade in performance and avoid end-

to-end delay. Multirate schemes adapt the lters in smaller sections at lower

computational cost. This is only necessary for real-time implementations. Two

approaches are considered. The subband adaptive ltering approach splits the

spectra of the signal in a number of subbands that can be adapted inde-

pendently and therefore the ltering can be carried out in a fullband. The

frequency-domain adaptive ltering partitions the signal in time-domain and

projects it into a transformed domain (i.e. frequency) using better proper-

ties for adaptive processing. In both cases the input signals are transformed

into a more desirable form before adaptive processing and the adaptive algo-

rithms operate in transformed domains, whose basis functions orthogonalize

the input signal, speeding up the convergence. The partitioned convolution is

necessary for fullband delayless convolution and can be seen as an ecient

frequency-domain convolution.

band-pass lters for basis functions and replacing the x gains for adaptive

lters. Several implementations are possible. A typical conguration uses an

analysis lter bank, a processing stage and a synthesis lter bank. Unfortu-

nately, this approach introduces an end-to-end delay due to the synthesis lter

bank. Figure 10 shows an alternative structure which adapts in subbands and

lters in full-band to remove this delay [25].

K is the decimation ratio, M is the number of bands and N is the prototype

lter length. k is the low rate time index. The sample rate in subbands is

reduced to Fs /K. The input signal per channel is represented by a vector

T

xp (n) = x(n) x(n 1) x(n L + 1) , p = 1 P . The adaptive lter

T

in full-band per channel wp = wp1 wp2 wpL is obtained by means of

the T operator as

M/2

wp = (hmK wpm )K gm , (61)

m=1

from the subband adaptive lters per each channel wpm , p = 1 P ,m =

1 M/2 [25]. The subband lters are very short, of length C = L+N 1

N K

K + 1, which allows to use much more complex algorithms. Although the

input signal vector per channel xp (n) has size L 1, it acts as a delay line

which, for each iteration k, updates K samples. K is an operator that means

26 Lino Garca et. al.

z 1 z 1 z 1 d(n)

K K K

H

x1 (n) y1 (n) y(n) e(n)

w1

e1 (k)

K w11

z 1

e2 (k)

K H w12

1

z

eM/2 (k)

z 1 K w1(M/2)

xP (n) yP (n)

wP

K

wP 1

z 1

K H wP 2

z 1

z 1 wP (M/2)

K

Fig. 10. Subband adaptive ltering. This conguration is known as open-loop be-

cause the error is in the time-domain. An alternative closed-loop can be used where

the error is in the subband-domain. Gray boxes corresponds to ecient polyphase

implementations. See detail in [25].

Adaptive Filtering Techniques for Forensic Audio 27

synthesis lter in subband m obtained by modulating a prototype lter. H is

a polyphase matrix of a generalized discrete Fourier transform (GDFT) of an

oversampled (K < M ) analysis lter bank [8]. This is an ecient implemen-

tation of a uniform complex modulated analysis lter bank. This way, only a

prototype lter 9 p is necessary.

It is possible to select dierent adaptive algorithms or parameter sets for

each subband. For delayless implementation, the full-band convolution may

be made by a partitioned convolution.

The basic operation in frequency-domain adaptive ltering (FDAF) is to trans-

form the input signal in a more desirable form before the adaptation process

starts [26] in order to work with matrix multiplications instead of dealing with

slow convolutions. The frequency-domain transform employs one or more dis-

crete Fourier transforms (DFT), T operator in Fig. 11, and can be seen as a

pre-processing block that generates decorrelated output signals. In the more

general FDAF case, the output of the lter in the time-domain (11) can be

seen as the direct frequency-domain translation of the block LMS (BLMS)

algorithm. That eciency is obtained taking advantage of the equivalence

between the linear convolution and the circular convolution (multiplication

in the frequency-domain). It is possible to obtain the linear convolution be-

tween a nite length sequence (lter) and an innite length sequence (input

signal) with the overlapping of certain elements of the data sequence and the

retention of only a subgroup of the DFT.

The partitioned block frequency-domain adaptive ltering (PBFDAF) was

developed to deal eciently with such situations [23]. The PBFDAF is a more

ecient implementation of the LMS algorithm in the frequency-domain. It re-

duces the computational burden and bounds the user-delay. In general, the

PBFDAF is widely used due to its good trade-o between speed, computa-

tional complexity and overall latency. However, when working with long AIRs,

the convergence properties provided by the algorithm may not be enough. This

technique makes a sequential partition of the impulse response in the time-

domain prior to a frequency-domain implementation of the ltering opera-

tion. This time segmentation allows setting up individual coecient updating

strategies concerning dierent sections of the adaptive canceller, thus avoiding

the need to disable the adaptation in the complete lter. In the PBFDAF case,

the lter is partitioned transversally in an equivalent structure. Partitioning

wp in Q segments (K length) we obtain

Q K1

P

y(n) = xp (n qK m)wp(qK+m) . (62)

p=1 q=1 m=0

9

The prototype lter is a low-pass lter. The band-pass lters are obtained mod-

ulating a prototype lter.

28 Lino Garca et. al.

d(n)

T w11 T1

z K

w21

y21 (k) T

K

z

e(k)

wQ1

yQ1 (k)

T w1P

yP (k)

z K

w2P

y2P (k)

z K

wQP

yQP (k)

Where the total lter length L, for each channel, is a multiple of the length

of each segment L = QK, K L. Thus, using the appropriate data section-

ing procedure, the Q linear convolutions (per channel) of the lter can be

independently carried out in the frequency-domain with a total delay of K

samples instead of the QK samples needed by standard FDAF implementa-

tions. Figure 11 shows the block diagram of the algorithm using the overlap-

save method. In the frequency-domain with matricial notation, (62) can be

expressed as

Y = X W, (63)

the Fourier transform of the Q partitions and P channels of the input signal

mn

matrix X. F represents the DFT matrix dened as F = WM of size M M

and F1 as its inverse. Of course, in the nal implementation, the DFT matrix

should be substituted by much more ecient fast Fourier transform (FFT).

Being X, 2K P -dimensional (supposing 50% overlapping between the new

block and the previous one). It should be taken into account that the algorithm

adapts every K samples. W represents the lter coecient matrix adapted

in the frequency-domain (also M Q P -dimensional) while the operator

multiplies each of the elements one by one; which, in (63), represents a circular

convolution. The output vector y can be obtained as the double sum (rows)

of the Y matrix. First we obtain a M P matrix which contains the output

Adaptive Filtering Techniques for Forensic Audio 29

all the outputs we obtain the whole system output, y. Finally, the output in

the time-domain is obtained by using y = last Kcomponents of F1 y. Notice

that the sums are performed prior to the time-domain translation. This way

we reduce (P 1)(Q 1) FFTs in the complete ltering process. As in any

adaptive system the error can be obtained as

e = d y, (64)

T

with d = d(mK) d(mK + 1) d((m + 1)K 1) . The error in the

frequency-domain (for the actualization of the lter coecients) can be ob-

tained as

0

e = F K1 . (65)

e

volution implementation. In the same way, for the block gradient estimation,

it is necessary to employ the same error vector in the frequency-domain for

each partition q and channel p. This can be achieved by generating an error

matrix E with dimensions M Q P which contains replicas of the error

vector, dened in (65), of dimensions P and Q (E e in the notation). The

actualization of the weights is performed as

G = X E. (67)

This is the unconstrained version of the algorithm which saves two FFTs

from the computational burden at the cost of decreasing the convergence

speed. The constrained version basically makes a gradient projection. The

gradient matrix is transformed into the time-domain and is transformed back

into the frequency-domain using only the rst K elements of G as

G

G=F . (68)

0KQP

gradient matrix to vectors and reverse [11]. The vectors g and p in (48,49)

should be changed by gl Gl , Gl = J(Wl ) and pl Pl , Pl = J(Wl Gl )

with gradient estimation obtained by averaging the instantaneous

gradient

N

estimates over N past values Gl = J(Wl ) = N2 k=1 Glk Wl Xlk dlk .

30 Lino Garca et. al.

T

v 1 v 2 vP of size N = LP 1 and initially partitioned in a reasonable

number Q of equally-sized blocks vq , q = 1 Q, of length K. Each of these

blocks is treated as a separate impulse response, and convolved by a standard

overlap-and-save process, using T operator (FFT windows of length L). All

input data are processed in overlapped blocks of L samples (each block at

L K samples to the last). Each block is zero-padded to length L (typically

equal to 2K), and transformed with FFT so that a collection of Q frequency-

domain lters vq is obtained. The results of the multiplications of these Q

lters vq with the FFTs of the Q input blocks are summed, producing the same

result as the unpartitioned convolution, by means of proper delays applied

to the blocks of convolved data. Finally an T1 operator (IFFT) of the rst

acummulator is made to submmit an output data block (obviosly only the last

L K block samples). Each block of input data needs to be FFT transformed

just once, and thus the number of forward FFTs is minimized [1]. The main

advantage compared to unpartitioned convolution is that the latency of the

whole ltering processing is just M points instead of 2N , and thus the I/O

delay is kept to a low value, provided that the impulse response is partitioned

in a sensible number of chunks (8-32). Figure 12 outlines the whole process.

Suposse that A = (A1 , A2 , , AQ ) is a set of multiplications of the rst

data block and B = (B1 , B2 , , BQ ) the second, then for time-index 1 it is

only necessary to consider A1 . At the next index-time, corresponding to K + 1

samples, the sum is formed with (BQ , B1 + A2 , B2 + A3 , , BQ1 + AQ ). If

C = (C1 , C2 , , CQ ) corresponds to the third block the sum is formed with

(CQ , C1 +B2 +A3 , C2 +B3 +A4 , , CQ1 +BQ ). An ecient implementation

of this sum can be implemented using a double buering technique [1].

The ltering operation can be made delayless by operating the rst block in

the time-domain (direct convolution) while the rest of the blocks continue to

operate in the frequency domain [22]. The fast convolution starts after the

samples have been processed for direct convolution. The direct convolution

allows giving samples to the output while data is incomming. This approach

is applicable to the multirate frameworks described.

Once the theoretical foundations of the adaptive ltering have been reviewed,

the most important techniques that can be applied to forensic ltering are

introduced.

Adaptive Filtering Techniques for Forensic Audio 31

blocks of L samples

1st 2nd J 1 Jth

T T

v1 v2 vQ v1

11stbloque

seg. 22nd seg.

bloque Qth seg.

Q bloque 1st data block

1st seg. 2nd seg. Qth seg. 2nd data block

n2 bloque

T1 T1 T1 T1

n2 bloque

Fig. 12. Partitioned convolution. Each output signal block is produced taking only

the L K last samples of the block.

The adaptive spectral equalization is widely used for noise suppression and

corresponds to the single-input and single-output (SISO) estimator applica-

tion (class a, Fig. 5); a single microphone, P = 1, is employed. This approach

estimates a noiseprint spectra and subtracts it from the whole signal in the

frequency-domain.

The Wiener lter estimator is the result of estimating y(n) from s(n) that

minimizes the MSE y(n) s(n)2 given by y = Qx, x = s + r, and that

results

|x|2 |d|2

q

= 2 , (69)

|x|

Q = diag q1 q2 qM is a diagonal matrix which contains the spec-

tral gain in the frequency-domain; normally T is a short-time Fourier trans-

form (STFT), suitable for not stationary signals, and T1 its inverse. In this

case this algorithm is known as short-time spectral attenuation (STSA). The

M 1 vector q contains the main diagonal components of Q. d is the noise

32 Lino Garca et. al.

T T1

r(n)

w

d(n) Wiener lter

estimator

spectra d = r from the mixture x (noisy signal) is necessary (in intervals

when the speech is absent and only the noise is present). The STFT is dened

N mk

as xk = n=1 h(n)x(m n)WM , m = 0 M 1, where k is the time

index about which the short-time spectrum is computed, m is the discrete

frequency index, h(n) is an analysis window, N dictates the duration over

which the transform is computed, and M is a number of frequency bins at

which the STFT is computed. For stationary signals the squared-magnitude

of the STFT provides a sample estimate of the power spectrum of the under-

lying random process. This form (69) is basic to nearly all the noise reduction

methods investigated over last forty years [12]. The specic form to obtain Q

is known as the suppresion rule.

Power Subtraction

be estimated if its magnitud is estimated as

s = |x| |d| , (70)

and the phase of the noisy signal x can be used, if its SNR is reasonably

high, in place of s. is an exponent and is a parameter introduced to

control the amount of noise to be subtracted ( = 1 for full subtraction and

> 1 for over subtraction). A paramount issue in spectral subtraction is to

obtain a good noise estimate; its accuracy greatly aects the noise reduction

performance [3].

the deterministic y(n) s(n) and stochastic part e(n) r(n) assuming that

the noise and interference signal has a broadband spectra. ALP corresponds

Adaptive Filtering Techniques for Forensic Audio 33

with a single microphone, P = 1.

w

r(n)

Fig. 14. Adaptive linear predictor.

Most signals, such as speech and music, are partially predictable and par-

tially random. The random input models the unpredictable part of the signal,

whereas the lter models the predictable structure of the signal. The aim of

linear prediction is to model the mechanism that introduces the correlation

in a signal [29]. The solution to this system corresponds to a Wiener solu-

tion (35) with the cross-correlation vector, r, slighty modied. The delay z D

in the ALP lter should be selected in such a way that d(n) = x(n) + r(n)

and d(n D) are still correlated. If D is too long, the correlation in d(n)

and d(n D) is weak and unpredictable for the ALP lter; for that reason

it cannot be canceled suitably. If D is too short, the deterministic part of

signal in d(n) and d(n D) remains correlated after D; for that reason it

can be predictable and cancelled by the ALP lter. D = 1 causes that the

voice in d(n) and d(n D) is strongly correlated. A cascade of ALP lters of

lower order independently adapted improves the modeling of the general ALP

lter. In this case, the prediction is performed in successive renements, the

adaptation steps can be greater, and thus each stage is less aected by the

disparity of eigenvalues which results in a faster convergence.

The adaptive noise cancellation (ANC) cancels the primary unwanted noise

r(n) by introducing a canceling antinoise of equal amplitude but opposite

phase using a reference signal. This reference signal is derived from one or

more sensors located at points near the noise and interference sources where

the interest signal is weak or undetectable. A typical ANC conguration is

depicted in Fig. 15. Two microphones are used, P = 2. The primary input

d(n) = s(n) + r(n) collects the sum of unwanted noise r(n) and speech signal

s(n), and the auxiliary or reference input measures the noise signal x(n) =

r(n).

ANC corresponds to multiple-input and single-output (MISO) joint-process

estimator application (class c, Fig. 5) with at least two microphones, P = 2.

34 Lino Garca et. al.

d(n)

s(n)

r(n)

x(n) y(n) e(n)

w

4.4 Beamforming

consists of multichannel advanced multidimensional (space-time domain) l-

tering techniques that enhance the desired signal as well as suppress the noise

signal. In beamforming, two or more microphones are arranged in an array of

some geometric shape. A beamformer is then used to lter the sensor outputs

and amplies or attenuates the signals depending on their direction of arrival

(DOA), . The spatial response, or beampattern, of a beamformer generally fea-

tures a combination of mainlobes that may be aimed at the target sources, and

smaller sidelobes and null points aimed at the interference sources. Beampat-

terns are generally frequency-dependent, unless the beamformer is specically

designed to be frequency independent.

x1 (n)

h1

d(n) e(n)

i

cos

hP

xP (n) AIC

i

b1 bP

FB

g1

y(n)

gP

ABM

Fixed beamforming (FB) allow conforming determined directivity pattern. The

adaptive block matrix (ABM) or blocking matrix, with coecient-constrained adap-

tive lters, prevents the target signal from leaking into the adaptive interference

canceller (AIC). The AIC uses norm-constrained adaptive lters that can further

improve the robustness against target signal cancellation.

Adaptive Filtering Techniques for Forensic Audio 35

array, i.e. the distance of the source from the array is much greater than the

distance between the microphones (the spherical wavefronts emanating from

the sources can be approximated as plane wavefronts). Each source si (n)

arrives to microphone 1 with delay i = cos v

i

relative to its arrival to 2

because it has to travel an extra distance cos ; i is the DOA of si (n) and

v 355ms1 is a velocity of sound. Fvs represents the spatial samplig

interval of the waveeld; it has to fulll this inequality to avoid spatial aliasing.

The generalized sidelobe canceller (GSC) is an adaptive beamformer that

keeps track of the characteristics of the interfering signal, leading to a high

interference rejection performance. Initially, the P microphone inputs xp (n),

p = 1 P , go through the FB that directs the beam towards the expected

DOA. The beamformer output y(n) = h, x(n) contains the enhanced sig-

nal originating from the pointed direction, which is used as a reference by

the ABM. The coecient vector h has to fulll both spatial and temporal

constrains Ch = c, h = C[CH C]1 c. The ABM adaptively subtracts the

signal of interest, represented by the reference signal y(n), from each channel

input xp (n), and provides the interference signals. The columns of C must be

pairwise orthogonal to the columns of the blocking matrix B, CH B = 0. The

quiescent vector h is a component independently of data and w = h Bg is

a lter that satises the linear constrains CH w = CH (h Bg) = CH h = c.

The upper signal path in Fig. 16 has to be orthogonal to the lower signal path.

In order to suppress only those signals that originate from a specic track-

ing region, the adaptive lter coecients are constrained within predened

boundaries [3]. These boundaries are specied based on the maximum allowed

deviation between the expected DOA and the actual DOA. The interference

signals, obtained from the ABM, are passed to the AIC, which adaptively

removes the signal components that are correlated to the interference signals

from the beamformer output y(n). The norm of the lter coecients in the

AIC is constrained to prevent them from growing excessively large. This min-

imizes undesirable target signal cancellation, when the target signal leaks into

the AIC, further improving the robustness of the system [31].

In noise reduction systems, the beamformer can be used to either reject

the noise (interference) by attenuating signals from certain DOAs, or focus

on the desired signal (target) by amplifying signals from the target DOA and

attenuating all signals that are not from the target DOAs. For non real-time

audio forensic applications it is possible to select a set of DOAs to be tested.

Therefore adaptive algorithms with directional constrains, like a RGSC, can

be exploited to achieve better noise reduction performance.

4.5 Deconvolution

Both blind signal separation (BSS), also known as blind source separation,

and multichannel blind deconvolution (MBD) problems are a type of inverse

problems with similarities and subtle dierences between them: in the MBD

36 Lino Garca et. al.

only one source is considered, and thus the system is single-input single-output

(SISO), while in BSS there are always multiple independent sources and the

mixing system is MIMO; the interest of MBD is to deconvolve the source from

the AIRs, while the task of BSS is double: on the one hand the sources must

be separated, on the other hand the sources must be deconvolved from the

multiple AIRs since each sensor collects a combination of every original source

convolved by diferent lters (AIRs) according to (2) [13].

z D d(n) = si (n)

v1 w1

xP (n)

vP wP

r(n)

x1 (n) y(n)

v1 w1

si (n)

xP (n)

vP wP

BSFS

r(n)

the blind separation one, must estimate adaptively the inverse of the convolu-

tive system that allows recovering the input signals and suppressing the noise.

The goal is to adjust W so that WV = PD, where P is a permutation matrix

and D is a diagonal matrix whose (p, p)th is p z p ; p is a nonzero scalar

weigthing, and p is an integer delay.

BSS deals with the problem of separating I unknown sources by observ-

ing P microphone signals. In the underdetermined case (P < I) there are

innitely possible vectors s(n) that satisfy (3). There are mainly two ways

Adaptive Filtering Techniques for Forensic Audio 37

to achieve the minimum norm solution. In the rst, the right generalized in-

verse of V is estimated and then applied to the set of microphone signals

x(n). Another class of algorithms employ the sparseness of speech signal to

design better inversion strategies and identify the minimum norm solution.

Many techniques of convolutive BSS have been developed by extending meth-

ods originally designed for blind deconvolution of just one channel. A usual

practice is to use blind source factor separation (BSFS)Blind source factor sep-

aration technique, where one source (factor) is separated from the mixtures,

and combine it with a deationary approach, where the sources are extracted

one by one after deating, i.e. removing, them from the mixed signals. The

MIMO FIR lter W used for BSS becomes a multiple-input single-output

(MISO) depicted in Fig. 18. The output y(n) corresponds to (11) and the

tap-stacked column vector containing all demixing lter weights dened in

(7) is obtained as

u = Rp, (71)

u

w= ,

H

u Ru

where R is a block matrix where its blocks are the correlation matrices Rpq be-

tween the p-th channel and q-th channel dened in (36) and p is a block vector

where its blocks are the cross-cumulant vector p = cum{x(n), y(n) y(n)}

[13]. The second step in (71) is just the normalization of the output signal

y(n). This is apparent left multiplying by x(n).

The deationary BSS algorithm for i = 1 I sources can be summa-

rized as following: one source is extracted with the BSFS iterative scheme

till convergence (71) and the ltering of the microphone signals with the

estimated lters from the BSFS method (11) is performed; the contribu-

tion of the extracted source into the mixtures xp , p = 1 P , is estimated

(with the LS criterion) and the contribution of the o-th source into i-th

mixture is computed by using the estimated

lter b, c(n) = b, y(n) with

y(n) = y(n) y(n 1) y(n B + 1) , B << L; deate the contribution

c(n) from the p-th mixture, xp (n) = xp (n) c(n), p = 1 P . This method is

very suitable for audio forensic application where only one source should be

extracted, i.e. speech.

It is possible to consider the deationay BSFS (DBSFS) structure as a

GSC. ABM exactly corresponds to the deating lters of the deationary ap-

proach. By comparing the dierent parts, i.e. the BSFS block and the xed

beamformer, it is concluded that it may be possible to construct similar algo-

rithms to those of GSC.

5 Conclusions

This chapter is an advanced tutorial about multichannel adaptive ltering for

forensic audio. Dierent techniques have been examined in a common foun-

38 Lino Garca et. al.

of channels increases.

The spectral equalization (power subtraction), in general, can achieve more

noise reduction than an ANC and a beamformer method. However, this is

based on the noise spectrum estimator instead of the unknown noise spectra

at each time producing a distortion known as musical noise (because of the

way it sounds). The performance of ANC depends on the coherence between

the input noisy signal and the reference noise signal. Only if the coherence is

very high the results are spectacular, therefore, this fact limits its application

to particular cases. The amount of noise that can be canceled by a beamformer

relies on the number of microphones in the array and on the SNR of the

input signal. More microphones can lead to more noise reduction. However,

the eectiveness of a beamformer in suppressing directional noise depends

on the angular separation between signal and the noise source [3]. The ALP

method is very simple because only second order statistics are required, but

the estimation is only optimal if the residue is i.i.d. Gaussian [27].

All these techniques are narrowly connected. The linear prediction of x(n)

is nothing but the deconvolution of x(n) [27]. In [28], the problem of Wiener

system blind inversion using source separation methods is addressed. This

approach can also be used for blind linear deconvolution. In [13] the link

between the deationary approach (the extension of the single channel blind

deconvolution algorithm) and the traditional GSC structure is showed. Several

strategies between dierent approaches are also possible, i.e. in [29], a Wiener

lter, that uses linear prediction to estimate the signal spectrum, is presented.

The best lter to enhance a particular recording will be chosen based

on experience and experimentation [20]. Nevertheless, the algorithm devel-

oper would nd it useful to have a quality measure that helps to compare,

in general terms, the performance of dierent implementations of a certain

algorithm [30]. One substantial ingredient of this performance is the intelli-

gibility attained after processing the recording, or even better the increase

of intelligibility compared to the unprocessed sample. Therefore, one possible

way to measure the performance of an enhancement algorithm, and probably

the best, would be to use a panel of listeners and one of the subjective tests

introduced in Sect. 1.3. To attain signicant results, dierent speech record-

ings with dierent types and degrees of noise and distortion should be used

as inputs to the algorithm, and therefore the task would probably become

unapproachable in terms of time and eort, setting aside the fact that the

experiment would hardly be repeatable.

For these reasons, an objective intelligibility measure would be of great

help to compare, or even adjust, voice enhancement algorithms. The STI mea-

sure, introduced in Sect. 1.3, is probably the most appropriate one, because

it can deal with more types of degradations than other objective intelligibil-

ity measures. As previously mentioned, to compute the STI of a transmis-

sion system, that in our case comprises the acoustic environment where the

recorded was performed, followed by the enhancement system, either the im-

Adaptive Filtering Techniques for Forensic Audio 39

signal is needed. As adaptive lters are time variant, they can not be identied

by an impulse response, thus the use of the probe signal is mandatory.

In order to properly monitor the performance of the algorithms, dierent

types and degrees of degradations should be imposed to the test signal. The

model used to deal with degradations can be as simple as an additive noise,

for a mono version of the test signal corrupted by random noise or a second

talker speech, or as complex as a virtual room simulator for early reexions

and a stocastic reverberation generator, for a detailed acoustic model of the

recording room, where several noise sources can be placed in dierent places.

Measured impulse responses of a real chamber is another option to obtain

very realistic mono or multi-channel virtual recordings.

Although STI seems useful for this kind of analysis, providing a practical

systematic tool to measure the degree of enhancement attained, it should

be kept in mind that it was not designed to test adaptive lters, and its

behaviour during the adaptation periods of the algorithms may be misleading

[15]. Hence, the obtained results should be carefully contrasted with subjective

listening tests. An example of the use of SRT to evaluate the performance of

beamforming techniques can be found in [4].

References

1. Armelloni E, Giottoli C, Farina A (2003) Implementation of Real-time Parti-

tioned Convolution on a DSP Board, 2003 IEEE Workshop on the Applications

of Signal Processing to Audio and Acoustics, page(s):7174.

2. Barnett, P.W, Knight, R.D (1995) The Common Intelligibility Scale, Proc.

I.O.A. 17(7):199-204.

3. Benesty J, Huang Y (2003) Adaptive Signal Processing (Applications to Real-

World Problems). Springer, Berlin Heidelberg New York Hong Kong London

Milan Paris Tokyo

4. Beracoechea, J (2007) Codicacin de Audio Multicanal para entornos de tipo

Ventana Acstica, Ph D. Thesis, Universidad Politcnica de Madrid

5. Betts D, French A, Hicks C, Reid G (2005) The Role of Adaptive Filtering in

Audio Surveillance. AES 26th International Conference.

6. Boray G.K, Srinath M.D (1992) Conjugate Gradient Techniques for Adaptive

Filtering. IEEE Transactions on Circuits and Systems 39(1):1-10.

7. Chau E.Y-H (2001) Adaptive Noise Reduction Using A Cascaded Hybrid Neu-

ral Network. MS Thesis, University of Guelph, Ontario

8. Crochiere R.E, Rabiner L.R (1983) Multirate Digital Signal Processing.

Prentice-Hall, London Sidney Toronto Mexico New Delhi Tokyo Singapore Rio

de Janeiro

9. French, N.R, Steinberg, J.C (1947) Factors governing the intelligibility of speech

sounds, The Journal of the Acoustical Society of America 19:90119.

10. Friedlander B (1982) Lattice Filters for Adaptive Processing. Proceedings of

the IEEE 70(8):829867.

40 Lino Garca et. al.

Politecnica de Madrid, Spain

12. Gay S.L, Benesty J (2000) Acoustic Signal Processing for Telecommunication.

Kluwer Academic Publishers, Boston Dordrecht London

13. Gkalelis N (2004) Undetermined Blind Source Separation for Speech Signals.

MS Thesis, Friedrich-Alexander-Universitat Erlangen-Nurnberg, Germany

14. Glentis G.-O, Berberidis K, Theodoridis S (1999) Ecient least square adaptive

algorithms for FIR transversal ltering: A unied view, IEEE Signal Processing

Magazine 16(4):1341

15. Goldsworthy R.L, Greenberg J.E (2004) Analysis of speech-based speech trans-

mission index methods with implications for nonlinear operations, The Journal

of the Acoustical Society of America 116(6):36793689.

16. Haykin S (2002) Adaptive Filter Theory. Prentice-Hall, Inc., New Jersey

17. Honig M.L, Messerschmitt D.G (1984) Adaptive Filters: Structures, Algorithms

and Applications. Kluwer Academic Publishers, Boston The Hague London

Lancaster

18. IEC Standard 60268-16 (2003)

19. ISO standard 9921 (2003)

20. Koenig B.E, Lacey D.S, Killion S.A (2007) Forensic enhancement of digital

audio recordings, Journal of de Audio Engineering Society, 55(5):352371.

21. Mayyas K (2002) Stereophonic Acoustic Echo Cancellation Using Lattice

Orthogonalization, IEEE Transactions on Speech and Audio Processing,

10(7):517525.

22. Morgan D.R, Thi J.C (1995) A Delayless Subband Adaptive Filter Architecture,

IEEE Transactions on Signal Processing, 43(8):18191830.

23. Paez Borrallo J.M, Otero M.G (1992) On The Implementation of a Partitioned

Block Frequency Domain Adaptive Filter (PBFDAF) For Long Acoustic Echo

Cancellation, Signal Processing, 27(3)301315.

24. Plomp R., Mimpen, A.M (1979) Improving the reliability of testing the speech

reception threshold for sentences, Audiology 8:4352.

25. Reilly J.P, Wilbur M, Seibert M, Ahmadvand N (2002) The Complex Subband

Decomposition and its Application to the Decimation of Large Adaptive Fil-

tering Problems, IEEE Transactions on Signal Processing, 50(11):27302743.

26. Shynk J.J (1992) Frequency-domain and Multirate Adaptive Filtering, IEEE

Signal Processing Magazine, 9(1):1437.

27. Sole-Casals J, Jutten C, Taleb A (2000) Source Separation Techniques Applied

to Linear Prediction, 2th International Workshop on Independent Component

Analysis and Blind Source Separation Proceedings ICA2000, page(s):193198.

28. Taleb A, Sole-Casals J, Jutten C (1999) Blind Inversion of Wiener Systems,

IWANN 99, page(s):655664.

29. Vaseghi S.V (1996) Advanced Signal Processing and Digital Noise Reduction.

John Willey & Sons Ltd. and B.G. Teubner, Chichester New York Brisbane

Toronto Singapore Stuttgart Leipzig

30. Yi H, Philipos C.L (2007) A comparative intelligibility study of single-

microphone noise reduction algorithms, The Journal of the Acoustical Society

of America, 122(3):1777-1786.

31. Yoon B-Y, Tashev I, Acero A (2007) Robust Adaptive Beamforming Algorithm

Using Instantaneous Direction Of Arrival With Enhanced Noise Suppression

Capability. IEEE International Conference on Acoustics, Speech and Signal

Processing 1:I-133I-136.

Index

36

Adaptive ltering, 1, 9, 10, 25 Joint-process estimation, 11, 12, 14, 15,

Adaptive linear prediction, 32 18, 33

Adaptive noise cancellation, 33

Adaptive spectral equalization, 31, 38 Lattice-ladder joint-process estimation,

Audio enhancement, 1, 8 13, 14, 23

Audio restoration, 8

Multichannel adaptive ltering, 2, 12,

Beamforming, 34 14, 37

Blind deconvolution, 35, 36

Blind source separation, 35 Noise, 26, 810, 3134, 36, 38

Noise reduction, 2, 4, 8, 9, 31, 35, 38

Cost functions, 16

Speech intelligibility, 8, 9, 38

Forensic audio, 13, 5, 8, 37 Speech listenability, 8

- Numerical and Experimental Analysis of the Power Output of a Point Absorber Wave Energy Converter in Irregular WavesDiunggah olehCarlos Garrido
- Reverberation Time Problems (Solved)Diunggah olehKairos
- Power SpectrumDiunggah olehKiran Kumar
- Jin 2014Diunggah olehFernanda Lago
- LA1600Diunggah olehEdiesya Syam
- ALTO+L-20+service+manualDiunggah olehBruno Aoki
- Quantile Based Noise Estimation for Spectral SubtractionDiunggah olehPritisha Punukollu
- DESC9115_DAS_Assign02_310106370Diunggah olehAnoop
- lec38.pdfDiunggah olehkenjo138
- Chapter 12 FurierDiunggah olehDenis Art
- communication systemDiunggah olehKenneth Lim Yoong Kang
- Grid Nesting With STWAVEDiunggah olehGonzalo Guerrero Cáceres
- Audio OpAmp RankingDiunggah olehSeppo Nikkilä
- Dynamic Spectrum Access and Meteor Burst CommunicationsDiunggah olehSkybridge Spectrum Foundation
- EECDiunggah olehfatostabil
- j9Diunggah olehwrite2arshad_m
- Noise 02 - Sound AbsorptionDiunggah olehMuhamad Soqhimi
- Article-Swinging the towers.pdfDiunggah olehzeemas
- Performance Analysis of Signal Detection for Cognitive RadioDiunggah olehIqbal Ahmad Sulaeman
- 15-05-0393-00-004a-codes-data-and-preambleDiunggah olehBee MU
- AdaptiveInterferenceCancellationSystemforMultihopWCDMA3GDiunggah olehkamal
- ECM00Diunggah olehFiras Zak
- desoer1962Diunggah olehAboubakr Atef
- David R. Brillinger Time Series Data Analysis and Theory 2001Diunggah olehMarcela Paz Azócar Pizarro
- Vibration Measurement and AnalysisDiunggah olehRosellie Niduaza
- Vibration Analysis of Silencer Based on FEM and FFT AnalyserDiunggah olehIJAERS JOURNAL
- 2.Building Acoustics Gyproc Whitebook 2 MeDiunggah olehSantiago J. ramos jr
- TN_Reading Accelerometer SpecificationsDiunggah olehSyabam Setiawan
- d SubrataDiunggah olehIgnatius Samraj
- MaterialProperties Davis 2015 CVPRDiunggah olehMarshall

- Camera Geometry Alignment FinalDiunggah olehLino García Morales
- Estética del vacíoDiunggah olehMario M. Reyes
- 2015 02 03 Keyrouz PresentationDiunggah olehLino García Morales
- 2015 02 03 Keyrouz PresentationDiunggah olehLino García Morales
- AntifragilidadDiunggah olehLino García Morales
- introducción al audio digital v2.pdfDiunggah olehLino García Morales
- Personal Computers and MusicDiunggah olehLino García Morales
- Nicaragua y La Responsabilidad de La Izquierda (Agosto 2018)Diunggah olehLino García Morales
- "Copias del arte y arte de la copia" de Mikel Iriondo ArangurenDiunggah olehBianca
- José Martí-Obras completas. Edición crítica. Tomo 5 (1877-1880 México, Cuba, Guatemala y Estados Unidos volumen 1)-Centro de Estudios Martianos. CLACSO (2016).pdfDiunggah olehLino García Morales
- chopinop28no4Diunggah olehMateo Mendoza
- Dialnet-LaGuerraDeCubaUnCapituloInsuficientementeConocidoD-1455810Diunggah olehLino García Morales
- David Hilbert - Fundamentos de Las MatematicasDiunggah olehhugmont
- David Hilbert - Fundamentos de Las MatematicasDiunggah olehhugmont
- Paf Dao White Paper CornellDiunggah olehLino García Morales
- La Producción Como Proceso de RestauraciónDiunggah olehLino García Morales
- ComoHacerUnaActividadSugarDiunggah olehLino García Morales
- Las Extensiones de La Realidad v2Diunggah olehLino García Morales
- Galileo GettingStarted 329685 005Diunggah olehLino García Morales
- Mecatrónica del GestoDiunggah olehLino García Morales
- Trandisciplinariedad y ComplejidadDiunggah olehLino García Morales
- Gestión de Calidad y Conservación de PatrimonioDiunggah olehLino García Morales
- Bus I2C de ArduinoDiunggah olehdanielsan74
- Pierre_Henry-House_Of_Sounds.pdfDiunggah olehLino García Morales
- CsoundXO PythonDiunggah olehLino García Morales
- Invasion Generativa 1 1Diunggah olehIvan Picazo
- SoftwareCatalog-1Diunggah oleh10101010101010101020
- Catalogo Artfutura 2009Diunggah olehDerek Fenix
- Resiliencia tecnologicaDiunggah olehLino García Morales

- Pv SystemDiunggah olehHarshal Patil
- Applications for Digital SignalsDiunggah olehue06037
- Adaptive FiltersDiunggah olehMehul Raghavan
- A8.1Diunggah olehVo Phong Phu
- COMPARATIVE STUDY OF SPEAKER RECOGNITION SYSTEM USING VQ AND GMMDiunggah olehIJAMTES
- gowthamDiunggah olehSharanya Vaidyanath
- Sound Localization using VHDLDiunggah olehAnonymous lPvvgiQjR
- Survey on Echo Cancellation Using Adaptive FilterDiunggah olehSatadal Gupta
- mollaei2009Diunggah olehSanti Ago
- A Novel Delayless Frequency Domain Filtered-x Least Mean Square Algorithm for Vehicle Powertrain Noise ControlDiunggah olehJie Duan
- M.E.communicationandNetworkingDiunggah olehvijai
- A New Method for a Nonlinear Acoustic Echo Cancellation SystemDiunggah olehIRJET Journal
- THESIS REF 5-1991.Applications of Adaptive Filtering to ECG Analysis Noise Cancellation and Arrhythmia DetectioDiunggah olehMurali Krishna
- LMS Adaptive FiltersDiunggah olehalialibaba
- 212EC5177-10Diunggah olehImad A Shaheen
- Syllabus R2013 ME Comm & NWDiunggah olehSornagopal Vijayaraghavan
- SwamiDiunggah olehydsraju
- EC110 IJCCSDiunggah olehKrishnaBattula
- ANN.ch2-Adaline and MadalineDiunggah olehAlfredo Valle Hernández
- Full Text 02Diunggah olehAhmed Said
- lms_fpga_504Diunggah olehelamaran_vlsi
- Multi-channel speech enhancement (1).pptDiunggah olehAdit Mbeyes Cah Getas
- |l0 Norm Constraint LMS Algorithm for Sparse System IdentificationLMS Algorithm for Sparse SystemDiunggah olehJéssica Sanson
- An Efficient Hardware Simulator for the Design of a WCDMA Interference Cancellation RepeaterDiunggah olehSyed Kashif Arshad
- A comparison of adaptive and nonadaptive filters for reduction of power line interference in the ECG.pdfDiunggah olehnathalia
- Adaptive Filter for Echo CancellationDiunggah olehAnonymous GTcxTG
- 00822378Diunggah olehapi-26783388
- ppt on adaptive filterDiunggah olehRaghu Reddy
- 08.M.E. COMM & NWDiunggah olehVasu
- bput_mtech_eie_2010Diunggah olehT