Short Time Fourier Transform

2012-11-28 Dan Ellis 1
ELEN E4810: Digital Signal Processing

Topic 10:
The Fast Fourier Transform
1. Calculation of the DFT
2. The Fast Fourier Transform algorithm
3. Short-Time Fourier Transform
2012-11-28 Dan Ellis 2
1. Calculation of the DFT
Filter design so far has been oriented to

time-domain processing - cheaper!
But: frequency-domain processing

makes some problems very simple:
use all of x[n], or use short-time windows
Need an efficient way to calculate DFT

Fourier domain
processing
x[n] y[n] DFT IDFT
X[k] Y[k]
2012-11-28 Dan Ellis 3
The DFT
Recall the DFT:
discrete transform of discrete sequence
Matrix form:

X[k] = x[n]W
N
kn
n=0
N1

W
N
= e
j
2
N
( )
W
N
r
has only
N distinct values
W
N
@2/N
Structure
opportunities
for
efficiency
X[0]
X[1]
X[2]
.
.
.
X[N 1]
1 1 1 1
1 W
1
N
W
2
N
W
(N1)
N
1 W
2
N
W
4
N
W
2(N1)
N
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1 W
(N1)
N
W
2(N1)
N
W
(N1)
2
N
x[0]
x[1]
x[2]
.
.
.
x[N 1]
2012-11-28 Dan Ellis 4

Computational Complexity
N complex multiplies
+ N-1 complex adds per point (k)
N points (k = 0.. N-1)
cpx mult: (a+jb)(c+jd) = ac - bd + j(ad + bc)

= 4 real mults + 2 real adds
cpx add = 2 real adds
N points: 4N
2
real mults, 4N
2
-2N real adds

X[k] = x[n]W
N
kn
n=0
N1
2012-11-28 Dan Ellis 5

Goertzels Algorithm
Now:
i.e.
where

X k
[ ]
= x
[ ]
W
N
k
=0
N1
= W
N
kN
x
[ ]
W
N
k N
( )
looks like a
convolution

X k
[ ]
= y
k
N
[ ]
y
k
n
[ ]
= x
e
n
[ ]
h
k
[n]
x
e
[n] = {
x[n] 0 n < N
0 n = N
h
k
[n] = {
W
N
-kn
n 0
0 n < 0
+
z
-1
W
N
-k
x
e
[n]
x
e
[N] = 0
y
k
[n]
y
k
[-1] = 0
y
k
[N] = X[k]
2012-11-28 Dan Ellis 6
Goertzels Algorithm
Separate filters for each X[k]
can calculate for just a few values of k
No large buffer, no coefficient table
Same complexity for full X[k]

(4N
2
mults, 4N
2
- 2N adds)
but: can halve multiplies by making the

denominator real:

H z
( )
=
1
1W
N
k
z
1
=
1W
N
k
z
1
1 2cos
2k
N
z
1
+ z
2
evaluate only
for last step
2 real mults
per step
2012-11-28 Dan Ellis 7
2. Fast Fourier Transform FFT
Reduce complexity of DFT

from O(N
2
) to O(NlogN)
grows more slowly with larger N
Works by decomposing large DFT into

several stages of smaller DFTs
Often provided as a highly optimized

library
2012-11-28 Dan Ellis 8
Decimation in Time (DIT) FFT
Can rearrange DFT formula in 2 halves:

X k
[ ]
= x n
[ ]
W
N
nk
n=0
N1
= x 2m
[ ]
W
N
2mk
+ x 2m+1
[ ]
W
N
2m+1
( )
k
( )
m=0
N
2
1
= x 2m
[ ]
W
N
2
mk
m=0
N
2
1
+W
N
k
x 2m+1
[ ]
W
N
2
mk
m=0
N
2
1
N/2 pt DFT of x for even n N/2 pt DFT of x for odd n

X
0
[<k>
N/2
] X
1
[<k>
N/2
]
Arrange
terms
in pairs...
Group terms
from each
pair
k = 0.. N-1
2012-11-28 Dan Ellis 9
Decimation in Time (DIT) FFT
We can evaluate an N-pt DFT as two

N/2-pt DFTs (plus a few mults/adds)
But if DFT
N
{} ~ O(N
2
)
then DFT
N/2
{} ~ O((N/2)
2
) = 1/4 O(N
2
)
Total computation ~ 2 1/4 O(N
2
)
= 1/2 the computation (+") of direct DFT

DFT
N
x n
[ ] { }
= DFT
N
2
x
0
n
[ ] { }
+W
N
k
DFT
N
2
x
1
n
[ ] { }
x[n] for even n x[n] for odd n
2012-11-28 Dan Ellis 10
One-Stage DIT Flowgraph
Classic FFT structure
Even
points
from
x[n]
Odd
points
from
x[n]
Same as
X[0..3]
except for
factors on
X
1
[]
terms
twiddle factors:
always apply to
odd-terms output
NOT mirror-image

X k
[ ]
= X
0
k
N
2
[ ]
+W
N
k
X
1
k
N
2
[ ]
x[0]
x[2]
x[4]
x[6]
x[1]
x[3]
x[5]
x[7]
X[0]
X[1]
X[2]
X[3]
X[4]
X[5]
X[6]
X[7]
X
0
[0]
X
0
[1]
X
0
[2]
X
0
[3]
X
1
[0]
X
1
[1]
X
1
[2]
X
1
[3]
DFT
N
2
DFT
N
2
W
N
0
W
N
1
W
N
2
W
N
3
W
N
4
W
N
5
W
N
6
W
N
7
2012-11-28 Dan Ellis 11
If decomposing one DFT

N
into two
smaller DFT
N/2
s speeds things up ...
Why not further divide into DFT
N/4
s ?
i.e.
make:
Similarly,
Multiple DIT Stages

X k
[ ]
= X
0
k
N
2
[ ]
+W
N
k
X
1
k
N
2
[ ]
0 k < N

X
0
k
[ ]
= X
00
k
N
4
[ ]
+W
N
2
k
X
01
k
N
4
[ ]
0 k < N/2
N/4-pt DFT of even points
in even subset of x[n]
N/4-pt DFT of odd points
from even subset

X
1
k
[ ]
= X
10
k
N
4
[ ]
+W
N
2
k
X
11
k
N
4
[ ]
2012-11-28 Dan Ellis 12
Two-Stage DIT Flowgraph
x[0]
x[4]
x[2]
x[6]
x[1]
x[5]
x[3]
x[7]
X[0]
X[1]
X[2]
X[3]
X[4]
X[5]
X[6]
X[7]
X
0
[0]
X
00
X
01
X
10
X
11
X
0
[1]
X
0
[2]
X
0
[3]
X
1
[0]
X
1
[1]
X
1
[2]
X
1
[3]
DFT
N
4
DFT
N
4
DFT
N
4
DFT
N
4
W
N
0
W
N
1
W
N
2
W
N
3
W
N
4
W
N
5
W
N
6
W
N
7
W
N/2
3
W
N/2
0
W
N/2
3
W
N/2
0
different from before same as before
2012-11-28 Dan Ellis 13
Multi-stage DIT FFT
Can keep doing this until we get down

to 2-pt DFTs:
N = 2
M
-pt DFT reduces to M stages of
twiddle factors & summation
(O(N
2
) part vanishes)
real mults < M4N , real adds < 2M2N
complexity ~ O(NM) = O(Nlog
2
N)
DFT
2
X[0] = x[0] + x[1]
X[1] = x[0] - x[1]
-1 = W
2
1
1 = W
2
0
butterfly element
2012-11-28 Dan Ellis 14

W
N
r+
N
2
= e
j
2 r+
N
2
( )
N
= e
j
2r
N
e
j
2N / 2
N
= W
N
r
FFT Implementation Details
Basic butterfly (at any stage):
Can simplify:
W
N
r
W
N
r+N/2
X
X
[r]
X
X
[r+N/2]
X
X0
[r]
X
X1
[r]
2 cpx mults
X
X
[r]
X
X
[r+N/2]
X
X0
[r]
X
X1
[r]
W
N
r
-1
just one cpx mult!
i.e. SUB rather than ADD
2012-11-28 Dan Ellis 15
8-pt DIT FFT Flowgraph
-1s absorbed into summation nodes
W
N
0
disappears
in-place algorithm: sequential stages

b
i
t
-
r
e
v
e
r
s
e
d

i
n
d
e
x
i
n
g
W
4
W
4
W
8
W
8
W
8
x[0]
x[4]
x[2]
x[6]
x[1]
x[5]
x[3]
x[7]
000
100
010
110
001
101
011
111
X[0]
X[1]
X[2]
X[3]
X[4]
X[5]
X[6]
X[7]
-
-
-
-
-
-
-
-
-
-
-
-
2
3
2012-11-28 Dan Ellis 16
FFT for Other Values of N
Having N = 2
M
meant we could divide
each stage into 2 halves = radix-2 FFT
Same approach works for:
N = 3
M
radix-3
N = 4
M
radix-4 - more optimized radix-2
etc...
Composite N = abcd mixed radix

(different N/r point FFTs at each stage)
.. or just zero-pad to make N = 2

M

M
2012-11-28 Dan Ellis 17
Inverse FFT
Recall IDFT:
Thus:
Hence, use FFT to calculate IFFT:

x[n] =
1
N
X[k]W
N
nk
k=0
N1
only differences
from forward DFT

Nx
*
[n] = X[k]W
N
nk
( )
*
k=0
N1
= X
*
[k]W
N
nk
k=0
N1
Forward DFT of x[n] = X

*
[k]|
k=n
i.e. time sequence made from spectrum
DFT
Re{x[n]}
Im{x[n]}
Re
Im
Re
Im
Re{X[k]}
Im{X[k]}
1/N
-1/N -1
pure real flowgraph
x n

1
N
X
*
k

W
N
nk
k0
N 1
*
2012-11-28 Dan Ellis 18
If x[n] is pure-real, DFT wastes mults
Real x[n] Conj. symm. X[k] = X

*
[-k]
Given two real sequences, x[n] and w[n]

call y[n] = jw[n] , v[n] = x[n] + y[n]
N-pt DFT V[k] = X[k] + Y[k]

but: V[k]+V
*
[-k] = X[k]+X
*
[-k]+Y[k]+Y
*
[-k]
X[k]=
1
/
2
(V[k]+V
*
[-k]) , W[k]=
-j
/
2
(V[k]-V
*
[-k])
i.e. compute DFTs of two N-pt real

sequences with a single N-pt DFT
DFT of Real Sequences
X[k]
-Y[k]
2012-11-28 Dan Ellis 19
3. Short-Time
Fourier Transform (STFT)
Fourier Transform (e.g. DTFT) gives

spectrum of an entire sequence:
How to see a time-varying spectrum?
e.g. slow AM of a sinusoid carrier:

x n
[ ]
= 1 cos
2tn
N
|
\

|
.
|
cose
0
n
0 200 400 600 800 1000
-2
-1
0
1
2
n
x[n]
2012-11-28 Dan Ellis 20
Fourier Transform of AM Sine
Spectrum of
whole sequence
indicates
modulation
indirectly...
... as
cancellation
between
closely-
tuned
sines
2cAcB
= cA+B
+cA-B
Nsin2kn
N
-Nsin2(k-1)n
N 2
-Nsin2(k+1)n
N 2
0 0.02 0.04 0.06 0.08
0
200
400
600
N
N/2
\X[k]\
WP
k/(N/2)
-1
-0.5
0
0.5
1
-1
-0.5
0
0.5
1
0 128 256 384 512 640 768 896
-1
-0.5
0
0.5
1
M
2012-11-28 Dan Ellis 21
Fourier Transform of AM Sine
Sometimes wed rather separate

modulation and carrier:
x[n] = A[n]cos!
0
n
A[n] varies on a
different (slower) timescale
One approach:
chop x[n] into short sub-sequences ..
.. where slow modulator is ~ constant
DFT spectrum of pieces show variation

!
A[n]
!
0
2012-11-28 Dan Ellis 22
FT of Short Segments
Break up x[n] into successive, shorter

chunks of length N
FT
, then DFT each:
Shows amplitude modulation
of !
0
energy
0 128 256 384 512 640 768 896 1024 = N
-2
-1
0
1
2
0 64
0
50
100
n
k
k
x
0
[n]
x[n]
X
0
[k]
x
1
[n]
X
1
[k]
x
2
[n]
X
2
[k]
x
3
[n]
X
3
[k]
x
4
[n]
X
4
[k]
x
5
[n]
X
5
[k]
x
6
[n]
X
6
[k]
x
7
[n]
X
7
[k]
k
0
= W
0
N
FT
2P
N
FT
= N/8
2012-11-28 Dan Ellis 23
The Spectrogram
Plot successive DFTs in time-frequency:
This image is called the Spectrogram

time hopsize (between successive frames)
= 128 points
128 256 384 512 640 768 896 1024
k
n
k k
\X
i
[k]\
\X[k,n]\
0
0
5
10
15
20
0
40
60
80
100
120
X
0
[k] X
1
[k] X
2
[k] X
3
[k] X
4
[k] X
5
[k] X
6
[k] X
7
[k]
2012-11-28 Dan Ellis 24
Short-Time Fourier Transform
Spectrogram = STFT magnitude

plotted on time-frequency plane
STFT is (DFT form):
intensity as a function of time & frequency

X k,n
0
[ ]
= x n
0
+ n
[ ]
w[n] e
j
2kn
N
FT
n=0
N
FT
1
frequency
index
time
index
N
FT
points of x
starting at n
window
DFT
kernel
2012-11-28 Dan Ellis 25
STFT Window Shape
w[n] provides time localization of STFT
e.g. rectangular
selects x[n], n
0
n < n
0
+N
W
But: resulting spectrum has same

problems as windowing for FIR design:
n
w[n]

X e
j
,n
0
( )
= DTFT x n
0
+ n
[ ]
w n
[ ] { }
= e
jn
0
X e
j
( )
W e
j
( )
( )
d
spectrum of short-time window

is convolved with (twisted) parent spectrum
DTFT
form of
STFT
2012-11-28 Dan Ellis 26
STFT Window Shape
e.g. if x[n] is a pure sinusoid,
Hence, use tapered window for w[n]

blurring (mainlobe)
+ ghosting (sidelobes)
e.g. Hamming

w n
[ ]
=
0.54 + 0.46cos(2
n
2M+1
)
sidelobes
< -40 dB
W
P
W

X(e
jW
) W(e
jW
)
W
-10 -5 0 5 10
W n
w[n] W(e
jW
)
2012-11-28 Dan Ellis 27
STFT Window Length
Length of w[n] sets temporal resolution
Window length 1/(Mainlobe width)
more time detail less frequency detail

short window measures
only local properties
longer window averages
spectral character
shorter window
more blurred
spectrum
0 200 400 600 800 1000
-0.1
0
0.1
0.2
0 200 400 600 800 1000
-0.1
0
0.1
0.2
x[n] x[n]
w [n]
L S
w [n]
-100 -50 0 50 100
0
0.5
1
-P -0.5P 0 0.5P P
0
10
20
-100 -50 0 50 100
0
0.5
1
-P -0.5P 0 0.5P P
0
10
20
n
n
w
L
[n]
w
S
[n]
W
W
W
S
(e
jW
)
W
L
(e
jW
)
N
1
pts
N
2
pts
N
1
zero at 4
N
2
zero at 4
2012-11-28 Dan Ellis 28
STFT Window Length
Can illustrate time-frequency tradeoff

on the time-frequency plane:
Alternate tilings
of time-freq:
disks show blurring
due to window length;
area of disk is constant
Uncertainty principle:
ft k
half-length window half as many DFT samples
0 100 200 300
0
0.5
1
0
50
100
150
200
250
n
k
2012-11-28 Dan Ellis 29
Spectrograms of Real Sounds
individual t-f
cells merge
into continuous
image
time-domain
successive
short
DFTs
time / s
time / s
f
r
e
q

/

H
z
i
n
t
e
n
s
i
t
y

/

d
B
2.35 2.4 2.45 2.5 2.55 2.6
0
1000
2000
3000
4000
f
r
e
q

/

H
z
0
1000
2000
3000
4000
0
0.1
-50
-40
-30
-20
-10
0
10
0 0.5 1 1.5 2 2.5
2012-11-28 Dan Ellis 30
Narrowband vs. Wideband
Effect of varying window length:

1.4 1.6 1.8 2 2.2 2.4 2.6
f
r
e
q

/

H
z
time / s
level
/ dB
0
1000
2000
3000
4000
f
r
e
q

/

H
z
0
1000
2000
3000
4000
0
0.2
W
i
n
d
o
w

=

2
5
6

p
t

N
a
r
r
o
w
b
a
n
d

W
i
n
d
o
w

=

4
8

p
t

W
i
d
e
b
a
n
d

-50
-40
-30
-20
-10
0
10
M
2012-11-28 Dan Ellis 31
Spectrogram in Matlab
>> [d,sr]=wavread(mpgr1_sx419.wav');
>> Nw=256;
>> specgram(d,Nw,sr)
>> caxis([-80 0])
>> colorbar
(hann) window length
actual sampling rate
(to label time axis)
dB
Time
F
r
e
q
u
e
n
c
y

0.5 1 1.5 2 2.5 3
0
2000
4000
6000
8000
-80
-60
-40
-20
0
2012-11-28 Dan Ellis
-60
-40
-20
0 -1
0
1
-1
0
1
n
Re{x[n]}
I
m
{
x
[
n
]
}
32
STFT as a Filterbank
Consider one row of STFT:

where
Each STFT row is output of a filter

(subsampled by the STFT hop size)

X
k
n
0
[ ]
= x n
0
+ n
[ ]
w n
[ ]
e
j
2kn
N
n=0
N1
= h
k
m
[ ]
x n
0
m
[ ]
m=0
N1
( )

h
k
n
[ ]
= w n
[ ]
e
j
2kn
N
just one freq.
convolution
with
complex IR
2012-11-28 Dan Ellis 33
STFT as a Filterbank
If
then
Each STFT row is the same bandpass

response defined by W(e
j!
),
frequency-shifted to a given DFT bin:

h
k
n
[ ]
= w
( )
n
[ ]
e
j
2kn
N

H
k
e
j
( )
= W e

( )
j
2k
N
( )
( )
shift-in-!
A bank of identical,
frequency-shifted
bandpass filters:
filterbank
\W(e
jW
)\
\H
1
(e
jW
)\ \H
2
(e
jW
)\

W
P
2012-11-28 Dan Ellis 34
STFT Analysis-Synthesis
IDFT of STFT frames can reconstruct

(part of) original waveform
e.g. if
then
Can shift by n
0
, combine, to get x[n]:
Could divide by w[n-n

0
] to recover x[n]...

X k,n
0
[ ]
= DFT x n
0
+ n
[ ]
w n
[ ] { }
IDFT X k,n
0
[ ] { }
= x n
0
+ n
[ ]
w n
[ ]
^
n
0
n
^
x[n]
x[n]w[n-n
0
]
2012-11-28 Dan Ellis 35
Dividing by small values of w[n] is bad
Prefer to
overlap windows:
i.e. sample X[k,n
0
]
at n
0
= rH where H = N/2 (for example)
Then
hopsize window length

x n
[ ]
= x n
[ ]
w n rH
[ ]
r
= x n
[ ]

w n rH
[ ]
r
=1
if
n
^
x[n]
x[n]w[n-rH]
2012-11-28 Dan Ellis 36
Hann or Hamming windows

with 50% overlap
sum to constant
Can modify individual frames of X[k,n]

and then reconstruct
complex, time-varying modifications
tapered overlap makes things OK

0.54 + 0.46cos(2
n
N
)
( )
+ 0.54 + 0.46cos(2
n
N
2
N
)
( )
=1.08
0 20 40 60 80
0
0.2
0.4
0.6
0.8
1
n
w[n] w[n-N/2]
w[n] + w[n-N/2]
2012-11-28 Dan Ellis 37
e.g. Noise reduction:

STFT of
original speech
Speech corrupted
by white noise
Energy threshold
mask
M
100 200 300
20
40
60
80
100
120
r
k

Short Time Fourier Transform

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Short Time Fourier Transform

Diunggah oleh

Hak Cipta:

Format Tersedia

2012-11-28 Dan Ellis 1

ELEN E4810: Digital Signal Processing

Filter design so far has been oriented to

But: frequency-domain processing

use all of x[n], or use short-time windows

Need an efficient way to calculate DFT

Recall the DFT:

discrete transform of discrete sequence

2012-11-28 Dan Ellis 4

cpx mult: (a+jb)(c+jd) = ac - bd + j(ad + bc)

cpx add = 2 real adds

2012-11-28 Dan Ellis 5

Separate filters for each X[k]

can calculate for just a few values of k

No large buffer, no coefficient table

Same complexity for full X[k]

but: can halve multiplies by making the

Reduce complexity of DFT

grows more slowly with larger N

Works by decomposing large DFT into

Often provided as a highly optimized

Can rearrange DFT formula in 2 halves:

N/2 pt DFT of x for even n N/2 pt DFT of x for odd n

We can evaluate an N-pt DFT as two

If decomposing one DFT

Can keep doing this until we get down

2012-11-28 Dan Ellis 14

Basic butterfly (at any stage):

-1s absorbed into summation nodes

in-place algorithm: sequential stages

Same approach works for:

Composite N = abcd mixed radix

.. or just zero-pad to make N = 2

Hence, use FFT to calculate IFFT:

Forward DFT of x[n] = X

If x[n] is pure-real, DFT wastes mults

Real x[n] Conj. symm. X[k] = X

Given two real sequences, x[n] and w[n]

N-pt DFT V[k] = X[k] + Y[k]

i.e. compute DFTs of two N-pt real

Fourier Transform (e.g. DTFT) gives

How to see a time-varying spectrum?

e.g. slow AM of a sinusoid carrier:

Sometimes wed rather separate

chop x[n] into short sub-sequences ..

.. where slow modulator is ~ constant

DFT spectrum of pieces show variation

Break up x[n] into successive, shorter

Plot successive DFTs in time-frequency:

This image is called the Spectrogram

Spectrogram = STFT magnitude

STFT is (DFT form):

intensity as a function of time & frequency

w[n] provides time localization of STFT

But: resulting spectrum has same

spectrum of short-time window

e.g. if x[n] is a pure sinusoid,

Hence, use tapered window for w[n]

Length of w[n] sets temporal resolution

Window length 1/(Mainlobe width)

more time detail less frequency detail

Can illustrate time-frequency tradeoff

Effect of varying window length:

Consider one row of STFT: