Introduction
How is audio different from speech?
Human auditory system
Lossless audio coding
Reversibility of closed-loop DPCM
Inter-channel decorrelation
Perceptual audio coding
Psychoacoustics
Perceptual entropy
Principles
No physical model exists for audio production
Instead, more emphasis is put on human auditory
NO
NO
AFTER
(rounding)
K
e(n) = x(n) − Q[ xˆ (n)], xˆ (n) = ∑ ak x(n − k ),
K k =1
xˆ (n) = ∑ ak x(n − k ) x(n) = e(n) + Q[ xˆ (n)]
k =1
Encoder Decoder
L
s
Analysis Synthesis
EE493Q: Digital Speech Processing
Filter Bank Illustration
Forward Transform
Inverse Transform
EE493Q: Digital Speech Processing
Pre-Echo Distortion
Example:
Model 2:
Calculate “tonality” index to determine likelihood of each
spectral point being a tone
based on previous two analysis windows
X: tonal
O: noise
NMT Asymmetry
TMN
EE493Q: Digital Speech Processing
Step 3: Decimation and
Reorganization of Maskers
“Smear” each signal within its critical band
Use either a masking (Model 1) or a spreading
function (Model 2).
Adjust calculated threshold by
incorporating a “quiet” mask – masking
threshold for each frequency when no
other frequencies are present.
http://www.technologyreview.com/read_article.aspx?id=17642&ch=infotech