Side-chain equalization through

selective unmasking by phase

vocoder analysis and synthesis
Zhiguang Eric Zhang
Equalization (EQ) within
• Make sonic elements fit
within the mix

• Address problematic or
clashing frequencies

• Emphasize or deemphasize
certain frequencies for
creative purposes
Modern EQ
• Dynamic / Active EQ

• Side-chain EQ

• New features

• spectrum grab

• natural phase

• match EQ

• New filter shapes
Time domain vs frequency
domain EQs
• Fast Fourier transform (FFT) frequency domain
filters are not widely accepted for EQ purposes

• Pre-echo, ripple, ringing, smearing, resonance?

• Time domain filters always have phase distortion -

frequency domain filters are zero-phase - no
phase shift or distortion!

• FFT incurs latency - digital audio workstations

generally compensate for latency
Primary aims

• To attempt to implement a frequency domain

FFT equalizer

• Objectively measure unmasking, clarity, and


• Introduce the concept of a vector to frequency-

domain processing
Automatic mixing
• Automate tedium associated with audio engineering / mixing /

• Gain / level balancing, EQ, compression, panning

• Cross-adaptive (multi-track), auto-adaptive (single track or stem),

external-adaptive (side-chain)

• Should incorporate audio engineering best practices and domain

• Pestana, Pedro D.; Reiss, Joshua D. Intelligent Audio Production Strategies Informed by Best
Practices. Presented at the 53rd AES International Conference, January 27–29, London, UK, 2014.

• Pestana, Pedro D. Automatic Mixing Systems Using Adaptive Audio Effects. Ph.D. dissertation,
Catholic University of Portugal, 2013.
• Complete masking is defined as a
dominant sound rendering a quieter
(-15 to 20 dB) sound unheard

• Partial masking - reducing the

audibility of a simultaneously
occurring sound

• Partial loudness - the loudness of a

sound in the presence of another

• Central to the goals of EQ

Bosi, Marina; Goldberg, Richard E. Introduction to Digital Audio Coding and Standards.
Boston: Kluwer Academic Publishers, 2003. Print.

• is also addressed by gain and

Prior work
• Pestana, Pedro D.; Reiss, Joshua D. A Cross-Adaptive Dynamic Spectral
Panning Technique. Proc. of the 17th Int. Conference on Digital Audio
Effects (DAFx-14), September 1-5, Erlangen, Germany, 2014.

• Dynamically panning FFT bins based on musical heuristics and unmasking

• Unmasking is subjectively competitive with a human audio engineer
• Subjective “excitement” and “other-worldliness”

• Hafezi, Sina; Reiss, Joshua D. Autonomous Multi-track Equalization Based

on Masking Reduction. J. Audio Eng. Soc., Vol. 63, No. 5, May, 2015.

• Cross-adaptive unmasking based on essential and non-essential

• Uses psychoacoustic model to detect masking and time-domain filters to
• Subjectively competitive with a human audio engineer
Phase vocoder

• Frequency domain FFT

transformation followed by
time-frequency processing

• Traditionally used for pitch

shifting, time stretching,
and other effects

Quackenbush, Schuyler. L06_Transforms_1.pptx.pd

Zolzer, Udo. DAFX - Digital Audio Effects,
1st ed. New York City, USA: New York University
Second Edition. West Sussex: Wiley, 2011. Print.
Steinhardt, Music Technology, 2014.
FFT bins are vectors

• The FFT is a linear

transformation when taking
into account magnitude and
phase angle

• Signal addition can be done in

the frequency domain through
vector algebra

Signal contribution to a sum
is its scalar projection

• Signals are often added with

similar frequency content

• One track contributes to the

mix at a specific frequency
through its scalar projection
Subtractive EQ or attenuation
via scalar projection
Phase response

phase response = zero-phase

Dynamic FFT EQ
• Time constants (attack, release) smooth the
action of the effect and prevent artifacts

• Add features of a compressor to the effect:

threshold & ratio
Equal-loudness scaling

• Humans perceive relative

loudness of frequencies
differently, resulting in equal-
loudness curves

• Scale attenuation based on

how we actually hear