Anda di halaman 1dari 16

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 1

A Geometric Model for Prediction


of Spatial Aliasing in 2.5D Sound Field Synthesis
Fiete Winter, Frank Schultz, Gergely Firtha, Sascha Spors

Abstract—The avoidance of spatial aliasing is a major chal- long as a critical number of actuators are employed [1],
lenge in the practical implementation of Sound Field Synthesis. which grows with the spatial scale of the target region and
Such methods aim at a physically accurate reconstruction of a the temporal frequency. It serves as a motivation for so-
desired sound field inside a target region using a finite ensemble
of loudspeakers. In the past, different theoretical treatises of the called Local Sound Field Synthesis (LSFS): A more accurate
inherent spatial sampling process led to anti-aliasing criteria for reproduction inside a downsized area which is smaller than
simple loudspeaker array arrangements, e.g. lines and circles, and the area surrounded by the Secondary Source Distribution
fundamental sound fields, e.g. plane and spherical waves. Many (SSD) is pursued. To achieve this, stronger artefacts outside
criteria were independent of the listener’s position inside the the prioritised area are permitted.
target region. Within this article, a geometrical framework based
on ray-approximation of the underlying synthesis problem is Two well-known representatives of SFS are Near-Field-
proposed. Unlike former approaches, this model predicts spatial Compensated Higher Order Ambisonics (NFC-HOA) [2] and
aliasing artefacts for arbitrary convex loudspeaker arrays and Wave Field Synthesis (WFS) [3]. Both methods allow to de-
as a function of the listening position and the desired sound scribe the driving signals analytically using simple parametric
field. Anti-aliasing criteria for distinct listening positions and models. NFC-HOA constitutes the analytical solution to the
extended listening areas are formulated based on the established
predictions. For validation, the model is applied to different underlying synthesis problem for circular and spherical SSDs.
analytical Sound Field Synthesis approaches: The predicted It uses spatial bandwidth limitation in order to avoid spatial
spatial structure of the spatial aliasing agrees with numerical aliasing around the SSD’s centre. Extensions to NFC-HOA
simulation of the synthesised sound fields. Moreover, it is shown published in [4, Sec. IV.A] and [5, p. 145ff.] use multipole
within this framework, that the active prioritisation of a control re-expansion in the circular/spherical harmonics domain in
region using so-called Local Sound Field Synthesis approaches
does indeed reduce spatial aliasing artefacts. For the scenario order to shift the region of high synthesis accuracy from the
under investigation, a method for Local Wave Field Synthesis centre. Conventional WFS as a spatial full-bandwidth tech-
achieves an artefact-free synthesis up to a frequency which is nique does not actively avoid spatial aliasing. Extensions for
between 2.9 and 17.3 times as high as for conventional Wave WFS to achieve LSFS were presented in [6]–[9]. Numerical
Field Synthesis. approaches to SFS that base on a local control region are
Index Terms—sound field synthesis, spatial aliasing inherently LSFS [4, Sec. I]. As the number of approaches is
vast, we exemplarily name [10]–[17]. Approaches for LSFS
have also been extended towards multizone SFS, which aims
I. I NTRODUCTION
at independently controlling the synthesised sound field in two
HE physically accurate reconstruction of a virtual, a.k.a. or more portions of a target area, e.g. [18]–[22]. In order to do
T desired or target, sound field within a target region is
the common goal of Sound Field Synthesis (SFS) techniques.
so, these methods create bright zones with significant sound
pressure and quiet zones with minimal sound pressure.
They drive a distribution of (up to hundreds of) loudspeakers Several theoretical treatises investigated spatial aliasing in
as so-called secondary sources surrounding the target region SFS with a dedicated focus on the aliasing frequency. This
such that their individual sound fields being superimposed frequency describes the largest temporal frequency up to which
coincide with the desired one. As the density of the distribution no aliasing occurs given a distinct synthesis scenario. Exceed-
is limited for practical reasons, spatial aliasing artefacts impair ing this frequency can be regarded as a violation of the anti-
the synthesis accuracy. Spatial aliasing can be avoided as aliasing criterion. For linear and circular SSDs driven by WFS,
criteria were derived for fundamental virtual sound fields such
Manuscript received *****; revised *****; accepted *****. Date of pub-
lication *****; date of current version *****. This work was supported as plane and spherical waves [23], [24]. A comparison to NFC-
by the German Research Foundation (DFG) under Grant SP 1295/9-1. The HOA with respect to the aliasing properties was presented
associate editor coordinating the review of this manuscript and approving it for in [25]. The found criteria are listening position independent.
publication was Dr. Federico Fontana. (Corresponding author: Fiete Winter.)
F. Winter, F. Schultz, and S. Spors are with the Institute of Com- However, numerical simulation of the synthesised sound fields
munications Engineering, University of Rostock, Rostock 18051, Ger- suggest a spatial heterogeneity of the aliasing frequency. Cor-
many (e-mail: fiete.winter@uni-rostock.de; frank.schultz@uni-rostock.de; teel et al. [6] used a time-domain model based on path-lengths
sascha.spors@uni-rostock.de).
G. Firtha is with the Department of Networked Systems and Services, to predict the position-dependent aliasing frequency for virtual
Budapest University of Technology and Economics, Budapest, Budapest 1111, point sources. It was further utilised by Oldfield [26] for his
Hungary (e-mail: firtha@hit.bme.hu). investigations on focused point sources in WFS. Within own
Color versions of one or more of the figures in this paper are available
online at http://ieeexplore.ieee.org. work [4] a model was published that predicts the occurrence of
Digital Object Identifier 10.1109/TASLP.2019.******* spatial aliasing for virtual plane waves synthesised by an NFC-

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 2

HOA approach for LSFS. It was further used to predict the


method’s optimal parametrisation to avoid aliasing. The model
was extended and applied to multizone SFS by Donley et al.
[27]. There, the impact of the spatial aliasing caused by the
∂Ω Ωl
synthesis of the bright zone on the sound pressure inside the
quiet zone was modelled. Analytic solutions for the aliasing
frequency were derived for virtual plane waves synthesised by
linear and circular SSDs. n0 (x0 )
This article proposes a geometric model to predict the Ω G(x − x0 , ω)
spatial occurrence of aliasing artefacts depending on the x0
listening position, on the geometry of the SSD, and on the S(x, ω)
virtual sound field. It generalises prior treatises towards the Fig. 1: Geometry for Sound Field Synthesis.
mentioned dependencies. The framework is based on a high-
frequency, i.e. ray-based, approximation of the underlying SFS
problem. The prediction will be used to formulate anti-aliasing an arbitrary sound field P (x, ω) fulfilling the linear wave
criteria for listening positions and extended listening areas. For equation [30, Eq. (2.1)], it is defined as
validation, the model is applied to WFS, NFC-HOA, Local ω
Wave Field Synthesis (LWFS) and Multizone SFS. Moreover, kP (x, ω) := −∇φP (x, ω) ≈ k̂P (x, ω) (2)
c
the impact of LWFS on the aliasing properties compared to
WFS is discussed within this framework. A comparison to using the gradient operator ∇. The normalised vector k̂P (x, ω)
prior approaches that predict spatial aliasing is made whenever describes the local propagation direction of P (x, ω) at a
possible. given coordinate x. For elementary sound fields such as point
The article is organised as follows: The mathematical pre- and line sources, or plane waves, kP (x, ω) fulfils the local
liminaries are described in Sec. II. The geometric model for dispersion relation, i.e. its length is fixed to ω/c. For arbitrary
SFS is presented in Sec. III followed by anti-aliasing criteria sound fields, this statement is true for asymptotically high fre-
derived from it in Sec. IV. The model is applied to various SFS quencies, see [31, Sec. 5.14]. A high-frequency approximation
methods in Sec. V in order to compare its predictions against of the sound field gradient is given by [32, Eq. (57)]
numerical simulations. The conclusion is given Sec. VI. ω
∇P (x, ω) ≈ −j k̂P (x, ω)P (x, ω) . (3)
c

II. P RELIMINARIES B. The Stationary Phase Approximation


A position vector x in the three-dimensional, right-hand Given a complex-valued function F (u) = AF (u)e+jφF (u)
coordinate system is defined by its Cartesian [x, y, z]T , or its with its phase term rapidly oscillating compared to its slowly
cylindrical representation [r, α, z]T . The distance of x from changing amplitude, the following approximation of the inte-
the z-axis is denoted as r, while α describes the azimuth gral
angle between x-axis and projection of x onto the xy-plane. Z ∞ s
∗ 2π +j π
00 ∗
4 sgn(φF (u ))
Elements of a vector are denoted using the same subscripted F (u)du ≈ F (u ) 00 ∗ e (4)
−∞ |φF (u )|
indices of their corresponding vector. For example, xi , yi ,
and zi belong to xi . The normalised version of a vector x is holds. For the stationary point u∗ , the first-order derivative
indicated via x̂. The scalar product between the vectors x1 and w.r.t. u of the phase φ0F (u∗ ) vanishes and the second-order
x2 is given as hx1 |x2 i. The axial unit vectors of the Cartesian derivative φ00F (u∗ ) is non-zero. The approximation is based
coordinate system are denoted as ux , uy , and uz . The radial upon the idea that the integration over a complex sinusoid
frequency ω = 2πf is defined by the temporal frequency f . with a rapidly changing phase yields zero except for the
The speed of sound is denoted by c and fixed to 343 m/s for all contributions from u∗ and its neighbourhood.
simulations within this article. A sound pressure field p(x, t)
is a scalar function depending on the position and the time t. III. G EOMETRIC M ODEL FOR S OUND F IELD S YNTHESIS
Its temporal Fourier transform [28, Eq. (9.1)]
In the following, a derivation of the geometric model for
spatial aliasing in SFS will be presented. A ray approximation
Z ∞
P (x, ω) = p(x, t) e−jωt dt = AP (x, ω) e+jφP (x,ω) (1) is applied to the underlying synthesis problem, whose essen-
−∞
tials are briefly recalled at first. The derivation is carried out
will be regularly expressed by its real-valued amplitude for a linear SSD and is then generalised to convex geometries.
AP (x, ω) and phase φP (x, ω), here. The fundamental task in SFS is to reproduce the virtual
sound field s(x, t) with its temporal Fourier spectrum

A. The Local Wavenumber Vector S(x, ω) = AS (x, ω) e+jφS (x,ω) (5)


The concept of the local wavenumber vector was introduced within a defined listening region Ωl ⊆ Ω, see Fig. 1 for
to the context of SFS by Firtha et al. [29, Eq. (15)]. For graphical explanation of the geometry. In 21/2-dimensional

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 3

(2.5D) scenarios [33, Sec. 2.3], correct synthesis is pursued y y


in the horizontal plane (z = 0) using an SSD located in x
the same plane (z0 = 0). Thus, Ωl and Ω are 2D areas. kG,x
x kG
Within the described scenario, it is assumed that the virtual kG kS,x
sound field does only propagate in horizontal directions, i.e. kS
the z-component of its local wavenumber vectors kS (x, ω) ∂Ω ∂Ω
is zero for z = 0. For the case where Ωl = Ω, methods x x
x∗0 ∆x x∗0
are usually referred to as conventional SFS including methods kS
like WFS and NFC-HOA. The remaining case, i.e. Ωl ⊂ Ω,
is usually termed LSFS. A distribution of loudspeakers is (a) (b)
positioned along the boundary ∂Ω as secondary sources (see Fig. 2: The images show an exemplary synthesis scenario for a linear
the loudspeaker symbols in Fig. 1). Each secondary source is SSD along the x-axis. A virtual point source (grey dot) is to be
synthesised in the upper half plane (y > 0). For the continuous SSD
oriented along the inward pointing boundary normal n0 (x0 ).
(a), the calculus is described in Sec. III-A. The discrete SSD with
The sound field emitted by an individual secondary source is the sampling distance ∆x (b) is discussed in Sec. III-B.
commonly modelled by a point source. It is given by the 3D
free-field Green’s function [30, Eq. (8.41)]
ensures that only the secondary sources that are oriented along
1 ω
G(x − x0 , ω) = e−j c |x−x0 | (6a) the propagation direction of the virtual sound field are active.
4π|x − x0 | As a systemic artefact in 2.5D synthesis, an inevitable
= AG (x − x0 , ω)e+jφG (x−x0 ,ω) (6b) mismatch between the amplitude decay of the synthesised and
the virtual sound field occurs. The distance factor d(x0 ) can
with x0 ∈ ∂Ω. The secondary source at x0 is driven by its
be used to reference the synthesised sound field to a given
respective driving function D(x0 , ω) and the resulting super-
contour/location on/at which its amplitude is correct, see [29].
position of all secondary sources constitutes the synthesised
sound field P (x, ω). The driving signals have to be chosen
such that the synthesised and the virtual sound field coincide A. Continuous Linear Secondary Source Distribution
within Ωl . Mathematically, this is subsumed by the 2D Single
In the following, a continuous linear SSD along the x-axis
Layer Potential (SLP)
is assumed, see grey line in Fig. 2a. The secondary source
positions are denoted by x0 = [x0 , 0, 0]T and the boundary
I
!
S(x, ω) = P (x, ω) = D(x0 , ω)G(x − x0 , ω) dl(x0 ) .
∂Ω normal vector n0 points into the positive y-direction. Correct
(7) synthesis is supposed to be achieved inside the positive y-half
The equation is supposed to hold for all x ∈ Ωl . A suitably plane. The SLP given by (7) specialises to an 1D convolution
chosen differential line segment for the integration along the integral
boundary ∂Ω is denoted by dl(x0 ). In order to establish a Z ∞
geometrical, i.e. ray-based, model for SFS, a high-frequency P (x, y, ω) = D(x0 , ω)G(x−x0 , y, ω) dx0 , ∀y > 0 ,
approximation of the underlying synthesis problem is reason- −∞
(10)
able. A solution to the SLP for asymptotically high frequencies
where the dependencies on the spatial variables are split in
and arbitrary convex boundaries is given by the Kirchhoff
the following, for clarity. It can be inferred from the driving
approximation [34, p. 57]. It divides the continuous boundary
function (8b) that the only phase term depending on x0 is the
into piecewise linear segments. The assumption becomes more
phase of the virtual sound field φS . The driving function can
accurate the shorter the wavelength in comparison to the
hence be expressed by
segment size. The Kirchhoff approximation constitutes the
theoretical basis of WFS. The generic 2.5D WFS driving D(x0 , ω) = AD (x0 , ω)e+jφS (x0 ,0,ω) . (11)
function and approximate solution of the SLP is given by [29,
The phase of virtual sound field φS is hereby evaluated at y =
Eq. (47)]
0 since the secondary sources are distributed along the x-axis.
Using amplitude-phase notations of the involved quantities, the
s
8π p
D(x0 , ω) = −aS (x0 ) ω d(x0 )hn0 ∇S(x, ω)|x=x0 i synthesis integral in (10) is reformulated to
jc Z ∞
(8a) P (x, y, ω) = AD (x0 , ω) AG (x − x0 , y, ω)
r
ωp −∞ (12)
≈ aS (x0 ) j 8πd(x0 )hn0 k̂S (x0 , ω)iS(x0 , ω) . ·e +j(φG (x−x0 ,0,ω)+φS (x0 ,0,ω))
dx0 ,
c
(8b) which is then approximated by
For the second expression, the high-frequency gradient approx- PSPA (x, y, ω) =D(x∗0 , ω)G(x − x∗0 , y, ω)
imation (3) was used. The secondary source selection criterion s
[29, Eq. (46)] 2π
·
( |φ00S (x∗0 , 0, ω) + φ00G (x − x∗0 , y, ω)|
if hn0 k̂S (x0 , ω)i ≥ 0

1 , 00 ∗ 00 ∗
aS (x0 ) = (9) π
· e+j 4 sgn(φS (x0 ,0,ω)+φG (x−x0 ,y,ω)) (13)
0 , otherwise

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 4

(a) P0S (x, y, ω) (b) P1S (x, y, ω) (c) P2S (x, y, ω) (d) P3S (x, y, ω) dB
2 1 0
−10
y/m

1
0 −20
−30
0
−1 −40
(e) ε̂0,SPA (x, y, ω) (f) ε̂1,SPA (x, y, ω) (g) ε̂2,SPA (x, y, ω) (h) ε̂3,SPA (x, y, ω) (i) S(x, y, ω)
2
y/m

0
−1 0 1 −1 0 1 −1 0 1 −1 0 1 −1 0 1
x/m x/m x/m x/m x/m
S
Fig. 3: The plots (a)-(d) show the real part of the aliasing components Pη (x, y, ω), see (23). A discrete linear SSD along the x-axis with
S
∆x = 0.5 m is assumed. The zeroth aliasing component P0 (x, y, ω) is equivalent to the sound field P (x, y, ω) synthesised by a continuous
linear SSD, see (10). Each coloured line indicates the positions x for which the secondary source at the start of the respective line (circles) is
the stationary secondary source. The yellow circles mark secondary sources, for which the condition in (27) is not fulfilled. For the secondary
S S S
source at the green circle, the normalised error ε̂η,SPA (x, ω), see (19), between the Pη (x, y, ω) and its approximation Pη,SPA (x, y, ω) given
by (24) is shown in the plots (e)–(h). Due the axial symmetry the plots for negative η can be generated by negating the x-coordinate. (i)
T
shows the real part of S(x, y, ω) for a monochromatic (f = 1.5 kHz) point source located at xps = [0, −1, 0] m which radiates the virtual
sound field.

using the Stationary Phase Approximation (SPA) introduced is the equivalent condition for the normalised local wavenum-
in Sec. II-B, see [32]. It states that the major part of the ber vectors. This relation is illustrated in Fig. 2a. Eqs. (17)
reproduced sound field at x is contributed by an individual and (16) are solved for x yielding
secondary source located at x∗0 = [x∗0 , 0, 0]. The stationary    ∗
k̂S,x (x∗0 , 0, ω)
 
x x0
phase point x∗0 has to fulfil the condition y  =  0  + γ k̂ (x∗ , 0, ω) ,
S,y 0 0 ≤ γ ≤ ∞ (18)
!
0 = φ0S (x∗0 , y, ω) + φ0G (x − x∗0 , y, ω) . (14) 0 0 0
which is the parametric definition of a ray starting at x∗0 with
The terms φ0· (·) and φ00· (·) denote the first- and second-order
the direction k̂S (x∗0 , 0, ω). For asymptotically high frequen-
derivative of the phase w.r.t. x0 evaluated at the accord-
cies, the reproduced sound field along the given ray is mainly
ing arguments. With the definition of the normalised local
determined by the secondary source located at x∗0 . As shown
wavenumber vector in (2), the condition is equivalent to
in Fig. 3a, the rays (coloured lines) are perpendicular to the
!
k̂S,x (x∗0 , 0, ω) = k̂G,x (x − x∗0 , y, ω) (15) wavefront curvature of the synthesised sound fields. Fig. 3e
shows the normalised error
where k̂·,x denotes the x-component of the respective vector.

PSPA (x, y, ω) − P (x, y, ω)
ε̂SPA (x, y, ω) = 20 log10 (19)
With their lengths and their z-component fixed to unity and

P (x, y, ω)
zero, respectively, the normalised local wavenumber vectors
of the reproduced sound field and its SPA for a distinct
are determined by one of their remaining components (x
x∗0 (green circle). Along the corresponding ray (green line),
and y) despite an unknown sign of the other component.
the error is significantly reduced, which indicates that the
For the virtual sound field, this ambiguity can be resolved
approximation can be regarded as reasonable.
taking the secondary source selection criterion into account:
it demands a positive y-component of the local wavenumber
B. Discrete Linear Secondary Source Distribution
vector. According to Eqs. (2) and (6), the normalised local
wavenumber vector of the 3D free-field Green’s function (6) The practical application of SFS implies the discretisation
is given by of the SSD since the spacing between adjacent loudspeakers
cannot be chosen arbitrarily small. For a uniform discretisation
x − x∗0
 
∗ 1 of a linear SSD, the reproduced sound field P (x, y, ω) is
k̂G (x − x0 , y, ω) = q  y  , (16) approximated by
∗ 2 2
(x − x0 ) + y 0 ∞
P S (x, y, ω) =
X
which has a positive y-component for the target region y > 0. D(n∆x , ω) G(x − n∆x , y, ω) ∆x (20)
n=−∞
Hence,
with the sampling distance denoted by ∆x , see the geometry
!
k̂S (x∗0 , 0, ω) = k̂G (x − x∗0 , y, ω), ∀y > 0 , (17) in Fig. 2b. A commonly used model to describe this sampling

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 5

process is the multiplication of the continuous quantity by a n0


Dirac impulse comb [28, Sec. 11.3.1]. The sampled driving x n0
function reads kG,t0 Ωl
kS,t0 kG

DS (x0 , ω) = D(x0 , ω) ∆x
X
δ(x0 − n∆x ) (21) kS u kmax
G
kmin
G
n=−∞ u
with δ(x0 −n∆x ) being a Dirac impulse [28, Sec. 8.3] imposed x0 t0
at n∆x . Note that DS (x0 , ω) is still a continuous function, min
kG,t x0 kG,t
max t0
0 0
however only non-zero at integer multiples of ∆x . With the
Fourier series of the Dirac impulse comb [28, Eq. (11.12)], (a) (b)
the sampled driving function is rearranged to Fig. 4: (a) shows a synthesis scenario for a discrete convex Secondary
Source Distribution (SSD) (black arc) analogous to Fig. 2b. The sam-

−j2πη ∆0
x pling process and the involved quantities are described in Sec. III-C.
DS (x0 , ω) =
X
D(x , ω)e x . (22) (b) illustrates the involved quantities for the estimation of the aliasing
η=−∞
| 0 {z } frequency of an extended listening area Ωl , see Sec. IV-B.
S
Dη (x0 ,ω)

The η-th aliasing component of the discrete driving function


is denoted by DηS (x0 , ω), whereas the 0-th component is the continuous SSD, the rays are locally perpendicular to the
the original continuous driving function. The η-th aliasing wavefront curvature of the sound fields. The normalised error
component of the sound field P S (x, y, ω) synthesised by the between the aliasing component and its SPA is significantly
discrete SSD is given by reduced along the given ray, again indicating that the SPA is
S
Z ∞ valid, see Fig. 3f-3h.
Pη (x, y, ω) = DηS (x0 , ω)G(x − x0 , y, ω) dx0 . (23)
−∞

Superimposing PηS (x, y, ω)


for all η will result in P S (x, y, ω)
C. Discrete Convex Secondary Source Distribution
given by (20).
Since the aliasing components are individually treatable, The presented model for the linear SSD will now be
they can be separately approximated via the SPA (cf. (13)) extended towards general convex boundaries, including non-
uniform sampling of the SSD. The boundary ∂Ω is described
S
Pη,SPA (x, y, ω) =DηS (x∗0 , ω)G(x − x∗0 , y, ω)
s as a curve x0 (u) depending on the parameter u ∈ [a, b], see
2π Fig. 4a. The component-wise derivative x0 w.r.t. u is denoted
·
|φS,x (x0 , 0, ω) + φ00G,x (x − x∗0 , y, ω)|
00 ∗
as x00 . It is oriented along the unit tangent vector t0 . The
π 00 ∗ 00 ∗ inward pointing boundary normal vector n0 is perpendicular
· e+j 4 sgn(φS (x0 ,0,ω)+φG (x−x0 ,y,ω)) . (24) to x00 and t0 . The SLP of (7) generalises to the line integral
Compared to (14), the condition for the stationary phase
b
point x∗0 is extended with an additional phase term caused
Z
P (x, ω) = D(x0 (u), ω)G(x − x0 (u), ω)|x00 (u)| du. (28)
by the complex exponential of DηS (x0 , ω), see (22). The a
equivalent condition for the x-components of the normalised
local wavenumber vectors is then given by There exist an infinite number of parametrisations describ-
ηc ! ing the same boundary. For example, x0 = [u, 0, 0]T and
k̂S,x (x∗0 , 0, ω) + = k̂G,x (x − x∗0 , y, ω) . (25) x0 = [u3 , 0, 0]T define the same linear SSD for u ∈ [−∞, ∞].
∆x f
However, an equidistant sampling w.r.t. u would lead to
The same approach as of Sec. III-A is taken to solve the different sampling schemes w.r.t. x0 . Limiting the upcoming
equation for x. The corresponding ray equation for the aliasing discussion to equidistant sampling for u is sufficient since
components reads any deterministic non-uniform scheme can be realised with
   ∗

k̂S,x (x∗0 , 0, ω) + ∆ηcf
 a suitable parametrisation. Analogous to the linear SSD,
x x0 r x
2  the sampling results in aliasing components for the driving
 1 − k̂S,x (x0 , 0, ω) + ∆x f  . (26)
 
y  =  0  + γ  ∗ ηc 
function and synthesised sound field. For the SPA of the sound
0 0 field, the stationary phase point u∗ has to fulfil
0
It is evident from the square-root-term (defining the y-

2π ∂φS (x0 (u), ω) + φG (x − x0 (u), ω)
component of the ray’s direction vector) that the condition η = , (29)
∆u ∂u
u=u


k̂S,x (x∗0 , 0, ω) + ηc ≤ 1 (27)

∆ f with ∆u being the sampling distance in the u-domain. The
x
chain rule for differentiation is used together with (2) to
has to be fulfilled for a real-valued solution. Otherwise the η- formulate the equivalent condition
th aliasing component is not excited by the secondary source
located at x∗0 . Fig. 3b-3d show examples for the aliasing com- 2π !
ponents and their corresponding ray approximations. As for h x0∗ ∗
0 | kS (x0 ) i + η = h x0∗ ∗
0 | kG (x − x0 ) i (30)
∆u

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 6

1: function A LIASING(Ω, x) 1: function A LIASING E XTENDED(Ω, Ωl )


2: fS ← ∞ 2: fΩSl ← ∞
3: for x0 , x00 ← ∂Ω+ do . (28), (37), densely sampled 3: for x0 , x00 ← ∂Ω+ do . (28), (40), densely sampled
4: ∆x0 ← ∆u |x00 | 4: ∆x0 ← ∆u |x00 |
c min max
5: f← . (35) 5: k̂G,t0
, k̂G,t0
← M IN M AX WAVENUMBER(Ωl , x0 )
c

∆x0 G,t0 (x − x0 ) − k̂S,t0 (x0 )
f min ← . (38)

6:
f S ← min(f S ; f )
min
. (37)

6: ∆x0 k̂G,t − k̂S,t0 (x0 )
0
7: c
end for 7: f max ← max . (38)
return f S

8: ∆x k̂G,t − k̂S,t (x0 )
0 0 0
9: end function 8: fΩSl ← min(fΩSl ; f min ; f max ) . (40)
Fig. 5: Brute-force search algorithm to determine the aliasing fre- 9: end for
S
quency f (x) given by (35) and (37). 10: return fΩSl
11: end function
Fig. 6: Brute-force search algorithm to determine the aliasing fre-
for the local wavenumber vectors. The asterisk denotes the S
quency fΩl for an extended listening area Ωl given by (38) and
according entities evaluated at u∗ . Normalising all involved (40). Major modifications compared to the algorithm in Fig. 5
vectors while preserving equality yields are marked by bold line numbers. An example of the function
ηc M IN M AX WAVENUMBER for a circular region is given in Fig. 8.
!
h t∗0 | k̂S (x∗0 ) i + = h t∗0 | k̂G (x − x∗0 ) i, (31)
| {z } ∆x0 (x∗0 )f | {z }
∗ ∗
k̂S,t (x0 )
0
k̂G,t (x−x0 )
0 be used to derive the highest frequency up to which no
propagating spatial aliasing artefacts occur. It is commonly
where k̂·,t0 denotes the tangential component of the respective
referred to as the spatial aliasing frequency. Exceeding this
vector, see Fig. 4a. The length of x00 and the sampling distance
frequency can be regarded as a violation of the anti-aliasing
∆u are combined to ∆x0 (x0 ) = |x00 |∆u , which can be
criterion. In the following, different aliasing frequencies are
interpreted as the local sampling distance in Cartesian space.
derived. For practical relevance, lower bounds for the aliasing
The equation establishes a connection between the tangential
frequency covering arbitrary virtual sound fields are explicitly
components of the normalised local wave vectors. Analogous
formulated. In the calculus, the asterisk of the stationary phase
to their x and y components, the tangential and normal
point for x0 is skipped for the sake of brevity.
components of unit vectors cannot be chosen independently.
Hence, (31) uniquely defines k̂G (x − x∗0 ) for x ∈ Ω. Solving
it for x yields the desired ray equation A. Aliasing Frequency at the Position x
S Solving (31) for f yields the frequency
x = x∗0 + γ k̂η (x∗0 , ω), 0≤γ≤∞ (32a) ηc
fηS (x, x0 ) = , (34)
with ∆x0 (x0 )(k̂G,t0 (x − x0 ) − k̂S,t0 (x0 ))
k̂S,t0 (x∗0 ) + ∆ ηc
 

x0 (x0 )f
at which the secondary source located at x0 considerably
s 2  contributes the η-th aliasing component PηS to a distinct
k̂Sη (x∗0 , ω) = R∗0 
  
∗ ηc . position x inside the target region. Since the aliasing frequency
 1 − k̂S,t0 (x0 ) + ∗ 
∆x (x0 )f
 0  defines the bound up to which no aliasing is contributed to x,
0 the minimum of |fη6S=0 (x, x0 )| over all aliasing components
(32b) has to be considered. For this pair of listening and secondary
The rotation matrix R0 = [t0 n0 uz ] contains the listed source positions it reads
vectors as its columns. Analogous to the discrete linear SSD c
in Sec. III-B, a real-valued solution for the rays’ direction only f S (x, x0 ) = . (35)
exists, if ∆x (x0 ) G,t (x − x0 ) − k̂S,t (x0 )

0 0 0

k̂S,t (x∗0 ) + ηc
(33) An infinite aliasing frequency is obtained, when k̂G,t0 (x −


≤1
0
∆x0 (x0 )f
x0 ) = k̂S,t0 (x0 ). This is fulfilled, if the direction of x relative
is fulfilled. For cross-validation of the calculus, a uniformly to the secondary sources is aligned with the propagation
sampled linear SSD may be chosen as a special case of the direction of the virtual sound field k̂S (x0 ). As k̂G,t0 (x − x0 )
convex SSD: x0 = [u, 0, 0]T , ∆u = ∆x , t0 = ux , n0 = and k̂S,t0 (x0 ) do not exceed ±1, the frequency is lower
uy , R∗0 = I (identity matrix), and k̂·,t0 = k̂·,x . Thus, (26) is bounded by
indirectly a special case of (32). c c
f S (x, x0 ) ≥ ≥ .
∆x0 (x0 ) 1 + |k̂G,t0 (x − x0 )| 2∆x0 (x0 )
IV. E STIMATION OF THE S PATIAL A LIASING F REQUENCY (36)
In Sec. III, the connection between the listening position The first inequality defines the lower bound for arbitary virtual
x, the secondary source position x0 , the sampling distance sound fields. Additionally, arbitrary positions x relative to x0
∆x0 (x0 ), and the temporal frequency f was established for are included by the second bound. It corresponds to the well-
the occurrence of spatial aliasing. This relation will now known half-wavelength sampling criterion ∆x0 (x0 ) ≤ λ/2.

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 7

The aliasing frequency f S (x) for the position x is defined n0 n0 n0


as the frequency up to which no secondary source contributes xl Cl
any aliasing to x. Hence, the minimum of f S (x, x0 ) over all min
kG,t
xl Cl xl Cl
0

Rl
secondary sources defines this frequency Rl Rl
u u u
f S (x) = min f S (x, x0 ). (37)
x0 ∈∂Ω+
min x max t0 max t0
x0 kG,t minx max t0
The minimisation is carried out over the part of the boundary kG,t0
0 kG,t
0 0
kG,t0
0 kG,t
0

∂Ω+ where the secondary source selection criterion aS (x) (a) (b) (c)
yields unity, see Eq. (9). Analytical solutions to the minimisa-
Fig. 7: The three sketches illustrate the three different cases that have
tion problem for elementary virtual sound fields S(x, ω), e.g. max min
to be considered for the computation of k̂G,t0 (x0 ) and k̂G,t0 (x0 ) for
point sources and plane waves, and simple geometries of the a circular listening region Cl . In (a), x0 is part of the circle. For the
SSD are subject to further research. In order to illustrate the second case depicted in (b), the x0 is not part of the circle, but
principle of the prediction model, it is sufficient to use a brute- the circle intersects with the boundary ∂Ω. The circle is completely
force minimisation on a dense grid of x0 . The algorithm used inside Ω in (c).
to predict the aliasing-frequency is given in Fig. 5. While it
is not the most efficient approach to a specific scenario, this 1: function M IN M AX WAVENUMBER C IRCLE(Cl , x0 )
method is feasible for the scenarios that are investigated in 2: xl , Rl ← Cl
Sec. V. Moreover, numerical approaches like brute-force or 3: % ← Rl/|xl −x0 |
iterative optimisation algorithms might be the only alternative 4: k̂l,t0 ← k̂G,t0 (x
ql − x0 )
for complex scenarios (w.r.t. sound field and SSD geometry) 5: if % > 1 or − 1 − %2 > k̂l,t0 then
where an analytical solution cannot be derived in closed form. min
6: k̂G,t0
← −1
7: else q q
B. Aliasing Frequency for an Extended Listening Area 8:
min
k̂G,t0 ← k̂l,t0 1 − %2 − % 1 − k̂l,t 2
0
So far, the aliasing frequency for a distinct position x ∈ 9: end if q
Ω has been discussed. It is of further interest to find anti- 10: if % > 1 or + 1 − %2 > k̂l,t0 then
aliasing conditions for the area Ωl ⊆ Ω. Fig. 4b shows an max
11: k̂G,t ←1
exemplary geometry. As a starting point, the aliasing frequency 0
12: else
f S (x, x0 ) for a distinct pair of x and x0 is considered, see
q q
max 2 2
13: k̂G,t ← k̂ 1 − % + % 1 − k̂l,t
(35). The minimum over all listening positions x inside Ωl 0 l,t0 0
14: end if
yields the aliasing frequency fΩSl (x0 ) for a secondary source min max
not radiating any aliasing components into Ωl . For a convex 15: return k̂G,t 0
, k̂G,t 0
16: end function
boundary ∂Ω, the angle between the normalised wavenumber
vector k̂G (x − x0 ) and the tangent vector t0 is in the range Fig. 8: Algorithm to determine the minimum and maximum tangential
[0, π]. Hence, the tangential component k̂G,t0 (x − x0 ) as the component of the local wavenumber vector for a secondary source
position x0 and a circular region Cl with radius Rl and centre xl .
cosine of this angle is a monotonically decreasing function. The derivation is given in App. I.
Searching for the minimum w.r.t. x, only the extremal values
min max
kG,t 0
(x0 ) and kG,t 0
(x0 ) have to be considered, see Fig. 4b.
The aliasing frequency is given by The aliasing frequency for Ωl as the minimum over all active
c secondary sources reads
fΩSl (x0 ) = min f S (x, x0 ) = (38)
x∈Ωl ∆x0 (x0 )
! fΩSl = min fΩSl (x0 ) . (40)
1 1 x0 ∈∂Ω+
· min max ; min .
k̂G,t (x0 ) − k̂S,t (x0 ) k̂G,t
0 0
(x0 ) − k̂S,t (x0 )
0 0
The algorithm to determine this aliasing frequency is given
in Fig. 6. Compared to the baseline algorithm in Fig. 5, it is
It can be seen that for fixed shape and size of Ωl , the
augmented by the function M IN M AX WAVENUMBER(Ωl , x0 )
angular width between kmin max
G,t0 (x0 ) and kG,t0 (x0 ) decreases min max
in line 5. It determines kG,t (x0 ) and kG,t (x0 ) for a given
with increasing distance of Ωl from x0 . For the limiting case, 0 0
secondary source position x0 and listening area Ωl . For
both extremal values coincide.
arbitrary shapes of Ωl , this determination is challenging as
Analogous to (36), the lower bound of fΩSl (x0 ) for arbitrary
it requires to find the locations on ∂Ωl whose tangent is
virtual sound fields
intersecting with ∂Ω at x0 . A circular listening area Cl —
c
fΩSl (x0 ) ≥   for practical relevance—simplifies the following discussion.
max min
∆x0 (x0 ) 1 + max |k̂G,t 0
(x0 )|; | k̂G,t 0
(x 0 )| Moreover, it is often regarded as an approximation of the
(39) listener’s head. As shown in Fig. 7, three different cases have
is found by inserting the extreme values for k̂S,t0 (x0 ) into to be considered for the circular area centred at xl ∈ Ω
(38). A further generalisation towards arbitrary listening areas with radius of Rl . In Fig. 7a, the distance between xl and
Ωl yields a lower bound corresponding to the half-wavelength x0 is smaller than the radius Rl . The secondary source is
sampling criterion, again. located inside the circle. No further restriction is applied to the

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 8

n0 1: function A LIASING E XTENDED C ONTROL(Ω, Ωl , Ωc )


Ωc Ωc 2: fΩSl ← ∞
3: for x0 , x00 ← ∂Ω+ do . (28), (44), densely sampled
kmin
S kmax 4:
min
k̂S,t max
, k̂S,t ← M IN M AX WAVENUMBER(Ωc , x0 )
S 0 0
min max
5: if k̂S,t0 (x0 ) < k̂S,t 0
or k̂S,t0 (x0 ) > k̂S,t0
then
6: continue . next iteration
min max t0
x0 kS,t 7: end if
kS,t0 0
8: ∆x0 ← |∆u x00 |
min max
9: k̂G,t 0
, k̂G,t0
← M IN M AX WAVENUMBER(Ωl , x0 )
min c
10: f ← min . (38)
(a) (b)

∆x0 G,t0 − k̂S,t0 (x0 )

c
Fig. 9: (a) illustrates the secondary sources (green and orange 11: f max ← max . (38)
loudspeaker symbols), which have to be active in order to synthesise ∆x0 G,t0 − k̂S,t0 (x0 )

the corresponding virtual point sources (green and orange dot) inside 12: fΩSl ← min(fΩSl ; f min ; f max ) . (40)
the control region Ωc . The local wavenumber vectors k̂S (x0 ) at each 13: end for
secondary source position are given by the arrows. (b) shows the
bounds of k̂S (x0 ) for which the secondary source at x0 still has to 14: return fΩSl
be active. 15: end function
Fig. 10: Generic brute-force search algorithm to determine the alias-
S
ing frequency fΩl |Ωc given by (44). Major modifications compared to
tangential components and they take the respective extremal the algorithm in Fig. 6 are marked by bold line numbers. An example
values of ±1. A similar scenario is shown in Fig. 7b, where Rl of M IN M AX WAVENUMBER for a circular region is given in Fig. 8.
and/or the angle between xl −x0 and the normal vector n0 are
large enough for the circle to be partly outside Ω. Depending max
and k̂S,t (x0 ), for which fΩSl |Ωc (x0 ) is finite, have to be
on the halfspace (w.r.t. n0 ) in which xl is located, either 0

min max considered. Substituting k̂S,t0 (x0 ) with these values in (38)
the kG,t (x0 ) or the kG,t (x0 ) component reach its extremal
0 0 yields
value. The last alternative depicted in Fig. 7c covers the case,
c
where the circular area is completely inside Ω. The derivation fΩSl |Ωc (x0 ) ≥ (42)
min
of kG,t max
(x0 ) or kG,t (x0 ) for the three cases is given in App. I. ∆x0 (x0 )
0 0 !
min max
The resulting algorithm to determine kG,t (x0 ) and kG,t (x0 ) 1 1
0 0
· min ; max .
is given in Fig. 8. max
|k̂G,t min
(x0 ) − k̂S,t (x0 )| |k̂S,t min
(x0 ) − k̂G,t (x0 )|
0 0 0 0

The bound for fΩSl |Ωc (x0 ) is greater or equal to the lower
C. Increased Aliasing Frequency
bound for fΩSl (x0 ) given by (39). This can be proven by insert-
After the derivations for a distinct listening position and min max
ing ±1 for k̂S,t (x0 ) and k̂S,t (x0 ) as worst case scenarios.
extended listening area, the discussion is now steered towards 0 0
Hence, actively prioritising the accurate synthesis inside Ωc
the potential increase of the aliasing frequency. The geometry
potentially increases the aliasing frequency for the region Ωl .
for the following explanation is shown in Fig. 9a: According
However, Ωc has to be sensibly chosen in order to provide
to the ray model of SFS introduced in Sec. III, an individual
correct synthesis in Ωl . If Ωl = Ωc , a comparison of Fig. 9b
secondary source at x0 mainly contributes to the synthesised min max
with Fig. 4b reveals that k̂S,t (x0 ) and k̂S,t (x0 ) coincide with
sound field along the ray whose direction is determined by min max
0 0

the propagation direction of the virtual sound field at x0 k̂G,t0 (x0 ) and k̂G,t0 (x0 ), respectively. Hence, the bound can
(green and orange arrows). A SFS method may prioritize be further simplified to
the synthesis inside an extended control region Ωc which is c
fΩSl |Ωc =Ωl (x0 ) ≥ max min
. (43)
not necessarily equal to the listening region Ωl . A secondary ∆x0 (x0 )(k̂G,t0 (x0 ) − k̂G,t 0
(x0 ))
source contributes to the reproduced sound field inside Ωc , if In general, the aliasing frequency for a certain combination
its ray intersects with Ωc . Hence, only a subset of the SSD is of Ωl and Ωc as the minimum over all active secondary sources
required to synthesise the virtual sound field inside Ωc . The reads
secondary source at x0 has a non-zero driving signal, only if fΩSl |Ωc = min fΩSl |Ωc (x0 ) . (44)
min max
k̂S,t0 (x0 ) lies between k̂S,t 0
(x0 ) and k̂S,t 0
(x0 ), see Fig. 9b. x0 ∈∂Ω+

These bounds depend on the shape and position of Ωc relative The algorithm to estimate this frequency is shown in Fig. 10.
min max
to x0 . The aliasing frequency of (38) is thus modified to The determination of the k̂S,t 0
(x0 ) and k̂S,t 0
(x0 ) follows
min max
the same paradigm as for the k̂G,t0 (x0 ) and k̂G,t (x0 )
fΩSl |Ωc (x0 ) = (41) 0
( in Sec. IV-B. Hence, the same abstract function M IN -
S min max
fΩl (x0 ) for k̂S,t0 (x0 ) ≤ k̂S,t0 (x0 ) ≤ k̂S,t0 (x0 ) M AX WAVENUMBER may be used. Again, the concrete im-
∞ otherwise. plementation for arbitrary shapes of Ωc is challenging and a
circular control region Cc with radius Rc and centre xc may
To derive the lower bound for fΩSl |Ωc (x0 ) covering ar- be considered for simplification. In this case, the procedure
min
bitrary virtual sound fields, the extreme values k̂S,t 0
(x0 ) given by Fig. 8 can be reused.

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 9

(a) f = 1.0 kHz (b) f = 1.5 kHz (c) f = 2.0 kHz (d) f = 2.5 kHz dB kHz
1 10 3
1 5 2.5
0
y/m

0 0 2
-5
-10 1.5
−1
-1 -15 1
(e) f = 1.0 kHz (f) f = 1.5 kHz (g) f = 2.0 kHz (h) f = 2.5 kHz (i) f S (x)

1
y/m

−1

−1 0 1 −1 0 1 −1 0 1 −1 0 1 −1 0 1
x/m x/m x/m x/m x/m
T
Fig. 11: The plots (a)-(d) show the real part of a virtual point source located at xps = [0, 2.5, 0] m synthesised by WFS for different
frequencies. The according error ε̂(x, ω) caused by aliasing is plotted in (e)-(h), see (47). For the positions above the solid black lines, the
S S
predicted anti-aliasing criterion involving f (x) is violated. (i) shows the aliasing frequency f (x) estimated by the algorithm in Fig. 5. A
discrete colormap is used for better visibility.
(a) 20
0 4 (b) 9 (c)
±20
3 8
ε̂(x, ω) / dB

0
±20 9
0 2 7
±20 6 7 8
0 1 6
±20 3 4 5
0 0 5
−20
−40 0 1 2
1 f /kHz 10 1 f /kHz 10
Fig. 12: In (a) and (b), the solid lines show the error ε̂(x, ω) defined in (47) for the same synthesis scenario as in Fig. 11. The lines have
been shifted to enhance visibility. The according 0 decibel reference is indicated by horizontal dashed lines of the same color. The circles
S
mark the estimated aliasing frequencies f (x) given by (37). Plot (c) depicts the evaluated positions x with their corresponding index.
(a) f = 1.0 kHz (b) f = 2.0 kHz (c) f = 3.0 kHz (d) f = 4.0 kHz dB kHz
1 10 5
1 5 4
0
y/m

0 0 3
-5
-10 2
−1
-1 -15 1
(e) f = 1.0 kHz (f) f = 2.0 kHz (g) f = 3.0 kHz (h) f = 4.0 kHz (i) f S (x)

1
y/m

−1

−1 0 1 −1 0 1 −1 0 1 −1 0 1 −1 0 1
x/m x/m x/m x/m x/m
T
Fig. 13: The plots show the quantities analogous to Fig. 11 for a virtual focused point source located at xfs = [0, 0.75, 0] m with the
T
orientation nfs = [0, −1, 0] (yellow cross and arrow). For the positions outside the area surrounded by the black lines, the anti-aliasing
S
criterion involving f (x) is violated, see (37). In addition, the cyan circle indicates the aliasing-free area according to (48).

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 10

V. A PPLICATION E XAMPLES OF THE M ODEL (a) L = 36, µ ≈ 1.4 (b) L = 896, µ ≈ −19.0
1
To further study its performance, the model is applied to
1
different SFS methods. The predicted aliasing frequency will
be compared to numerical simulations of the synthesised sound

y/m
fields as well as to results of other theoretical treatises in 0 0
the literature. Hereby, a uniformly discretised circular SSD
of radius R is assumed for most of the examples. It is chosen
since some SFS methods, e.g. NFC-HOA, are limited to this −1
geometry. The SLP for a continuous, circular SSD specialises -1
to (c) L = 36, µ ≈ 1.4 (d) L = 896, µ ≈ −19.0 dB
Z 2π 10
P (x, ω) = D(x0 , ω)G(x − x0 , ω)R dα0 , (45) 1 5
0
with the secondary source positions x0 = 0

y/m
0
R[cos α0 , sin α0 , 0]T . The tangent and normal vector read -5
t0 = [− sin α0 , cos α0 , 0]T and n0 = −[cos α0 , sin α0 , 0]T ,
respectively. For the discrete SSD, all simulations use −1 -10
R = 1.5 m and L = 56 equi-angularly spaced secondary
-15
sources. This yields the sampling distance ∆x0 = 2πR/L, −1 0 1 −1 0 1
which coincides with the arc length of an equally partitioned x/m x/m
circle. The half-wavelength sampling criterion is then met Fig. 14: The plots (a) and (b) show the real part of a monochromatic
at f ≈ 1 kHz for this parametrisation. The sound field (f = 2.0 kHz) virtual point source located at xps = [0, 2.5, 0] m
T

reproduced by the discrete SSD reads synthesised with an exponentially sampled circular SSD. The number
L−1 of secondary sources L and the spacing parameter µ for the expo-
S 2πR X (n) (n) nential sampling are given above each plot. The SSD is driven by
P (x, ω) = D(x0 , ω)G(x − x0 , ω) , (46) WFS. The according error ε̂(x, ω) caused by aliasing is plotted in
L n=0
(c) and (d), see (47). For the positions above the solid black lines,
S
(n) T the predicted anti-aliasing criterion involving f (x) is violated. The
with x0 = R[cos(n 2π L ), sin(n L ), 0] . The normalised error

S
aliasing frequency f (x) is estimated by the algorithm in Fig. 5.
between the reproduced sound fields

P S (x, ω) − P (x, ω)
ε̂(x, ω) = 20 log10

(47) reflect the nature of the artefacts gradually reducing with
P (x, ω)

increasing distance to the SSD. In Fig. 12, the error is
measures the influence of the spatial sampling on the accu- plotted over frequency for nine different positions. Here, the
racy. Note that the measure takes the synthesised sound field drastic decrease of spatial aliasing artefacts near the predicted
P (x, ω) of the continuous SSD instead of the virtual sound frequency (circles) becomes more obvious.
field S(x, ω) as the reference. This intentionally excludes other 2) Focused Point Source: Amongst other sound field syn-
synthesis artefacts such as diffraction and 2.5D amplitude thesis techniques, WFS allows for the synthesis of so-called
errors from the evaluation. If the error decreases considerably focused sources. The according driving function is given by
below the predicted aliasing frequency, the model’s result can Eq. (64) in App. II-A. It aims at creating the impression
be regarded as reasonable. of a monopole source inside Ω. The underlying principle is
termed acoustic focusing by time reversal/phase conjugation
[35]–[37]: The sound field emitted by the SSD converges
A. Wave Field Synthesis towards a focus point xfs (yellow cross) in one half space
As one representative of conventional SFS approaches, and diverges afterwards, see Fig. 13. The half spaces are
WFS does not actively try to avoid aliasing artefacts inside a defined by the orientation of the focused source nfs (yellow
particular region. For the estimation of the aliasing frequency, arrow). An aliasing-free region around the focus point evolves,
the algorithm in Fig. 5 is used. which narrows with increasing frequency. This phenomenon is
1) Point Source: For a virtual point source, the WFS correctly predicted by the geometrical model (black lines).
driving function is given by Eq. (63) in App. II-A. The For an infinite linear SSD with sampling distance ∆x ,
synthesised sound field, the sampling error, and the predicted Wierstorf [38, Eq. (3.4)] empirically found a formula for the
aliasing frequency are shown in Fig. 11. The spatial structure radius of an aliasing-free circular region around the focus
of aliasing with stronger artefacts at positions closer to the point. It is a function of the minimum distance ds between
virtual point source is in agreement with the plotted sound the focal point and the linear SSD. Transferring it to a circular
fields. A significant drop of the sampling error is observable SSD the minimum distance is given by ds = R − rfs , whereas
near the predicted boundary between the aliasing-corrupted rfs ≤ R is the distance of the focal point from the centre of
and aliasing-free region. As the presented ray model is an the SSD. The modified formula reads
approximation of the underlying SFS problem, the strict d c  r  Lc
separation between aliasing-free and aliased regions does not Rl = s = 1 − fs . (48)
f ∆x R 2πf

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 11

Its results are plotted as the cyan circles in Fig. 13: For 1 (a) M = 27 (b) M = 13
kHz, the radius underestimates the aliasing-free region. In 1
the remaining plots, the circular region slightly exceeds the 1
model’s prediction. The higher the frequency, the closer the
two predictions match. As the formula assumes a circular

y/m
0 0
region, it is not capable of predicting the correct contour of the
aliasing-free region. The radius Rl in Eq. (48) may be replaced
by the distance of a distinct coordinate x from the focus point −1
xfs . Solving the equation for f yields the position-dependent −1
aliasing frequency −1 0 1 −1 0 1
Lc R − rfs x/m x/m
f S (x) = (49) Fig. 15: The plots show a monochromatic (f = 2 kHz) virtual
2πR |x − xfs | T
point source at [0, 2.5, 0] m synthesised using NFC-HOA with two
predicted by the model of Wierstorf. For the synthesis sce- different modal bandwidths M . The black circle indicates the circular
nario under investigation, Wierstorf’s prediction yields approx- area of radius Rl = Rc ≈ M c/2πf .
imately 2 kHz for the centre position, i.e. x = 0. This also
agrees with the plot of Fig. 13b, where the cyan circle barely
includes the origin for f = 2 kHz. The approach proposed B. Near-Field-Compensated Higher-Order Ambisonics
in this paper estimates an aliasing frequency of approximately
4 kHz. The coarse approximation of the aliasing-free region For 2.5D synthesis scenarios, NFC-HOA states the explicit
by a circle in Wierstorf’s model hence yields an aliasing solution to the SLP in (45). Its driving function for a virtual
frequency which is lower by a factor of 2. point source is given by (66) in App. II-B. In the equation,
3) Non-Uniformly Discretised SSD: In order to demon- the summation is truncated to ±M which constitutes a spatial
strate the capabilities of the geometric model to incorporate lowpass filtering of the virtual sound field to avoid spatial
non-standard SSDs, an exponentially spaced circular SSD aliasing artefacts. M is usually referred to as the modal band-
is chosen. The secondary source positions read x0 (u) = width. Fig. 15 illustrates that for an appropriately chosen M ,
T
R[cos α0 (u), sin α0 (u), 0] with the azimuth angle given as a circular region of high accuracy emerges around the centre
of the loudspeaker array. The radius of this control region can
eµ|u| − 1 π be approximated by Rc ≈ M c/2πf [5, Eq. (2.41)]. Although
α0 (u) = π sgn(u) + , u ∈ [−1, 1] . (50)
eµ − 1 2 reducing spatial aliasing, a small M has the drawback of a
Depending on whether the spacing parameter µ is neg- small listening region with high synthesis accuracy. Its radius
ative/positive, the angle between two adjacent secondary Rl ≈ M c/2πf also decreases with frequency. In the literature
sources in-/decreases the closer the secondary sources’ az- [5, Eq. (4.26)], the optimal modal bandwidth w.r.t. a trade-
imuth is to π/2. The sound field reproduced by the discrete off between spatial aliasing and available listening region is
SSD given by (46) has to be adjusted to given by M = b(L−1)/2c. This value will be compared with
L−1 the predictions of the model: For the aliasing frequency, (41)
S (n) (n)
D(x0 , ω)G(x − x0 , ω) ∆x0 (u(n) ) , (51) together with (38) has to be considered. Since the listening
X
P (x, ω) =
n=0 and control area have the same radii and are co-located at
min min
(n) the centre of the circular SSD, k̂G,t (x0 ) = k̂S,t (x0 ) and
with x0 = x0 (u(n) ), ∆x0 (u)
µ|u|
= (2πR e )/(L(eµ −1)), and max max
0 0
k̂G,t0 (x0 ) = k̂S,t0 (x0 ) holds for every x0 . The according
u(n) = (2n−L)/L. The synthesised sound field and the sampling values read ∓Rl/R for Rl < R, and ∓1 otherwise. They are
error are shown in Fig. 14 for two different parametrisations given by (62) in App. I. Combining (41), (38), and (62) yields
of the SSD. The same point source as in Sec. V-A1 serves
as the virtual sound field. The spacing parameter µ and the fCSl |Cl (x0 ) = (52)
number of secondary sources L have been chosen such that the Rl
1
for

number of active secondary sources selected by the selection R > 1,
 1+|k̂S,t0 (x0 )|


criterion (9) is equal to the uniform case in Sec. V-A1. It allows c Rl
· Rl 1
for |k̂S,t0 (x0 )| ≤ R ≤ 1,
for comparability as the sound field is always synthesised by ∆x0 (x0 ) 
 R +|k̂S,t0 (x0 )|
Rl
same number of secondary sources. For Fig. 14a and 14c, the ∞ for < |k̂S,t0 (x0 )| .

R
positive µ leads to a denser SSD around π/2. Compared to
the uniform case depicted in Fig. 11c and 11g, the area of Inserting Rl = 2πf
Mc
and ∆x (x0 ) = 2πR/L leads to an implicit
low aliasing error is smaller. A sparser sampling around π/2 is S
formulation w.r.t. fCl |Cl (x0 ). Its solution reads
chosen for Fig. 14b and 14d: Here, an improvement w.r.t. the
aliasing error can be observed. For both parametrisations, the fCSl |Cl (x0 ) = (53)
predictions of the aliasing frequency by the geometric model L
for M > L

,
(black lines) agree with the error plots. The findings agree with 
 1+|k̂S,t (x0 )| 1+|k̂S,t (x0 )|
c  0 0
the investigation by Corteel [39], which indicated a possibly · L−M
for L L
2 ≤ M ≤ 1+|k̂S,t (x0 )| ,
positive impact of irregularly spaced arrays on the aliasing 2πR   |k̂S,t0 (x 0 )| 0

∞ for M < L2 .

properties.

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 12

(a) f = 1.0 kHz (b) f = 2.0 kHz (c) f = 3.0 kHz (d) f = 4.0 kHz dB kHz
1 10 5
1 5 4
0
y/m

0 0 3
-5
-10 2
−1
-1 -15 1
(e) f = 1.0 kHz (f) f = 2.0 kHz (g) f = 3.0 kHz (h) f = 4.0 kHz (i) f S (x)

1
y/m

−1

−1 0 1 −1 0 1 −1 0 1 −1 0 1 −1 0 1
x/m x/m x/m x/m x/m
T
Fig. 16: The plots (a)-(d) show the real part of a virtual point source located at xps = [0, 2.5, 0] m synthesised with LWFS (M = 27)
T
for different frequencies. The expansion centre of the control area xc = [−0.5, 0.75, 0] m and a circle with a radius of Rc = M c/2πf are
indicated by the cross and the dashed line, respectively. The plots (e)-(h) below show the error ε̂(x, ω) for the according frequency. The
plot in (i) shows the estimated aliasing frequency for the same synthesis scenario. It was computed using the algorithms in Fig. 8 and 10
with Rl = 0, xl = x, Rc = M c/2πf , and xc . A discrete colormap is used for better visibility in subplot (i).

A generalisation towards arbitrary virtual sound fields is found the circular SSD under investigation (L = 56), the chosen
by inserting +1 for |k̂S,t0 (x0 )|. The lower bound of the modal bandwidth is optimal for the centre position. It can
aliasing frequency reads be seen that aliasing artefacts are still present in the target
( region (dashed circle) although spatial lowpass filtering was
S
cL
for M ≥ L2 applied. For the investigated frequencies f , the geometric
fCl |Cl (x0 ) ≥ 4πR (54)
∞ for M < L2 . model correctly predicts that parts of the target region are
affected by spatial aliasing.
In order to maximise the lower bound, it is necessary for
M to be smaller than L/2. Under the premise that the size The aliasing frequency fCSl |Cc as a function of M is shown
of the available listening area is supposed to be as large as Fig. 17 for ten different positions xc . The plot can also be
possible, the best choice for M is the largest integer fulfilling used to find the optimal, i.e. aliasing-free, M for a given
this criterion. Thus the prediction of the geometric model for frequency. For comparison, the aliasing frequency for M → ∞
optimal modal bandwidth in NFC-HOA reads M = b(L−1)/2c corresponding to conventional WFS is also shown. A circular
and agrees with the results from the literature. listening region Cl with Rl = 8.5 cm was chosen for the sim-
ulation to approximate the human head [41, Sec. IV.F]. For all
positions, the aliasing frequencies increase with decreasing M .
C. Local Wave Field Synthesis using Spatial Bandwidth Lim- However, low M have the drawback of a small control region
itation (LWFS-SBL) Rc , where correct synthesis is provided. The frequencies at
In the previous section, it was shown that an appropriate which the radius of the control region Rc = M c/2πf is smaller
choice of the modal bandwidth M can be utilised to reduce than Rl is highlighted by the red-shaded area. The frequencies
spatial aliasing. However, without further modification NFC- at its boundary (Rl = Rc ) are marked by circles. They will
HOA does not allow for shifting the area of high synthesis be referred to as the maximum artefact-free frequencies and
accuracy from the array centre. As an example for a more will be used for the upcoming discussion: At the positions 0-
flexible LSFS method, Local Wave Field Synthesis using 5, the artefact-free frequency is between ≈ 9.3 (pos. 5) and
Spatial Bandwidth Limitation (LWFS-SBL) is chosen. The ≈ 17.3 (pos. 0) times as high as the aliasing frequency of
approach was originally published in [9] with a discrete-time WFS (M = ∞). With at least 17.9 kHz (pos. 5), the frequency
implementation presented in [40]. The details on the driving reaches to the end of the audible range. For the positions 6-
function are given in App. II-C. LWFS-SBL allows to shift 9, which are closer to the virtual point source, the aliasing
the region of high accuracy to xc . If the location coincides frequency is comparatively low: A very small M has to be
with the centre of the loudspeaker array, the same optimal chosen to avoid spatial aliasing. Compared to the positions
modal bandwidth M as for NFC-HOA holds. Fig. 16 shows discussed before, the gain w.r.t. spatial aliasing frequency is
the synthesised sound fields, the normalised error, and aliasing relatively small. At pos. 9, the artefact-free frequency (black
frequency for xc = [−0.75, 0.5, 0]T m and M = 27. For circle) is about 2.9-times as high as the frequency for WFS.

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 13

(a) 40 Rc = Rl (b) (a) Rc = 0.15 m (b) Rc = 0.15 m

1 1
4 0
fCS |C / kHz

10 3

y/m
9
8 0
6 7 8
c

9
l

3 4 5
5 −1
6 7 0 1 2

1 (c) Rc = 0.5 m (d) Rc = 0.5 m


0 10 20 30 40 50 60 70 80 90 ∞ 1
M 1
S
Fig. 17: (a) shows the estimated aliasing frequency fCl |Cc ,
see (41), as
a function of the modal bandwidth M for different positions xl = xc

y/m
of the circular listening region. The aliasing frequency was computed 0 0
using the algorithm in Fig. 10 with a circular listening and control
region (see Fig. 7). The parameters are Rl = 8.5 cm, Rc = M c/2πf ,
and xl = xc . The area shaded in red indicates combinations of f −1
and M , for which the radius of the control area Rc is smaller than −1
the radius of the listening area Rl . (b) depicts the positions xl = xc −1 0 1 −1 0 1
with their index corresponding to the left plot. x/m x/m
Fig. 18: The plots show the real part of a monochromatic (f = 2.5
kHz) virtual plane wave with the propagation direction ns =
T
D. Multizone Sound Field Synthesis [0, −1, 0] synthesised with LWFS (M = d ωc Rc e). For the driving
function, see App. II-C. The centre of the control area xc =
As stated in Sec. I, many approaches for Multizone SFS T
[−0.9, 0, 0] m and a circle with a radius of Rc are indicated
utilise the concept of bright and quiet zones. A virtual sound by the cyan cross and line, respectively. The centre line and the
field is supposed to be synthesised in the bright zone, while boundaries of the main lobe are plotted solid and dashed cyan.
Analogous quantities for the aliasing a.k.a. grating lobe are drawn
sound pressure in the quiet zone should be as low as possible.
in magenta. The listening area with Rl = 0.25 m and its centre
It was shown in the past sections that a discrete SSD may T
xl = [0.5, −0.75, 0] m (black cross) is indicated by the solid black
cause aliasing artefacts which radiate into undesired regions circle. (a) and (c) show the predictions using the concept of Donley
in space. In the context of Multizone SFS, it is of interest et al., while (b) and (d) correspond to the proposed method. The
at which frequency these artefacts contribute to the sound secondary sources used for the prediction are highlighted in green.
pressure of the quiet zone. A geometric model to predict this
frequency for virtual plane waves was proposed by Donley
et al. [27]. It is based upon own prior work [4]. For the aliasing a.k.a. grating lobe (dashed magenta) is assumed to
upcoming discussion, it is highly recommended to revisit [27, be equal to the width of the main lobe, i.e. 2Rc . The anti-
Sec. V] and [4, Sec. IV-D]. The difference between the model aliasing condition is violated, if the grating lobe intersects with
of Donley et al. and the proposed approach will be discussed the listening region, i.e. the quiet zone. For this, the distance
w.r.t. the concept, how the spatial structure of the aliasing between the centre line of the grating lobe (solid magenta) and
is described. A direct comparison of the calculus for the the centre xl (black cross) has to be smaller than Rl + Rc .
aliasing frequency is challenging, because Donley et al. used Compared to the work of Donley et al., the proposed
the dimensionality a.k.a. degrees of freedom of a sound field approach does not consider a single secondary source. It takes
[1] for their derivation. As this framework assumes integer the whole part of the SSD intersected by the main lobe into
numbers for various quantities, rounding operations yield to account, see green arc in Fig. 18b and 18d. The consideration
different formulas than for the approach presented in this is necessary whenever the propagation direction of the sound
article. field k̂(x0 ) relative to the SSD normal n0 or the sampling
The scenario depicted in Fig. 18 is used for the discussion: distance ∆x0 (x0 ) changes with x0 . Such dependencies are
A virtual plane wave is supposed to be synthesised within caused by the curvature of the SSD, curved wave fronts of the
the bright zone (cyan circle), which is equivalent to a circular virtual sound field or non-uniform sampling of the SSD. A
control region Cc in the current framework. The circular quiet uniformly discretised, linear SSD synthesising a plane wave
zone (solid black), where aliasing is supposed to be avoided, would be an example where the mentioned dependencies do
represents the listening region Cl . The width of the main lobe not exist. The consideration of a single secondary source
between the dashed cyan lines is equal to the diameter 2Rc would be sufficient and both models yield the same result
of the control region. The model of Donley et al. considers in this case. The outer bounds of the grating lobes (dashed
the intersection point of the main lobes’ centre line (solid magenta) predicted by the current approach are not parallel to
cyan) with the SSD (green dot in Fig. 18a and 18c). Using the centre line (solid magenta), see Fig. 18b and 18d. For
the relations given by (32), the propagation direction of the Rc = 0.15 m the predictions of the two models are very
aliasing ray (solid magenta) is calculated. The width of the similar due to the relatively small active part of the SSD (green

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 14

arc). It is reasonably represented by the single intersection 3

fCS |C / kHz
point. For Rc = 0.5 m, however, this arc is large enough
to cause an observable difference between the predictions. In 2

c
Fig. 18c, the model of Donley et al. already predicts a violation

l
of the anti-aliasing criterion, while no intersection is present
1
in Fig. 18d. In general, both models yield the same result for 0 0.5 1 1.5 2 2.5 ∞
Rc = 0. Rc / m
The two principles can be further discussed by transferring S
the concept of Donley et al. to the framework presented in Fig. 19: The plots shows the estimated aliasing frequency fCl |Cc for
the scenario shown in Fig. 18 as a function of the control radius
this article and comparing the resulting aliasing frequencies. Rc . It was computed using the algorithms in Fig. 8 and 10 with
As already lined out, only a single intersection point is used. T T
xc = [−0.9, 0, 0] m and xl = [0.5, −0.75, 0] m. For blue line
This is equivalent to a control region with zero radius, i.e. Rl = 0.25 m was used. The red line shows the aliasing frequency
0 0
Rc0 = 0. Further, the grating lobe’s width is assumed to be 2Rc . for Rl = Rl + Rc and Rc = 0.
An intersection of the grating lobe with the listening region of
radius Rl is hence equivalent to the intersection of its centre
line with a circle of radius Rl0 = Rl + Rc . A comparison of Sec. 5.2]. Future research may investigate the connection
the resulting frequencies can be seen in Fig. 19: Both concepts between the output of the model and the results of listening
align for Rc = 0, while the concept of Donley et al. (red line) tests. A prediction of distinct aspects of human perception
results in a lower aliasing frequency for any other radius. It based on the parametrisation of the SFS scenario seems
can be deduced that the concept of Donley et al. estimates achievable, now.
a lower aliasing frequency especially for a large control area To support reproducibility of the presented results, the
Cl . For the tested example, the maximum difference between source code for all simulations is publicly available under [43].
the two estimations is approximately 1.2 kHz. It is however
emphasized again that the two concepts were transferred to the
same framework and compared with each other. A direct com-
A PPENDIX I
parison of particular formulas for the aliasing frequency from
E XTREMAL VALUES OF THE LOCAL WAVENUMBER VECTOR
the two publications would be misleading. While Donley et al.
FOR A CIRCULAR REGION
proposed an analytic solution for the aliasing frequency, the
generic framework performs its estimation using brute-force The tangential component of the local wavenumber vector
search. A direct comparison to an analytic solution derived for the Green’s functions is given by
from the presented framework remains future research. * +
x−x
0
VI. C ONCLUSIONS k̂G,t0 (x − x0 ) = ht0 |k̂G (x − x0 )i = t0 . (55)

|x − x0 |
This work presented a ray-based approximation of SFS
allowing the description of spatial aliasing artefacts. From this, If the circle Cl is completely inside Ω, see Fig. 7c, the
anti-aliasing conditions for different scenarios were derived. outer bounds of the circle define the extremal values for the
The predictions of the model have been discussed by different normalised local wavenumber vector of the Green’s function.
application examples including WFS, NFC-HOA, LWFS, and They are given by
Multizone SFS. The estimated aliased regions agree with  
the spatial structure of the aliasing artefacts observed in {min,max} cos αC ∓ sin αC
k̂G (x0 ) = k̂G (xl − x0 ) , (56)
the numerically simulated sound fields. For WFS, the model ± sin αC cos αC
correctly predicts the influence of non-uniform discretisation
which defines a rotation of the normalised local wavenumber
of the SSD. In future research, it may be used to optimise
vector k̂G (xl − x0 ) clockwise or counter-clockwise about the
the sampling scheme - w.r.t. spatial aliasing - for a given
angle
target sound field. Moreover, the geometric model can be  
Rl
used to find optimal parametrisations of SFS techniques and αC = arcsin = arcsin (%) . (57)
beam forming for sound reinforcement utilizing line arrays, |xl − x0 |
see [42]. Here, this was exemplarily shown for NFC-HOA For brevity, the ratio % of the circle radius Rl and the distance
and LWFS. For the scenario under investigation, the model between x0 and the centre xl is introduced. If the ratio is
predicts an artefact-free synthesis in LWFS up to a frequency larger than 1, the circle includes the x0 and the arcsin(·) has
which is at least 2.9-times as high as for conventional WFS. no real solution. Hence, the case shown in Fig. 7a applies and
With the presented framework, a generic tool to predict the min,max
k̂G,t (x0 ) = ∓1 holds. Using the trigonometric identities,
spatial aliasing frequency without the actual simulation of the 0
the bounding vectors can be expressed by
synthesised sound field emerged. Other than prior approaches,
the discussed algorithms are not limited to distinct types of
q 
2
{min,max} 1 − % ∓%
virtual sound fields or SSD geometries. k̂G (x0 ) =  q  k̂G (xl − x0 ) .
Spatial aliasing affects the human perception, e.g. the ±% 1 − %2
perception of timbre, of the synthesised sound fields [38, (58)

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 15

Their according tangential components are given by evaluating 2.5D WFS driving function for a virtual focused point source
the vector-matrix-vector multiplication located at xfs ∈ Ω with the orientation nfs is given by
r
{min,max} {min,max} WFS ωp
k̂G,t0 (x0 ) = ht0 |k̂G (x0 )i Dfs (x0 |xfs , nfs , ω) = afs (x0 ) −j 8π|x0 − xfs |
c
(64)
q   s
= 1 − %2 t0,x k̂G,x (xl − x0 ) + t0,y k̂G,y (xl − x0 ) (59)
ω
|x0 − xfs | hxfs − x0 |n0 i e+j c |xfs −x0 |
· 1+ .
  |xref − xfs | |xfs − x0 | 4π|xfs − x0 |
∓ % −t0,y k̂G,x (xl − x0 ) + t0,x k̂G,y (xl − x0 ) .
It is a modified version of the point source driving function
The first bracket constitutes the scalar product of the tangent given by [29, Eqs. (45) and (47)]. The according secondary
vector t0 and the normalised wavenumber vector k̂G (xl −x0 ), source selection criterion reads
which is the tangential component k̂l,t0 := k̂G,t0 (xl −x0 ). For
(
1 if hx0 − xfs |nfs i ≥ 0
the second bracket, the tangent vector is rotated about π/2, afs (x0 ) = (65)
0 otherwise.
which is equivalent to the normal vector n0 . As xl lies inside
the convex Ω, the normal qcomponent is always positive. It can For both driving functions, xref defines a reference position
hence be expressed by 1 − k̂l,t 2
. This yields at which correct synthesis is provided for asymptotically high
0
frequency, i.e. ω → ∞. It is set to the coordinates’ origin 0
{min,max}
q q within this article.
k̂G,t0 (x0 ) = k̂l,t0 1 − %2 ∓ % 1 − k̂l,t
2
0
. (60)
B. Near-Field-Compensated Higher Order Ambisonics
As the remaining task, the case depicted in Fig. 7b has to be The 2.5D NFC-HOA driving function for a virtual point
detected. Here, the circle overlaps with the boundary ∂Ω. This source located at xps = rps [cos αps , sin αps , 0]T with rps > R
{min,max}
is done by inserting the ∓1 for k̂G,t0 (x0 ) and solving reads [44, Eq. (6)]
q
the equation for k̂l,t0 . As a result, k̂l,t0 = ∓ 1 − %2 marks M (2)
h|m| ( ωc rps )
1
HOA
e+jm(α0 −αps ) .
X
{min,max} Dps (x0 |xps , ω) =
the critical value below/above which has to be
k̂G,t0 (x0 ) 2πr0 (2)
h|m| ( ωc R)
assigned to ±1. A conditional expression covering all three m=−M
(66)
cases of Fig. 7 is given by
with the spherical Hankel function [45, Eq. (2.1.76)] of second
{min,max}
k̂G,t0 (x0 ) = (61) kind and n-th order denoted as h(2)
n (·).

if % > 1 , else


 ∓1 C. Local Wave Field Synthesis using Spatial Bandwidth Lim-
q
itation

∓1 if k̂l,t0 ≶ 1 − %2 ,
 q q The generic 2.5D LWFS-SBL driving function for the
k̂l,t0 1 − %2 ∓ % 1 − k̂l,t
2
otherwise,

expansion centre xc reads [40, Eq. (8)]

0

where the upper and lower option for ∓ and ≷ applies for DLWFS−SBL (x0 |xc , ω) = (67)
min max
k̂G,t0
(x0 ) and k̂G,t0
(x0 ), respectively. 1
Z 2π
WFS
For the special case of Cl being in the centre of a circular S̄(npw |xc , ω)Dpw (x0 |npw , xc , ω) dαpw .
2π 0
SSD Ω with radius R, k̂l,t0 = 0 and % = Rl/R holds for all The integral states the superposition of conventional 2.5D
x0 . This yields WFS driving function [42, Eq. (2.177)]
(
if RRl > 1 , else
r
{min,max} ∓1 WFS ωp
k̂G,t0 (x0 ) = (62) Dpw (x0 |npw , xc , ω) =apw (x0 ) j 8π|x0 − xc |
∓ RRl otherwise. c (68)
ω
· hnpw |n0 ie−j c hnpw |x0 −xc i
A PPENDIX II for an ensemble of plane waves with their propagation di-
D RIVING F UNCTIONS rection npw = [cos αpw , sin αpw , 0]T distributed over the
unit circle. The according secondary source selection criterion
A. Wave Field Synthesis apw (x0 ) is given by (9) with k̂S (x0 , ω) = npw . The integral
The 2.5D WFS driving function for a virtual point source in (67) is approximated by a sum over Npw equiangularly
located at xps ∈
/ Ω is given by [42, Eq. (2.131)] spaced samples for αpw . For the simulations in this article,
Npw = 1024 was sufficient. The driving function for each
plane wave is weighted by the according plane wave coeffi-
r s
WFS ω 8π|x0 − xps ||x0 − xref |
Dps (x0 |xps , ω) =aps (x0 ) j cient expanded around xc . For a point source, the coefficient
c |x0 − xps | + |x0 − xref |
ω reads
hx0 − xps |n0 i e−j c |x0 −xps |
· . (63) S̄ps (npw |xps , xc , ω) = (69)
|x0 − xps | 4π|x0 − xps |
(2) †
M
X |m|
(−j ωc )h|m| ( ωc rps ) +jm(α −α† )
The according secondary source selection criterion aps (x0 ) j e
pw ps


is given by (9) with k̂S (x0 , ω) = (x0 −xps )/|x0 −xps |. The m=−M

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/TASLP.2019.2892895, IEEE/ACM
Transactions on Audio, Speech, and Language Processing
IEEE/ACM TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING 16

† †
with αps and rps corresponding to the point source position [21] P. Coleman, P. J. B. Jackson, M. Olik, M. Møller, M. Olsen, and
J. Abildgaard Pedersen, “Acoustic contrast, planarity and robustness of
x†ps = xps − xc in a shifted coordinate frame. The coefficient sound zone methods using a circular loudspeaker array,” J. Acoust. Soc.
for a plane wave propagating into ns = [cos αs , sin αs , 0]T Am., vol. 135, no. 4, pp. 1929–1940, 2014.
reads [46, Eq. (4.95)] [22] P. Coleman, P. J. Jackson, M. Olik, and J. A. Pedersen, “Personal audio
with a planar bright zone,” J. Acoust. Soc. Am., vol. 136, no. 4, pp.
sin 2M2+1 (αpw − αs ) −j ωc hnpw |xc i 1725–1735, 2014.

S̄pw (npw |ns , xc , ω) =  e . [23] S. Spors and R. Rabenstein, “Spatial Aliasing Artifacts Produced by
sin 12 (αpw − αs ) Linear and Circular Loudspeaker Arrays used for Wave Field Synthesis,”
(70) in Proc. of 120th Audio Eng. Soc. Conv., Paris, France, May 2006, pp.
1–14.
[24] S. Spors and J. Ahrens, “Spatial Sampling Artifacts of Wave Field
Synthesis for the Reproduction of Virtual Point Sources,” in Proc. of
126th Audio Eng. Soc. Conv., Munich, Germany, May 2009, pp. 1–19.
R EFERENCES [25] ——, “A comparison of wave field synthesis and higher-order Ambison-
ics with respect to physical properties and spatial sampling,” in Proc. of
[1] R. A. Kennedy, P. Sadeghi, T. D. Abhayapala, and H. M. Jones, “Intrinsic 125th Audio Eng. Soc. Conv., San Francisco, USA, October 2008, pp.
Limits of Dimensionality and Richness in Random Multipath Fields,” 1–17.
IEEE Trans. Signal Process., vol. 55, no. 6, pp. 2542–2556, 2007. [26] R. G. Oldfield, “The analysis and improvement of focused source
[2] J. Daniel, “Spatial Sound Encoding Including Near Field Effect: Intro- reproduction with wave field synthesis,” Ph.D. dissertation, University
ducing Distance Coding Filters and a Viable, New Ambisonic Format,” of Salford, 2013.
in Proc. of 23rd Intl. Audio Eng. Soc. Conf. on Signal Process. in Audio [27] J. Donley, C. Ritz, and W. B. Kleijn, “Multizone Soundfield Repro-
Recording and Reproduction, Copenhagen, Denmark, May 2003, pp. duction With Privacy- and Quality-Based Speech Masking Filters,”
1–15. IEEE/ACM Trans. Audio Speech Lang. Process., vol. 26, no. 6, pp.
[3] A. J. Berkhout, “A Holographic Approach to Acoustic Control,” J. Audio 1041–1055, 2018.
Eng. Soc., vol. 36, no. 12, pp. 977–995, 1988. [28] B. Girod, R. Rabenstein, and A. Stenger, Signal and Systems. Chich-
[4] F. Winter, J. Ahrens, and S. Spors, “On Analytic Methods for 2.5-D ester: Wiley, 2001.
Local Sound Field Synthesis Using Circular Distributions of Secondary [29] G. Firtha, P. Fiala, F. Schultz, and S. Spors, “Improved Referencing
Sources,” IEEE/ACM Trans. Audio Speech Lang. Process., vol. 24, no. 5, Schemes for 2.5D Wave Field Synthesis Driving Functions,” IEEE/ACM
pp. 914–926, 2016. Trans. Audio Speech Lang. Process., vol. 25, no. 5, pp. 1117–1127, 2017.
[5] J. Ahrens, Analytic Methods of Sound Field Synthesis. Berlin: Springer, [30] E. G. Williams, Fourier Acoustics. London: Academic Press, 1999.
2012. [31] L. E. Kinsler, A. R. Frey, A. B. Coppens, and J. V. Sanders, Funda-
[6] E. Corteel, C. Kuhn-Rahloff, and R. Pellegrini, “Wave Field Synthesis mentals of Acoustics, 4th ed. Hoboken: Wiley, 1999.
Rendering with Increased Aliasing Frequency,” in Proc. of 124th Audio [32] G. Firtha, P. Fiala, F. Schultz, and S. Spors, “On the General Relation of
Eng. Soc. Conv., Amsterdam, The Netherlands, May 2008, pp. 1–10. Wave Field Synthesis and Spectral Division Method for Linear Arrays,”
[7] S. Spors and J. Ahrens, “Local Sound Field Synthesis by Virtual IEEE/ACM Trans. Audio Speech Lang. Process., vol. 26, no. 12, pp.
Secondary Sources,” in Proc. of 40th Intl. Audio Eng. Soc. Conf. on 2393–2403, 2018.
Spatial Audio, Tokyo, Japan, October 2010, pp. 1–9. [33] E. N. G. Verheijen, “Sound Reproduction by Wave Field Synthesis,”
[8] S. Spors, K. Helwani, and J. Ahrens, “Local sound field synthesis by Ph.D. dissertation, Delft University of Technology, 1997.
virtual acoustic scattering and time-reversal,” in Proc. of 131st Audio [34] D. Colton and R. Kress, Inverse acoustic and electromagnetic scattering
Eng. Soc. Conv., New York, USA, October 2011, pp. 1–14. theory, 3rd ed. New York: Springer, 2013.
[9] N. Hahn, F. Winter, and S. Spors, “Local Wave Field Synthesis by Spatial [35] S. Yon, M. Tanter, and M. Fink, “Sound focusing in rooms: The time-
Band-Limitation in the Circular/Spherical Harmonics Domain,” in Proc. reversal approach,” J. Acoust. Soc. Am., vol. 113, no. 3, pp. 1533–1543,
of 140th Audio Eng. Soc. Conv., Paris, France, June 2016, pp. 1–12. 2003.
[10] M. Miyoshi and Y. Kaneda, “Inverse filtering of room acoustics,” IEEE [36] D. de Vries and A. J. Berkhout, “Wave theoretical approach to acoustic
Trans. Acoust., Speech, Signal Process., vol. 36, no. 2, pp. 145–152, focusing,” J. Acoust. Soc. Am., vol. 70, no. 3, pp. 740–748, 1981.
1988. [37] M. Fink, “Time reversal of ultrasonic fields. {I.} Basic principles,” IEEE
[11] O. Kirkeby and P. A. Nelson, “Reproduction of plane wave sound fields,” Trans. Ultrason. Ferroelectr. Freq. Control., vol. 39, no. 5, pp. 555–566,
J. Acoust. Soc. Am., vol. 94, no. 5, pp. 2992–3000, 1993. 1992.
[12] O. Kirkeby, P. A. Nelson, F. Orduna-Bustamante, and H. Hamada, [38] H. Wierstorf, “Perceptual Assessment of sound field synthesis,” Ph.D.
“Local sound field reproduction using digital signal processing,” J. dissertation, Technische Universität Berlin, 2014.
Acoust. Soc. Am., vol. 100, no. 3, pp. 1584–1593, 1996. [39] E. Corteel, “On the use of irregularly spaced loudspeaker arrays for
[13] S. Ise, “A principle of sound field control based on the Kirchhoff- Wave Field Synthesis, potential impact on spatial aliasing frequency,”
Helmholtz integral equation and the theory of inverse systems,” Acta in Proc. of 9th Int. Conf. on Digital Audio Effects (DAFx-06), Montreal,
Acust. united Ac., vol. 85, no. 1, pp. 78–87, 1999. Canada, September 2006, pp. 209–214.
[14] M. A. Poletti, “Three-dimensional surround sound systems based on [40] F. Winter, N. Hahn, and S. Spors, “Time-Domain Realisation of Model-
spherical harmonics,” J. Audio Eng. Soc., vol. 11, no. 53, pp. 1004– Based Rendering for 2.5D Local Wave Field Synthesis Using Spatial
1025, 2005. Bandwidth-Limitation,” in 25th European Signal Process. Conf. (EU-
[15] J. Hannemann and K. D. Donohue, “Virtual Sound Source Rendering SIPCO), Kos Island, Greece, August 2017, pp. 688–692.
Using a Multipole-Expansion and Method-of-Moments Approach,” J. [41] V. R. Algazi, C. Avendano, and R. O. Duda, “Elevation localization and
Audio Eng. Soc., vol. 56, no. 6, pp. 473–481, 2008. head-related transfer function analysis at low frequencies,” J. Acoust.
[16] M. Kolundžija, C. Faller, and M. Vetterli, “Reproducing sound fields Soc. Am., vol. 109, no. 3, pp. 1110–1122, 2001.
using mimo acoustic channel inversion,” J. Audio Eng. Soc, vol. 59, [42] F. Schultz, “Sound Field Synthesis for Line Source Array Applications
no. 10, pp. 721–734, 2011. in Large-Scale Sound Reinforcement,” Ph.D. dissertation, University of
[17] T. Betlehem, W. Zhang, M. A. Poletti, and T. D. Abhayapala, “Personal Rostock, 2016.
sound zones: Delivering interface-free audio to multiple listeners,” IEEE [43] F. Winter, F. Schultz, G. Firtha, and S. Spors, “A Geometric Model
Signal Process. Mag., vol. 32, no. 2, pp. 81–91, 2015. for Prediction of Spatial Aliasing in 2.5D Sound Field Synthesis –
[18] J.-W. Choi and Y.-H. Kim, “Generation of an acoustically bright zone Software,” DOI: 10.5281/zenodo.1158027, January 2019.
with an illuminated region using multiple sources,” J. Acoust. Soc. Am., [44] S. Spors, V. Kuscher, and J. Ahrens, “Efficient realization of model-
vol. 111, no. 4, pp. 1695–1700, 2002. based rendering for 2.5-dimensional near-field compensated higher order
[19] Y. J. Wu and T. D. Abhayapala, “Spatial Multizone Soundfield Repro- Ambisonics,” in IEEE Workshop on Appl. of Signal Process. to Audio
duction: Theory and Design,” IEEE Trans. Audio Speech Lang. Process., and Acoust. (WASPAA), New Paltz, New York, October 2011.
vol. 19, no. 6, pp. 1711–1720, 2011. [45] N. A. Gumerov and R. Duraiswami, Fast multipole methods for the
[20] W. Jin, W. B. Kleijn, and D. Virette, “Multizone soundfield reproduction Helmholtz equation in three dimensions. Amsterdam: Elsevier, 2004.
using orthogonal basis expansion,” in Proc. of 38th IEEE Intl. Conf. [46] A. Kuntz, “Wave Field Analysis Using Virtual Circular Microphone
Acoust., Speech and Signal Process. (ICASSP), Vancouver, Canada, May Arrays,” Ph.D. dissertation, Friedrich-Alexander-Universität Erlangen-
2013. Nürnberg, 2009.

2329-9290 (c) 2018 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.

Anda mungkin juga menyukai