Unified Structure and Parallel Algorithms For FBMC Transmitter and Receiver

2013 IEEE 24th International Symposium on Personal, Indoor and Mobile Radio Communications: Fundamentals and PHY Track
Unied Structure and Parallel Algorithms for

FBMC Transmitter and Receiver
Yonghong Zeng, Ying-Chang Liang, Meng Wah Chia and Edward Chu Yeow Peh
Institute for Infocomm Research, A STAR, 1 Fusionopolis Way, Singapore 138632
AbstractIn recent years, lter bank multicarrier (FBMC)
has recaptured widespread interests for its possible applications
in cognitive radio and dynamic spectrum access. A distinctive
feature for cognitive radio is its adaptivity to environment. When
environment changes, a cognitive radio will change its parameters
to optimize the transmission and receiving. Thus it is desirable
to design a unied structure and algorithm for FBMC that needs
little change for different parameters.
In this paper, we propose a unied structure and parallel
algorithms to implement the FBMC. The FBMC system and
parallel algorithms are constructed based on the normalized
prototype lter. The coefcients of the normalized prototype
lter can be pre-computed and stored. The proposed parallel
algorithms have the same structure for various choices of time
duration, subcarrier spacing and bandwidth. Combined with
known parallel algorithms for the fast Fourier transform (FFT),
the proposed algorithms fully parallelize the computations for
the transmitter and receiver, which can run much faster than
conventional serial algorithms as modern processors usually have
massive parallel capability.
I. I NTRODUCTION
Filter bank multicarrier (FBMC) is a transmission technology proposed long time ago [1]. In recent years, FBMC has
recaptured widespread interests for its possible applications
in cognitive radio, dynamic spectrum access and heterogenous networks [27]. Like the orthogonal frequency division
multiplexing (OFDM), FBMC is also a multicarrier scheme.
Compared to OFDM [8], FBMC uses a prototype lter with
much better time-frequency product. The localization of the
FBMC signal in both time and frequency makes it much more
robust to timing error and frequency offset.
In general FBMC has higher complexity than OFDM,
which has limited FBMCs applications. There have been
some fast algorithms proposed to reduce the complexity. For
example, a fast algorithm for the Offset Quadrature Amplitude
Modulation (OQAM, also called OFDM-OQAM in some
literature) is proposed in [9]. In [10], a unied fast algorithm
is proposed that is applicable to arbitrary sampling rate and
subcarrier spacing. In general, such fast algorithms use the fast
Fourier transform (FFT) and the polyphase structure to reduce
the complexity. The polyphase structure is realized by fast
convolution algorithms [11]. In practice, the prototype lter
length is limited to just a few block symbol length. Thus
the convolution length for the polyphase structure becomes
very short. As we know, it is not efcient to implement short
convolution algorithms in parallel processors [12, 13]. On the
other hand, parallel processors are widely used in various
platforms. For example, FPGA usually has large scale parallel
978-1-4577-1348-4/13/$31.00 2013 IEEE
processing capability. To fully use the capability of parallel

processors, we should design parallel algorithms for FBMC.
The most simple and efcient parallel algorithm is the vector
operation.
In this paper, we propose a unied structure and parallel
algorithms to implement the FBMC. We design the FBMC system and parallel algorithms based on the normalized prototype
lter. The same design and algorithms can be used for different
FBMC systems with different parameters. The coefcients
of the normalized prototype lter can be pre-computed and
stored. Combined with known parallel algorithms for FFT,
the proposed algorithms fully parallelize the computations for
the transmitter and receiver, which will be much faster than
conventional serial algorithms as modern processors usually
have massive parallelism capability.
The rest of paper is organized as follows. We present a
unied structure for FBMC in Section II. In Section III, we
propose a unied parallel algorithm for FBMC transmitters.
A unied parallel algorithm for FBMC receivers is given in
Section IV. Simulation results are shown in Section V. Finally,
we give conclusions in Section VI.
II. U NIFIED STRUCTURE AND PROTOTYPE FILTER FOR
FBMC
To design a FBMC system, we have three parameters to
choose: the prototype lter p(t), the time duration T and
the subcarrier spacing F [25, 7]. A good FBMC system
needs to have a few properties like high bandwidth efciency,
robustness to double selective channel, robustness to synchronization errors, and low complexity etc.. However, some
of the properties are conicting that cannot be met at the
same time. Thus the practical choice is a trade-off among
different factors that optimizes a certain cost function. For
xed bandwidth efciency, we can choose different T and
F for different channel conditions with given time-frequency
product : T F = .
In cognitive radio, depending on the system requirement,
channel condition, and available bandwidth, we may need
to change the parameters adaptively [6, 14, 15]. Thus it is
better to design a unied structure that can be used for
different parameters with little change. For this purpose, we
can design a normalized prototype lter p0 (t) for normalized
time duration and subcarrier spacing: T = F = . Using

p0 (t), we can construct the prototype lter for any T and F
922
with T F = :
p(t) = p0 (
F
t) = p0 ( t).
T
(1)
Thus we only need to design one prototype lter with given

.
Let the ambiguity function of the prototype lter p(t) with
given T and F be

p(t)p(t nT )ej2kF t dt.
(2)
Ap (n, k) =
Then it is easy to show that the ambiguity function of p0 (t)

is

p0 (t)p0 (t n )ej2k t dt
Ap0 (n, k) =
=
(3)
Ap (n, k).
T
Thus, to make the FBMC system orthogonal, we only need to
design a normalized prototype lter p0 (t) such that
Ap0 (n, k) = 0, if and only if k = n = 0.
(4)
Assume that we are going to transmit a data sequence using

N subcarriers. The data sequence is divided into blocks of
length N and denoted by: sn (k), k = N/2, , N/2
1; n = 0, 1, , M 1, where N is the number of subcarriers
in the FBMC system. Thus N F is the whole bandwidth used
for the system. The data sequence is then modied to
sn (k) = sn (k)dn (k),
x(t) =
sn (k)p(t nT )ej2kF t .
III. U NIFIED PARALLEL ALGORITHM FOR THE

TRANSMITTER
To reduce transmitter complexity, people usually compute

the samples of (6) and then use a digital to analog converter
(DAC) to convert the digital signal into analog signal. How
to generate the discrete time signal is a major part of the
transmitter design. In this section, we consider parallel algorithms for discrete time implementation of FBMC at the
transmitter. To simplify the notations, we only consider general
FBMC systems. Note that for general FBMC, sn (k) = sn (k).
Extension to OFDM/OQAM is straightforward. Let W be the
bandwidth for transmission. To simplify the DAC and analog
ltering, we can choose N and F such that N F W and null
some subcarriers to keep the used bandwidth not exceeding
W . The sampling rate is usually chosen as N F , where
1 is the over-sampling factor. Thus the sampling period
T
. To simplify the notations, we denote
is Ts = N1 F = N
N = N . Note that can be any positive number not smaller

= N to be an integer number.
than 1, but we require N
The sampled signal of (6) is:
x(mTs ) =
N/21 M 1

k=N/2 n=0
(5)
where dn (k) are modulation coefcients independent on the

data. The modied data sequence is modulated to an analog
signal expressed as
N/21 M 1

In the following we will design unied parallel algorithms

to compute (6) and (7) for different T , F and bandwidth using
the normalized prototype lter p0 (t) only.
N/21 M 1

sn (k)p0
k=N/2 n=0
j2km
j2km
m
n e N . (9)
We divide the output into blocks and denote

yl (m) =
+ m)Ts ),
x((lN
1,
m = 0, 1, , N
N/21
(6)
k=N/2 n=0
sn (m) =
The expressions here are very general and can be applied to

different systems. In literature, there are some systems like
FMT (Filtered Multiple Tones) and OFDM/OQAM (Offset
Quadrature Amplitude Modulation), which are treated as special cases here using the expressions above. For general FBMC
or FMT, sn (k) can be any type of modulation (complex or
sn (k) must be real
real) and dn (k) = 1. For OFDM/OQAM,
and dn (k) = j n+k , where j = 1.

At the receiver, let x
(t) be the received signal. The signal
is matched with the prototype lter to obtain:

x
(t)p(t nT )ej2kF t dt,
(7)
zn (k) = cp
sn (k)p(mTs nT )e
sn (k)e
j2km
(10)
(11)
k=N/2
Then
yl (m) =
M
1

sn (m)p0
n=0
l
m
+
n .
(12)
Usually the prototype lter is chosen to have fast decay in

time and frequency like the isotropic orthogonal transform
algorithm (IOTA) lter. Thus we can only consider the lter in
a limited time interval [, ]. Hence the summation index
n is actually limited to
(13)
/ + l/ n < / + l/ + 1/.
Let
where cp is a positive constant to normalize the output.

For OFDM/OQAM, only the real part of zn (k) is used for
equalization, and the imaginary part is discarded. Thus, for the
special case of OFDM/OQAM, only the real part is retained:

j2kF t
x
(t)p(t nT )e
dt . (8)
zn (k) = R cp
1
l
l
M1 = + , M2 = + + .

Then
923
yl (m) =
M2

n=M1

sn (m)p0
l
m
+
n .
(14)
(15)
We dene
k
m
+
gk (m) = p0
Then
M2
yl (m) =

.
sn (m)gln (m).
(16)
factor here can be the same or different as that at the transmitter. For simplicity, we use the same notation as that at the
T
.
transmitter. Thus the sampling period is Ts = N1 F = N
The received signal samples are divided into blocks of length
and denoted as
N
(17)
+ m)Ts ),
yl (m) = x
((lN
1; l = 0, 1,
m = 0, 1, , N
n=M1
As we have shown above
+ l n < + l + 1,
we obtain
Let
(18)
1 < l n .
(19)
K1 = 1, K2 = .
(20)
The matched ltering is approximated as

zn (k) = cp
x
(t)p(t nT )ej2kF t dt
Thus we only need gk (m) for k [K1 , K2 ]. The output data

sequences may be longer than the input. In fact, the index l
should be constrained by
(21)
1 l < + (M 1).
In general, the size of the output data sequence is around
times that of the input data sequence.
Based on the above derivations, we have the parallel algorithm as follows.
Algorithm 1: Unied parallel algorithm for FBMC
transmitter
j2km
N/21
1. Compute sn (m) = k=N/2 sn (k)e N . This can be
-point FFT for any xed n. Parallel algorithms
realized by a N
for FFT have been studied extensively and efcient algorithms
are available in [12, 13, 16].
for k [K1 , K2 ]
2. Compute gk (m) = p0 k + m
and m = 0, 1, , N 1. This can be done in real-time or by

pre-computation. If pre-computation is used, we pre-calculate
these values and store them in memory. The values are not
related to T and F and
can be used for different scenarios.
M2
s (m)gln (m). This can be
3. Compute yl (m) = n=M
1 n
done in parallel by forming vector operations:
yl () =
M2
sn ()gln ().
(22)
n=M1
2
+ 1 + 1) vector
For xed l, only (M2 M1 + 1) (
2
+ 1 ) vector additions
multiplications and (M2 M1 ) (
.
are needed. The vector lengths are N
IV. U NIFIED PARALLEL ALGORITHM FOR THE RECEIVER

At the receiver, signals are matched with the prototype lter
and then equalization is used to recover the transmitted signal.
Usually the equalizer is the simple one-tap equalization, thus
the major computation at the receiver is the matched ltering.
In the following, we focus on the parallel algorithm for the
matched ltering (7).
The received signal x
(t) is sampled at sampling rate N F ,
where is the over-sampling factor. Note that over-sampling
(23)
1
N
cp T
yl (m)
N
l m=0

j2km
l
m
n e N
p0 +
1
N
l m=0
1
N
m=0
Dene
un (m) =
zn (k) =
j2km
yl (m)gln (m)e
j2km
. (24)
Then
yl (m)gln (m)e
yl (m)gln (m).
(25)
1
N
un (m)e
j2km
(26)
m=0
As shown above, due to the limited time support of the

prototype lter, we have constraint:
1 < l n .
(27)
Thus we only need gk (m) for k [K1 , K2 ]. Let
1 = 1 + n, M
2 = + n .
M
(28)
1, M
2 ].
Then obviously the index l is conned in interval [M
Based on the above derivations, we have the parallel algorithm as follows.
Algorithm 2: Unied parallel algorithm for FBMC receiver

for k [K1 , K2 ]
1. Compute gk (m) = p0 k + m
N
1. This can be done in real-time or by
and m = 0, 1, , N
pre-computation. If pre-computation is used, we pre-calculate
these values and store them in memory. The values are not
related to T and F and can be
used for different scenarios.
M
2. Compute un (m) = 1N l=2M yl (m)gln (m). This
1
can be done in parallel by forming vector operations:
un () =
2
M

1
l=M
yl ()gln ().
(29)
1 + 1) (2 + 2) vector
2 M
For xed n, only (M
1 ) (2 +1) vector additions

2 M
multiplications and (M
924
0.8
0.8
normalized frequency response
normalized frequency response
0.9
0.7
0.6
0.5
0.4
0.3
0.2
0.6
0.4
0.2
0.1
0
10
0.2
20
15
10
Fig. 1.
10
15
20
normalized frequency
normalized frequency
Normalized prototype lter for FBMC ( = 1.25)
Fig. 2.
N/4
. Note that the constant

are needed. The vector lengths are N
1
can be combined into gk (m) or the FFT in the following.

N
j2km
N 1
3. Compute zn (k) = m=0 un (m)e N . This can be
-point FFT for any xed n. Parallel algorithms
realized by a N
for FFT have been studied extensively and efcient algorithms
are available in [12, 13, 16].
Symbol error rate (SER)
FBMC
OFDM
V. S IMULATIONS
In the simulations, we use Matlab running in a general
computer to verify the correctness of the parallel algorithms.
The Matlab program is exactly based on Algorithm 1 for the
transmitter and Algorithm 2 for the receiver by using vector
operations. Note that the simulations here are not for showing
the performance gain of the parallel algorithms, as the actual
performance of a parallel algorithm is related to the processing
platform. Here we just show that the parallel algorithms are
correct and compare the performance of FBMC and OFDM
at various situations.
For fair comparison, we choose the FBMC system and
OFDM system to have the same bandwidth efciency. The
prototype lter for FBMC is designed based on the IOTA
(Isotropic Orthogonal Transform Algorithm) [24]. As an
example, the frequency
response of the normalized prototype
lter for T = F = 1.25 is shown in Figure 1. We can also
treat OFDM as a special type of FBMC with the rectangular
function as its prototype lter. For comparison the frequency
response of the equivalent normalized prototype lter for
OFDM with cyclic prex (CP) length N/4 is shown in Figure
2. Obviously, the prototype lter for FBMC is much welllocalized in frequency.
We consider practical situations with carrier frequency offset (CFO) and timing error. Let the maximum CFO be max
kHz and maximum timing error be max samples. The actual
CFO and timing error at each Monte-Carlo test are randomly
generated in [max , max ) and [max , max ), respectively,
with even distribution.
Equivalent normalized prototype lter for OFDM with CP length
10
10
10
12
14
16
18
20
22
24
26
28
30
SNR
Fig. 3.
SER vs SNR (16QAM, no CFO, no timing error)
Frequency selective channels with 65 taps are used in

the simulations. The channel is assumed to have exponential
power prole with prole factor 0.1.
=
The settings for the simulation are as follows: N = N
256, subcarrier spacing F = 1/28 MHz, T = 1.25/F ,
and sampling (baud) rate Ts = 64/7 MHz. There are 43
null subcarriers to contain the spectrum within the 8 MHz
bandwidth. The CP length in OFDM is 64, which is just
enough to overcome the channel frequency selectivity without
error for OFDM. We choose the frame length to be 100
OFDM/FBMC blocks, which corresponds to 3.5 milli seconds
(ms). For each frame, the rst two OFDM/FBMC blocks are
preambles, which are used to estimate the channel at the
beginning of each frame. Continual pilots are inserted into
every OFDM/FBMC block that are used to estimate the phase
and channel change within a frame. There are 5 continual
pilots in each block. The phase/channel change is estimated at
every OFDM/FBMC block and compensated at the receiver.
925
FBMC
OFDM
FBMC
OFDM
10
10
10
10
10
15
20
25
30
35
40
10
SNR
Fig. 4.
SER vs SNR (BPSK, max =0 kHz, max =10)
Fig. 6.
18
20
22
24
26
28
30
SER vs SNR (16QAM, max =1 kHz, max =10)
R EFERENCES
16
Acknowledgement: Thank Dr. The Hanh Pham for providing the prototype lter and helpful discussions.
10
10
15
20
25
30
35
40
SNR
Fig. 5.
14
SNR
FBMC
OFDM
10
12
SER vs SNR (BPSK, max =3 kHz, max =0)
We have done extensive simulations on various conditions.

Some of them are shown in Figure 3 to Figure 6.
Based on the simulations, we have the following observations:
1. The proposed unied structure and parallel algorithms
are valid and applicable to any FBMC systems.
2. The combined effect of frequency offset and timing error
is very detrimental to OFDM, while FBMC is quite robust to
the effect.
VI. C ONCLUSIONS
We have proposed a unied structure and parallel algorithms
to implement the FBMC. The advantage is that the same
structure and algorithm can be used for different applications with different parameters. The proposed vector parallel
algorithms can be used in platforms like FPGA to greatly
accelerate the transmitter and receiver. The correctness of the
algorithms have been veried by simulations. However, the
actual performances of the algorithms are yet to be quantized
in real parallel processors. FPGA could be one of the best
platform for the algorithms.
[1] S. Weinstein and P. Ebert, Data transmission by frequency-division multiplexing using the discrete Fourier transform, IEEE Trans. Commun.
Tech., vol. 19, p. 628C634, Oct. 1971.
[2] B. L. Floch, M. Alard, and C. Berrou, Coded orthogonal frequency
division multiplex, Proceedings of IEEE, vol. 83, no. 6, pp. 982996,
1995.
[3] R. Haas and J.-C. Belore, A time-frequency well-localized pulse
for multiple carrier transmission, Wireless Personal Communications,
vol. 5, pp. 118, 1997.
[4] P. Siohan and C. Roche, Cosine-modulated lterbanks based on extended gaussian functions, IEEE Trans. Signal Processing, vol. 48,
no. 11, pp. 30523061, 2000.
[5] B. Farhang-Boroujeny, OFDM versus lter bank multicarrier, IEEE
Signal Processing Magazine, pp. 92112, May 2011.
[6] D. Noguet, M. Gautier, and V. Berg, Advances in opportunistic radio
technologies for TVWS, EURASIP Journal on Wireless Communications and Networking, vol. 170, pp. 112, 2011.
[7] PHYDYAS, PHYDYAS-physical layer for dynamic spectrum access
and cognitive radio, in http://www.ict-phydyas.org/, 2013.
[8] M.-O. Pun, M. Morelli, and C.-C. J. Kuo, Multi-Carrier Techniques for
Broadband Wireless Communications: A Signal Processing Perspective.
UK: Imperial College Press, 2007.
[9] L. Vangelista and N. Laurenti, Efcient implementations and alternative
architectures for OFDM-OQAM systems, IEEE Trans. On Communications, vol. 49, no. 4, pp. 664675, 2001.
[10] E. Gutierrez, J. A. Lopez-Salcedo, and G. Seco-Granados, Unied
framework for exible multi-carrier communication system, in 8th
International Workshop on Multi-Carrier Systems & Solutions (MC-SS),
(Spain), May 2011.
[11] G. Bi and Y. H. Zeng, Transforms and Fast Algorithms for Signal
Analysis and Representation. Boston, USA: Birkhauser-Springer, 2003.
[12] Y. H. Zeng, L. Z. Cheng, and M. Zhou, Parallel Algorithms for Digital
Signal Processing. China: National University of Defense Technology
Press, 1998.
[13] E. Chu and A. George, Inside the FFT Black Box: Serial and Parallel
Fast Fourier Transform Algorithms. Taylor & Francis, 2010.
[14] Y. H. Zeng, Y.-C. Liang, Z. D. Lei, S. W. Oh, F. Chin, and S. M. Sun,
Worldwide regulatory and standardization activities on cognitive radio,
in IEEE DySPAN, (Singapore), April 2010.
[15] Y.-C. Liang, K. Chen, Y. Li, and P. Mahonen, Cognitive radio networking and communications: an overview, IEEE Trans on Vehicular
Technology, vol. 60, no. 6, pp. 123, 2011.
[16] M. Ayinala, M. Brown, and K. K. Parhi, Pipelined parallel FFT
architectures via folding transformation, IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, vol. 20, pp. 10681081, June
2012.
926

Unified Structure and Parallel Algorithms For FBMC Transmitter and Receiver

Diunggah oleh

Informasi Dokumen

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Unified Structure and Parallel Algorithms For FBMC Transmitter and Receiver

Diunggah oleh

Hak Cipta:

Format Tersedia

2013 IEEE 24th International Symposium on Personal, Indoor and Mobile Radio Communications: Fundamentals and PHY Track

Unied Structure and Parallel Algorithms for

978-1-4577-1348-4/13/$31.00 2013 IEEE

processing capability. To fully use the capability of parallel

time duration and subcarrier spacing: T = F = . Using

Thus we only need to design one prototype lter with given

Then it is easy to show that the ambiguity function of p0 (t)

Assume that we are going to transmit a data sequence using

III. U NIFIED PARALLEL ALGORITHM FOR THE

To reduce transmitter complexity, people usually compute

N = N . Note that can be any positive number not smaller

where dn (k) are modulation coefcients independent on the

In the following we will design unied parallel algorithms

We divide the output into blocks and denote

The expressions here are very general and can be applied to

and dn (k) = j n+k , where j = 1.

Usually the prototype lter is chosen to have fast decay in

where cp is a positive constant to normalize the output.

As we have shown above

The matched ltering is approximated as

Thus we only need gk (m) for k [K1 , K2 ]. The output data

and m = 0, 1, , N 1. This can be done in real-time or by

IV. U NIFIED PARALLEL ALGORITHM FOR THE RECEIVER

As shown above, due to the limited time support of the

1 ) (2 +1) vector additions

normalized frequency response

normalized frequency response

Normalized prototype lter for FBMC ( = 1.25)

. Note that the constant

can be combined into gk (m) or the FFT in the following.

Symbol error rate (SER)

Equivalent normalized prototype lter for OFDM with CP length

SER vs SNR (16QAM, no CFO, no timing error)

Frequency selective channels with 65 taps are used in

Symbol error rate (SER)

Symbol error rate (SER)

SER vs SNR (BPSK, max =0 kHz, max =10)

SER vs SNR (16QAM, max =1 kHz, max =10)

Symbol error rate (SER)

SER vs SNR (BPSK, max =3 kHz, max =0)

We have done extensive simulations on various conditions.

Anda mungkin juga menyukai