Nc1
k=0
, where N
c
is the chan-
nel length. This corresponds to the N
c
1 channel im-
pulse response vector of c = [c
0
, c
1
, . . . , c
Nc1
]
T
where
()
T
is the transpose operator and the channel is station-
ary (a time-varying channel can be used as long as it
does not vary faster than can be tracked by the equaliza-
tion algorithm). The source symbol is a random variable
that is independent and identically distributed (i.i.d.) with
zero mean and variance
2
s
= E[s(n)[
2
and is drawn
from a nite alphabet, which is given by the nite set
s
m
= s
m,R
+ s
m,I
M
m=1
for an M-QAM constellation,
where E is the expectation operator and the subscripts
R and I denote the magnitude of the real and imaginary
quantities, respectively.
The received T/2-spaced input signal u(k) is corrupted
by ISI and the additive white Gaussian noise (AWGN)
signal v(k). The baseband receiver consists of a N
f
-
tap T/2-spaced feedforward equalizer (FFE) and a N
b
-
tap T-spaced FBE to form a nonlinear DFE, where the
FFE removes the precursor ISI and the FBE removes
the postcursor ISI. The FFE and FBE tap-coefcients
are characterized by the nite series f
k
N
f
1
k=0
and
b
k
N
b
1
k=0
, respectively, which correspond to the N
f
1
vector f (n) =
_
f
0
(n), f
1
(n), . . . , f
N
f
1
(n)
T
and the
N
b
1 vector b(n) = [b
0
(n), b
1
(n), . . . , b
N
b
1
(n)]
T
, re-
spectively. The output of the FFE is baud spaced and
is formed by convolving the received T/2-spaced input
signal sequence with the FFE tap coefcients. The T/2-
spaced convolution matrix is constructed from the channel
impulse response vector and is dened as [2]
C
FS
=
_
_
c
0
c
1
c
0
.
.
. c
1
c
0
c
Nc1
.
.
. c
1
.
.
.
c
Nc1
.
.
.
.
.
. c
0
c
Nc1
c
1
.
.
.
.
.
.
c
Nc1
_
_
(1)
where C
FS
is an (N
c
+ N
f
1) N
f
matrix. The T-
spaced convolution matrix is formed by the odd rows of
(1) and is dened as
C =
_
_
c
1
c
0
c
3
c
2
c
1
c
0
.
.
.
.
.
. c
3
c
2
.
.
.
c
Nc1
c
Nc2
.
.
.
.
.
.
.
.
. c
1
c
0
c
Nc1
c
Nc2
c
3
c
2
.
.
.
.
.
.
.
.
.
c
Nc1
c
Nc2
_
_
(2)
where C is a P N
f
matrix and
P = (N
c
+ N
f
1)/2|. The regressor vector of
FFE input samples is comprised of the previous N
f
received T/2-spaced samples and is dened as
u(n) = C
T
s(n) +v(n) (3)
where s(n) = [s(n), s(n 1), . . . , s(n P + 1)]
T
is the P 1 transmitted source symbol vector and
v(n) = [v
0
(n), v
1
(n), . . . , v
N
f
1
(n)]
T
is the N
f
1
vector of AWGN samples. The FFE output is decimated
by a factor of two and is dened as
y(n) = u
T
(n)f (n) = s
T
(n)Cf (n) +v
T
(n)f (n). (4)
The DFE output signal is dened as
z(n) = x
T
(n)w(n) = u
T
(n)f (n) s
T
(n)b(n) (5)
where the N
w
1 vectors x(n) =
_
u
T
(n), s
T
(n)
T
and
w(n) =
_
f
T
(n), b
T
(n)
T
are the combined DFE in-
put regressor and tap coefcient vectors, respectively,
while s(n) = [ s(n), s(n 1), . . . , s(n N
b
+ 1)]
T
is
the N
b
1 regressor vector of past estimated symbol
points and N
w
= N
f
+ N
b
.
The FBE does not exhibit noise enhancement since it
utilizes past symbol estimates, which are assumed to be
correct [1]. When an incorrect symbol estimate is fed back
to the feedback tapped delay line, there is a greater like-
lihood of error propagation. Therefore, the SER must be
sufciently low before the FBE is activated. Initially, the
FFE is utilized to reduce the SER, while the FBE tap coef-
cients are xed at zero. After the SER is sufciently low,
the FBE is activated to reduce the postcursor ISI.
3. Blind Equalization Algorithms
3.1 Constant Modulus Algorithm
The constant modulus algorithm (CMA) [20], [21]
achieves channel equalization by penalizing the dispersion
of the squared output modulus, [z(n)[
2
, from the constant
2
c
. The cost function minimized by CMA is dened as
J
cma
=
1
4
E
_
_
[z(n)[
2
2
c
_
2
_
(6)
where
2
c
= E[s
m
[
4
/E[s
m
[
2
is the dispersion con-
stant. A gradient-descent equalizer tap adjustment algo-
rithm that minimizes J
cma
is dened as
w(n + 1) = w(n) + (
w
J
cma
) (7)
= w(n) + z(n)
_
2
c
[z(n)[
2
_
. .
e
cma
(n)
x
(n)
where is a positive stepsize,
w
is the gradient operator
with respect to the elements of vector w, e
cma
(n) is the
CMA error signal, and ()
2
m
, where z(n) = z
R
(n) + z
I
(n). The cost function
minimized by MMA is dened as
J
mma
=
1
4
E
_
_
z
2
R
(n)
2
m
_
2
+
_
z
2
I
(n)
2
m
_
2
_
(8)
where
2
m
= Es
4
m,R
/Es
2
m,R
is the dispersion con-
stant. A gradient-descent equalizer tap adjustment algo-
rithm that minimizes J
mma
is dened as
w(n + 1) =w(n) + (
w
J
mma
) (9)
=w(n) +
_
e
mma
R
(n)
..
z
R
(n)
_
2
m
z
2
R
(n)
_
+ z
I
(n)
_
2
m
z
2
I
(n)
_
. .
e
mma
I
(n)
_
x
(n)
where e
mma
R
(n) and e
mma
I
(n) are the real and imaginary
components of the MMA error signal, respectively.
3.3 Decision-Directed Algorithm
The cost function minimized by the decision-directed
(DD) algorithm[24] utilizes the instantaneous error across
the slicer and is dened as
J
dd
=
1
2
E([z(n) s(n)[)
2
(10)
where s(n) = s
R
(n)+ s
I
(n) is the estimated QAM sym-
bol. A gradient-descent equalizer update algorithm that
minimizes J
dd
is dened as
w(n + 1) = w(n) +
_
w
J
dd
_
(11)
= w(n) + ( s(n) z(n))
. .
e
dd
(n)
x
(n)
where e
dd
(n) is the DD error signal. The DD algorithm
requires the mean-squared error (MSE) to be lower than a
specied threshold [25] and cannot be applied at the onset
of equalization.
4. Computationally-Efcient Methods
This section discusses computationally-efcient methods
that can be applied to both trained and blind adaptive
equalizers. As illustrated in Fig. 2, adaptive equalization
can be generalized into two operations: convolving the re-
ceived symbol sequence with the equalizer tap coefcients
and updating the equalizer tap coefcients. One method to
improve computational efciency is to simplify or reduce
the number of multiplications needed to realize the equal-
izer. Signed-error [26][28] and power-of-two error [10],
[11] are methods which simplify the multiplications in the
equalizer tap adjustment to shift operations when a power-
of-two step size is applied. The selective coefcient update
method [12], [13] reduces the number of multiplications
by updating only a subset of the total taps during an itera-
tion, while frequency-domain block algorithms [15][18]
perform time-domain convolution for a block of samples
in the frequency-domain.
4.1 Power-of-Two Error Method
The most common method of reducing the complexity
of an adaptive algorithm is to retain only the sign of the
error signal [26][28]. Signed-error algorithms simplify
the multiplications in the equalizer tap adjustment por-
tion to shift operations when a power-of-two step size is
applied. However, signed-error algorithms are character-
ized by rough convergence and high steady-state MSE.
An alternative method that avoids these characteristics is
the power-of-two (POT) error method [10], [11], which
quantizes the error signal of the respective algorithm to a
power-of-two. The general equalizer tap adjustment algo-
rithm for POT algorithms is dened as
w(n + 1) = w(n) + Qe(n)x
(n) (12)
where e(n) is the error signal of the respective algorithm
and Q is a nonlinear power-of-two quantizer, which
can be dened as [11]:
Qx =
_
_
_
csgn(x), [x[ 1
2
log2|x|
sgn(x), 2
L+2
[x[ < 1
csgn(x), [x[ < 2
L+2
(13)
where csgn() is the complex sign operator, L is the data
word length including the sign bit and is typically set to
either 0 or 2
L+1
.
The computational requirements for the POT method
equalizer tap update are specied in Table 1. When cou-
pled with a power-of-two step size, the multiplications are
reduced to shift operations while the equalizer tap adjust-
ment becomes shift and add operations.
4.2 Selective Coefcient Update Method
The complexity of an adaptive lter is proportional to
the number of its tap coefcients. By partially updating
the tap coefcients, the processor capacity can be uti-
lized more efciently while reducing the power consump-
tion [12][14]. The general equalizer tap adjustment al-
gorithm for selective coefcient update (SCU) algorithms
is dened as
w(n + 1) = w(n) + e(n)A
IP(n)
x
(n) (14)
where e(n) is the respective error signal and A
IP(n)
is
a diagonal matrix having P elements equal to one in the
positions indicated by I
P
(n) and zeros elsewhere, where
I
P
(n) is the N
w
1 update constraint vector. The up-
date constraint vector is determined through information
evaluation, which can be accomplished using a number of
methods. We will consider the xed and time-varying set-
membership criteria discussed in [14]. The rst method
is when P tap coefcients are updated during each itera-
tion, where P is a xed value between 0 < P < N
w
. The
equalizer input vector x(n) is sorted and the index posi-
tions that correspond to the largest P input samples are set
to one in I
P
(n), while all other positions are set to zero.
4 Preprint submitted to Elsevier Science
AE
(17)
where the T operator transforms the 2N 1 vector
into a 2N 2N diagonal matrix. The frequency-domain
equalizer output is Y(nN) = U(nN)F(nN), while the
AE
U Int. J. Electron. Commun.
(2006) No. 7 Preprint submitted to Elsevier Science 5
Fig. 3. Realization of the FDB method using the overlap-save sec-
tioning procedure.
N 1 time-domain equalizer output vector is dened as
y(nN) = kT
1
U(nN)F(nN) (18)
= [y(nN N + 1), . . . , y(nN)]
T
where k is the N 2N constraint matrix that ensures the
output result is a linear convolution [17]. The frequency-
domain error signal vector is dened as
E(nN) = T[0
T
N,1
, e
T
(nN)]
T
(19)
where e(nN) = [e(nN N + 1), . . . , e(nN)]
T
is the
time-domain error signal vector and 0
N,1
is the N 1
zero vector. The gradient and convolution constraint ma-
trices, g and k, respectively, are dened as [17]:
g =
_
I
N,N
0
N,N
0
N,N
0
N,N
_
k = [ 0
N,N
I
N,N
] (20)
where 0
N,N
is an N N zero matrix and I
N,N
is an
N N identity matrix.
The computational requirements for the FDB method
equalizer tap update are specied in Table 1. The FDB
method is applied to the FFE while the FBE is imple-
mented in the time-domain. As illustrated in Fig. 3, a to-
tal of three 2N-point FFTs and two 2N-point IFFTs are
needed to implement the FFE utilizing the FDB method,
where two FFTs and one IFFT are utilized specically for
the frequency-domain equalizer tap adjustment.
5. Radius-Directed Stop-and-Go Method
The radius-directed stop-and-go (RSG) method for QAM
signals selectively updates the equalizer tap coefcients
based on the equalizer output radius, r(n) = [z(n) s(n)[,
which is the Euclidean distance between the equalizer out-
put and estimated symbol. The equalizer tap coefcients
are only adjusted during iterations when r(n) > r
s
, where
r
s
is a constant bound. The general equalizer tap adjust-
ment algorithm for RSG algorithms is controlled by the
ag (n) and is dened as
w(n + 1) = w(n) + (n)e(n)x
(n) (21)
where e(n) is the error signal of the respective algorithm
and (n) is dened as
(n) =
_
1, if r(n) > r
s
0, otherwise
(22)
where the constant r
s
=
s
d/2,
s
is a user dened pa-
rameter, and d is the distance between symbol points. This
parameter is to be chosen between the following limits
2N
w
max
<
s
2
3
(23)
where
max
is the maximum adjustment error that can oc-
cur when the equalizer output is a constellation point (i.e.
when z(n) s
m
M
m=1
for a square M-QAM constella-
tion) and the upper limit of
s
corresponds to the mini-
mum level of MSE required for transfer to the DD algo-
rithm, which is denoted
dd
[32]. While there is no ad-
justment error for the DD algorithm in steady-state op-
eration, this is not the case for statistical mean algo-
rithms such as CMA and MMA, which have non-zero
updates for z(n) s
m
M
m=1
. These algorithms accentu-
ate the bottom of the bowl scenario of classical gradi-
ent search methods, where the equalizer tap coefcients
bounce around the optimal solution. As a result, these uc-
tuations cause the steady-state MSE to increase.
At the uppermost limit of (23), the number of equalizer
tap updates will be minimized at the expense of a high
steady-state MSE. As
s
approaches the lowermost limit,
the steady-state MSE will be equivalent to the original al-
gorithmwith slightly fewer updates. When selecting
s
, it
is important to note that when the MSE level is below
dd
,
the steady-state MSE for the original algorithm,
ss
, can
be approximated as the error across the slicer as follows
ss
= E
_
[ s(n) z(n)[
2
_
= E
_
r
2
(n)
_
. (24)
The relationship between the steady-state MSE for the se-
lective update method,
rsg
ss
, and
ss
for equalizers in the
decision-directed mode of operation, can be expressed as
rsg
ss
_
=
ss
, if
_
ss
r
2
s
and r
2
s
,
ss
_
or
ss
r
2
s
>
ss
, if
ss
r
2
s
and r
2
s
ss
.
(25)
This can be explained as follows: if r
2
s
ss
, then
r
2
s
E
_
r(n)
2
_
, which will cause a signicant reduction
in equalizer tap updates. This can effect
rsg
ss
construc-
tively or destructively, depending on the selection of r
s
. If
r
2
s
ss
, the steady-state MSE will approach r
2
s
since the
equalizer tap coefcients will only be updated once
rsg
ss
is
degraded. However, as r
2
s
ss
, the
rsg
ss
will decrease to
the point where
rsg
ss
=
ss
.
At the onset of equalization the equalizer output will be
a random i.i.d. value, which will result in the following
6 Preprint submitted to Elsevier Science
AE
d
2
= 1
2
s
4
. (26)
This probability decreases as the equalizer adapts and
reaches a minimum when the equalizer is in steady-
state operation. During the initial stages of adapta-
tion, E r(n) r
s
causing the equalizer taps to be
updated frequently. This allows the respective algo-
rithm to maintain its transient characteristics. As the
E
_
r(n)
2
_
ss
E r(n) r
s
, the equalizer is in
steady-state operation and the number of equalizer tap up-
dates will be at a minimum. If the channel should experi-
ence sudden changes, the MSE will increase and the pro-
cess will repeat.
While the RSG method reduces the number of equalizer
tap adjustments, there are no reductions in the hardware
resources needed for implementation. The RSG method
can be modied to reduce the hardware complexity by
combining it with methods that simplify the multipli-
cations in the tap coefcient update, such as the POT
method. Here we propose retaining the rst two leading
ones of the error signal, which we term the double power-
of-two (DPOT) method. This is proposed to improve the
accuracy of the error signal estimate over the POT method,
while minimizing the added hardware complexity over
that method. The general equalizer tap adjustment algo-
rithm for RSG-DPOT algorithms is dened as
w(n + 1) = w(n) + (n)e
dpot
(n)x
(n) (27)
where e
dpot
(n) = Qe(n) + Qe(n) Qe(n),
e(n) is the error signal of the respective algorithm and
Q is the nonlinear power-of-two quantizer that was
dened in (13) for the POT method.
The computational requirements for the RSG and RSG-
DPOT methods are specied in Table 1. The RSG method
maintains the same hardware complexity but reduces the
number of equalizer tap adjustments and hence, the num-
ber of computations. However, in addition to reducing
the number of equalizer tap adjustments, the RSG-DPOT
method reduces the equalizer tap adjustment to shift and
add operations when a power-of-two step size is applied.
The calculation of r(n) requires two real multiplications,
three real additions and one real square root. However, if
the operand precision is sufcient, r
2
(n) can be utilized
to determine whether to adjust the equalizer taps, which
eliminates the square root function. Alternatively, a look-
up-table (LUT) could be utilized to implement the square
root function.
6. Simulation Study
This section presents simulation results for
computationally-efcient methods applied to CMA-
and MMA-based DFEs. The algorithms are simulated
over empirically derived cable and microwave channels
from the Signal Processing Information Base (SPIB,
located: http://spib.rice.edu/) and multi-tap
Ricean fading channels. The simulation environment
Calculate r(n)
r(n) < d/3
count < r
th
count > 0
count=count-1
Yes
No
Yes
Turn off DFE
Yes
No
No
count=count+1 Turn on DFE
Fig. 4. DFE activation and de-activation ow chart.
consists of a T/2-spaced channel in cascade with a 54-tap
DFE consisting of an 18-tap T/2-spaced FFE and a
36-tap T-spaced FBE, where the channel, FFE, and FBE
are modeled as complex FIR lters. The source symbol
sequence is randomly generated using an i.i.d. process and
is drawn from a normalized square QAM constellation.
The received equalizer input samples are generated by
convolving the source sequence with the channel impulse
response and adding AWGN.
The DFE is controlled by the activation/de-activation
method illustrated in Fig. 4, which utilizes r(n) to de-
termine whether to activate or de-activate the FBE of the
DFE. At the onset of equalization, the FFE is initialized
with a dual center spike of 1/