Abstract—Precoding with block diagonalization (BD) is an inverse of the multiuser matrix channel [4]. Owing to the
attractive technique for approaching the sum capacity in the transmit power enhancement resulting from power normaliza-
multiuser multiple-input multiple-output (MIMO) broadcast tion, the gap in the sum rate between DPC and these linear
channel. Unfortunately, BD requires either global channel state
information at every receiver or an additional training phase, precoding methods, however, is large.
which demands additional control overhead and additional sys- A way to avoid transmit power enhancement is to use non-
tem planning. In this paper we propose a new multiuser MIMO
algorithm that combines BD with vector perturbation (VP). The
linear precoding such as lattice precoding [6]. Tomlinson-
proposed algorithm avoids the second training phase, reduces Harashima MIMO precoding is one example of transmit
each user’s receiver complexity thanks to pre-equalization with precoding with a modulo operation [6]. Another example is
VP at the transmitter, and has comparable diversity performance vector perturbation (VP) where the transmit signal vector is
to BD with maximum likelihood decoding algorithm. A bound perturbed by another vector to minimize transmit power from
on the achievable sum rate for the proposed technique is derived
and used to show that BD with VP approaches the achievable
the extended constellation [7]. Finding the optimal perturba-
sum rate of BD with water-filling. Numerical simulations confirm tion involves solving a minimum distance type problem and
that the proposed technique provides better bit error rate and thus can be implemented using sphere-encoding or other full
diversity performance than BD with a zero-forcing receiver as search-based algorithms. The multiuser precoding approaches
well as BD with zero-forcing precoding. mentioned above assume that the transmitter sends a single
Index Terms—MIMO systems, broadcast channels, nonlinear stream to each user. In next generation systems, mobile
system, perturbation methods, spatial filtering, interference sup- terminals may have multiple receive antennas and could thus
pression. receive multiple streams. In this paper, we propose a new
algorithm that allows more than one transmit stream per user.
I. I NTRODUCTION One precoding algorithm that supports multi-stream trans-
Authorized licensed use limited to: VELLORE INSTITUTE OF TECHNOLOGY. Downloaded on July 28, 2009 at 06:55 from IEEE Xplore. Restrictions apply.
4052 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008
Fig. 1. Structure of a lattice-based broadcast SM-MIMO precoding system using the BD algorithm.
channel matrix.2 We assume that the transmitter has full CSI user 1 and 16-QAM for user 2. Thus, from both rate/diversity
for all users and satisfies the dimensionality constraint as with and complexity perspectives, our approach achieves similar
conventional BD [11]. The proposed algorithm exploits the BD performance to BD with an optimal receiver but with much
linear precoding algorithm to transmit interference free groups lower receiver complexity. This can be a particular advantage
of data to different users. To avoid the extra coordination in multiuser systems with low-cost low-power receivers.
information between the transmitter and receiver, we propose This paper is organized as follows. In Section II we begin
to use a ZF prefilter combined with multi-stream VP, which with the system model and present a summary of BD and its
differs from the previous perturbation [7] in operating domain. limitations. In Section III we propose vector-perturbed MIMO
Hence, the main features of our approach are that (i) we do not BD and derive its achievable rate. We present numerical
require coordination information like global CSI at the receiver results including achievable rate, probability of bit error, and
or an additional training phase, and that (ii) our approach has complexity in Section IV and conclude in Section V.
much lower receiver complexity, at the expense of additional
transmit complexity over BD. II. B ROADCAST MIMO S YSTEM WITH B LOCK
We derive an achievable rate, which we define as an error- D IAGONALIZATION
free supportable rate that satisfies any given power constraint
[16], of our system under an optimal perturbation assumption. In this section we discuss the narrow-band broadcast signal
Through numerical results, we show that the resulting sum and channel model under consideration. Then we discuss BD
rate is equivalent to that of BD combined with water-filling and its knowledge of coordination information.
in [11] and [12]. We also compare the proposed algorithm
with the conventional BD assuming equal power allocation A. MIMO Broadcast Signal Model
[11] and a ZF or maximum likelihood (ML) receiver [9] in
Consider a MIMO broadcast signal model with K users
terms of the uncoded bit error rate (BER). We find that our
each employing Nr,k receive antennas and receiving their own
approach has similar diversity performance to BD with an
data streams, which are precoded at the transmitter with Nt
ML receiver and much better performance than BD with a ZF
antennas. The channel is assumed to be flat fading, though this
receiver even though we use a low complexity receiver and
model can be extended to frequency selective fading channels
do not need the second training phase. Note that we could
using orthogonal frequency division multiplexing (OFDM)
use the standard channel inversion approach in [4], [7] and
modulation. In the broadcast channel, the received signal at
treat the receive antennas as if they were separate users, and
the k-th receiver is3
then full diversity gain can be achieved at the transmitter. The
K
computational complexity of the channel inversion-based VP
(CI-VP), however, would significantly increase as the number y k = H kF kx k + H k F lx l + n k , (1)
l=1,l=k
of transmit antennas at the transmitter increases. In addition,
the same modulation order has to be used in CI-VP while the where x k is the transmit signal of the k-th user, H k is the
proposed systems could use different modulation orders for Nrk × Nt channel matrix, and n k is the additive complex
each user based on each user’s channel gain, e.g., 4-QAM for Gaussian noise vector with zero mean and covariance matrix
2 Compared with our related conference paper [5], this paper includes a 3 Upper case and lower case boldface are used to denote matrices A and
complexity comparison between different methods, provides a more detailed vectors a , respectively. If A denotes a complex matrix, and A T , A ∗ , A −1
proof of the achievable rate, and contains new simulation results to show the denote the transpose, conjugate transpose, and the inverse of A , respectively.
effectiveness of our proposed precoding technique. A F denotes the Frobenius norm of matrix A .
A
Authorized licensed use limited to: VELLORE INSTITUTE OF TECHNOLOGY. Downloaded on July 28, 2009 at 06:55 from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008 4053
σn2 I Nr,k . In (1), F k denotes the precoder for the k-th user, at the transmitter and the decoder of the k-th user also has U k
which is a cascade of two precoding matrices B k and D k at the receiver. Therefore the maximum achievable sum rate
for BD, i.e., F k = B k D k , where B k removes the inter- of the BD algorithm is given by
user interference and D k is used for parallelizing and power
allocation [11]–[15]. BD Λ 2Q
CW F = max log det I + 2 , (8)
Q:Tr(Q
{Q Q)≤PT } σn
where H eff,k = H kB k denotes the effective channel of the A. The Proposed MIMO Block Diagonalization with Vector
k-th user and the size of x k is Lk × 1. Since the k-th user Perturbation
receives its own data stream without interference from other The proposed multiuser MIMO BD with VP transceiver is
users, the methodology for designing an appropriate decoder illustrated in Fig. 1. The transmitter encodes each user’s data
is similar to that of single user MIMO cases after channel streams independently. The k-th transmitter consists of the
estimation [9]. cascade of two matrices H −1 eff,k and F k where the effective
To achieve the highest sum rate, after removing the effect of channel H eff,k = H kF k . The precoding matrix F k is F k =
the interfering users’ streams, BD maximizes data throughput B k , where B k is calculated in the same manner of (4). Note
with the well-known water-filling (WF) algorithm [11]. Define that F k has a different form from (7). As mentioned in Section
the SVD of H eff,k II-B, using the optimal BD solution with diagonalization
∗ via SVD requires additional coordination information since
Λk 0 ] V (1)
H eff,k = U k [Λ k V
(0)
k , (6) the effective channel used for the SVD operation includes
CSI from other users. To avoid the additional coordination
(1) information, the precoder F k has only to remove multiuser
where V k denotes the set of the right singular vectors
corresponding to non-zero singular values and U k is the left interference. The inversion of H eff,k is used to pre-equalize
(1) 1 and parallelize each user’s stream instead of SVD. In addition,
singular matrix. Taking D k = V k Q k2 where Q k denotes a 1
diagonal matrix whose elements scale the power transmitted transmit power optimization (Q Qk2 ) as seen in (7) is not required
(1) since the same symbol constellation on each stream is used
into each of the column of V k , the precoder of the k-th user
F k is given by for vector-perturbing.
To prevent transmit power enhancement due to H −1 eff,k , the
(0) (1) 1 proposed transceiver applies a perturbation to the transmitted
Fk = V k V k Q k2 (7)
(1:Lk ) spatial multiplexing signal vector to reduce the norm of the
Authorized licensed use limited to: VELLORE INSTITUTE OF TECHNOLOGY. Downloaded on July 28, 2009 at 06:55 from IEEE Xplore. Restrictions apply.
4054 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008
precoded signal vector for each user. The perturbation for user
k is given by
2 2
l k = arg min H −1 sk + τll k ) = arg min H −1
eff,k (s eff,k
sk
l k ∈AZ2Lk p k ∈AZ2Lk
(9)
where s k = s k + τll k . In (9), s k and l k are the transmit signal
vector of the k-th user before perturbing and the perturbing
vector of the k-th user and τ is a positive real number,
respectively. Essentially we find the k-th user’s perturbing
vector l k from the set of 2Lk -dimensional lattice points.
Unlike work in [7], the proposed perturbation operates in
stream domain, not in user domain. Since we generally accept
that the number of transmit streams for each user is less
than the number of users, i.e., Lk < K, the implementation
of VP in the proposed transceiver requires smaller a search
dimension than that in [7].
Since the transmitter sends the pre-distorted symbol with a Fig. 2. Comparison between the sum capacity (Csum , [20]) and the
achievable sum rates of MIMO BD with WF (CW BD , [11]), MIMO BD-
F
perturbation, the received signal is TxZF (CTBD (CVBD
xZF , (22)) and the proposed MIMO BD-VP P ). Nt = 4 and
K = 2.
yk =
sk + nk . (10)
Note that the received signal in (10) consists of the perturbed
From (12), the received signal-to-noise-ratio (SNR) of
sk ) and the AWGN vector. The receiver has only to
symbol (
each stream, SN Rl , with a total transmit power constraint
map the perturbed symbol back to the original symbol (ssk ) in
E{xxk 2 } = P can be represented by
the fundamental region using modulo operations [8], and the
estimated symbol of s k is given by ρξl2 ρξ 2
SN Rl = = Lk l , (14)
ŝsk = mod(yy k ) (11) γ 2 2
m=1 μm ξm
where mod(·) denotes a modulo operation that operates per where ρ = σP2 . Therefore, the achievable rate of the k-th user,
n
dimension. As mentioned in Section II, mod(·) results in a Rk , is given by
simple decoder at the receiver.
Lk Lk
ρξl2
Rk = log (1 + SN Rl ) = log 1 + Lk .
2 2
B. Achievable Rate Analysis of MIMO Block Diagonalization l=1 l=1 m=1 μm ξm
with Perturbation (15)
Note that perturbation is not applied yet in calculating (15).
The problem of achievable rate analysis is reduced to the The effect of perturbation is to force the perturbing vector l k
single-user MIMO case thanks to the fact that the nulling to minimize γ and generate a s k that can only be coarsely
matrix F k removes multiuser interference. Recall the received oriented in the coordinate system defined by u 1 , · · · , u Lk [7].
signal model that uses the effective channel in (5). Define
Therefore, from (15), if we find the proper perturbing vector
H eff,k = U kΛ kV ∗k where U k = [u u 1 · · · u Lk ], V k =
and control ξm to minimize the normalized factor γ, then we
[vv 1 · · · v Lk ] and Λ k = diag (λ1 · · · λLk ). Then the received
can obtain an upper bound on achievable rate of the proposed
signal vector y k for the proposed system is given by
precoder for each user
1 ⎛ ⎞
y k = H eff,kx k + n k = √ I Lk ×Lk
sk + nk . (12)
γ Lk
⎜ ρξ 2
Rk,prop = log ⎝1 + l ⎟
⎠. (16)
where x k = √1γ H −1 k . In (12), the normalized scalar factor
eff,ks l=1 min Lk
μ
m=1 m m
2 ξ 2
ξm
of the transmit signal γ is given by
−1 From (16), we need to solve for
−1 2
γ = H H eff,k
s k = s ∗k H eff,kH ∗eff,k
sk
Lk
Lk
(13) ξl = arg min μ2m ξm
2
. (17)
s ∗kU kΛ −2
= ∗
k U ksk = μ2l ξl2 , ξm m=1
l=1
1 By the arithmetic-geometric mean inequality, solution of
where μl = > 0 and ξl =
λl
u∗l sk |.
|u
Note that τ is chosen (17) occurs when
large enough so that no s can be made zero and u i is a
non-zero unit-norm singular vector. Since we consider to have μ21 ξ12 = · · · = μ2Lk ξL
2
= ω02 (18)
k
integer components in perturbing vector l , the case s k to be
chosen exactly parallel to u l occurs with probability zero. Thus for an arbitrary constant ω02 > 0. Note that, as mentioned,
u∗l
ξl = |u s k | > 0 [7]. u∗l
ξl = |u s k | cannot be zero since τ is chosen large enough
Authorized licensed use limited to: VELLORE INSTITUTE OF TECHNOLOGY. Downloaded on July 28, 2009 at 06:55 from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008 4055
Fig. 3. Comparison of the achievable sum rate performance according to Fig. 4. BER performance comparison between the proposed MIMO BD-
the number of users: Nr = 2 and SNR=20 dB. VP and the other MIMO precoding techniques ([7], [9]) for 4-QAM where
Nt = KNr . Note that CI-VP has better BER performance than the proposed
(1) that has the same data rate at the cost of high computational complexity.
Authorized licensed use limited to: VELLORE INSTITUTE OF TECHNOLOGY. Downloaded on July 28, 2009 at 06:55 from IEEE Xplore. Restrictions apply.
4056 IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008
TABLE I
S YSTEM CHARACTERISTIC COMPARISON FOR THE PROPOSED SYSTEM (MIMO BD-VP), MIMO BD-T X ZF, MIMO BD-R X ZF AND MIMO BD-R X ML
*
Channel estimation Transmitter complexity Receiver complexity **
MIMO BD-VP No > O(KN M ) O(N )
MIMO BD-TxZF No O(KN ω ), 2 < ω < 3 O(N )
MIMO BD-RxZF Required - O(N ω ), 2 < ω < 3
MIMO BD-RxML Required - O(N M )
*
This channel estimation means estimation for the effective channel at each mobile station.
**
This receiver complexity is required for each mobile station. We assume that N = Nr = Lk , and M denotes a modulation order.
Fig. 2 compares the achievable sum rate of the proposed by the precoder.
system with the other systems in the case of {2, 2}. From Fig. 4 shows BER performance comparing the proposed
Fig. 2, we observe that the sum rate of the proposed MIMO MIMO BD-VP with MIMO BD-RxZF, MIMO BD-RxML,
BD-VP is better than that of MIMO BD-TxZF and also and CI-VP for 4-QAM. We assume three {Nr , K} scenarios
approaches the sum rate of MIMO BD-WF asymptotically to observe the BER performance: {2, 2}, {2, 3}, and {3, 2}.
for high SNR as we expected in Section III-B without any For CI-VP, we consider Nt = 4, Nr = 1, and K = 4.
additional coordination information and iterative updates for The proposed MIMO BD-VP supports 9 dB SNR gains at
implementing the precoding and decoding matrices. The sum 10−2 BER compared with MIMO BD-RxZF. Compared with
rate of MIMO BD-TxZF is degraded by the power normal- MIMO BD-RxML, the proposed precoding system shows the
ization from transmit precoding. The proposed MIMO BD- same diversity gain, but provides less performance in SNR
VP exploits the perturbation as a form of power allocation to gain. Note that MIMO BD-RxML requires perfect channel
compensate for the degradation of power normalization. estimation for decoding the transmit symbol. On one hand,
Fig. 3 shows the achievable sum rate performances ac- as long as perfect channel estimation is guaranteed at the
cording to the number of users. The sum rate of proposed receiver, ML decoding is the optimal solution and provides
MIMO BD-VP linearly increases as the number of transmit a lower bound of error rate performance compared with other
antennas and the number of users increase. We also observe decoding algorithms. On the other hand, the proposed MIMO
that the performance gap between MIMO BD-WF and the BD-VP shows comparable BER performance to MIMO BD-
proposed MIMO BD-VP increases as the number of users RxML without any channel estimation. From the viewpoint
increases, which is because in this simulation the number of of diversity gain, the proposed MIMO BD-VP has the same
transmit antennas increases as the number of users increases. diversity order with the ML type receiver as mentioned earlier.
The performance gap between CVBD BD In Fig. 4, we also observe that the proposed precoding system
P and CW F resulted from
the assumption that MIMO BD-VP uses equal power and has the same slope as MIMO BD-RxML and provides better
the same constellation for each transmit antenna. Note that diversity gain than MIMO BD-RxZF. As mentioned in the
the achievable rate of proposed MIMO BD-VP approaches introduction, we can see that CI-VP shows better performance
the sum rate of the specific case of CW BD than the proposed solution at the cost of high computational
F mentioned in
Section III-B. complexity.
We also compare the BER performance and diversity gain
of the proposed MIMO BD-VP and the other MIMO BD B. Complexity
precoding systems according to several configurations of the In this section we evaluate the approximate complexity of
receiver antennas and users under the assumption that all the the precoding systems described in Section IV-A. It is hard to
precoding systems use equal power and the same constellation calculate the exact complexity of the proposed MIMO BD-VP
for each transmit antenna. because the perturbing algorithm used in the proposed precod-
The other precoding systems include MIMO BD-RxZF and ing system adopts a scalar design parameter τ that provides a
MIMO BD-RxML. The MIMO BD-RxZF and the MIMO BD- symmetric encoding region around every signal constellation
RxML precoding systems are the same as the proposed MIMO points (see (9) in [7]). The proposed perturbing algorithm
BD-VP from the viewpoint of the usage of the nulling matrix uses a complex version of Lk -dimensional integer-lattice least-
to remove multiuser interference. These precoding systems, square problem and we assume that N = Lk = Nr . Hence,
however, have ZF and ML decoders at the receiver to decode the proposed precoder has an approximate complexity of
the transmit symbol, respectively [9]. Therefore, we assume O(N M+α ) for each user, where M denotes a modulation
that, for MIMO BD-RxZF and MIMO BD-RxML, the receiver order and α is a positive value that depends on the encoding
uses the coordination information or channel estimation to region parameter τ . The complexity of O(N M+α ) is greater
give the information about the effective channel as mentioned than O(N M ) referred to as the complexity of ML decoding al-
in Section II. In contrast, the proposed MIMO BD-VP does gorithm with Lk transmit data streams. Consequently, MIMO
not need to estimate precoded channel parameters. So the BD-VP has the complexity of O(KN M+α ) (> O(KN M ))
proposed precoder is suitable for the system that only exploits totally at the transmitter. The receive complexity of MIMO
common pilot channels like 3GPP LTE standard [19], where BD-VP depends primarily on the modulo algorithm that
each receiver estimates the real channel without multiplying simply demaps a perturbed symbol to an original symbol.
Authorized licensed use limited to: VELLORE INSTITUTE OF TECHNOLOGY. Downloaded on July 28, 2009 at 06:55 from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. 7, NO. 11, NOVEMBER 2008 4057
Authorized licensed use limited to: VELLORE INSTITUTE OF TECHNOLOGY. Downloaded on July 28, 2009 at 06:55 from IEEE Xplore. Restrictions apply.