MIMO Outage Capacity - Large Deviations Approach PDF

RESEARCH PROJECT REPORT, MASTER SAR, 2013-2014 1
MIMO Outage Capacity: Large Deviations Analysis

Mohamed Abouzrar, Graduate Student, Sup elec
AbstractWe are considering a MIMO system under
an asymptotic regime, for which, we are investigating the
outage and error probabilities. Our main focus will be
on a research work [1], where the authors present a new
approach to calculate the probability density function and
the outage probability of the MIMO mutual information.
This approach, namely Large Deviations, shows interesting
results, especially in non Gaussian tails, where other
approaches fail to provide a good approximation of the
outage probability. We have simulating the main results
and comparing them to those provided by the authors
of [1]. In a second time, we studied the decoding error
probability from [2], of our system, the Gallager bound
was used to get a tight upper bound. The large deviations
approach was introduced in the calculus of the error
probability.
KeywordsLarge deviations principal, MIMO outage
probability, Error probability, Gallager bound.
I. INTRODUCTION
T
HE present paper is a report of our research
project. We will consider a MIMO system with
large number of transmitting and receiving antennas,
such that the ratio stays nite. Our aim is to derive the
outage and error probabilities, starting from two works
[1] and [2].
In the next section we will introduce some tools that
we found it useful to manipulate the subject, namely
Wisharts matrix, Large Deviations Principal, MIMO
outage capacity and Gallager upper bound. In the two
following sections, we will present our summary results,
especially the derivation of the outage probability
and then using it to enhance an upper bound of error
probability.
II. USEFUL TOOLS
A. Random matrices: Wisharts Matrix
The study of random matrices is started with the work
of the statician J. Wishart back to 1928, he was essen-
tially interested on the behavior of covariance matrices
with i.i.d entries, his work provided an expression of the
The author was working under the supervision of Mr. R. Couillet,
Telecom Departement, Sup elec (Gif-sur-Yvette)
joint probability of the entries of such covariance matrix
when the entries are identical CN(0, 1).
What about a formal denition of random matrix ? Let
the triple (, F, P) be the probability space so that F
is a sigma-algebra of subsets of and P a probability
measure on (, F) , Mat
N
(F) denote the space of N-
by-N matrices with entries in F, i.g F = C, then a
random matrix X
N
is dened to be a measurable map
from (, F) to Mat
N
(F).
The main interest of random matrices is in their
eigenvalues, recall that eigenvalues of X
N
are the roots
of the characteristic polynomial P(z) = det(zI
N
X
N
),
therefore the eigenvalues are functions of the entries of
X
N
. We will use essentially Hermitian matrices, as a
consequence of perturbation theory of normal matrices,
the real eigenvalues
i
(X
N
) are continuous functions
of X
N
, i.e since X
N
is a random matrix then the
eigenvalues are random variables. Thus, we will reduce
the complexity of the problem from considering a
O(N
2
) of random variables to just N random variables,
namely the eigenvalues.
The rst asymptotic considerations was developed by
the physician E. Wigner, the result is will known as
Wigners theorem. A Wigner matrix X
N
Mat
N
(F)
is any Hermitian matrix whose entries are independent
and identically distributed, except for the Hermitian
constraints. Let L
N
= N
1
N
i=0
i(XN)
denote the
empirical measure of the eigenvalues of X
N
, Wigners
theorem afrm that L
N
converges weakly, in probability,
to the semicircle distribution dened as follows:
(x) =
1
2
_
4 x
2
1
|x|2
(1)
Then, the theorem assert that for any continuous
function f, with real support, and > 0:
lim
N
P(|L
N
, f , f| > ) = 0
Lets go back to the beginning of this section, and
present some useful results related to Wishart matrix.
Let Y be an M-by-N rectangular matrix with Gaussian
entries over F, then X
N
= Y
H
Y is a Wishart matrix,
square covariance matrix. The joint distribution of the
real eigenvalues
i
(X
N
) is dened as follows, in case
where F = C:
P(
i
(X
N
))
MN
i
i<k
(
i
j
)
2
e
N
i=1
i
(2)
A physical interpretation of this expression is known
as 2D Coulomb gas method:
P(
i
(X
N
)) e
E(i(XN))
(3)
such that E is the energy of the system seen as particles
conned to a line each with a position dened by an
eigenvalue:
E(
i
(X
N
)) =
N
i=1
(
i
(MN) log
i
)
i=j
log |
i
j
|
(4)
Therefore
P[
i
t . . .
N
t] = P[
max
t] =
Z
N
(t)
Z
N
()
(5)
Where
Z
N
(t) =
_
t
. . .
_
t
i
d
i
exp[E(
i
(X
N
))]
A result, due to Marcenko and Pastur back to 1967,
concerning the asymptotic behavior of the empirical mea-
sure of the eigenvalues assert the convergence weakly to
Marcenko-Pastur distribution given by:
f
MP
(x) =
1
2x
_
(x
+
x)(x x
) (6)
s.t:
x
= (1
_
M
N
)
B. Large Deviations Principal
The theory of Large Deviations is concerned with the
study of the probabilities of very rare events, especially
with the rates at which this probabilities decay as a
natural parameter in the problem varies. If we have
sequence of probability distributions P
n
on (X, B), a
complete separable metric space ( Polish space) X with
its Borel sigma-eld B, we say that it satises a Large
Deviation Principle (LDP) with rate I(x) if the following
properties hold:
The function I(x) 0 is lower semicontinuous
and the level sets K
l
= {x : I(x) l} are
compact for any nite l.
For any closed set C X we have
lim
n
sup
1
n
log P
n
[C] inf
xC
I(x) (7)
For any open set G X we have
lim
n
inf
1
n
log P
n
[G] inf
xG
I(x) (8)
A consequence of the denition is the following
theorem,
Theorem 1: Let P
n
satisfy LDP on X with rate I and
F : X R a bounded continuous function. Then
lim
n
1
n
log
_
e
nF(x)
dP
n
= sup
x
[F(x) I(x)] (9)
An other consequence, very important for our case,
concerning the probability distribution Q
n
dened by
Q
n
(A) =
_
A
e
nF(x)
dP
n
_
X
e
nF(x)
dP
n
(10)
for Borel subsets A X. Then Q
n
satises an LDP on
X as well with the new rate function J given by
sup
xX
[F(x) I(x)] [F(x) I(x)] (11)
This theorem was used in [1], as we will see hereafter,
where was mentioned to be Varadhans lemma.
C. MIMO outage capacity
An other concept to be claried is the channel capacity
for single-user communications, we focus in the Shannon
theoretic sense. For time-invariant channel, the capacity
is dened as the maximum mutual information between
the channel input and output, it was proved by Shannons
theorem that the capacity is the maximum data rate that
can be transmitted over the channel at arbitrary small
error probability.
When the channel is time-variant, the denition of the
capacity depends on the channel knowledge at both sides
of the channel. If the Channel State Information (CSI), i.e
the instantaneous channel gains, are perfectly known at
both sides of the channel, then the ergodic capacity is de-
ned as the maximum mutual information averaged over
all channel states. When the CSI is known only at the
receiver, the channel coefcients are typically assumed
to be jointly Gaussian, so the channel is specied by the
rst two moments, i.e the mean and the covariance. In
this case the ergodic capacity denes the rate that can be
achieved, however the transmitter can send data at a rate
that cannot be supported by all channel states, therefore
an outage is declared, by the receiver, in poor channel
states.
On the other hand, concerning the non-ergodic
channel, the case in which the channel is chosen
arbitrarily in the beginning of the communication and
remain constant during the uses of the channel, it
was claimed by Telatar [4] that the maximum mutual
information is in general not equal to the channel
capacity and the Shannon capacity is zero. In this case,
we talk about a tradeoff between outage probability and
supportable rate.
Consider a transmitter with M transmit antennas and
a receiver with N antennas, the channel can be presented
by N-by-M matrix H. The received signal y is equal to
y = Hx + z , where x is the transmitted signal vector
and z is AWGN vector, normalized so that its covariance
matrix is the identity matrix. In the following we give
the expression of the channel capacity, from [5], in each
case that was discussed above:
When channel is constant and known perfectly at
the transmitted and the receiver, where Q is the
input covariance matrix:
C = max
Q:Tr(Q)=P
log
I
N
+HQH
H
(12)
For fading MIMO channel with perfect CSI at both
the transmitter and the receiver:
C = E
H
_
max
Q:Tr(Q)=P
log
I
N
+HQH
H
_
(13)
For fading MIMO channel with perfect CSI at the
receiver only:
C = max
Q:Tr(Q)=P
E
H
_
log
I
N
+HQH
H
(14)
And for the non-ergodic channel, the outage prob-
ability is given by:
P
out
(R, P) = inf
Q0:Tr(Q)P
P
_
log
I
N
+HQH
H
< R
(15)
D. Gallager Upper Bound
In most of cases we can not compute directly the
probability, that is why the bounds are an important tools
to derive a closer value, upper or lower bound, usually
upper bounds are more interesting. We will see the trivial
bound, known as Union Bound, then we will get to
the Gllager bound as a generalization of Bhattacharyya
bound, supposing a random coding and ML decoding.
Union bounds are based on the trivial inequality which
states that the probability of a union of events is upper
bounded by the sum of the probabilities of the individual
events. Let a set of signals s
1
, s
2
, . . . , s
N
, the Union
Bound on the probability of error given the k
th
was sent
is,
P(E/s
k
)
i=k
P(E
i,k
/s
k
) (16)
for orthogonal signals, |s
1
s
k
| =
2E
s
, i = k,
P(E) (N 1)Q(
_
E
s
/N
0
)
It turns out that (15) turns to be an equality if these events
are disjoint, otherwise, it could be very loose bound. The
looseness of the Union Bound comes from the fact that
intersections of half-spaces related to codewords other
than the transmitted one, are counted more than once.
We shall now present the Bhattacharyya Bound, here
also we assume that s
k
was sent, then
E
i,k
= {r : p(r/s
i
) > p(r/s
k
)}
we dene the indicator function,
(r) =
_
1 , r E
i,k
0 , otherwise
Thus,
P(E
i,k
/s
k
) =
_
Ei,k
p(r/s
k
)dr (17)
=
_
allr
p(r/s
k
)dr (18)
using the fact that,
r E
i,k
, (r) 1
p(r/s
i
)
p(r/s
k
)
we get the Bhattacharyya Bound,
P(E/s
k
)
_
r
i=k
_
p(r/s
i
)p(r/s
k
)dr
A generalization of this bound could be deduced by
using the fact that,
E
k

E
k
=
_
_
_
r :
i=k
_
p(r/s
i
)
p(r/s
k
)
_ 1
1+
_
_
_
hence we get the Gallager Bound,
P(E/s
k
)
_
r
(p(r/s
k
))
1
1+
_
_
i=k
((p(r/s
i
))
1
1+
_
_
dr , with 0
(19)
Assuming a random coding, we shall obtain the Gal-
lager upper bound of the average error probability:
P(E) e
N(E0(,P)R)
where
E
0
(, P) = log E
H
_
y
__
x
P(x)p(y/x, H)
1
1+
dx
_
1+
dy
III. OUTAGE PROBABILITY: LD APPROACH
A. Summary of The Approach
The authors of [1] are looking for an analytic expres-
sion for the probability distribution function and outage
probability of the MIMO mutual information, by using
Large Deviations theory techniques.
The previous approaches have shown that mutual
information, tends asymptotically to a Gaussian behavior
for large number of receiver antennas (N). This approx-
imation is acceptable around the mean of the mutual
information, i.e ergodic capacity, however it fails to
capture the tails of the distribution especially when the
rate drop bellow the half of the ergodic capacity. There
is much variants of Gaussian approximation, that can be
summarized in two methods, the large N xed SNR on
the one hand, and the large SNR xed N limit on the
other hand, both failing to produce quantitative results for
the outage probability outside their respective regions.
Since the tails of the mutual information PDF are
very important, because they match the rate region of
very low outage probability, where one would want to
operate. Then, one still needs an other approaches that
give more relevant results on the mutual information
PDF for arbitrary SNR and rate. That is the goal of [1]
by using a Large Deviations analysis.
We will focus on i.i.d Gaussian noise and input, the
MIMO channel model:
y = Hx + z
the mutual information for a given channel matrix H,
I
N
= log |I +H
H
H| (20)
where is the signal to noise ratio, H is a MxN whose
elements are independent N(0, 1/N), such that the N
transmitting and M receiving antennas are large, and =
N/M stay nite. We shall write (20) in terms of
k
the
eigenvalues of the Wishart matrix H
H
H,
I
N
=
N
k=1
log(1 +
k
) (21)
We shall now use the distribution given at (2), but before
that, we will use an asymptotic expression of the energy
term (4),
E(p) =
_
xp(x)dx ( 1)
_
p(x)log(x)dx(22)
__
p(x)p(y) log |x y|dxdy (23)
Then, the probability density function of the normal-
ized mutual function I
N
/N can be written as (5),
P
N
(r) =
Z
N
(r)
Z
N
(24)
this expression corresponds to (10), by applying the
Varadhans lemma (11), we get
lim
N
1
N
2
log P
N
(r) = E
0
E
1
(r) (25)
where
E
0
= inf
pX
E(p) (26)
E
1
(r) = inf
pXr
E(p) (27)
and p is a probability density, such that
X = {p : the expectation exists and > 0,
_
|p(x)|
1+
< }
X
r
= {p X :
_
p(x) log(1 +x)dx = r}
Now, to get the expression of (27), it sufces to resolve
the following optimization problem,
_
Min E(p)
Subject to p X
r
(28)
Than we can deduce (26) by simply vanishing the
Lagrangian multiplier corresponding to the constraint,
_
p(x) log(1 +x)dx = r
The derivation of the solution is given in details at the
appendix C in [1].
The evaluation of E
0
seems to be relatively easier,
because it takes into account just one constraint,
E
0
=

2
32
+
a
2
log
1
2
log(a)

2
_
G
_
0,
a
_
+
1
2
G
_
a
,
a
_
_
(29)
where b a , the function G(x, y) =
1
_
1
0
_
(t(t 1))
log(t+x)
t+x
dt which is computed by [6],
and [a, b] is the support of the distribution p. When
= 1, E
0
= 3/2.
The evaluation of E
1
(r) is relatively difcult, this time
we have to take into account two constraint, hence the
result will depend on . I will focus on the case where
= 1, because its the case that we have simulated. Since
the rate is non-negative or null, a 0, then we should
split this case itself into two cases :
a = 0,
p(x) =
b x
2(1 +x)
x
_
1 +x
k
1 +x
_
(30)
such that:
k =
b/2 2
1 1/
1 +x
a > 0,
p(x) =

2
_
(b x)(x a)
1 +x
(31)
where a and b are the solution of the equation:
x
2
(2k + 4
2
)x +k
2
2

2k + 4
+
2
= 0
which leads to, a = (
k + 1 1)
2

1
and
b = (
k + 1 + 1)
2
1
B. The evaluation of the outage probability
In this section, we will derive the expression of the
outage probability dened as follows,
P
out
(r) = P(I
N
/N < r) (32)
To do so, we will use the probability density of the
normalized mutual information given by (25).
Firstly, we compute the normalized factor of P
N
, it
sufces to evaluate it at r r
erg
, r
erg
is the ergodic
average of I
N
/N, where the Guassian approximation is
valid, which leads to
P
N
(r)
N
_
2v
erg
e
N
2
(E1(r)E0)
(33)
To obtain P
out
, we will need this following lemma,
Lemma 1 (Watsons lemma): Suppose f(t) = O(e
at
)
as t and in some neighborhood of t = 0 , f(t)
can be expanded as
f(t) = t
_
n
k=0
a
k
t
k
+R
n+1
(t)
_
, 0 < t < , > 1
where |R
n+1
(t)| < At
n+1
for 0 < t < . Then
F(s) =
_

0
e
st
f(t)dt
has the asymptotic expansion
F(s)
n
k=0
a
k
( +k + 1)
s
+k+1
+O(s
(+k+1)
), s
Therefore:
When r < r
erg
,
P
out
(r)
Q
_
N|E
1
(r)|
1
(r)
_
e
N
2
E1(r)E0
E
2
1
(r)
2E
1
(r)
_
E
1
(r)v
erg
(34)
and when r > r
erg
P
out
(r) 1
Q
_
N|E
1
(r)|
1
(r)
_
e
N
2
E1(r)E0
E
2
1
(r)
2E
1
(r)
_
E
1
(r)v
erg
(35)
Proof:
C. Simulations and Discussion
The trivial approximation of the outage probability is
the Gaussian approximation which is valid in the region
where r r
erg
, such that r
erg
is the ergodic average of
normalized mutual information,
r
erg
= (1 ) log u + log(1 +u) +u
1
1
The ergodic variance is given by,
v
erg
= log
_
1
(1 u)
2
u
2
_
where,
u =
1
2
(1 + +( 1) +
_
(1 +( 1))
2
+ 4
Then we obtain,
P
Gout
= 1 Q
_
N
r r
erg
v
erg
_
(36)
To simulate the outage probability given by (34) and
(35) for the case where = 1 , we have derived all the
equations needed, and they are presented at Appendix A.
Some of the simulation results are given by
the following gures Fig. 1, Fig. ??, and Fig. 3.
We have two remarks, rst our results, especially
for P
LDout
curves, differ from those obtained in [1].
Nevertheless, our results do not contradict the calculation
approach, because we wait from the LD approach to
give us an accurate expression of the outage probability
near the small errors, which is showed in our simulations.
We can say that the authors of [1] are successfully
obtained a good results even for higher error probability.
However, we are far from an exact analytical expression
of the outage probability, because a rigorous treatment
of the large-deviations region was not pursued.
IV. DECODING ERROR PROBABILITY
We will use the outage probability (34) and (35) and
the Gallager bound (19), to derive an upper bound of the
decoding error, assuming a random coding.
[2] claim that a decoding error occurs if either the
channel matrix is atypically ill-conditioned, which leads
to outage, noise is atypically large, or some codewords
are atypically close to each other.
We shall write,
P
e
(r) = P
outage
(r)P
e/outage
(r) + (1 P
outage
(r))P
e/no outage
(r)
P
outage
(r) +P
e/nooutage
(r)
(37)
Let now derive an upper bound of P
e/nooutage
(r) by
using the Gallager bound (19) and maximizing over the
parameter ,
P
e/nooutage
(r) exp(NE(r)) (38)
where
E(r) = max
[0,1]
{E
0
() r} (39)
and,
E
0
(r) = log E
H
det
_
I +

1 +
H
H
H
_
= log E
H
_
N
k=1
_
1 +

1 +
k
_
_
(40)
Fig. 1: Outage probability: MIMO 2x2 , = 10dB
Using Jensens inequality,
Fig. 2: MIMO 2x2 , = 10dB
Fig. 3: MIMO 3x3 , = 10dB
E
0
(r) E
H
_
log
_
N
k=1
_
1 +

1 +
k
_
__
= E
H
_
N
k=1
log
_
1 +

1 +
k
_
_
= E
H
(I
N
)
Where I
N
is exactly a mutual information given a
signal-to-noise ratio

1+
.
In the following, we assume
E
0
(r) E
H
(I
N
)
this means simply that we are interested in some asymp-
totic region where the mutual information behaves as an
afne function.
Therefore,
P
e/no outage
(r) min
[0,1]
e
EH(NI
N
) +Nr
min
[0,1]
E
H
_
e
NI
N
_
e
Nr
min
[0,1]
E
I
N
_
e
NI
N
_
e
Nr
min
[0,1]
_
e
NI
N
dP
N
e
Nr
eventually it still need to be expansed and then
checked.
V. CONCLUSION
To sum up, we have considered a MIMO system in an
asymptotic regime and we attempt to obtain the outage
and error probabilities. This project is split into two main
focuses, rst reviewing the work [1] and simulating some
of its results, and second starting from a work on error
probability [2] in order to use the results of [1] and the
Gallager bound to get some enhancement. From the rst
work, we obtained simulation result that differ from the
ones given by [1], but our simulations show that the
LD approximation works as it should be at small errors.
Actually, the second one still need some works to obtain
an analytic expression of tight upper bound of decoding
error.
APPENDIX A
SIMULATION EQUATIONS
We will present the equations used to simulate the
outage probability (34) for the case when = 1 ,
where [a, b] is the rate support and k is the Lagrangian
multiplier of the rate constraint:
When a = 0,
k
r
=
1
B
k
r
=
1
AB
s.t
A =
2(b + 1)
3
2
3b + 4 2
4
1 +b(2 +b 2
1 +b
B = 2 log
1 +
1 +b
2

1
2
log(1 +b)
where k and b are the solution of the following
equation, for a given r:
b
3
+ (4k
2
4k + 16 8 + 16k)b
b
+ (1 4k 8)b
2
+ 16k + 16 = 1
The rst two derivatives of E
1
(r) are as follows:
E
1
r
=
k
2
+
1
B
_
r
2

b
8
log
1 +
1 +b
2
_
+
1
AB
_
1
8
+
3b
16

1
b

k
8

k
2(1 +b +
1 +b)
_
and
2
E
1
r
2
=
1
B
1
AB
_
1
4
+

2(1 +b +
1 +b)
+

2(1 +b)
_
+
1
AB
_
3
16
+
1
b
2
+
k
2
(2
1 +b + 1)
2(b + 1)
3
2
(1 +
1 +b)
2
_
C
AB
3
_
r
2

b
8
log
1 +
1 +b
2
_
AC +BD
(AB)
3
_
1
8
+
3b
16

1
b
k
8

k
2(1 +b +
1 +b)
_
Where,
C =

2
b
2(1 +b)(1 +
1 +b)
2
and
D =
32(1 +
1 +b) b(b(b + 12) + 32
1 +b + 48
8b
3
(1 +b)
3
2
When a > 0:
a = (
k + 1 1)
2
1
; b = (
k + 1 + 1)
2
1
and k is obtained for a given r.
k
r
=
1
log
_
1 +
1
k
_
Then
E
1
r
=
r log +
1
2
2 log
_
1 +
1
k
_ +
k 1
2
+
log k
log
_
1 +
1
k
_
and,
2
E
1
r
2
=
1
log
_
1 +
1
k
_+
r log +
1
2
(k
2
+k) log
3
_
1 +
1
k
_
+
1
2k log
2
_
1 +
1
k
_+
log k
(k
2
+k) log
3
_
1 +
1
k
_
REFERENCES
[1] P. Kazakopoulos et al, Living at the Edge: A Large Deviations
Approach to the Outage MIMO capacity , Information Theory,
IEEE Transactions, 2011.
[2] S. Ray et al, On Error Probability for Wideband MIMO Chan-
nels , Conference on Information Sciences and Systems, The
Johns Hopkins University, March 16-18, 2005.
[3] I. E. Telatar, Capacity of multi-antenna Gaussian channels ,
European Transactions on Telecommunications, Vol. 10, No. 6,
pp. 585-595, 1999.
[4] L. Zheng and D. N. C. Tse, Diversity and multiplexing: A
fundamental tradeoff in multiple antenna channels, Information
Theory, IEEE Transactions, 2003.
[5] A. Goldsmith et al, Capacity Limits of MIMO channels, IEEE
Journal, 2003.
[6] Y. Chen and S. M. Manning, Some eigenvalues distribution
functions of the Laguerre ensemble, J. Phys. A: Math. Gen, vol.
29, pp. 7561-7579, 1996.
[7] Greg W. Anderson et al, An Introduction to Random Matrices,
Cambridge University Press, 2010.
[8] Romain Couillet and M erouane Debbah, Random Matrix Meth-
ods for Wireless Communications, Cambridge University Press,
2011.
[9] A. Dembo and O. Zeitouni, Large Deviations Techniques and
Applications, Springer, 2ed 2010: .
[10] S. R. S. Varadhan, Large Deviations, 2010,
www.math.nyu.edu/faculty/varadhan.
[11] W. A. Pearlman , The Gallager and Bhattacharyya Bounds,
Orthogonal Signal and Random Coding Bounds, 2005,
http://www.cipr.rpi.edu/ pearlman/.

MIMO Outage Capacity - Large Deviations Approach PDF

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

MIMO Outage Capacity - Large Deviations Approach PDF

Diunggah oleh

Hak Cipta:

Format Tersedia

RESEARCH PROJECT REPORT, MASTER SAR, 2013-2014 1

MIMO Outage Capacity: Large Deviations Analysis

1 +b) b(b(b + 12) + 32

Anda mungkin juga menyukai