Advanced Digital Communication

1
Lecture 1
Channel Model for Point-to-Point Communications

Point-to-point communications systems are well modeled using a bandpass additive noise channel model of
the form shown in Figure 1.1.
)|
|S(f
s(t)
h(t)
r(t)
y(t)
fc
fc
n(t)
Figure 1.1: Real bandpass channel model for point-to-point communications
The message bearing signal s(t) is a real-valued bandpass signal whose spectrum is concentrated in the
vicinity of some carrier frequency fc .
Distortions introduced by the channel are characterized by a linear time invariant system with impulse
) concentrated around fc . The channel response h(t)

may or
response h(t),
and frequency response H(f
may not be known at the receiver. In the simplest case, the response h(t) corresponds to a an ideal bandpass
filter with bandwidth corresponding to that of the signal s(t).
The additive noise process n(t) is a WSS bandpass random process. It may be idealized by White Gaussian
Noise (WGN) for the purposes of analysis.
The received signal r(t) is a real-valued bandpass process as well.
In the following we convert the bandpass channel model into a more convenient and equivalent complex
baseband channel model. (Also see chapter 4 of [1].)
Complex baseband representation for signal
) is symmetric about f = 0. Hence all of the information about
Since the signal s(t) is real, its spectrum S(f
), which we define to be
the signal s(t) is contained in the positive half of the spectrum S(f
)u(f ) .
S+ (f ) = 2S(f
(1.1)
where u() is the unit step function. The factor of 2 in the above equation makes the signal s+ (t) have the
same energy as the signal s(t). The inverse Fourier transform of the spectrum S+ (f ) is easily shown to be the
complex signal
1
s(t) + j s(t)] ,
(1.2)
s+ (t) = [
2
c V. Veeravalli, 2000
V.
)). The
where the signal s(t) is the Hilbert transform of s(t) (i.e., the Fourier transform of s(t) is jsgn(f )S(f
signal s+ (t) is called the pre-envelope of s(t). If we shift the spectrum of S+ (f ) down to the origin, we get
the baseband signal s(t) with
S(f ) = S+ (f + fc ),
and s(t) = s+ (t)ej2fc t .
(1.3)
Note that since S(f ) is not necessarily symmetric around the origin, the signal s(t) is in general complexvalued. The signal s(t) is called the complex envelope or the complex baseband representation of the real
signal s(t). From (1.2) and (1.3), we get
s(t) = Re[ 2 s+ (t)] = Re[ 2 s(t)ej2fc t ] .

(1.4)
The complex envelope s(t) can be written in terms of its real and imaginary parts as
s(t) = sI (t) + jsQ (t) .
(1.5)
From this and (1.4) we get
s(t) = 2[sI (t) cos 2fc t sQ (t) sin 2fc t] = 2 a(t) cos[2fc t + (t)] ,
where
q
a(t) =
s2I (t) + s2Q (t),
and (t) = tan1
sQ (t)
.
sI (t)
(1.6)
(1.7)
The signal a(t) is called the envelope of s(t), and (t) is called the phase of s(t). It is to be noted that every
bandpass signal can be written in the forms given in (1.6).
Equation (1.6) also suggests a practical way to generate the (components
of) complex envelope from the
passband signal. It is easy to see that if we multiply s

(t) by 2 cos(2fc t) and low-pass filter (LPF) the
output, we produce sI (t). Similarly, if we multiply by 2 sin(2fc t) and LPF the output, we get sQ (t).
The conversion from passband to baseband and vice-versa is illustrated below in Figure 1.2
LPF
s(t)
sI (t)
2 cos 2fc t
LPF
2 sin 2fc t
2 cos 2fc t
s(t)
sQ (t)
2 sin 2fc t
Figure 1.2: Conversion from passband to baseband and vice-versa.
References
[1] J. G. Proakis. Digital Communications. Mc-Graw Hill, New York, 3rd edition, 1995.
V.
Lecture 2
Complex baseband representation of channel response

Referring to Figure 1.1, since the output of the channel y(t) is a bandpass signal, it has the complex baseband
representation y(t) = y+ (t) ej2fc t . The signal y(t) is related to s(t) through the convolution integral, i.e.,
? s(t). The question that we ask now is whether the complex envelopes s(t) and y(t) are related in a
y(t) = h
similar fashion, and if so, what is the corresponding complex impulse response? You will show in HW#1 that
this in indeed the case and that the corresponding complex baseband channel response is given by:
1
1
j2fc t
= 2Re[h(t)ej2fc t ] .
H(f ) = H
and h(t)
+ (f + fc ) = h(t) = h+ (t)e
2
2
Note the additional factor of 2 in the equation relating h(t) and h(t).
(2.1)
Note that
y(t) = h ? s(t) = (hI + jhQ ) ? (sI + jsQ )(t)
(2.2)
implies that the I and Q components of y(t) can be computed separately as

yI (t) = hI ? sI (t) hQ ? sQ (t) , and yQ (t) = hI ? sQ (t) + hQ ? sI (t) .
(2.3)
using real baseband operations.

This suggests a way to implement the passband filter h
Complex baseband representation of noise
We now consider the last component of the channel in Figure 1.1, namely, the additive noise term n(t). Clearly,
since n(t) is a sample path of a WSS bandpass process, it can be represented by a complex envelope which we
denote by w(t).
1
w(t) = n+ (t)ej2fc t = [n(t) + j n
(t)]ej2fc t , and n(t) = Re[ 2 w(t)ej2fc t ] .
2
(2.4)
The complex process w(t) has some very interesting properties.

If the process n(t) is zero mean, then the process w(t) is obviously zero mean as well.
Let w(t) = wI (t) + jwQ (t), where wI (t) and wQ (t) are the real in-phase and quadrature processes, respectively. Assume that n(t) is zero mean and WSS. From (2.4) and the WSS property of n(t), it is easily
established that wI (t) and wQ (t) are jointly WSS processes (see HW#1.) Furthermore, we can show that
the auto- and cross-correlation functions satisfy1
RwI ( ) = RwQ ( ) , and RwI wQ ( ) = RwQ wI ( ) .
(2.5)
A complex process with the above property is said to be a proper complex process. We will study such
processes in greater detail later in Lecture 3. The autocorrelation function of w(t) is defined as
Rw ( ) = E[w(t + )w? (t)] .
1
(2.6)
For jointly WSS X(t) and Y (t), we define RXY ( ) = E[X(t + )Y (t)]
V.
Note the complex conjugate in the above definition. From (2.5), it is easy to show that
Rw ( ) = 2RwI ( ) + j2RwQ wI ( ) .
(2.7)
Rn ( ) = Re[Rw ( )ej2fc ] .
(2.8)
Furthermore, we can show that

Let Sw (f ) of denote the PSD of w(t). From (2.8), it is easy to show that
Sn (f ) =
1
[Sw (f fc ) + Sw (f fc )]
2
(2.9)
and that
Sw (f ) = 2Sn (f + fc ) u(f + fc ) .
(2.10)
If n(t) is a stationary Gaussian process, then w(t) is a stationary proper complex Gaussian (PCG) process.(See Lecture 3.)
Based on the complex baseband representations of the signal, channel response and noise, we have the following complex baseband system shown in Figure 2.1 which is equivalent to the bandpass system of Figure 1.1.
Note that the signal r(t) is the complex envelope of r(t), i.e., r(t) = r+ (t)ej2fc t , t.
s(t)
h()
r(t)
y(t)
w(t)
Figure 2.1: Complex baseband channel model for point-to-point communications
Idealization by White Gaussian Noise

If the process n(t) is idealized by a white Gaussian noise (WGN) process with PSD N0 /2 in Figure 1.1, then
the above does not apply since WGN is not a bandpass process. In this case we replace the process n(t) by a
bandpass noise process nBP (t) with a flat spectrum over the bandwidth of the channel, with the understanding
that nBP (t) is the noise process seen after front-end processing at the receiver. The PSD of the process nBP (t)
is:
(
N0
if |f fc | < B2
,
(2.11)
SnBP (f ) = 2
0
otherwise
where B is the channel bandwidth.
Now the process nBP (t) has complex envelope wLP (t) which is a lowpass complex Gaussian process. By the
symmetry of the spectrum SnBP (f ) around fc , we have
(
N0 if |f | < B2
.
(2.12)
SwLP (f ) =
0
otherwise
V.
Since SwLP (f ) is symmetric about f = 0, RwLP ( ) is a real function. From (2.7), we then conclude that the
real and imaginary parts of the process wLP (t) are uncorrelated (and hence independent). Note that
RwLP ( ) = N0 Bsinc(B ) .
(2.13)
We now idealize wLP (t) by a white noise process for the same reason we idealize the bandpass noise by WGN.
If we consider the limiting form of (2.13) as B , we get
Rw ( ) = N0 ( )
(2.14)
Again since Rw ( ) is real, from (2.7) we see that RwI wQ ( ) must be 0 for all . Also,
1
N0
Rw ( ) =
( ) .
2
2
(2.15)
1
N0
Sw (f ) =
for all f .
2
2
(2.16)
RwI ( ) = RwQ ( ) =
and
SwI (f ) = SwQ (f ) =
References
V.
Lecture 3
Proper Complex and Circularly Complex Gaussian Processes

The complex baseband representation of bandpass communication systems described in the previous section
will be used throughout this course. We will hence need to be comfortable in dealing with complex random
processes. Fortunately, all the complex random processes we have to deal with have an elegant structure
(which is why they are called proper) that makes analysis in the complex baseband domain considerably
more convenient than in the real passband domain. In this lecture we study this structure in more detail.
Proper Complex Random Vectors
Let Y = YI + jYQ be a complex random vector. Define the real covariance matrices
YI = cov[YI , YI ] = E[(YI mYI )(YI mYI )> ]
YQ = cov[YQ , YQ ] = E[(YQ mYQ )(YQ mYQ )> ]
YI YQ = cov[YI , YQ ] = E[(YI mYI )(YQ mYQ )> ]
(3.1)
YQ YI = cov[YQ , YI ] = E[(YQ mYQ )(YI mYI )> ]

Also define the complex covariance matrices
h
i
Y = E (Y mY )(Y mY )
i
h
Y = E (Y mY )(Y mY )>
(3.2)
Y are, respectively, the covariance and pseudocovariance matrices of Y. Note that

where Y and

Y = YI + YQ + j YQ YI YI YQ

Y = Y Y + j Y Y + Y Y
(3.3)
Y = 0, i.e. if
Definition 3.1. Y is said to be a proper complex random vector if pseudocovariance matrix
YI = YQ and YQ YI = YI YQ .
(3.4)
Note that for proper complex Y, it follows from (3.3) that

Y = 2YI + j2YQ YI .
(3.5)
The scalar case

In the special case where Y is a scalar, denoted by Y , it is clear that
YQ YI = E[(YI mI )(YQ mQ )] = YI YQ
V.
(3.6)
In this case, if Y is proper, which means E[(Y mY )2 ] = 0, then YQ YI = YI YQ together with (3.6)
implies that YQ YI = 0, i.e. that YI and YQ are uncorrelated. Furthermore, from (3.5), we get
Y = Y2 = E[|Y mY |2 ] = 2YI = 2Y2I = 2Y2Q .
(3.7)
Thus for a complex random scalar Y to be proper, the in-phase and quadrature components must have the same
variance and be uncorrelated. Also we see that variance of Y is twice the variance of each of the components.
In the following, we give some general results for proper complex random vectors that also justify the use of
the term proper in decribing such complex random vectors.
Theorem 3.1. Let Y be a proper complex random n-vector, and suppose the random m-vector Z is defined by
Z = AY + b .
(3.8)
Then Z is also proper complex.

That is, properness is preserved under affine transformations.
Theorem 3.2. Let Y = YI + jYQ be a proper complex Gaussian vector, i.e. YI , YQ are jointly Gaussian.
Then
pY (y) := pYI YQ (yI , yQ )
n
o
1
exp (y mY ) 1
= n
Y (y mY )
det(Y )
(3.9)
(Check: From the fact that Y is Hermitian and positive definite [1], it follows that det(Y ) is real and
positive. For the same reason the quantity inside the exponential is also real and positive. )
Note that (3.9) does not hold if Y is not proper. Proper complex Gaussian vectors are also called circularly
complex Gaussian vectors. This is because the pdf of Y is unchanged if we rotate each component about its
mean by some angle . That is, Z = (Y mY )ej + mY has the same pdf as Y (prove this!).
In the special case of a proper complex Gaussian scalar Y , the components YI and YQ are independent Gaussian
random variables with variance equal to Y2 /2, i.e.,

(yI mI )2 + (yQ mQ )2
1
exp
.
(3.10)
pY (y) = pYI YQ (yI , yQ ) =
Y2
Y2
Note that this joint pdf is circularly symmetric about the mean (mI , mQ ).
Corollary 3.1. If Y is a proper complex Gaussian (PCG) vector, then Z = AY + b is also a PCG vector.
Proper Complex Processes
Let Y (t) = YI (t) + jYQ (t) be a complex random process. Parallel to the vector case, we define covariance
and pseudocovariance functions:
CY (t + , t) = E [(Y (t + ) mY (t + ))(Y (t) mY (t))? ]
CY (t + , t) = E [(Y (t + ) mY (t + ))(Y (t) mY (t))]
V.
(3.11)
It is easy to show that

CY (t + , t) = [CYI (t + , t) + CYQ (t + , t)] + j[CYQ YI (t + , t) CYI YQ (t + , t)]
CY (t + , t) = [CY (t + , t) CY (t + , t)] + j[CY Y (t + , t) + CY Y (t + , t)]
I
Q I
(3.12)
I Q
Definition 3.2. Y (t) is proper complex if the pseudocovariance CY Y (t + , t) = 0, i.e., if

CYI (t + , t) = CYQ (t + , t) and CYQ YI (t + , t) = CYI YQ (t + , t)
(3.13)
Note that for a zero mean process Y (t), the covariance functions C above may be replaced by correlation
functions R. Also, if YI (t) and YQ (t) are jointly WSS processes, then Y (t) is WSS, and (t + , t) in the above
equations may be replaced by .
For a proper complex process Y (t),
CY (t + , t) = 2CYI (t + , t) + j2CYQ YI (t + , t) .
(3.14)
Theorem 3.3. For any n, and any t1 , t2 , . . . , tn , the samples Y (t1 ), Y (t2 ), . . . , Y (tn ) of a proper complex
process Y (t) form a proper complex random vector
Definition 3.3. A proper complex process Y (t) is said to be proper complex Gaussian if, for all n, and all
t1 , t2 , . . . , tn , the samples Y (t1 ), Y (t2 ), . . . , Y (tn ) are jointly PCG.
We now give the continuous-time version of Theorem 3.1.
Theorem 3.4. If a proper complex process Y (t) is passed through a linear (possibly time-varying) system to
form
Z
Z(t) =
h(t, s)Y (s)ds .
(3.15)
s=
Then Z(t) is a proper complex process as well. In addition, if Y (t) is proper complex Gaussian process, then
Z(t) a is PCG process as well.
References
[1] R. A. Horn and C. R. Johnson. Matrix Analysis. Cambridge, New York, 1985.
[2] F. D. Neeser and J. L. Massey. Proper complex random processes with applications to information theory.
IEEE Trans. Inform. Th., 39(4), July 1993.
V.
Lecture 4
Signal Space Concepts

In order to proceed with the design and analysis of digital communication systems (in complex baseband) it is
important for us to understand some properties of the space in which the message bearing signal s(t) lies.
Inner Product and Norm
Let x(t) and y(t) be complex valued signals with t [a, b]. If a and b are not specified, it is assumed that
t (, ).
Definition 4.1. (Inner Product)
hx(t), y(t)i =
x(u)y (u)du .
(4.1)
The inner product satisfies the necessary axioms (see, e.g., [1, Chap 3]):
hx(t), y(t)i = hy(t), x(t)i
hx(t) + y(t), z(t)i = hx(t), z(t)i + hy(t), z(t)i
hx(t), y(t)i = hx(t), y(t)i
hx(t), x(t)i 0, and hx(t), x(t)i = 0 iff x(t) = 0 for all t.
Signals x(t) and y(t) are said to be orthogonal if hx(t), y(t)i = 0. The orthogonality of x(t) and y(t) is
sometimes denoted by x(t) y(t).
Definition 4.2. (Norm) The inner product defined above induces the following norm:
p
kx(t)k = hx(t), x(t)i .
(4.2)
It is easy to show that the above quantity is a valid norm in that it satisfies the required axioms. (Based on
above, all that one needs to verify is the triangle inequality.)
Properties of Inner Product and Norm
Cauchy-Schwarz Inequality:
|hx(t), y(t)i| kx(t)kky(t)k
(4.3)
with equality iff x(t) = y(t) for some complex .

Parallelogram Law:
kx(t) + y(t)k2 + kx(t) y(t)k2 = 2kx(t)k2 + 2ky(t)k2 .
(4.4)
Pythagorean Theorm: If x(t) y(t) then

kx(t) + y(t)k2 = kx(t)k2 + ky(t)k2 .
V.
(4.5)
9
Signal Space and Basis Functions

If all we know about the signal s(t) is that it has finite energy, i.e., ks(t)k , then we can consider s(t)
to belong to the (infinite-dimensional) Hilbert space of complex signals with finite energy and with inner
product as defined above. This Hilbert space is denoted by L2 [a, b].
One can find a (countably infinite) set of functions {i (t)}
i=1 in L2 [a, b] that are orthornormal, i.e.,
hi (t), ` (t)i = i` , such that for any s(t) L2 [a, b], we have
s(t) =
xi i (t) .
(4.6)
i=1
The set {i (t)}

i=1 is called a complete basis for L2 [a, b]
Example 4.1. On L2 [0, T ], we have the Fourier basis, defined by:
1
i (t) = ej2it/T , i = 0, 1, 2, . . .
T
(4.7)
Suppose we further impose constraint that the complex baseband signal s(t) is approximately bandlimited to
W/2 Hz (and time-limited to [T /2, T /2], say), and impose no other constraints on the signal space. Then
the appropriate basis functions for the signal space are the Prolate Spheroidal Wave Functions (PSWFs). See
the papers by Slepian, Landau and Pollack [2, 3, 4] for a description of PSWFs. This basis is optimum in
the sense that, although there are a countably infinite number of functions in the set, at most W T of these
are enough to capture most of the energy for any signal in this signal space. So the signal space of complex
signals that are approximately bandlimited to W/2 Hz and time limited to [T /2, T /2] is approximately
finite dimensional.
More typically in communication systems, s(t) is one of M possible signals s0 (t), s1 (t), . . . , sM 1 (t). If we
let S = span{s0 (t), . . . , sM 1 (t)}, then dim(S) = n M . The signal s(t) can then be considered to belong
to the n-dim space S. One can find an orthonormal basis for S by the standard Gram-Schmidt procedure:
( (u)
0
if k0 (t)k 6= 0
(4.8)
0 (t) = s0 (t), 0 (u) = k0 (t)k
stop
otherwise
( (u)
1
if k1 (t)k 6= 0
(4.9)
1 (u) = s1 (u) hs1 (t), 0 (t)i0 (u), 1 (u) = k1 (t)k
stop
otherwise
( (u)
`1
`
X
if k` (t)k 6= 0
(4.10)
` (u) = s` (u)
hs` (t), i (t)ii (u), ` (u) = k` (t)k
stop
otherwise
i=0
For signal s(t) S, we can write
s(t) =
n
X
s` ` (t) , with s` = hs(t), ` (t)i .
(4.11)
`=0
The signal s(t) S is equivalent to the vector s = [s1 s2 sn ] in the sense that
ks(t)k = s s = ksk (show this!)

V.
(4.12)
10
and for sk (t), sm (t) S
hsk (t), sm (t)i = sm sk = hsk , sm i (show this!) .
(4.13)
Signal Energy, Correlation and Distance

The energy of a signal s(t) is denoted by E and is given by
E = ks(t)k2 .
(4.14)
The correlation between two signals sk (t) and sm (t), which is a measure of the similarity between these two
signals, is given by
hsk (t), sm (t)i
hsk (t), sm (t)i
=
.
(4.15)
km =
ksk (t)kksm (t)k
Ek Em
The distance between two signals sk (t) and sm (t), which is also a measure of the similarity between these
two signals, is given by

1
p
2
dkm = ksk (t) sm (t)k = Ek + Em 2 Ek Em Re[k,m ] .
If Ek = Em = E, then
dk,m = [2E(1 Re[k,m ])] 2 .
(4.16)
(4.17)
References
[1] D. G. Luenberger. Optimization by Vector Space Methods. Wiley, New York, 1969.
[2] D. Slepian and H. O. Pollack. Prolate Spheroidal Wave Functions, Fourier analysis and uncertainty-I. Bell
Syst. Tech. J., pages 4363, January 1961.
[3] H. J. Landau and H. O. Pollack. Prolate Spheroidal Wave Functions, Fourier analysis and uncertainty-II.
Bell Syst. Tech. J., pages 6485, January 1961.
[4] H. J. Landau and H. O. Pollack. PSWFs-III: The dimension of the space of essentially time- and bandlimted signals. Bell Syst. Tech. J., pages 12951320, July 1962.
V.
11
Lecture 5
Digital Modulation
After possible source and error control encoding, we have a sequence {mn } of message symbols to be transmitted on the channel. The message symbols are assumed to come from a finite alphabet, say {0, 1, . . . , M
1}. In the simplest case of binary signaling, M = 2. Each symbol in the sequence is assigned to one of M
waveforms {s0 (t), . . . , sM 1 (t)}.
Memoryless modulation versus modulation with memory. If the symbol to waveform mapping is fixed from
one interval to the next, i.e., m 7 sm (t), then the modulation is memoryless. If the mapping from symbol
to waveform in the n-th symbol interval depends on previously transmitted symbols (or waveforms) then
the modulation is said to have memory.
For memoryless modulation, to send the sequence {mn } of symbols at the rate of 1/Ts symbols per second,
we transmit the signal
X
smn (t nTs ) .
(5.1)
s(t) =
n
Linear versus nonlinear modulation. A digital modulation scheme is said to be linear if we can write the
mapping from the sequence of symbols {mn } to the transmitted signal s(t) as concatenation of a mapping
from the sequence {mn } to a complex sequence {cn }, followed by a linear mapping from {cn } to s(t).
Otherwise the modulation is nonlinear.
Linear Memoryless Modulation
In this case, the mapping from symbols to waveforms can be written in complex baseband as:
p
sm (t) = Em ejm g(t) , m = 0, 1, . . . , M 1 ,
(5.2)
where g(t) is a real-valued, unit energy, pulse shaping waveform.

The signal sm (t) can be represented by a point in the complex plane, i.e., the signal space corresponding to
a symbol interval is a 1-d (complex) space with basis function g(t).
Q
Q
sm
Em
m
s0
s1
s2
I
s3
sM 1
Representation of sm (t)
V.
Signal constellation
12
In real passband,
sm (t) = Re[ 2sm (t) ej2fc t ] = 2Em cos(2fc t + m ) .
(5.3)
It is easy to see that the signal energy is the same in both the real passband and complex baseband domains
and equals Em .
The average symbol energy for the constellation is given by
Es =
M 1
1 X
Em .
M
(5.4)
m=0
The average bit energy for the constellation (assuming that M = 2 , for some integer ) is given by
Eb =
Es
Es
=
.
log2 M
(5.5)
The distance between signals sk and sm is dk,m = ksk sm k, and the minimum distance is given by
dmin = min dk,m .
k,m
(5.6)
A measure of goodness of the constellation is the ratio

=
d2min
.
Eb
(5.7)
Note that is independent of scaling of the constellation.

Some commonly used signal constellations are:
Pulse Amplitude Modulation (PAM). Information only in amplitude:
m = 0 and
p
d
Em = (2m + 1 M ) , m = 0, 1, . . . , M 1 .
2
(5.8)
We can compute as a function of M . For example, = 4 for M = 2.

Phase Modulation or Phase Shift Keying (PSK). Information only in phase:
m =
p
2m
and Em = E , m = 0, 1, . . . , M 1 .
M
(5.9)
For QPSK, = 4 (as in BPSK).

Quadrature Amplitude Modulation (QAM). Information in phase and amplitude. We can design constellations to maximize for a given M . Rectangular constellations are convenient for demodulation. For
rectangular 16-QAM, = 1.6.
V.
13
Orthogonal Memoryless Modulation

Here the signal set is given by
sm (t) =
E gm (t), m = 0, 1, . . . , M 1
(5.10)
where {gm (t)} are (possibly complex) unit energy signals, i.e., kgm (t)k = 1.
The correlation between signals sk (t) and sm (t) is given by:
km =
hsk (t), sm (t)i

= hgk (t), gm (t)i
E
(5.11)
There are two kinds of orthogonality:

Orthogonality only in the real component of the correlation, i.e. Re{km } = 0, for k 6= m. This form of
orthogonality is enough for coherent demodulation.
Complete orthogonality, i.e., km = 0, for k 6= m. This is required for noncoherent demodulation.
Examples of orthogonal signal sets
Separation in time:
gm (t) = g (t mTs /M )
(5.12)
where g(t) is such that hg(t kTs /M ), g(t mTs /M )i = km . For example, g(t) = pTs /M (t), a
rectangular pulse of width Ts /M .
This signal set is completely orthogonal. We can also create a signal set of twice the size which satisfies
orthogonality only in the real component of the correlation by adding {jgm (t)} to the above signal set.
Separation in frequency:
gm (t) = ej2mf t pTs (t)
(5.13)
km = sinc[Ts (k m)f ] ejTs (mk)f
(5.14)
Re{km } = sinc[2Ts (k m)f ] .
(5.15)
It is easy to show that

and that
Thus the smallest value of f such that km = 0, for k 6= m, is 1/T , and such that Re{km } = 0, for
k 6= m, is 1/2T .
Separation in time and frequency: One way to do this is to pick {gm (t)} to be the Walsh functions on
[0, Ts ] (see, e.g., [1, page 424]).
References
[1] G. Stuber. Principles of Mobile Communication. Kluwer Academic, Norwell, MA, 1996.
V.
14
Lecture 6
Continuous Phase Modulation (CPM)

CPM is a form of nonlinear modulation with memory. A natural way to introduce CPM is via memoryless
frequency shift keying (FSK) orthogonal modulation.
Memoryless FSK is a formed by linear scaling of the carrier frequency based on the symbol. If we map the
symbol m {0, 1, . . . , M 1} in a one-one manner to B {1, 2, . . . , (M 1)}, then
(6.1)
sm (t) = Eejf Bt .
To send the sequence {mn }, we transmit the signal:
X
Eejf Bn (tnTs ) .
s(t) =
(6.2)
The problem with memorlyless FSK is that abrupt frequency switching from one symbol interval to the next
can result in large spectral side lobes outside the main spectral lobe in the power spectrum of s(t). The
solution to this problem is continuous phase FSK (CPFSK)
CPFSK
Start with a real baseband PAM signal
d(t) =
Bn v(t nTs )
(6.3)
1
pT (t)
2Ts s
(6.4)
where Bn {1, 2, . . . , (M 1)}, and

v(t) =
Use d(t) to frequency modulate the carrier to form:

r

Z t
E
exp j 4Ts fd
d( )d + 0
s(t) =
Ts
(6.5)
where fd is the frequency deviation factor and 0 is the initial phase.

The phase of s(t) is continuous and is given by
Z
(t; B) = 4Ts fd
d( )d = 4Ts fd
"
#
Bn v( nTs ) d .
(6.6)
For t [nTs , (n + 1)Ts ], we can simplify the above expression as

(t; B) = n + 2hBn
V.
(t nTs )
2Ts
(6.7)
15
where the phase at time nTs , n , is given by

n = (nTs ; B) = h
n
X
Bk
(6.8)
k=
and the modulation index, h, is given by h = 2fd Ts .

Equation (6.7) can be rewritten as:
Bk q(t kTs )
(6.9)
1
t
11
+ 11
.
2Ts {t[0,Ts ]} 2 {t>Ts }
(6.10)
(t; B) = 2h
k=
where
q(t) =
CPM as a Generalization of CPFSK

Based on (6.9), we can generalize CPFSK to a continuous phase modulation scheme in which the carrier
phase is varied as:
X
Bk hk q(t kTs )
(6.11)
(t; B) = 2
k=
where hk could be constant with k or varied cyclically between a finite set of values, and
Z t
Z t
v( )d =
v( )d
q(t) =
with v(t) being a causal signal that is normalized so that
R
0
(6.12)
v( )d = 1/2.
Full response versus partial response

If v(t) has support [0, Ts ], then it is said to be a full response pulse. For example, the rectangular full
response pulse is given by
v(t) =
t
1
1
pTs (t) , with q(t) =
11{t[0,Ts ]} + 11{t>Ts } .
2Ts
2Ts
2
The time domain raised cosine (TDRC) full response pulse is given by:

1
2t
v(t) =
1 cos
pTs (t)
2Ts
Ts

with
t
1
q(t) =
sin
2Ts 4
2t
Ts

11{t[0,Ts ]} +
1
.
11
2 {t>Ts }
(6.13)
(6.14)
(6.15)
If v(t) is of support [0, LTs ], where L > 1, the modulation is said to be partial response. For example, the
rectangular partial response pulse with L = 2 is given by
v(t) =
V.
1
t
1
p2Ts (t) , with q(t) =
11{t[0,2Ts ]} + 11{t>2Ts } .
4Ts
4Ts
2
(6.16)
16
Phase Trajectories for CPM

The phase trajectory for CPM is a plot of the phase as a function of time for all possible sequences {Bn }.
Starting with (0, B) = 0, we can draw a tree with edges corresponding to the values taken by the Bn s.
This is called a phase tree. If M = 2 we have a binary tree.
If we draw the phase tree with the phase modulo [, ], then the tree collapses to a trellis.
For a full response rectangular pulse with hk = h, the edges are straight lines (see [1, Pg. 194]) and:

n1
X
t nTs
Bk + hBn
(6.17)
(t; B) = h
, for t [nTs , (n + 1)Ts ] .
Ts
k=
For a full response TDRC pulse, the edges are sinusoidal curves.
For a L = 2 partial response rectangular pulse (see (6.16)) with h = hk ,
(t; B) = h
n2
X
Bk +
k=
h(t (n 1)Ts )Bn1 h(t nTs )Bn

+
, for t [nTs , (n + 1)Ts ] . (6.18)
2Ts
2Ts
State Trellis. A simpler description of the phase trajectories is given in terms of a state trellis, in which only
the transitions between phase states at the symbol boundaries is drawn. For the example of (6.17)
n = h
n1
X
Bk .
(6.19)
k=
For rational h = m/p, it is of interest to list the set of possible states in the state trellis. It is easy to show
that if m is even

(p 1)m
m 2m
s = 0,
,
,...,
(6.20)
p
p
2p
and we have a total of p states. If m is odd, we have a total of 2p states and

m 2m
(2p 1)m
s = 0,
,
,...,
, .
(6.21)
p
p
2p
For the example of (6.18), with h = m/p,
n = h
n2
X
k=
Bk +
hBn1
.
2
(6.22)
Here we have a maximum of pM states if m is even, and a maximum of 2pM states if m is odd. In general,
for partial response covering L symbol intervals,
(
if m is even
pM L1
(6.23)
max # states =
L1
if m is odd
2pM
References
V.
17
Lecture 7
Minimum Shift Keying (MSK)

MSK is a special case of CPFSK with M = 2 and h = 1/2. Note that, as in any CPFSK modulation,
q(t) =
1
t
11{t[0,Ts ]} + 11{t>Ts } .
2Ts
2
(7.1)
The phase of the MSK signal is given by:

(t; B) =

n1
t nTs
X
Bk + Bn
, for t [nTs , (n + 1)Ts ] .
2
2Ts
(7.2)
k=
The frequency in interval [nTs , (n + 1)Ts ] is given by

fn =
1 Bn
Bn
1
=
= .
2 2Ts
4Ts
Ts
(7.3)
The frequency difference f = 2T1 s is the smallest frequency separation for orthogonality in a symbol period.
Hence the name minimum shift keying.
MSK can also be considered to be a special case of offset QPSK (or OQPSK)2 . It can be shown that:
!
r
X
E
exp j
Bk q(t kTs )
s(t) =
Ts
k=
r
E X
=
[B2n g(t 2nTs ) + jB2n+1 g(t 2nTs Ts )]
2 n=
r
E X
=
[B2n g(t 2nTb ) + jB2n+1 g(t 2nTb Tb )]
2 n=
where
1
g(t) = sin
Tb
t
2Tb
(7.4)

p2Tb (t) .
(7.5)
Power Spectra of Digitally Modulated Signals

Consider the linearly modulated signal of the form
s(t) =
Bn g(t nTs ) ,
(7.6)
n=
where {Bn } is a complex sequence produced by mapping the symbol sequence {mn } to the complex plane.
2
Note that Ts = Tb for MSK since it is a form of binary modulation. So the comparison with OQPSK made by fixing Tb , with the
understanding that Ts = 2Tb for OQPSK.
V.
18
If we model {mn } as a random sequence, then {Bn } is a random complex sequence. We assume that {Bn }
is WSS (discrete-time) process with mean B = E[Bn ], ACF RB [k] = E [Bn+k Bn? ], and PSD
SB (f ) =
RB [k] ej2f k .
(7.7)
k=
The modulated signal s(t) is a random process with mean
s (t) = E[s(t)] = B
g(t nTs )
(7.8)
n=
and ACF, which can be shown to equal

Rs (t + , t) = E[s(t + )s? (t)] =
RB [`]
g(t + `Ts nTs )g(t nTs ) .
(7.9)
n=
`=
Since s (t) and Rs (t + , t) are periodic in t with period Ts , s(t) is a cyclostationary process.
The PSD of the cyclostationary process s(t) is given by the Fourier transform of the average ACF
s ( ) = 1
R
Ts
Ts /2
Ts /2
Rs (t + , t)dt .
(7.10)
It is easy to show based on (7.9) that
s ( ) =
R
`=
1
RB [`]
Ts
g(t + `Ts )g(t)dt .
If we define the time ACF of the deterministic function g(t) by

Z
Rg ( ) =
g(t + )g(t)dt
(7.11)
(7.12)
then it is easy to show that

Sg (f ) = F [Rg ( )] = |G(f )|2 .
(7.13)
1 X
RB [`]Rg ( `Ts )
Rs ( ) =
Ts
(7.14)

s ( ) = 1 SB (f Ts ) |G(f )|2 .
Ss (f ) = F R
Ts
(7.15)
Based on (7.11) and (7.12), we obtain
`=
and hence
V.
19
Special Case: Uncorrelated Symbols

2 . Then
The sequence {Bn } is uncorrelated with each element having mean B and variance B
(
2 + 2
B
2
B if k = 0
= 2B + B
[k] .
RB [k] =
2
if k 6= 0
B
Thus
2
+ 2B
SB (f ) = B
ej2f k .
(7.16)
(7.17)
k=
Poisson Sum Formula
j2f Ts k
k=

1 X
n
=
f
.
Ts n=
Ts
(7.18)
From (7.15), (7.17) and (7.18), we obtain

2

2B X
k
1 2
k
2
Ss (f ) = B |G(f )| + 2
G Ts f Ts
Ts
Ts
(7.19)
k=
The second term in PSD corresponds to lines in the spectrum at the fundamental frequency 1/Ts . We can
suppress these spectral lines by forcing Bn to be zero mean. This is done in practice by ensuring that the
symbols are equally likely and symmetrical positioned around 0 in the complex plane.
For the special case of MPSK, it is easy see that, for all M ,
Ss (f ) =
E
|G(f )|2
Ts
(7.20)
Generalization of PSD Analysis

Consider linear modulation with memory of the form where {Bn } is first passed through a linear filter, with
transfer function H(z), to produce a new complex sequence {An }. The sequence {An } is modulated to
form:
X
s(t) =
An g(t nTs ) .
(7.21)
n=
The PSD of s(t) can be controlled by H(z) (see problem 6 of HW#2).

For offset QPSK and MSK,
s(t) =
E
2
[B2n g(t 2nTb ) + jB2n+1 g(t 2nTb Tb )]
(7.22)
n=
with Bn {+1, 1}. If {Bn } is an uncorrelated sequence with P{Bn = +1} = P{Bn = 1} = 1/2, then
it easy to show that
X
s ( ) = E
R
g(t nTb )g(t + nTb )
(7.23)
2 n
V.
20
Thus s(t) is cyclostationary with period Tb in this case, and following steps similar to those used above for
memoryless linear modulation we get
Ss (f ) =
E
E
|G(f )|2 = |G(f )|2 (for OQPSK) .
2Tb
Ts
(7.24)
Thus OQPSK has no spectral advantage over QPSK. However, as we discussed in class OQPSK is used in
practice to avoid abrupt zero-crossings in the passband signal.
The PSD analysis for CPM is considerably more complicated (see [1, Section 4.4.2] for details).
References
V.
21
Lecture 8
Likelihood Functions and Optimum Detection/Estimation

Estimation Problem: Information about unknown parameter S is available through a random observation Y whose probability distribution depends on . The goal is to estimate from Y . If card(S ) is finite,
then the estimation problem is called a detection problem.
Likelihood function: The pdf of Y when the parameter value is is denoted by p (y) and is called the
likelihood function for . There are two ways to interpret p (y). If we consider to be a deterministic
but unknown parameter, then p (y) is simply a member of the family of densities {p (y), S }. If is
considered to be a realization of a random variable , then p (y) equals the conditional pdf pY | (y|).
The maximum likelihood (ML) estimate of given Y = y is given by
ML (y) = arg max p (y) .
(8.1)
ML (y) = y.
For example, if Y N (, 2 ), then it is easy to see that
If the observation is a random vector Y and the parameter is a vector , then the likelihood function is p(y),
and the joint ML estimate of given y is:
ML (y) = arg max p(y) .
(8.2)
For example, if Y = [Y1 Y2 Yn ] has components that are i.i.d. Gaussian with unknown mean and
unknown variance 2 , then it is easy to show that joint ML estimates of and 2 are simply the sample mean
and samlple variance, respectively.
Bayesian Estimation
If the parameter is assumed to be random with known prior distribution p (), then the estimation procedure
is said to be Bayesian. There are many forms of Bayesian estimators. Two important ones are given below.
Maximum A Posteriori (MAP):
MAP (y) = arg max p|Y (|y) = arg max pY | (y|)p () = arg max p (y)p () .
(8.3)
MAP (y) =
ML (y).
If p () is uniform on S (this may not be possible for some S ), then
MMSE (y) is the
Minimum Mean Squared Error (MMSE): Assuming that belongs to a Hilbert space,
2
estimator that minimizes E[k(Y ) k ]. It can be shown that [1, Pg. 143]
MMSE (y) = E [ | Y = y] .
(8.4)
(More about MMSE estimators later in the course.)
V.
22
Minimum Probability of Error (MPE). If card(S ) = M < , then without loss of generality we can
consider S = {0, 1, . . . , M 1}. Let m = P{ = m}. We can define the probability of error as:
) 6= } =
Pe = P{(Y
M
1
X
) 6= m}|{ = m}) .
m P({(Y
(8.5)
m=0
MPE (y) is the estimator that minimizes Pe

Then
Relationship between MAP and MPE Estimators. The probability of correct decisions is given by:
M
1
M
1 Z
X
X
Pc = 1 Pe =
m P({(Y ) = m}|{ = m}) =
m pm (y)dy
m=0
m=0
(8.6)
where m is the decision region for parameter value m. It is clear that Pc is maximized by placing y in
decision region m if m pm (y) is larger than j pj (y) for all j 6= m. Thus
MPE (y) = arg max p (y) =
MAP (y) .
(8.7)
Examples
Suppose the likelihood function is given by:

1
|y |2
p (y) =
exp
.
2 2
2 2
(8.8)
ML (y) = arg max p (y) = arg min |y |2
(8.9)
Then
2
MAP (y) = arg max log |y | .
MPE (y) =
In the special case of binary signaling, i.e., S = {0, 1},
ML (y) = arg max p (y) =
This detector can be rewritten in the form:

ML (y) =
1
0
(8.10)
1 if p1 (y) p0 (y)
0 otherwise
if L(y) 1
otherwise
(8.11)
(8.12)
where L(y) = p1 (y)/p0 (y) is called the likelihood ratio of the observations. The optimum detector is a
special case of a likelihood ratio test (LRT).
References
[1] H. V. Poor. An Introduction to Signal Detection and Estimation, 2nd Edition. Springer-Verlag, New York,
1994.
V.
23
Lecture 9
Log-likelihood Ratio for Detection/Estimation in Discrete-Time WGN

Suppose the observation is a vector Y = [Y1 , Y2 , . . . , Yn ], with
Yk = s,k + wk , k = 1, . . . , n
(9.1)
where {wk } are i.i.d. CN (0, 2 2 )

The likelihood function is given by:
p (y) =
n
Y
k=1

|yk s,k |2
1
exp
.
2 2
2 2
(9.2)
For the purposes of optimum detection/estimation based on Y, it is okay to divide p (y) by the noise only
pdf

n
Y
|yk |2
1
exp 2 .
(9.3)
p(y) =
2 2
2
k=1
This division yields the likelihood ratio

L (y) =
n
Y
k=1
|yk s,k |2 |yk |2

exp
2 2

.
(9.4)
(Note that L(y) is the Radon-Nikodym derivative of measure P w.r.t. the measure P [1, Page 443].)
Taking the logarithm of L(y), we get an equivalent likelihood function, which simplifies to
n

1 X
2Re[yk s?,k ] |s,k |2 .
log L (y) = 2
2
(9.5)
k=1
The MAP estimate for based on y is given by:

MAP (y) = arg max log L (y) + log p () .
(9.6)
The other estimators have similar expressions in terms of log L .
Log-likelihood Ratio for Detection/Estimation in Continuous-Time WGN

Suppose the observation is a random process on interval [a, b] of the form:
Y (t) = s (t) + w(t) , t [a, b]
(9.7)
where w(t) is a proper complex WGN process with PSD N0 .

V.
24
Using Grenanders Theorem [2, Page 272], we showed in class that the log-likelihood ratio for the estimation
of based on {Y (t), t [a, b]} is given by:
log L (Y ) =

1
2RehY (t), s (t)i ks (t)k2
2
2
(9.8)
where 2 = N0 /2.
Sufficient Statistics. Suppose that for all S , s (t) has a representation in terms of a finite set of
orthonormal basis functions {k (t)}nk=1 , i.e.,
s (t) =
n
X
s,k k (t) , with s,k = hs (t), k (t)i .
(9.9)
k=1
(This is certainly true if card(S ) .) Then it is easy to see that

hY (t), s (t)i =
n
X
s?,k Yk
(9.10)
k=1
with Yk = hY (t), k (t)i. Thus

n

1 X
2Re[Yk s?,k ] |s,k |2 .
log L (Y ) = 2
2
(9.11)
k=1
This means that even though Y (t) has components outside the span of (1 (t), . . . , n (t)), these components
are irrelevant for the computation of the computation of the log-likelihood ratio, i.e., they are irrelevant for
optimum detection/estimation of based on {Y (t), t [a, b]}.
The correlations {Yk }nk=1 are said to form sufficient statistics for optimum detection/estimation.
It is easy to check that
Yres (t) = Y (t)
n
X
Yk k (t) = w(t)
k=1
n
X
wk k (t)
(9.12)
k=1
is independent of {Yk }nk=1 and is a function of the noise only. This is the justification that is given in many
texts for ignoring Yres (t) and using only the sufficient statistics, and is sometimes referred to as the Principle
of Irrelevance [3, Page 220].
References
[1] P. Billingsley. Probability and Measure. Wiley, New York, 1986.
1994.
[3] J. Wozencraft and I. Jacobs. Principles of Communication Engineering. John Wiley and Sons, New York,
1965.
V.
25
10
Lecture 10
Examples of Optimum Detection/Estimation in Continuous-Time WGN

From last class, for the continuous observation
Y (t) = s (t) + w(t) , t [a, b]
(10.1)
if s (t) span(1 (t), . . . , n (t)), then Yk = hY (t), k (t)i form sufficient statistics. We may pose the
problem of optimum detection of based on {Y (t), t [a, b] in terms of {Yk }nk=1 without loss of optimality.
Example 1
Suppose
Y (t) = g(t) + w(t) , t [a, b]
(10.2)
where kg(t)k2 = 1.
Clearly s (t) = g(t) is spanned by the single basis function g(t). Thus Y = hY (t), g(t)i is sufficient, and
is given by:
Y = + w
(10.3)
where w = hw(t), g(t)i CN (0, N0 ).
If {0, 1, . . . , M 1}, then, for equal priors,
MPE = arg max pm (y) = arg min |y m |2 .
(10.4)
Y (t) = s (t) + w(t) , t (, )
(10.5)
Example 2
Suppose
with
s (t) =
N
1
X
mn g(t nTs )
(10.6)
n=0
where mn {0, 1, . . . , M 1} and g(t) satisfies the zero-ISI condition:

hg(t nTs ), g(t `Ts )i = n` .
(10.7)
The parameter to be estimated is the sequence = (m0 , . . . , mN 1 ).

The basis functions for this example are k (t) = g(t kTs ), k = 0, 1, . . . , N 1.
Projecting Y (t) on to these basis functions we get:
Yk = hy(t), k (t)i = mk + wk
(10.8)
where {wk } are easily seen to be i.i.d. CN (0, N0 ) random variables.

V.
26
Note that {Yk } can be formed by passing Y (t) through a LTI sytem with impulse response h(t) = g(Ts t),
and sampling the output every Ts seconds.
The likelihood function (based on the sufficient statistics) is given by:

N
1
Y
|yk mk |2
1
exp
.
p (y) =
N0
N0
(10.9)
k=0
The ML solution for is given by:

ML = arg
min
m0 ,...,mN 1
N
1
X
|yk mk |2 .
(10.10)
k=0
It is clear from the above that the components of the ML solution satisfy:
ML,k = arg min |yk m |2 , k = 0, 1, . . . , N 1 .
k
mk
(10.11)
Thus symbol-by-symbol detection is optimum for the ML scheme.

For the MPE solution, we need priors on all possible sequences (m0 , m1 , . . . , mN 1 ). If the symbols in the
sequence are i.i.d. with each symbol being equally likely to take on the M possible values, then we have
ML , with symbol-by-symbol detection being optimum. If there
MPE =
uniform priors on the sequences and
MPE is different from
ML and symbol-by-symbol
is dependence between the symbols due to coding, then
detection is not optimum anymore, i.e., we need to do sequence detection.
If the zero-ISI condition on g(t) is not met, again symbol-by-symbol detection is not optimum.
Digital Communication on an Ideal AWGN Channel

If we assume an ideal channel filter, then passing the signal through the channel leaves it unchanged except
for the introduction of a delay . In this case
y(t) = s (t )ej + w(t)
(10.12)
where = 2fc + 0 .
The delay can usually be accurately estimated at the receiver. But even with a fairly accurate estimate of
, we may be left with an unknown phase offset at the receiver3 . Thus after delay estimation, we may move
the time axis to the right by to get the equivalent model
y(t) = s (t)ej + w(t) .
(10.13)
If is accurately estimated at the receiver and is used in the demodulation, then we have coherent demodulation. In this case, we may multiply Y (t) of (10.13) by ej to get the equivalent model
y(t) = s (t) + w(t)
(10.14)
since multiplying w(t) by ej does not change its statistical properties.

If cannot be accurately estimated at the receiver, we have resort to noncoherent demodulation.
3
Errors in are magnified by fc in the phase .
V.
27
Coherent demodulation for linear memoryless modulation

The received signal is given by:
y(t) =
N
1
X
p
Emn ejmn g(t nTs ) + w(t) .
(10.15)
n=0
This is a special case of Example 2 above.

Assuming that the sequences {mn } are equally likely, symbol-by-symbol MPE detection is optimum. Without loss of generality, we can consider demodulation of the symbol corresponding to symbol interval [0, Ts ].
The sufficient statistic for optimum detection of this symbol is given by:
p
y = hy(t), g(t)i = Em ejm + w
(10.16)
where we have dropped the subscript 0 for convenience. This detection problem is a special case of Example
1 above, and we see that
p
ML (y) = arg min |y Em ejm |2 .
(10.17)
m
MPE (y) = m
m
Let m denote the region in the complex plane where a decision in favor of symbol m is made. These
decision regions are obtained using the mimumum distance criterion of (10.17).
Probability of (symbol) error. The probability of error, conditioned on symbol m being sent is given by:
Z
pm (y)dy .
(10.18)
Pe,m = 1 Pc,m , with Pc,m =
m
The average probability of error (assuming equally likely symbols) is given by:
M 1
1 X
Pe,m .
Pe =
M
(10.19)
m=0
For symmetric constellations, Pe = Pe,m for all m.

For MPE detection, Pe can easily be calculated exactly in some special cases such as BPSK and QPSK. For
BPSK
!
r
2Es
Pe = Q
(10.20)
N0
and for QPSK
r
Pe = 2Q
Es
N0
r
Q
Es
N0
!
.
Union Bound on Pe

2
[
X
dm,`

Pe,m = P
{decide `}{m sent}
P ({decide `}|{m sent}) = Q
2N0
`6=m
(10.21)
(10.22)
`6=m
where dm,` is distance between the points m and ` in the constellation.

V.
28
Intelligent Union Bound (IUB). The Union Bound is generally too conservative. A better bound is obtained
by keeping only the terms in the Union Bounds that are required to cover the error region.
Nearest Neighbor Approximation (NNA). Let
dmin,m = min dm,`
`6=m
and let the number of neighbors that are at this minimum distance be Ndmin (m). Then
s
2
dmin,m
.
Pe,m Ndmin (m)Q
2N0
(10.23)
(10.24)
References
V.
29
11
Lecture 11
Coherent Detection of Orthogonally Modulated Signals

The received signal is given by:
y(t) =
N
1
X
Egmn (t nTs ) + w(t) .
(11.1)
n=0
1
where {gm (t)}M
m=0 are orthonormal signals.
Assuming memoryless modulation and equally likely symbol sequences, symbol-by-symbol detection is
optimum. For symbol corresponding to [0, Ts ]
y(t) = Egm (t) + w(t) .

(11.2)
The signal sm (t) = Egm (t) belongs to span(g0 (t), . . . , gM 1 (t)), and hence yk =< y(t), gk (t) >, k =
0, . . . , M 1, form sufficient statistics.
If symbol m is sent, then the vector of sufficient statistics is given by
y = [w0 E + wm wM1 ]>
(11.3)
where {wk } are easily seen to be i.i.d. CN (0, N0 ) random variables.

#
"

>y
yQ
1
(yI m )> (yI m )
1
Q
pm (y) =
exp
exp
N0
N0
(N0 )M/2
(N0 )M/2
where m = [0 . . . 0
E 0 . . . 0]> , with the m-th element being
(11.4)
E.
For equal priors,

ML (y) = arg max pm (y) = arg min(yI m )> (yI m ) = arg max ym,I
m
MPE (y) = m
m
(11.5)
The probability of error for the MPE decision rule is calculated as follows. First note that by symmetry,
Pe,m = Pe = Pe,0 . Also Pe,0 = 1 Pc,0 , where Pc,0 is the probability of correct decision when symbol 0 is
sent. Now

Pc,0 = P {y0,I > max yk,I } | {0 sent} = P

E + w0,I > max wk,I .
(11.6)
k6=0
k6=0
Let X = maxk6=0 wk,I . Then it is easy to show that

"
P(X x) = 1 Q
V.
p
N0 /2
!#M 1
.
(11.7)
30
Now, X is independent of w0,I . Thus

Z
Pc,0 =
P X<
1
=
2
where s =
Es
N0
1
E + w pw0,I (w)dw =
N0
[1 Q(t)]M 1 e 2 (t
1
2
2s )
"
1Q
!#M 1
2
E +w
w
p
e N0 dw
N0 /2
(11.8)
dt
E
N0 .
In the special case of M = 2 (e.g. binary FSK), the calculation of Pe is much easier. In particular,
r !
i
h
E
Pe = Pe,0 = 1 P E + w0,I > w1,I = Q

= Q ( s ) .
N0
(11.9)
since w1,I w0,I is N (0, 2N0 ).

Finally, we note that we do not require complete orthogonality among the signals {gm (t)} to get this performance. We can show that it is enough for only Re[k,m ] = 0 for k 6= m.
Estimation in the Presence of Nuisance Parameters

Suppose the distribution of the observation Y depends on S as well as S (), i.e., the likelihood
function given and is p, (y). But we are not interested in estimating , i.e., it is a nuisance parameter
for the estimation of based on Y . There are two approaches to estimating in this situation.
Joint ML Approach. We assume that and are deterministic but unknown (non-Bayesian model), and do a
joint ML estimation of and , but keep only the estimate of . Thus
(J) (y) = arg max pmax (y)
ML
(11.10)
pmax
(y) = max p, (y) .
(11.11)
where
S ()
MAP Approach. Here we assume and are realizations of random variables and (Bayesian model).
Then it is clear that
Z
avg
p (y) = pY | (y|) = p, (y)p| (|)d
(11.12)
and hence
MAP (y) = arg max pavg (y) p () .
If card(S ) = M < , then using steps similar to those used in Lecture 8, one can show that
Z
6= }|{ = })p ()d

Pe = P{ 6= } = P({
(11.13)
(11.14)
MPE =
MAP .
MAP . Thus
is minimized by
V.
31
MPE (or
MAP ). In the
In the analysis of digital communication systems, we are typically interested in
MAP for
absence of nuisance parameters, the justification we gave for considering ML was that it equalled
MAP
uniform priors. This justification does not extend to case where we have nuisance parameters, since
(J)
(J)
and ML are not necessarily equal even if we have uniform priors on . We may justify the use of ML based
on asymptotic properties of ML estimation (see, e.g., [1, Section IV.D]). Also, in some cases of interest such
MAP and
(J) are indeed equal.
as the one considered in the next section,
ML
Noncoherent Detection of Orthogonally Modulated Signals

In this case, the received signal for symbol corresponding to [0, Ts ] is
y(t) = Egm (t)ej + w(t)
(11.15)
where is assumed to be unknown at the receiver.

The correlations yk =< y(t), gk (t) >, k = 0, . . . , M 1, still form sufficient statistics, and conditioned on
symbol m being sent:
y = [w0 Eej + wm wM1 ]>

(11.16)

1
(y m ) (y m )
pm, (y) =
exp
,
(11.17)
(N0 )M
N0
where m = [0 . . . 0 Eej 0 . . . 0]> , with the m-th element being E ej . This is a detection problem
with nuisance parameter .
For the joint ML approach,
(J)
m
ML (y) = arg max pmax
m (y)
m
where
pmax
m (y)
(11.18)
#
" P
j
1
2 ) 2Re[y
( M
|y
|
Ee
]
+
E
1
m
k
k=0
= max pm, (y) = max
exp
N0
[0,2]
[0,2] (N0 )M
#
" P
(11.19)
1
2 ) 2|y | E + E
( M
|y
|
1
m
k
k=0
=
.
exp
(N0 )M
N0
Thus
(J)
m
ML (y) = arg max |ym | .
m
(11.20)
For the MAP approach, assuming that is uniformly distributed on [0, 2] (and that m is random with
uniform priors)
MPE (y) = arg max pavg
(11.21)
m
MAP (y) = m
m (y)
m
V.
32
where
Z
pavg
m (y)
1
d
2
#
" P
j !
Z 2
1
2) + E
( M
|y
|
]
1
2Re[y
m Ee
k
k=0
d
exp
exp
N0
2 0
N0
#
" P
!
1
2
( M
2|ym | E
k=0 |yk | ) + E
I0
.
exp
N0
N0
pm, (y)
0
1
(N0 )M
1
(N0 )M
(11.22)
Since I0 (x) as x ,
m
MAP (y) = m
MPE (y) = arg max I0
m
!
2|ym | E
= arg max |ym | .
m
N0
(11.23)
(J)
MAP = m
MPE in this case.
Thus m
ML = m
Probability of error for MPE decision detection. As before, Pe = Pe,0 = 1 Pc,0 . And

Pc,0 = P {|y0 | > max |yk |} | {0 sent} .
k6=0
(11.24)
Now, when 0 is sent, |y0 | is Ricean( E, N0 ) and |yk |, k 6= 0, is Rayleigh with second moment N0 , i.e.,
!
2

r
r +E
r E
p|y0 | (r) = 2 exp
11{r0}
(11.25)
I0
2 2
2
and for k 6= 0,

r
r2
p|yk | (r) = 2 exp 2 11{r0}
where 2 = N0 /2. You will show in Problem 6 of HW#3 that

!
r
2

2 M 1
Z
x
2Es
x
Es
Pc,0 =
exp
xI0 x
dx .
1 exp
+
N0
2
N0
2
0
In the special case of M = 2 (e.g., binary FSK) we get the considerably simpler expression:

h i
Es
1
1
s
.
Pe = Pe,0 = exp
= exp
2
2N0
2
2
(11.26)
(11.27)
(11.28)
References
1994.
V.
33
12
Lecture 12
Probability of Bit Error for M-ary Modulation

Assuming that M = 2 for some positive integer , we can map the symbols of any M -ary signaling scheme
to -bit vectors. To compare modulation schemes with different constellation sizes, it is useful to plot the
Es
average bit error probability for M -ary modulation versus the bit SNR b = NEb0 = N
.
0
For M -ary orthogonal signaling, it is easy to show that irrespective of how bits are assigned to symbols, we
have
21
Pb =
Pe .
(12.1)
2 1
We can hence get Pb as a function of b , based on expressions for Pe in terms of s .
For linear modulation, finding an exact expression for Pb as a function of b is difficult except in special cases
such as BPSK and QPSK.
Nearest Neighbor Approximation (NNA) for Pb . Let the symbol m be represented by the bit vector
bm = [b1,m b,m ]> , and define:
Ndmin (bm , i) = # NNs of bm that differ from bm in the i-th bit position .
Then
P {bm,i 6= bm,i } | {bm sent} Ndmin (bm , i) Q

and
Pb,i = P {i-th bit position in error}
Finally
Pb =
1
M
M
1
X
d2min (bm )
2N0
s
Ndmin (bm , i) Q
m=0
(12.2)
(12.3)
d2min (bm )
2N0
1X
Pb,i .
(12.4)
(12.5)
i=1
For Gray coded constellations, the NNA approximation for Pb is at most equal to Pe /.
Differential Phase Modulation and Detection

Consider MPSK signalling on an ideal AWGN channel with phase offset (t) that may change with time,
i.e.,
N
1
X
y(t) =
Eejn g(t nTs )ej(t) + w(t) .
(12.6)
n=0
Suppose (t) changes slowly with time so that we can assume that it is constant over two consecutive symbol
intervals.
V.
34
For standard MPSK
2mn
, mn {0, 1, . . . , M 1} .
(12.7)
M
We know that this scheme performs poorly if we cannot estimate at the receiver. But if (t) changes slowly,
a differential modulation approach can be taken where the sequence {n } is generated from {mn } as
n = mn =
n n1 = mn =
2mn
(with 0 = 0) .
M
(12.8)
Sufficient statistics for demodulation are still given by yn = hy(t), g(t nTs )i, n = 0, 1, . . . , N 1. Note
that
yn = E ejn ejn + wn
(12.9)
where n n1 for all n.
Since the information about mn is contained in the phase difference between yn and yn1 , it is convenient to
form the statistics:
yn y ?
rn = n1 Eejmn + Xn (with r0 = y0 )
(12.10)
E
where
wn w?
?
(12.11)
ejn ejn + n1
Xn = wn ejn ejn1 + wn1
E
Since {yn } can be recovered from {rn }, there is no loss of optimality if we use {rn } in place of {yn } for
detection.
The statistics {rn } are related to the symbols {mn } in the same way as in standard PSK with coherent detection, except that Xn is not a sequence of i.i.d. CN (0, N0 ) random variables. The fact that {Xn } are
correlated implies that symbol-by-symbol detection is not optimum, even if g(t) satisfies the zero-ISI condition and the symbols are independent. The MAP (or MPE) detector for this problem is quite complicated
and impractical, and hence the following suboptimum detector is used.
Differential Detector. This detector makes a decision on mn based on rn alone. In particular, m
n is chosen
via a minimum distance criterion as
m
n = arg min |rn Eejm |2
(12.12)
m
pretending that Xn is a zero-mean PCG random variable.

Performance of Differential Detector. At high SNR, we may ignore the cross-term in the equation for Xn
given in (12.11), and conclude that Xn is approximately CN (0, 2N0 ). Then it is clear that the performance
of DPSK is worse than PSK with coherent demodulation by approximately 3 dB. An exact analysis of the
performance of the differential detector can be done in the special cases of DBPSK and DQPSK (see [1, Page
275]). For DBPSK, we get the surprisingly simple expression:
Pb =
V.
1
exp(b ) .
2
(12.13)
35
s(t)
h()
y(t)
x(t)
w(t)
Figure 12.1: Complex baseband point-to-point communications channel
Channel Model for Mobile Communications

In Lecture 2, we developed the complex baseband model for point-to-point communications shown in Figure 12.1. Our goal now is to modify this channel model to incorporate the effects of the mobility. We will
focus on terrestrial mobile communications channels satellite channels are more well-behaved. The following are points worth noting in making the transition to the mobile communication channel model.
The additive noise term w(t) is always present whether the channel is point-to-point or mobile, and usually
w(t) is modelled as proper complex WGN.
For point-to-point communications the channel response is generally well modelled by a linear time invariant
(LTI) system (h() may or may not be known at the receiver). For mobile communications, the channel
response is time-varying, and we will see that it is well modelled as a linear time-varying (LTV) system.
To study the mobile communications channel, consider the situation where the mobile station (MS) is at location (x, y) or (d, ) in a coordinate system with the base station (BS) at the origin as shown in Figure 12.2. A
3-d model may be more appropriate in some situations, but for simplicity we will consider a 2-d model. Also,
we restrict our attention now to the channel connecting one pair of transmit (Tx) and receive (Rx) antennas.
Y
MS
BS
Figure 12.2: Multipath channel seen at location (d, ) for one Tx&Rx antenna pair
V.
36
If the mobile is fixed at location (d, ), the channel that it sees is time-invariant. The response of this timeinvariant channel is a function of the location, and is determined by all paths connecting the BS and the MS.
Thus we have the system shown in Figure 12.3, where hd, () is the impulse response of a causal LTI system,
which is a function of the multipath profile between the BS and MS.
hd, ()
s(t)
x(t)
Figure 12.3: Causal LTI system representing multipath profile at location (d, )
Referring to Figure 12.2, suppose the n-th path connecting the BS and MS has amplitude gain n (d, ) and
delay n (d, ). The delay of n (d, ) introduces a carrier phase shift of
n (d, ) = 2fc n (d, ) + constant
(12.14)
where the constant depends on the reflectivity of the surface(s) that reflect the path. Then we can write the
output x(t) in terms of the input s(t) as
X
n (d, ) ejn (d,) s(t n (d, ))
(12.15)
x(t) =
n
which implies that the impulse response is

X
n (d, ) ejn (d,) (t n (d, ))
hd, () =
(12.16)
References
V.
37
13
Lecture 13
Channel Model for Mobile Communications (continued)

As the MS moves, (d, ) change with time and the linear system associated with the channel becomes LTV.
There are two scales of variation:
The first is a small-scale variation due to rapid changes in the phase n as the mobile moves over distances
of the order of a wavelength of the carrier c = c/fc , where c is the velocity of light. This is because
movements in space of the order of a wavelength cause changes in n of the order of 1/fc , which in turn
cause changes in n of the order of 2. (Note that for a 900 MHz carrier, c 1/3 m.)
Modeling the phases n as independent Uniform[0,
2] random variables, we can see that the average power
P 2
gain in the vicinity of (d, ) is given by n n (d, ). We denote this average power gain by G(d, ).
The second is a large-scale variation due to changes in {n (d, )} both in the number of paths and their
strengths. These changes happen on the scale of the distance between objects in the environment.
To study these two scales of variation separately, we redraw Figure 12.3 in terms of two components as shown
in Figure 13.1.
cd, ()
s(t)
p
x(t)
G(d, )
Figure 13.1: Small-scale and large-scale variation components of channel

Here cd, is normalized so that the average power gain introduced by cd, is 1, i.e.
cd, () =
n (d, )ejn (d,) (t n (d, ))
(13.1)
where {n (d, )} is normalized so that n n2 (d,p) = 1. The large-scale variations in (average) amplitude
gain are then lumped into the multiplicative term G(d, ).
The goal of wireless channel modeling is to find useful analytical models for the variations in the channel.
Models for the large scale variations are useful in cellular capacity-coverage optimization and analysis, and
in radio resource management (handoff, admission control, and power control) [1, Chapter]. Models for the
small scale variations are more useful in the design of digital modulation and demodulation schemes (that are
robust to these variations). We hence focus on the small scale variations in this class.
V.
38
cd, ()
s(t)
x(t)
Figure 13.2: Small-scale variations in the channel (with large-scale variations treated as constant).
Small-scale Variations in Gain

Recall from Section that the small scale variations in the channel are captured in a linear system with response
X
n (d, )ejn (d,) (t n (d, )) ,
(13.2)
cd, () =
n
P
where the {n (d, )} are normalized so that n n2 (d, ) = 1. As (d, ) changes with t, the channel corresponding to the small-scale variations becomes time-varying and we get:
X
n (t) ejn (t) ( n (t)) .
(13.3)
c(t; ) := cd(t),(t) () =
n
Treating the large scale variations
G(d, ) as roughly constant (see Figure 13.2), we obtain:

Z
c(t; )s(t )d .
x(t) = G
(13.4)
Finally, we may absorb the scaling factor G into the signal s(t), with the understanding that the power of
s(t) is the received signal power after passage through the channel. Then
Z
x(t) =
c(t; )s(t )d .
(13.5)
0
Doppler shifts in phase

For movements of the order of a few wavelengths, {n (t)} and {n (t)} are roughly constant, and the time
variations in c(t; ) are mainly due to changes in {n (t)}, i.e.,
X
n ejn (t) ( n ) .
(13.6)
c(t; )
n
From this equation it is clear that the magnitude of the impulse response |c(t; )| is roughly independent of
t. A typical plot of |c(t; )| is shown in Figure 13.3. The width of the delay profile (delay spread) is of the
order of tens of microseconds for outdoor channels, and of the order of hundreds of nanoseconds for indoor
channels. Note that the paths typically appear in clusters in the delay profile (why?).
To study the phase variations n (t) in more detail, consider a mobile that is traveling with velocity v and
suppose that the n-th path has an angle of arrival n (t) with respect to the velocity vector as shown in Figure 13.4. (Note that we may assume that n is roughly constant over the time horizon corresponding to a few
V.
39
|c(t; )|
0 (LOS)
DS
Figure 13.3: Typical delay profile of channel with LOS path having delay 0.
wavelengths.) Then for small t ,
n (t + t ) n (t)
2vt
2fc vt cos n
=
cos n ,
c
c
(13.7)
where c is the carrier wavelength and c is the velocity of light. The frequency shift introduced by the movement of the mobile is hence given by
lim
t0
v
n (t + t ) n (t)
=
cos n = fm cos n ,
2t
c
(13.8)
where fm = v/c is called the maximum Doppler frequency. We will use this model for the variations in n (t)
to characterize small-scale variations statistically in the following section.
vt cos n
n
vt
Figure 13.4: Doppler Shifts

V.
40
Definition 13.1. The quantity DS = max n (d, ) min n (d, ) is called the delay spread of the channel.
Without loss of generality, we may assume that the delay corresponding to the first path arriving at the receiver
is 0. Then min n (d, ) = 0, DS = max n (d, ), and (13.5) can be rewritten as:
Z DS
x(t) =
c(t; )s(t )d .
(13.9)
0
Frequency Nonselective (Flat) Fading

If the bandwidth of transmitted signal s(t) is much smaller than 1/DS , then s(t) does not change much over
time intervals of the order of DS . Thus (13.9) can be approximated as
Z DS
X
x(t) s(t)
c(t, ) = s(t)
n ejn (t) .
(13.10)
0
This implies that the multipath channel simply scales the transmitted signal without introducing significant
frequency distortion. The variations with time of this scale factor are referred to as frequency nonselective, or
flat, fading.
Note that the distortions introduced by the channel depend on the relationship between the delay spread of
the channel and the bandwidth of the signal. The same channel may be frequency selective or flat, depending
on the bandwidth of the input signal. With a delay spread of 10 s corresponding to a typical outdoor urban
environment, an AMPS signal (30 kHz) undergoes flat fading, whereas an IS-95 CDMA signal (1.25 MHz)
undergoes frequency selective fading.
For flat fading, the channel model simplifies to
x(t) = E(t)s(t)
Z
where
DS
c(t, )d =
E(t) =
0
(13.11)
n ejn (t) .
(13.12)
References
[1] G. Stuber. Principles of Mobile Communication. Kluwer Academic, Norwell, MA, 1996.
V.
41
14
Lecture 14
Purely Diffuse Scattering - Rayleigh Fading

Our goal is to model {E(t)} statistically, but before we do that we distinguish between the cases where the
multipath does or does not have a line-of-sight (LOS) component. In the latter case, the multipath is produced
only from reflections from objects in the environment. This form of scattering is purely diffuse and can be
assumed to form a continuum of paths, with no one path dominating the others in strength. When there is a
LOS component, it usually dominates all the diffuse components in signal strength.
To model {E(t)} statistically, we first fit a stochastic model to the phases {n (t)}n=1,2,... .
Assumption 14.1. The phases {n (t)}n=1,2,... are well modeled as independent stochastic processes, with
n (t) being uniformly distributed on [0, 2] for each t and n.
Using this assumption, we get the following results:
{E(t)} is a zero-mean process. This is because
h
i
X
E [E(t)] =
n E ejn (t) = 0
n
The process {en (t)} defined by en (t) = n ejn (t) is a proper complex random process.
Proof. We need to show that the pseudocovariance function of {en (t)} equals zero.
+ , t) = E[en (t)en (t + )]
C(t
= E[(n ejn (t) )(n ejn (t+ ) )]
= n2 E[ej(n (t)+n (t+ )) ]
n2 E[ej(2n (t)+2fm cos n ) ] = 0
where the approximation on the last line follows from (13.7).
If the number of paths is large, we may apply the Central Limit Theorem to conclude that {E(t)} is a
proper complex Gaussian (PCG) random process.
First order statistics of {E(t)} for purely diffuse scattering
For fixed t, E(t) = EI (t) + jEQ (t) is PCG random variable with
h
i X
E |E(t)|2 =
n2 = 1 .
(14.1)
Since E(t) is proper, EI (t) and EQ (t) are uncorrelated and have the same variance, which equals half the
variance of E(t). Since E(t) is also Gaussian, EI (t) and EQ (t) are independent as well. Thus EI (t) and
EQ (t) are independent N (0, 1/2) random variables.
V.
42
Envelope and Phase Processes

The envelope process {(t)} and the phase process {(t)} are defined by

q
1 EQ (t)
2
2
(t) = |E(t)| = EI (t) + EQ (t) , and (t) = tan
.
EI (t)
(14.2)
We can write x(t) = E(t)s(t) in terms of (t) and (t) as:

x(t) = (t)ej(t) s(t) .
(14.3)
This means that for flat fading, the channel is seen as a single path with gain (t) and phase shift (t). Note
that and vary much more rapidly than the gain and phase of the individual paths n and n (why?).
For fixed t, using the fact that EI (t) and EQ (t) are independent
N (0, 1/2) random variables, it is easy to show that (t) and (t) are
independent random variables with (t) having a Rayleigh pdf and (t)
being uniform on [0, 2]. The pdf of (t) is given by
1
0.8
p (x) = 2xex u(x) .

2
0.6
(14.4)
h
0.4
It is easy to show that E[] =
p
/4 and E[2 ] = E |E(t)|2 = 1.
0.2
0
0
Since the envelope has a Rayleigh pdf, purely diffuse fading is referred
to as Rayleigh fading.
Scattering with a LOS component Ricean Fading
Figure 14.1: Rayleigh pdf
If there is a LOS (specular) path with parameters 0 , 0 and 0 (t) in

addition to the diffuse components, then
q
(14.5)
E(t) = 0 ej0 (t) + 1 02 E(t)

2 = 1.
where {E(t)}
is a zero mean PCG, Rayleigh fading process with variance E |E(t)|
Note: {E(t)} is also a zero-mean process, but it is not Gaussian since the LOS component {0 ej0 (t) }
dominates the diffuse components in power. However, conditioned on {0 (t)}, {E(t)} is a PCG process
with mean {0 ej0 (t) }.
Rice Factor: The Rice factor is defined by
=
power in the specular component

02
.
=
total power in diffuse components
1 02
From the definition of it follows that

r
1
, and 1 02 =
.
0 =
+1
( + 1)
V.
(14.6)
(14.7)
43
For fixed t, the pdf of the envelope (t) can be found by first computing the joint pdf of (t) and (t),
conditioned on 0 (t). This is straightforward since, conditioned on 0 (t), E(t) is a CCG random variable
with mean 0 ej0 (t) .
We can then show that the pdf of (t) conditioned on 0 (t) is not a function of 0 (t), and we get:

2

2x
2x0
x + 02
p|0 (x) =
I0
exp
u(x) = p (x)
1 02
1 02
1 02
(14.8)
This pdf is called a Ricean pdf [1] and it can be rewritten in terms of as:
p

p (x) = 2x( + 1) I0 2x ( + 1) exp x2 ( + 1) u(x)
where I0 () is the zeroth order modified Bessel function of 1st kind [2], i.e.,
Z
1
I0 (y) =
exp(y cos )d .
2
(14.9)
(14.10)
It is easy to see that when = 0, p (x) of (14.9) reduces to a Rayleigh pdf.

The pdf of 2 (t) is easily computed as:

p
p ( x)
= ( + 1) I0 2 x( + 1) exp [x( + 1) ] u(x) .

p2 (x) =
2 x
(14.11)
Rayleigh
Ricean =1
Ricean =5
Ricean =10
1.5
0.5
0
0
Figure 14.2: Ricean pdf for various Rice factors.
References
[1] S. Rice. Statistical properties of a sine wave plus noise. Bell Syst. Tech. J., 27(1):109157, January 1948.
[2] M. Abramowitz and I. A. Stegun. Handbook of Mathematical Functions. Dover, New York, 1964.
V.
44
15
Lecture 15 (2.5 hrs)
Signaling Through Slow Flat Fading Channels

We assume that the long-term variations in the channel are absorbed into Em . Then Em represents the
average received symbol energy (for symbol m) over the time frame for which the multipath profile may be
assumed to be constant. Then the received signal is given by:
y(t) = E(t)s(t) + w(t) = (t)ej(t) s(t) + w(t)
(15.1)
where E[2 (t)] = 1.

For slow fading, (t) and (t) may be assumed to be constant over each symbol period. Thus, for memorlyless modulation and symbol-by-symbol demodulation, y(t) for demodulation over symbol period [0, Ts ]
may be written as
y(t) = ej sm (t) + w(t) (conditioned on symbol m being transmitted)
(15.2)
Average probability of error for slow, flat fading

The error probability is a function of the received signal-to-noise ratio (SNR), i.e., the received symbol
energy divided by the noise power spectral density. We denote the symbol SNR by s , and the corresponding
bit SNR by b , where b = / and = log2 M .
For slow, flat fading, the received SNR is
s =
2 Es
.
N0
(15.3)
Es
Es
=
.
N0
N0
(15.4)
The average SNR (averaging over 2 ) is given by

s = E[2 ]
The corresponding bit SNRs are given by
b =
Es
Eb
.
, and b =
=
N0
N0
(15.5)
Suppose the symbol error probability with SNR s is denoted by Pe (s ). Then the average error probability
(averaged over the fading) is
Z
Pe =
Pe (x)ps (x)dx
(15.6)
where ps (x) is the pdf of s .

For Rayleigh fading, 2 is exponential with mean 1; hence s is exponential with mean s , i.e.,

1
x
ps (x) =
exp
u(x) .
s
s
V.
(15.7)
45
For Ricean fading, 2 has the pdf given in (14.11), and hence s has pdf
s
!

x( + 1)
+1
x( + 1)
ps (x) =
exp
I0 2
u(x) .
s
s
s
(15.8)
Pe for Rayleigh Fading

Useful result (see problem 3 of HW#4):

r
Z
ex/
1
Q( x )
dx =
1
.
2
2+
0
BPSK
Pb (b ) = Q(
2b ) .
(15.9)
(15.10)
Using (15.7) and (15.9), we get

Z
Pb =
0
s
"
#
1
1
b
1
Q( 2x )pb (x)dx =
(for large b ).
2
1 + b
4 b
(15.11)
Binary coherent orthogonal modulation (e.g. FSK)
Pb (b ) = Q( b ) .
Here Pb is the same as that for BPSK with b replaced by b /2, i.e.,
s
"
#
1
1
b
1
Pb =
(for large b ).
2
2 + b
2 b
(15.12)
(15.13)
Binary DPSK
1
Pb (b ) = eb .
2
In this we case we may integrate directly to get
Z
1
1
1 x ex/ b
e
Pb =
dx =
(for large b ).
2
2(1
+
)
2
0
b
b
b
(15.14)
(15.15)
Binary noncoherent orthogonal modulation (FSK)

1
Pb (b ) = eb /2 .
2
(15.16)
Here Pb is the same as that for DPSK with b replaced by b /2, i.e.,
Pb =
1
1
(for large b ).
2 + b
b
(15.17)
Similar expressions may be derived for other M-ary modulation schemes. Note that without fading
the error probabilities decrease exponentially with SNR, whereas with fading the error probabilities
decrease much more slowly with SNR (inverse linear in case of Rayleigh fading).
V.
46
Pe for Ricean Fading

Direct approach: Compute Pe using (15.6) and (15.8). This is cumbersome except in some special
cases.
Nakagami-m approach: Approximate ps (x) by a Nakagami-m distribution for which integration of
Pe to produce Pe is relatively easy. (See Problem 6 of HW#4).
Complex Gaussian approach: We begin by rewriting s of (15.3) as
2
s = 2 s = (EI2 + EQ
) s = YI2 + YQ2 = |Y |2 ,
(15.18)
where YI = s EI , YQ = s EQ , and Y = YI + jYQ is PCG, with mean mY and variance Y2 ,

conditioned on the LOS phase 0 , given by (see (14.7))
r
p
s j0
s
j0
=
mY = s 0 e
e , and Y2 = E[|Y |2 ] =
.
(15.19)
+1
+1
Without loss of generality, we may assume that 0 = 0, since the pdf of s is independent of 0 .
General expression for Pe
Z

Pe |y|2 pY (y) dy
yC
0
!
Z
2

|
|y
m
1
Y
dy
Pe |y|2 exp
=
Y2 yC
Y2
Pe =
Pe (x)ps (x)dx =
(15.20)
Useful result 1
1
Q(x) =
/2
0
x2
exp
2 sin2

d (problem 3 of HW#4) .
(15.21)
This alternative representation was introduced recently by Simon and Divsalar [1] as a way to compute
general expressions for the error rates for digital modulation on fading channels. For more recent
results, see the book by Simon and Alouini [2].
Useful result 2 The following result is also very useful in computing closed-form expressions for the
error probability in some special cases.
In (c) =
Z
0
/2
sin2
sin2 + c
n
d = [A(c)]n
n1
X
i=0

n1+i
[1 A(c)]i
i
(15.22)
i
p
with A(c) = 12 1 c/(1 + c) . This result is derived in [3]. Note that In (c) also has the following
alternative expression whose form is similar to that obtained in Problem 6 of HW#4.
h
In (c) =
/2
sin2
sin2 + c
n
d =

n1
X 2i
1
1
[A(c)]i [1 A(c)]i .
A(c)
i
2
2
(15.23)
i=0
Pb for Binary Signaling with Ricean Fading

V.
47
BPSK
1
Pb =
Y2
1
=
1
=
q

1
2
2
Q
2|y|
exp 2 |y mY |
dy
Y
yC
/2
0
/2
0
"
1
Y2
|y|2
exp 2
sin
yC
|y mY |2
exp
Y2
#
dy
(15.24)

( + 1) sin2
b
exp
d .
b + ( + 1) sin2
( + 1) sin2 + b
where the last line follows after completion of squares inside the exponential to compute the
complex Gaussian integral.
Note that for = 0 (i.e. Rayleigh fading), we have
1
Pb =
Z
0
/2
sin2
d .
b + sin2
(15.25)
Using (15.22) with n = 1, we can immediately see that the above expression is the same as the
one obtained in (15.11). Also, for , we see that we get back AWGN performance.
Binary coherent FSK. Same as BPSK with b replaced by b /2.
Binary DPSK.

Z

1
1
1
2
2
exp
exp
|y|
Pb =
|y
m
|
dy
Y
Y2 yC 2
Y2
(15.26)

b
+1
exp
=
.
2( + 1 + b )
+ 1 + b
where the second line follows easily by completion of squares as done in class. Again, it is easy
to check that we get the Rayleigh result when = 0 and the AWGN result as .
Binary noncoherent FSK. Same as DPSK with b replaced by b /2.
References
[1] M. K. Simon and D. Divsalar. Some new twists to problems involving the gaussian integral. IEEE Trans.
Commun., 46(2):200210, February 1998.
[2] M. K. Simon and M.-S. Alouini. Digital Communication over Fading Channels. Wiley, New York, 2000.
[3] M.-S. Alouini and M.K. Simon. Multichannel reception of digital signals over correlated nakagami fading
channels. In Proc. 36th Annual Allerton Conf., Monticello, IL, September 1998.
V.
48
16
Lecture 16
Diversity Techniques for Flat Fading Channels

Performance with fading is considerably worse than without fading, especially when the fading is Rayleigh.
Performance may be improved by sending the same information on many (independently) fading channels
For signaling on L channels, the received signal on the `-th channel is:
y` (t) = ` ej` sm,` (t) + w` (t), ` = 1, 2, . . . , L, m = 0, 1, . . . , M 1 .
(16.1)
where the noise w` (t) is assumed to be independent across channels.

If {` ej` }L
`=1 are independent, we get maximum diversity against fading.
How do we guarantee independence of channels? By separating them either in time, frequency or space.
frequency separation must be
1
DS ,
where DS is the delay spread
time separation must be f1m , where fm is the maximum Doppler frequency

spatial separation must be 2c , where c is the carrier wavelength.
Memoryless linear modulation with diversity

When symbol m is sent on the channels
p
y` (t) = ` ej` Es,` am ejm g` (t) + w` (t), ` = 1, 2, . . . , L, m = 0, 1, . . . , M 1 ,
(16.2)
function on channel `, Es,` is the average symbol

where g` (t) is a (possibly complex) unit energy shaping P
energy on channel `, and the am s are normalized so that m a2m = 1. We assume that the fading and noise
are independent across channels. Note that {w` (t)} are independent PCG processes with PSD N0 .
Optimum receiver: If we assume that the phases {` } and the amplitudes {` } are estimated perfectly at
the receiver, the optimum test statistic is formed by Maximal Ratio Combining (MRC) as
y=
L
X
Z
p
j`
y` (t)g` (t)dt .
` E` e
(16.3)
`=1
We proved that his was optimum in class; also see [1, 2] and Problem 1 of HW#5.
The sufficient statistic y may be rewritten as
y=
L
X
`2
`=1
L
X
p
p
Es,` am ejm +
` Es,` w` ,
(16.4)
`=1
where {w` } are independent CN (0, N0 ).
V.
49
The MPE (ML) decision rule is the same as without diversity except that the constellation is scaled in
amplitude based on the fading on the channels.
Special Case: BPSK with diversity
The sufficient statistic in this case takes the form
y=
L
X
`2 Eb,` + w ,
(16.5)
`=1
where w =
PL
`=1 `
p
Eb,` w` is PCG with
E[|w|2 ] = N0
L
X
Eb,` `2 .
(16.6)
`=1
The MPE decision rule for equal priors (or the ML decision rule)is to decide +1 (bit 1) if yI > 0, and 1
(bit 0) if yI < 0.
For fixed {` },
(
Pb = P({yI > 0} | {bit 0 sent}) = P wI >
L
X
)
`2 Eb,`
`=1
v
v
u L
u L
p

u X
u X 2 Eb,`
`
t
t
=Q
=Q
2
2
b,` = Q
2b
N0
`=1
`=1
where b,` is the received bit SNR on the `-th channel, and b =
The average BER is given by
(16.7)
Z
Pb =
Q
0
PL
`=1 b,`
is the total received bit SNR.

2x
pb (x)dx .
(16.8)
Thus, we may evaluate Pb by first finding the pdf pb (x). This works well for Rayleigh fading. However, as
shown below, Pb is more easily evaluated in the general case of Ricean fading using the complex Gaussian
approach of (15.20), and we get the Rayleigh fading result as a special case.
General Ricean analysis using the complex Gaussian approach:
Assume that ` is Ricean with Rice factor ` . Write b,` = |Y` |2 where {Y` } are PCG with means and
variances:
s
q
b,` ` j0,`
b,`
e
.
(16.9)
, and `2 = E[|Y` |2 ] =
m` = b,` 0,` ej0,` =
` + 1
` + 1
V.
50
Then

Q
2x pb (x)dx
0
v
!
u L
Z
L
2
u X
Y
1
m
|
|y
`
`
=
dy1 . . . dyL
Q t2
|y` |2
exp
`2
`2
y
k=1
`=1
"Z
!
#
!
Z
L
1 /2 Y
|y` |2
|y` m` |2
dy` d
=
exp 2
exp
0
`2
sin
y
`
`=1
!
Z /2 Y
L
b,` `
1
(` + 1) sin2
=
d .
exp
0
+ (` + 1) sin2
(` + 1) sin2 + b,`
`=1 b,`
Pb =
(16.10)
This is best we can do for general Ricean fading. Further simplification is possible for Rayleigh fading.
Special case: Rayleigh fading
If the fading is Rayleigh on all channels, i.e., ` = 0, for ` = 1, 2, . . . , L, then
1
Pb =
L
/2 Y
sin2
d .
+ sin2
`=1 b,`
(16.11)
Case 1: b,` s are distinct for ` = 1, 2, . . . , L. Here

L
Y
X
sin2
sin2
C
=
,
`
+ sin2
b,` + sin2
`=1 b,`
`=1
L
where
C` =
Y
i6=`
Thus
Pb =
L
X
`=1
1
C`
/2
0
b,`
.
b,` b,i
s
"
#
L
X
b,`
sin2
C`
1
d =
2
1 + b,`
b,` + sin2
`=1
(16.12)
(16.13)
(16.14)
where the second equality follows from (15.22)

Case 2: b,` s are identical, i.e. b,` = b /L for all `. Here
1
Pb =
/2
sin2
b
L
!L
+ sin2
L L1
`
X L 1 + `
b
b
d = A
1A
`
L
L
(16.15)
`=0
with

A
b
L
s
"
#
b
1
=
1
.
2
L + b
(16.16)
Note that the equation for Pb given in (16.15) is identical to that for BPSK in Nakagami-m fading with
m = L (see Problem 6 of HW#4).
V.
51
For large b ,

A
Thus

Pb
L
4 b
b
L
L
and 1 A
4 b
L X

L
L1+`
`
`=1

=
b
L
L
4 b

1.
L
(16.17)

2L 1
.
L
(16.18)
Note that with diversity Pb decreases at ( b )L which is a significant improvement over the inverse
linear performance obtained without diversity. (See Figure 16.1.)
Performance of BPSK with diversity on Rayleigh fading channel
10
Average Probability of Bit Error
10
L=1
10
L=2
4
10
L=8
L=4
10
10
10
15
20
25
30
35
40
45
Average Bit SNR
Figure 16.1: BPSK with diversity on Rayleigh fading channels.
References
[1] R. Price. Optimum detection of random signals in noise. IRE Trans. Inform. Th., pages 125135, December
1956.
[2] T. Kailath. Correlation detection of signals perturbed by a random channel. IRE Trans. Inform. Th., pages
361368, June 1960.
V.
52
17
Lecture 17
Error Control Coding for Fading Channels

The diversity approach to mitigating fading involves sending the same information on multiple independently fading channels. If diversity is obtained by the use of one transmit antenna and multiple receive
antennas, we must have the same information on all channels. In other situations it may be possible to send
different pieces of information on the various channels, i.e., code across the channels.
A code a mapping that takes a sequence of information symbols and produces a (larger) sequence of code
symbols so as to be able to detect/correct errors in the transmission of the symbols.
The simplest class of codes is the class of binary linear block codes. Here each vector of k information bits
xi = [xi,1 . . . xi,k ] is mapped to vector of n code bits ci = [ci,1 . . . ci,n ], with n > k. The rate R of the code
is defined to be the ratio k/n.
A binary linear block code can be defined in terms of a k n generator matrix G with binary entries such
that the code vector ci corresponding to an information vector xi is given by:
c i = xi G
(17.1)
(The multiplication and addition are the standard binary or GF(2) operations.)
Example: (7, 4) Hamming Code
1
0
G=
0
0
0
1
0
0
0
0
1
0
0
0
0
1
|
|
|
|
1
1
1
0
0
1
1
1
1
1
0
1
xi G = ci
(17.2)
Note that the codewords of this code are in systematic form with 4 information bits followed by 3 parity
bits, i.e.,
ci = [xi,1 xi,2 xi,3 xi,4 ci,5 ci,6 ci,7 ]
(17.3)
with ci,5 = xi,1 + xi,2 + xi,3 , ci,6 = xi,2 + xi,3 + xi,4 , and ci,7 = xi,1 + xi,2 + xi,4 . It is easy to write down
the 16 codewords of the (7,4) Hamming code. It is also easy to see that the minimum (Hamming) distance
between the codewords, dmin , equals 3.
General Result. If dmin = 2t + 1, then the code can correct t errors.
Example: Repitition Codes. A rate- n1 repitition code is defined by codebook:
0 7 [0 0 . . . 0] , and1 7 [1 1 . . . 1]
(17.4)
The minimum distance of this code is n and hence it can correct b n1

2 c errors. The optimum decoder for
1
this code is simply a majority logic decoder. A rate- 2 repitition code can detect one error, but cannot correct
any errors. A rate- 13 repitition code can correct one error.
V.
53
Coding Gain
The coding gain of a code is the gain in SNR, at a given error probability, that is achieved by using a code
before modulation.
The coding gain of a code is a function of: (i) the error probability considered, (ii) the modulation scheme
used, and (iii) the channel. We now compute the coding gain for BPSK signaling in AWGN for some simple
codes. Before we proceed, we introduce the following notation:
c = SNR per code bit
b = SNR per information bit =
c
R
Example: Rate- 12 repitition code, BPSK in AWGN
P{code bit in error} = Q( 2c ) = Q( b )
(17.5)
For an AWGN channel, bit errors are independent across the codeword. It is easy to see that with majority
logic decoding
1
Pce = P{decoding error} = [Q( b )]2 + 2 Q( b ) [1 Q( b )] = Q( b ) .

2
Thus
(17.6)
Pb (with coding) = Q( b ) > Q( 2b ) = Pb (without coding) .
(17.7)
The rate- 12 repitition code results in a 3 dB coding loss for BPSK in AWGN at all error probabilities!
Example: Rate- 13 repitition code, BPSK in AWGN
p
p
P{code bit in error} = Q( 2c ) = Q( 2b /3 ) = p (say) .
Then it is again easy to show that with majority logic decoding
"
Pb (with coding) = Pce = p +3p (1p) = 3p 2p = 3 Q

3
2b
3
!#2
"
2 Q
(17.8)
r
2b
3
!#3
. (17.9)
Furthermore one can show that the above expression for Pce is always larger than Pb without coding (which
equals Q( 2b )). Hence this code also has a coding loss (negative coding gain) for BPSK in AWGN.
Example: Rate- n1 repitition code, BPSK in AWGN
p
p
P{code bit in error} = Q( 2c ) = Q( 2b /n ) = p (say) .
Then we can generalize the previous two examples to get:
P

n n+1 n pq (1 p)nq
q
q=
Pb (with coding) = Pce = 1 n 2 n
P
n
2 n p 2 (1 p) 2 + nq= n +1
2
n
q
(17.10)
if n is odd
pq
(1
p)nq
if n is even
(17.11)
It is possible to show that the Rate- n1 repitition code (with hard decision decoding) results in a coding loss
for all n. (One way to show this is to see that even with optimum (soft-decision) decoding, the coding gain
for a Rate- n1 repitition code is 0.)
V.
54
Example: (7, 4)-Hamming code, BPSK in AWGN

p
p
P{code bit in error} = Q( 2c ) = Q( 8b /7 ) = p (say) .
(17.12)
Since the code can correct one code bit error (and will always have decoding error with more than one code
bit error), we have
Pce = P{2 or more code bits in error} =
7
X
7
q=2
pq (1 p)7q .
(17.13)
In general, it is difficult to find an exact relationship between the probability of information bit error Pb and
the probability of codeword error Pce . However, it is easy to see that Pb Pce always (see problem 4 of
HW#5). Thus the above expression for Pce serves as an upper bound for the Pb . Based on this bound, we can
show (see Fig. 17.1) that for small enough Pb we obtain a positive coding gain from this code. Of course,
this coding gain comes with a reduction in information rate (or bandwidth expansion).
Example: Rate R, t-error correcting code, BPSK in AWGN
p
p
P{code bit in error} = Q( 2c ) = Q( 2Rb ) = p (say) .
(17.14)
Now we can only bound Pce since the code may not be perfect like the (7, 4)-Hamming code. Thus
Pb Pce P{t + 1 or more code bits in error} =
n
X
n
pq (1 p)nq .
q
(17.15)
q=t+1
Soft decision decoding (for BPSK in AWGN)

C
xi = [xi,1 . . . xi,k ] [ci,1 . . . ci,n ] = ci
(17.16)
The code bits are sent using BPSK. If ci,` = 0, 1 is sent; if ci,` = 1, +1 is sent (2ci,` 1) is sent. Thus,
the received signal corresponding to the codeword ci in AWGN is given by
n
X
p
y(t) =
(2ci,` 1) Ec g(t `Tc ) + w(t)
(17.17)
`=1
where Tc is the code bit period, and g() is a unit energy pulse shaping function that satisfies the zero-ISI
condition w.r.t. Tc .
The task of the decoder is to classify the received signal y(t) into one of 2k classes corresponding to the
2k possible codewords. This is a 2k -ary detection problem, for which the sufficient statistics are given by
projecting y(t) onto g(t`Tc ), ` = 1, 2, . . . , n. Alternatively, we could filter y(t) with g(Tc t) and sample
the output at rate 1/Tc . The output of the matched filter for the `-th code bit interval is given by:
p
(17.18)
y` = (2ci,` 1) Ec + w`
where {w` } are i.i.d. CN (0, N0 ).
V.
55
10
10
Probability of Bit Error
10
10
rate1/2 repitition (hard)

5
10
10
10
(7,4) Hamming (soft, weak bd)
10
Uncoded BPSK
(also rate1/n, soft)
(7,4) Hamming (hard, bd)
(7,4) Hamming (soft, tight bd)
10
10
10
10
12
14
16
Information Bit SNR
Figure 17.1: Performance of block codes for BPSK in AWGN

For hard decision decoding, sgn(y`,I ), ` = 1, 2, . . . , n, are sent to the decoder. This is suboptimum but
is used in practice for block codes since optimum (soft decision) decoding is very complex for large n,
and efficient hard-decision decoders can be designed for many good block codes. For convolutional codes,
there is no reason to resort to hard decision decoding, since soft decision decoding can be done without
significantly increased complexity.
For optimum (soft decision) decoding, {y` }n`=1 are sent directly to the decoder for optimum decoding of the
codewords.
MPE (ML) Decoding: Let pj (y) denote the conditional pdf of y, given cj that is transmitted. Then, assuming all codewords are equally likely to be transmitted, the MPE estimate of the transmitted codeword index
is given by
iMPE = iML = arg max pj (y) .
(17.19)
j
References
V.
56
18
Lecture 18
Soft decision decoding for BPSK in AWGN (continued)

The MPE estimate of the codeword index is given by
iMPE = arg max pj (y)
j
= arg max
j
= arg max
j
n
Y
`=1
n
X
"
2#
|y` (2cj,` 1) Ec |
1
exp
N0
N0
y`,I cj,`
(18.1)
`=1
If we restrict attention to linear block codes, we can assume that the all-zeros codeword is part of the
code book. Without loss of generality, we can set c1 = [0 0 0]. Furthermore linear block codes have the
symmetry property that the probability of codeword error, conditioned on ci being transmitted, is the same
for all i.
Thus

Pce = P {i 6= 1}|{i = 1} = P
= P
( n
[ X
j=2
j=2
j=2

{i 6= j}{i = 1}
cj,` y`,I
cj,` y`,I
)
!

> 0 {i = 1}
`=1
( n
X
)

> 0 {i = 1}
2k
2k
2
[
(18.2)
`=1
where the last line follows from the Union Bound.

Now when {i = 1}, i.e., if c1 = 0 is sent, then
p
y`,I = Ec + w`,I .
Thus
(
P
n
X
`=1
cj,` y`,I
(18.3)
)
)
!
( n
X
p

> 0 {i = 1}
cj,` ( Ec + w`,I ) > 0
= P
`=1
)
( n
n
X
p X
cj,` w`,I >
Ec
cj,`
= P
`=1
v
u
n
u 2Ec X
cj,`
= Q t
N0
`=1
(18.4)
`=1
where the last line follows from the fact that c2j,` = cj,` .
V.
57
Definition: The Hamming weight i of a codeword ci is the number of 1s in the codeword, i.e.,
X
cj,` .
i =
(18.5)
Thus
2
X
k
Pb Pce
Q(
j=2
2
X
k
2 c j ) =
Q(
2 R b j ) .
(18.6)
j=2
To compute the bound on Pb we need the weight distribution of the code. For example, for the (7,4) Hamming code, it can be shown that there are 7 codewords of weight 3, 7 of weight 4, and 1 of weight 7.
We can obtain a weaker bound on Pb using only the minimum distance dmin of the code as
2
X
k
Pb Pce
Q(
p
2 R b dmin ) = (2k 1) Q( 2 R b dmin )
(18.7)
j=2
Example: Rate- n1 repitition code

This code has only one non-zero codeword with weight equal to n. With only one term in the Union Bound,
the bound (18.6) becomes an equality. Furthermore, Pb = Pce in this special case. Thus
!
r
p
1
Pb = Pce = Q
2 b n = Q( 2 b ) .
(18.8)
n
This means that repitition codes with soft decision decoding have zero coding gain for AWGN channels.
(We will see in the next section that repitition codes can indeed provide gains for fading channels.)
Example: (7, 4)-Hamming code
Using the weight distribution given above, we immediately obtain:
!
!
r
r

p
32
24
Pb 7Q
b + 7Q
b + Q
8 b .
7
7
We can also obtain the following weaker bound based on (18.7):
!
r
24
Pb 15 Q
.
b
7
(18.9)
(18.10)
See Figure 17.1 for the performance curves for soft decision making for the repitition and (7,4) Hamming
codes. We can see that soft decision decoding improves performance by about 2 dB over hard decision decoding for the (7,4) Hamming code.
V.
58
Coding and Interleaving for Slow, Flat Fading Channels

If we send the n bits of the codeword directly on the fading channel, they will fade together if the fading is
slow compared to the code bit rate. This results in bursts of errors over the block length of the code. If the
burst is longer than the error correcting capability of the code, we have a decoding error.
To avoid bursts of errors, we need to guarantee that the n bits of the codeword fade independently. One way
to do this is via interleaving and de-interleaving.
The interleaver follows the encoder and rearranges the output bits of the encoder so that the code bits of a
codeword are separated by N bits, where N Tc is chosen to be much greater than the coherence time Tcoh of
the channel.
The de-interleaver follows the demodulator (and precedes the decoder) and simply inverts the operation
of the interleaver so that the code bits of the codeword are back together for decoding. For hard decision
decoding the de-interleaver receives a sequence of bit estimates from the demodulator, and for soft decision
decoding, the decoder receives a sequence of (quantized) matched filter outputs.
In the following, we assume perfect interleaving and de-interleaving, so that the bits of the codeword fade
independently.
Coded BPSK on Rayleigh fading channel Hard decision decoding
The received signal corresponding to the codeword ci received in AWGN with independent fading on the
bits is:
n
X
p
y(t) =
` e` Ec (2ci,` 1) g` (t) + w(t)
(18.11)
`
{g` (t)}n`=1
are shifted versions of the pulse shaping function g(t) corresponding to the appropriate
where
locations in time after interleaving.
Assuming that {` } are known at the receiver, the output of the matched filter for the `-th code bit interval
is given by:
p
(18.12)
y` = (2ci,` 1)` Ec + w`
For hard decision decoding, we send sgn(y`,I ) to the decoder. Note that we do not need to know the fade
levels {` } to make hard decisions. However, the error probability corresponding to the hard decisions is a
function of the fade levels.
The conditional code bit error probability is given by:
P({`-th code bit is in error} | ` ) = Q(

where
c,` =
V.
`2 Ec
= `2 c .
N0
2c,` )
(18.13)
(18.14)
59
For Rayleigh fading, {c,` }n`=1 are i.i.d. exponential with mean c = R b .
Using this fact, we can show that for a t-error correcting code the average probability of codeword error Pc e
is given by:
s
s
!q
!nq

n
X
n
c
c
1 1
1 1
Pc e
(18.15)
+
q
2 2 1 + c
2 2 1 + c
q=t+1
and Pb Pc e in general. (See Problem 4 of HW#5.)

For a rate- n1 repitition code, it is easy to show that
Pb = Pc e
P
1 1 q b q 1 1 q b nq
n
n
n+1
2 2
n+ b
2 + 2
n+ b
q
q= 2
=
1 1 q b n2 1 1 q b n2
1 n
2 + 2
n+ b
2 2 2 2 n+ b
q
q
q
nq

P
b
b
n
n
1
1
1
1
+ q= n +1 q
+
2
2
n+
2
2
n+
b
if n is odd
(18.16)
if n is even
In the special case of a rate- 12 repitition code we obtain:

s
Pb = Pc e
1 1
=
2 2
b
2 + b
(18.17)
which is 3 dB worse than the error probability without coding.
V.
60
19
Lecture 19 (2.5 hrs)
Coded BPSK on Rayleigh fading channel Soft decision decoding

Assuming that {` } are known at the receiver, the output of the matched filter for the `-th code bit interval
is given by:
p
y` = (2ci,` 1)` Ec + w`
(19.1)
MPE decoding (for fixed and known {` }): We follow the same steps as in the pure AWGN case to get
iMPE = arg max
j
= arg max
j
n
Y
`=1
n
X

|y`,I (2cj,` 1)` Ec |2
1
exp
N0
N0
` r`I cj,`
(19.2)
`=1
Note that, unlike in hard decision decoding, we need to know the fade levels {` } for soft decision decoding.
Again, as in pure AWGN case, we can set c1 = 0 and compute a bound on Pce for fixed {` } as:
v
( n
)
!
u n
2k
X
X
u X

P
` cj,` y`I > 0 {i = 1} = Q t2
cj,` c,` .
Pce (1 , . . . , n )
j=2
`=1
(19.3)
`=1
Let
j =
n
X
cj,` c,` .
(19.4)
`=1
Then j is the sum of j i.i.d. exponential random variables, where j is the weight of cj . Thus
wj

1
xj 1
x
11{x0}
pj (x) =
exp
c
(j 1)!
c
(19.5)
Thus the average codeword error probability (averaged over the distribution of the {` }) is given by:
Pc e
2k Z
X
j=2
Q( 2x )pj (x) dx
(19.6)
and clearly Pb Pc e .
It is easy to show that (see Problem 5 of HW#5):
Pb Pc e

j 1

2k
X
1+
q
1
j X j 1 + q
q
2
2
j=2
V.
(19.7)
q=0
61
where
c
=
c + 1
R b
.
R b + 1
(19.8)
R
Also, since 0 Q( 2x )pj (x)dx decreases as j increases, we have the following weaker bound on Pc e
in terms of dmin .

Pb Pc e (2k 1)
dmin dmin

1
X
dmin 1 + q
1+
q
.
2
q
(19.9)
q=0
For large SNR, i.e., c 1, we have

1+
1
1
1
=
1 , and
2
2
4 c
4 R b
(19.10)
and hence the bound in (19.9) can be approximated by

Bound (2 1)
k
1
4 R b
dmin

2dmin 1
.
dmin
(19.11)
This means that Pb decreases as ( b )dmin for large SNR.

Example: Rate- n1 repitition code

Pb = Pc e =
with
=
n n1

X n 1 + q 1 +
q
q
2
(19.12)
q=0
b /( b + n)
The average bit error probability is the same as that obtained with n-th order diversity and maximum ratio
combining (as expected).
Example: (7, 4)-Hamming code
Pb Pc e 7f (3) + 7f (4) + f (7)
where

f () =
and
=
1

X 1 + q 1 +
q
q
2
(19.13)
(19.14)
q=0
4 b /(4 b + 7)
Performance plots for these codes for both hard and soft decision making are shown in Figure 19.1.
V.
62
10
10
Probability of Bit Error
10

uncoded BPSK
4
10
10
(7,4) Hamming (hard,bd)
10
rate1/2 repitition (soft)
rate1/3 repitition (soft)
10
Uncoded BPSK
in AWGN
(7,4) Hamming (soft,bd)
10
10
15
20
25
30
35
40
45
50
Bit SNR
Figure 19.1: Performance of block codes for BPSK in Rayleigh fading with perfect interleaving
V.
63
Code Division Multiple Access (CDMA)

Spread Spectrum Modulation
Informal definition of spread spectrum signal (Viterbi [1]): A spread spectrum information bearing signal is
one whose bandwidth is much larger than what is needed to transmit data reliably.
Precise definition (Massey [2]): A spread spectrum signal is one for which the (essential) Fourier bandwidth is much larger than the Shannon bandwidth, where Shannon bandwidth refers to half the number of
dimensions in signal space occupied by the signal per second.
Spreading versus Coding (VVV [3]): Spreading is a linear mapping in signal space that is energy and
distance preserving. Spreading provides no coding gain against AWGN; it is hence akin to repitition coding.
Coding is necessarily a nonlinear mapping in signal space. Every bandwidth expansion scheme can be
written as coding followed by spreading.
Why spread spectrum?
Military applications
immunity to narrowband jammers
low probability of intercept (LPI)
Commercial applications
multiaccess capability
randomization of interference
diversity gain against fading
Direct sequence spread spectrum (DS/SS)

Consider the complex baseband signal
s(t) =
smn (t nTs )
(19.15)
where mk {0, 1, . . . , M 1}. The signal s(t) occupies a bandwidth W that depends on the modulation
scheme used. To spread spectrum, we simply multiply s(t) by a high frequency chip waveform c(t) that has
bandwidth N W , where N is said to be the processing gain.
DS/SS Linear Modulation
Without spreading
s(t) =
X (n)
E
z gTs (t nTs )
(19.16)
where z (n) { E0 ej0 , . . . , EM 1 ejM 1 } is the complex symbol that is transmitted during the n-th
symbol interval, and gTs () is a unit energy pulse shaping function that satisfies the zero ISI (Nyquist)
V.
64
condition
Z
gTs (t iTs ) gTs (t jTs ) dt [i j]
(19.17)
Examples of gTs () (two extreme cases)

Sinc pulse:
1
gTs (t) = sinc
Ts
t
0.5
Ts
(19.18)
This is the pulse with smallest bandwidth satisfying the Nyquist condition of (19.17). The Fourier
transform of this pulse is given by
p
GTs (f ) = Ts rect(f Ts ) ejTs f
(19.19)
where
(
1 if |x| 12
rect(x) =
0 otherwise
Rectangular pulse:
1
gTs (t) = rect
Ts
(19.20)

t
0.5
Ts
(19.21)
and GTs (f ) = Ts sinc(f Ts ) ejTs f .

This pulse is convenient for analysis since the waveforms do not overlap from symbol to symbol, and
the pulse autocorrelation function has a convenient triangular form. However, it has poor bandwidth
properties and has been shown to result in poor performance in CDMA systems [4]
In practice, since sinc pulses cannot be used due to their infinite time extent, pulses that are approximately bandlimited and are time limited to a few symbol periods are used.
References
[1] A. J. Viterbi. Spread spectrum communications-myths and realities. IEEE Commun. Mag., 17(3):1118,
May 1979.
[2] J. L. Massey. Information theory aspects of spread-spectrum communications. In Proc. IEEE ISSSTA94,
pages 1620, Oulu, Finland, July 1994.
[3] V. V. Veeravalli. The coding-spreading tradeoff in CDMA systems. In Proc. 27th Annual Allerton Conference, pages 83140, Monticello, IL, September 1999.
[4] A. Mantravadi and V. V. Veeravalli. On chip-matched filtering and discrete sufficient statistics
for asynchronous band-limited CDMA systems. IEEE Trans. Commun., 2000. Submitted. See
http://www.comm.csl.uiuc.edu/vvv/cv/pubs/ for a copy.
V.
65
20
Lecture 20
DS/SS Linear Modulation

Spreading Spectrum
The transmitted signal for DS/SS linear modulation is given by:
p X
s(t) = Es
z (n) c(n) (t nTs )
(20.1)
where c(n) () is a unit energy waveform that replaces gTs () in (19.16), and is given by:
(n)
(t) =
N
1
X
(n)
cj gTc (t jTc ) .
(20.2)
j=0
1
The sequence {cj }N
j=0 is the chip sequence for the n-th symbol interval, and can be written compactly
using the vector notation
(n) (n)
(n)
(20.3)
c(n) = [c0 c1 cN 1 ]>
(n)
with c(n) normalized such that c(n) c(n) = 1. There are two special cases that we can consider:
Short sequences: c(n) = c for all n.
Long sequences: c(n) is different for each n, and the sequence may repeat after a long period that
spans several symbols. Such sequences are generated using pseudorandom number generators and are
also called random sequences.
(n)
The chip sequences are typically binary valued, i.e., cj
{+ 1N , 1N }, but in general they can be
complex valued and satisfy c(n) c(n) = 1

The chip pulse gTc () is a unit energy function that satisfies the zero ICI (Nyquist) condition
hgTc (t iTc ), gTc (t jTc )i [i j]
(20.4)
and just as with gTs (), there is a range of choices for gTc (), with the sinc pulse and the rectangular pulse
being extreme cases. Since gTc () has unit energy and c(n) c(n) = 1, it follows that c(n) () is a unit energy
waveform. It is also clear from (20.4) that c(n) () satisfies the zero ISI condition given in (19.17).
Single User Communications with DS/SS Linear Modulation
Consider single user communications over an AWGN channel with DS/SS linear modulation. The received
signal is given by:
p X
z (n) c(n) (t nTs ) + w(t)
(20.5)
y(t) = ej Es
n
where is the phase offset introduced by the channel.

V.
66
Assuming zero ISI, symbol-by-symbol detection is optimum, and the sufficient statistic for detecting the
symbol corresponding to interval [0, Ts ] (say) is given by
p
y = hy(t), ej c(t)i = Es z + w
(20.6)
where we have dropped the superscript (0) for convenience, and where w CN (0, N0 ). Note that we
need to know the sequence c = [c0 c1 cN 1 ]> in addition to to compute the above correlation.
For soft decision decoding, we send y to the decoder. The performance metric for soft decisions is the
signal-to-noise ratio in the statistic y, which is given by:
SNR =
E[|E[y|z]|2 ]
Es
=
.
var(y|z)
N0
For hard decision decoding, the MPE decision rule is given by:
zMPE = arg max p(y|z) = arg min |y
zS
zS
(20.7)
Es z|2
(20.8)
where S = {a0 ej0 , . . . , aM 1 ejM 1 }.

For binary signaling
zMPE = sgn(yI ) ,
and the bit-error rate (BER) is given by
r
Pb = Q
2Eb
N0
(20.9)
!
.
(20.10)
Note that the performance is the same as without spreading spreading results in zero coding gain in AWGN,
i.e., it is akin to repitition coding with soft decision decoding.
Mulitiuser Communications
Users are indexed by k = 1, 2, . . . , K, with K being the total number of users.
The signal of user k (for linear modulation) is given by
X (n) (n)
p
zk ck (t nTs ) .
sk (t) = Es,k
(20.11)
n
(n)
where Es,k is the average symbol energy of user k, zk is the n-th symbol of user k, Ts is the symbol period,
(n)
(n)
and ck () is the signaling waveform for the n-th symbol of user k. Note that ck () is not necessarily a
spreading waveform.
Signal separation
For FDMA, {sk (t)}K
k=1 occupy orthogonal frequency slots (possibly separated by guard bands).
For TDMA, {sk (t)}K
k=1 occupy orthogonal time slots (possibly separated by guard times).
For CDMA, {sk (t)}K
k=1 have their energy spread out roughly uniformly over time and frequency. The
signals are not necessarily orthogonal, and they may not even be linearly independent.
V.
67
DS/SS CDMA
For DS/CDMA, the signaling waveform for n-th symbol of user k is the spreading waveform given by
(n)
ck (t) =
N
1
X
(n)
ck,j gTc (t jT c)
(20.12)
j=0
with Ts = N Tc . The corresponding chip sequence can be written compactly as

ck = [ck,0 ck,1 ck,N 1 ]>
(n)
(n) (n)
ck
and ck
(n) (n)
(n)
(20.13)
= 1.
Synchronous versus asynchronous users

In general, the received signal in AWGN is given by
y(t) =
K
X
k=1
where Ak =
Ak
(n) (n)
zk ck (t nTs k ) ejk + w(t)
(20.14)
p
Es,k .
Without loss of generality, we may assume that k [0, Ts ].

For synchronous users, k = 0 and k = 0 for all k.
For asynchronous users, one-shot (symbol-by-symbol) detection is not optimum. We hence need to
consider a frame of length Ts , > 1, for detection. The asynchronous user problem over Ts can be
converted to an equivalent synchronous-user problem with + 2(K 1)( + 1) users.
For long (random) sequence CDMA, the performance for asynchronous users with multishot detection
can be approximated by the performance for synchronous users with one-shot detection.
References
[1] S. Verdu. Multiuser Detection. Cambridge University Press, United Kingdom, 1998.
V.
68
21
Lecture 21
Synchronous user model

K p
X
X (n) (n)
Ek
zk ck (t nTs ) + w(t) .
y(t) =
(21.1)
k=1
Sufficient statistics for detection are given by:

(n)
yk
(n)
= hy(t), ck (t nTs )i
(21.2)
Using the zero-ICI condition satisfied by gTc () it is easy to show that

(n)
yk
K
X
(n)
A` z`
(n)
(n)
(n)
(21.3)
(n0 ) K
}`=1
for any n0 6= n. Also,
hc` (t nTs ), ck (t nTs )i + wk .
`=1
(n)
Note that yk
(n0 )
{w` }K
`=1
(n)
is a function of only {z` }K

`=1 , but not a function of {z`
and
are independent for n0 6= n. Thus, one-shot multiuser processing is optimum,
i.e., the symbol decisions for the users can be made one symbol interval at time without loss of optimality.
(n)
{w` }K
`=1
Without loss of generality, consider symbol interval [0, Ts ], i.e., n = 0, and drop the superscript 0 for
convenience. Then
K
X
yk =
A` z` hc` (t nTs ), ck (t nTs )i + wk .
(21.4)
`=1
with wk = hw(t), ck (t)i.

yk = Ak zk +
A` z` `,k + wk
(21.5)
`6=k
where
`,k = hc` (t), ck (t)i ==
N
1
X
c`,i c?k,i = ck c`
(21.6)
i=0
and wk CN (0, N0 ). The noise components are not independent; in particular, E[wk w`? ] = N0 `,k .
If K N , the chip sequences can be made orthogonal, i.e., `,k = [k `], and hence
yk = Ak zk + wk
(21.7)
which is the same as the expression for the MF output for a single user in AWGN. Thus single-user detection
is optimum in this case, and the performance obtained is the same as that without the multiple-access interference (MAI) from other users. An example of an orthogonal sequence set is the set of Walsh-Hadamard
sequences, which are used in the forward link of IS-95 based CDMA systems.
V.
69
Single user detection

In general when the sequences are not necessarily orthogonal
yk = Ak zk + Ik + wk
where
Ik =
A` z` `,k .
(21.8)
(21.9)
`6=k
If we approximate Ik by a zero mean, PCG random variable, then the conditional pdf of Yk given zk is
PCG with mean Ak zk .
A single user (SU) detector treats Ik as a CCG random variable and makes a decision on zk based purely
on yk , i.e., ignoring {y` }`6=k .
For SU hard decision making, the ML decision for zk based on yk is given by:
zk,SU-MF arg min |yk Ak zk |2 .
zk S
(21.10)
For binary signaling, zk {+1, 1}, and we obtain:

zk,SU-MF sgn [Re(Yk )] .
(21.11)
For SU soft decision making, we send yk to the decoder and the decoder may use knowledge of Ak in
decoding.
For hard decisions, the performance metric of interest is of course the probability of error
Pe,k = P{
zk 6= zk }.
For soft decisions, a useful performance metric is the signal-to-intereference ratio (SIR) in the soft
decision statistic, defined by:
i
h
E |E[yk |zk ]|2
SIRk =
.
(21.12)
Var(yk |zk )
Single User Detection Performance Analysis
Binary signaling assumption : For the analysis in this section we make the simplifying assumption that
symbols and spreading sequences are binary, i.e.,
1
1
zk {+1, 1} , and ck,i {+ , } .
N
N
(21.13)
Case 1: Orthogonal users

Since `,k = 0 for ` 6= k,
yk = Ak zk + wk .
V.
(21.14)
70
The SIR for user k is given by:

SIRk =
h
i
E |E[yk |zk ]|2
Var(yk |zk )
h
i
E (Ak zk )2
=
N0
Eb,k
.
N0
(21.15)
The BER for user k for the MPE SU decision rule of (21.11) is given by:
s
2Eb,k
N0
zk = 1}|{zk = 1}) = P ({yk > 0}|{zk = 1}) = Q

Pb,k = P ({
= Q(
2SIRk ) .
(21.16)
As expected since the interference is completely cancelled out.

Case 2: Synchronous users with non-orthogonal short spreading sequences
Assuming that the bits of the users are i.i.d. Bernouilli(1, 0.5)
h
i
E[yk |zk ] = Ak zk = E |E[yk |zk ]|2 = A2k
and
Var(yk |zk ) = Var
A` z` `,k + wk =
`6=k
(21.17)
A2` 2`,k + N0 .
(21.18)
`6=k
Thus
Eb,k
SIRk = P
A2k
N0
P
2 2 + N =
A
0
1 + `6=k
`6=k ` `,k
Eb,`
N0
(21.19)
2`,k
The BER for user k for the MPE decision rule of (21.11) is to be computed by averaging over the
distribution of the bits of the other users. We do this by first computing Pb,k conditioned on the bits of
the others users, and then average over the distribution of these bits.

zk = 1}|{zk = +1}) = E P {
zk = 1} {zk = +1}, {z` }`6=k
Pb,k = P ({
"

=E Q
= E P {yk,I < 0} {zk = +1}, {z` }`6=k
s
= E Q
2Eb,k X
+
N0
`6=k
X
{+1,1}K1
z
2Eb,`
z` `,k
N0
s
Q
2Eb,k
+
N0
X
`6=k
Ak +
P
`6=k
A` z` `,k
!#
p
N0 /2
(21.20)
2Eb,`
z` `,k
N0
= [z0 zk1 zk+1 zK ]. Note that from the above equation we can immediately
where z
conclude that
s
!
2Eb,k
= Pb,k (orthogonal signaling) [Why?]
(21.21)
Pb,k Q
N0
V.
71
Also note that the number of terms in the sum grows exponentially with K, and hence it is difficult to
compute Pb,k exactly when the K is large.
Gaussian approximation for Pb,k : For large K
X
A` z` `,k + wk,I N (0, 12 )
(21.22)
`6=k
where
12 = Var(yk,I |zk ) =
A2` 2`,k +
`6=k
N0
.
2
(21.23)
Thus we can approximate Pb,k as

Pb,k = P ({
zk = 1}|{zk = +1}) Q
Ak

.
(21.24)
Case 3: Synchronous users with long (random) spreading sequences

Assuming that the bits of the users are i.i.d. Bernouilli(1, 0.5), and the chips of the users are i.i.d.
Bernouilli( 1N , 0.5), we can show that (see HW#6)
SIRk =
1
N
A2k
=
P
2
1+
`6=k A` + N0
Eb,k
N0
1
N
Eb,`
`6=k N0
(21.25)
For equal power users, Eb,` = Eb for all `, and we obtain

SIRk =
1+
Eb
N0
K1 Eb
N N0
N
for large K or large Eb /N0
K 1
(21.26)
The BER for user k for the MPE decision rule of (21.11) is to be computed by averaging over the
distribution of the bits as well as the chips. The procedure is similar to that used for Case 2.
s
s
X
2Eb,k
2Eb,`
Pb,k = E Q
(21.27)
+
z` `,k
N0
N0
`6=k
where the expectation is taken over the distribution of the bits and the chips. It is clear that computing
this expectation is even more cumbersome than in Case 2.
Gaussian approximation for Pb,k : For large K, and equal powers, using steps similar to those used in
Case 2, we can approximate Pb,k as
!
r

p
K 1
(21.28)
Q
SIRk .
Pb,k Q
N
See Figures 21.1 and 21.2 for typical numerical results.
V.
72
SIR for synchronous CDMA with random sequences and equal power users
50
SIR for any user (dB)
40
30
20
10
10
10
15
20
25
30
35
40
45
50
Eb/N0 (dB)
Figure 21.1: SIR for synchronous CDMA with N = 31
BER for synchronous CDMA with random sequences and equal power users
10
Average BER for any user
N=31
10
Random sequences, K=10

Random sequences, K=6
10
Orthogonal sequences
10
10
Eb/N0
Figure 21.2: BER for synchronous CDMA with N = 31
V.
73

Advanced Digital Communication

Diunggah oleh

Informasi Dokumen

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Advanced Digital Communication

Diunggah oleh

Hak Cipta:

Format Tersedia

1

Channel Model for Point-to-Point Communications

) concentrated around fc . The channel response h(t)

and s(t) = s+ (t)ej2fc t .

s(t) = Re[ 2 s+ (t)] = Re[ 2 s(t)ej2fc t ] .

From this and (1.4) we get

s2I (t) + s2Q (t),

and (t) = tan1

passband signal. It is easy to see that if we multiply s

Figure 1.2: Conversion from passband to baseband and vice-versa.

Complex baseband representation of channel response

implies that the I and Q components of y(t) can be computed separately as

using real baseband operations.

The complex process w(t) has some very interesting properties.

Furthermore, we can show that

Figure 2.1: Complex baseband channel model for point-to-point communications

Idealization by White Gaussian Noise

Proper Complex and Circularly Complex Gaussian Processes

YQ YI = cov[YQ , YI ] = E[(YQ mYQ )(YI mYI )> ]

Y are, respectively, the covariance and pseudocovariance matrices of Y. Note that

Note that for proper complex Y, it follows from (3.3) that

The scalar case

Then Z is also proper complex.

It is easy to show that

Definition 3.2. Y (t) is proper complex if the pseudocovariance CY Y (t + , t) = 0, i.e., if

h(t, s)Y (s)ds .

Signal Space Concepts

with equality iff x(t) = y(t) for some complex .

Pythagorean Theorm: If x(t) y(t) then

Signal Space and Basis Functions

The set {i (t)}

s` ` (t) , with s` = hs(t), ` (t)i .

ks(t)k = s s = ksk (show this!)

and for sk (t), sm (t) S

hsk (t), sm (t)i = sm sk = hsk , sm i (show this!) .

Signal Energy, Correlation and Distance

dk,m = [2E(1 Re[k,m ])] 2 .

where g(t) is a real-valued, unit energy, pulse shaping waveform.

sm (t) = Re[ 2sm (t) ej2fc t ] = 2Em cos(2fc t + m ) .

A measure of goodness of the constellation is the ratio

Note that is independent of scaling of the constellation.

We can compute as a function of M . For example, = 4 for M = 2.

For QPSK, = 4 (as in BPSK).

Orthogonal Memoryless Modulation

hsk (t), sm (t)i

There are two kinds of orthogonality:

gm (t) = ej2mf t pTs (t)

km = sinc[Ts (k m)f ] ejTs (mk)f

Re{km } = sinc[2Ts (k m)f ] .

It is easy to show that

Continuous Phase Modulation (CPM)

where Bn {1, 2, . . . , (M 1)}, and

Use d(t) to frequency modulate the carrier to form:

where fd is the frequency deviation factor and 0 is the initial phase.

For t [nTs , (n + 1)Ts ], we can simplify the above expression as

where the phase at time nTs , n , is given by

and the modulation index, h, is given by h = 2fd Ts .

CPM as a Generalization of CPFSK

with v(t) being a causal signal that is normalized so that

Full response versus partial response

Phase Trajectories for CPM

h(t (n 1)Ts )Bn1 h(t nTs )Bn

Minimum Shift Keying (MSK)

The phase of the MSK signal is given by: