Anda di halaman 1dari 12

Solutions to Homework Set #4

Channel and Source coding


1. Rates
(a) Channels coding Rate: Assuming you are sending 1024 dier-
ent messages using 20 usages of a channel. What is the rate (in
bits per channel use) that you send.
(b) Source coding Rate: Assuming you have a le with 10
6
Asci
characters, where the alphabet of Asci characters is 256. After
compressing is we get 4 10
6
bits. What is the compression rate?
Solution: Rates.
(a)
R =
1
20
log
2
1024 =
1
2
.
(b)
R =
2 10
6
10
6
log
2
(256)
=
1
4
.
2. Preprocessing the output.
One is given a communication channel with transition probabilities
p(y | x) and channel capacity C = max
p(x)
I(X; Y ). A helpful statisti-
cian preprocesses the output by forming

Y = g(Y ), yielding a channel
p( y|x). He claims that this will strictly improve the capacity.
(a) Show that he is wrong.
(b) Under what conditions does he not strictly decrease the capacity?
Solution: Preprocessing the output.
(a) The statistician calculates

Y = g(Y ). Since X Y

Y forms a
Markov chain, we can apply the data processing inequality. Hence
for every distribution on x,
I(X; Y ) I(X;

Y ).
1
Let p(x) be the distribution on x that maximizes I(X;

Y ). Then
C = max
p(x)
I(X; Y ) I(X; Y )
p(x)= p(x)
I(X;

Y )
p(x)= p(x)
= max
p(x)
I(X;

Y ) =

C.
Thus, the helpful suggestion is wrong and processing the output
does not increase capacity.
(b) We have equality (no decrease in capacity) in the above sequence
of inequalities only if we have equality in data processing inequal-
ity, i.e., for the distribution that maximizes I(X;

Y ), we have
X

Y Y forming a Markov chain. Thus,

Y should be a
sucient statistic.
3. The Z channel.
The Z-channel has binary input and output alphabets and transition
probabilities p(y|x) given by the following matrix:
Q =
_
1 0
1/2 1/2
_
x, y {0, 1}
Find the capacity of the Z-channel and the maximizing input probabil-
ity distribution.
Solution: The Z channel.
First we express I(X; Y ), the mutual information between the input
and output of the Z-channel, as a function of = Pr(X = 1):
H(Y |X) = Pr(X = 0) 0 + Pr(X = 1) 1 =
H(Y ) = H(Pr(Y = 1)) = H(/2)
I(X; Y ) = H(Y ) H(Y |X) = H(/2)
Since I(X; Y ) is strictly concave on (why?) and I(X; Y ) = 0 when
= 0 and = 1, the maximum mutual information is obtained for
some value of such that 0 < < 1.
Using elementary calculus, we determine that
d
d
I(X; Y ) =
1
2
log
2
1 /2
/2
1 ,
which is equal to zero for = 2/5. (It is reasonable that Pr(X = 1) <
1/2 since X = 1 is the noisy input to the channel.) So the capacity of
the Z-channel in bits is H(1/5) 2/5 = 0.722 0.4 = 0.322.
2
4. Using two channels at once.
Consider two discrete memoryless channels (X
1
, p(y
1
| x
1
), Y
1
) and
(X
2
, p(y
2
| x
2
), Y
2
) with capacities C
1
and C
2
respectively. A new
channel (X
1
X
2
, p(y
1
| x
1
) p(y
2
| x
2
), Y
1
Y
2
) is formed in which
x
1
X
1
and x
2
X
2
, are simultaneously sent, resulting in y
1
, y
2
. Find
the capacity of this channel.
Solution: Using two channels at once.
To nd the capacity of the product channel (X
1
X
2
, p(y
1
, y
2
|x
1
, x
2
), Y
1

Y
2
), we have to nd the distribution p(x
1
, x
2
) on the input alphabet
X
1
X
2
that maximizes I(X
1
, X
2
; Y
1
, Y
2
). Since the transition proba-
bilities are given as p(y
1
, y
2
|x
1
, x
2
) = p(y
1
|x
1
)p(y
2
|x
2
),
p(x
1
, x
2
, y
1
, y
2
) = p(x
1
, x
2
)p(y
1
, y
2
|x
1
, x
2
)
= p(x
1
, x
2
)p(y
1
|x
1
)p(y
2
|x
2
),
Therefore, Y
1
X
1
X
2
Y
2
forms a Markov chain and
I(X
1
, X
2
; Y
1
, Y
2
) = H(Y
1
, Y
2
) H(Y
1
, Y
2
|X
1
, X
2
)
= H(Y
1
, Y
2
) H(Y
1
|X
1
, X
2
) H(Y
2
|X
1
, X
2
)(1)
= H(Y
1
, Y
2
) H(Y
1
|X
1
) H(Y
2
|X
2
) (2)
H(Y
1
) + H(Y
2
) H(Y
1
|X
1
) H(Y
2
|X
2
) (3)
= I(X
1
; Y
1
) + I(X
2
; Y
2
),
where Eqs. (1) and (2) follow from Markovity, and Eq. (3) is met with
equality if X
1
and X
2
are independent and hence Y
1
and Y
2
are inde-
pendent. Therefore
C = max
p(x
1
,x
2
)
I(X
1
, X
2
; Y
1
, Y
2
)
max
p(x
1
,x
2
)
I(X
1
; Y
1
) + max
p(x
1
,x
2
)
I(X
2
; Y
2
)
= max
p(x
1
)
I(X
1
; Y
1
) + max
p(x
2
)
I(X
2
; Y
2
)
= C
1
+ C
2
.
with equality i p(x
1
, x
2
) = p

(x
1
)p

(x
2
) and p

(x
1
) and p

(x
2
) are the
distributions that maximize C
1
and C
2
respectively.
5. A channel with two independent looks at Y.
Let Y
1
and Y
2
be conditionally independent and conditionally identi-
cally distributed given X. Thus p(y
1
, y
2
|x) = p(y
1
|x)p(y
2
|x).
3
(a) Show I(X; Y
1
, Y
2
) = 2I(X; Y
1
) I(Y
1
; Y
2
).
(b) Conclude that the capacity of the channel
- -
X
(Y
1
, Y
2
)
is less than twice the capacity of the channel
- -
X Y
1
Solution: A channel with two independent looks at Y.
(a)
I(X; Y
1
, Y
2
) = H(Y
1
, Y
2
) H(Y
1
, Y
2
|X) (4)
= H(Y
1
) + H(Y
2
) I(Y
1
; Y
2
) H(Y
1
|X) H(Y
2
|X) (5)
(since Y
1
and Y
2
are conditionally independent given X) (6)
= I(X; Y
1
) + I(X; Y
2
) I(Y
1
; Y
2
) (7)
= 2I(X; Y
1
) I(Y
1
; Y
2
) (since Y
1
and Y
2
are conditionally
identically distributed)
. (8)
(b) The capacity of the single look channel X Y
1
is
C
1
= max
p(x)
I(X; Y
1
). (9)
The capacity of the channel X (Y
1
, Y
2
) is
C
2
= max
p(x)
I(X; Y
1
, Y
2
) (10)
= max
p(x)
2I(X; Y
1
) I(Y
1
; Y
2
) (11)
max
p(x)
2I(X; Y
1
) (12)
= 2C
1
. (13)
Hence, two independent looks cannot be more than twice as good
as one look.
4
6. Choice of channels.
Find the capacity C of the union of 2 channels (X
1
, p
1
(y
1
|x
1
), Y
1
) and
(X
2
, p
2
(y
2
|x
2
), Y
2
) where, at each time, one can send a symbol over
channel 1 or over channel 2 but not both. Assume the output alphabets
are distinct and do not intersect.
(a) Show 2
C
= 2
C
1
+ 2
C
2
.
(b) What is the capacity of this Channel?
1
2
3
1
2
3
1p
p
1p
p
Solution: Choice of channels.
(a) Let
=
_
1, if the signal is sent over the channel 1
2, if the signal is sent over the channel 2
.
Consider the following communication scheme: The sender chooses
between two channels according to Bern() coin ip. Then the
channel input is X = (, X

).
Since the output alphabets Y
1
and Y
2
are disjoint, is a function
of Y , i.e. X Y .
Therefore,
I(X; Y ) = I(X; Y, )
= I(X

, ; Y, )
= I(; Y, ) + I(X

; Y, |)
= I(; Y, ) + I(X

; Y |)
= H() + I(X

; Y | = 1) + (1 )I(X

; Y | = 2)
= H() + I(X
1
; Y
1
) + (1 )I(X
2
; Y
2
).
5
Thus, it follows that
C = sup

{H() + C
1
+ (1 )C
2
} ,
which is a strictly concave function on . Hence, the maxi-
mum exists and by elementary calculus, one can easily show C =
log
2
(2
C
1
+ 2
C
2
), which is attained with = 2
C
1
/(2
C
1
+ 2
C
2
).
If one interprets M = 2
C
as the eective number of noise free
symbols, then the above result follows in a rather intuitive manner:
we have M
1
= 2
C
1
noise free symbols from channel 1, and M
2
=
2
C
2
noise free symbols from channel 2. Since at each step we get
to choose which channel to use, we essentially have M
1
+ M
2
=
2
C
1
+ 2
C
2
noise free symbols for the new channel. Therefore, the
capacity of this channel is C = log
2
(2
C
1
+ 2
C
2
).
This argument is very similiar to the eective alphabet argument
given in Problem 19, Chapter 2 of the text.
(b) From part (b) we get capacity is
log(2
1H(p)
+ 2
0
).
7. Cascaded BSCs.
Consider the two discrete memoryless channels (X, p
1
(y|x), Y) and (Y, p
2
(z|y), Z).
Let p
1
(y|x) and p
2
(z|y) be binary symmetric channels with crossover
probabilities
1
and
2
respectively.
-
-

3 Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Qs -
-

3 Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Q
Qs
1
0
X
1
0
Z
1
0
Y
1
1
1
1

1
1
2
1
2

2
6
(a) What is the capacity C
1
of p
1
(y|x)?
(b) What is the capacity C
2
of p
2
(z|y)?
(c) We now cascade these channels. Thus p
3
(z|x) =

y
p
1
(y|x)p
2
(z|y).
What is the capacity C
3
of p
3
(z|x)? Show C
3
min{C
1
, C
2
}.
(d) Now let us actively intervene between channels 1 and 2, rather
than passively transmitting y
n
. What is the capacity of channel 1
followed by channel 2 if you are allowed to decode the output y
n
of
channel 1 and then reencode it as y
n
for transmission over channel
2? (Think W x
n
(W) y
n
y
n
(y
n
) z
n


W.)
(e) What is the capacity of the cascade in part c) if the receiver can
view both Y and Z?
Solution: Cascaded channels.
(a) Brute force method: Let C
1
= 1 H(p) be the capacity of the
BSC with parameter p, and C
2
= 1 be the capacity of the
BEC with parameter . Let

Y denote the output of the cascaded
channel, and Y the output of the BSC. Then, the transition rule
for the cascaded channel is simply
p( y|x) =

y=0,1
p( y|y)p(y|x)
for each (x, y) pair.
Let X Bern() denote the input to the channel. Then,
H(

Y ) = H((1)((1p)+p(1)), , (1)(p+(1p)(1)))
and also
H(

Y |X = 0) = H((1 )(1 p), , (1 )p)


H(

Y |X = 1) = H((1 )p, , (1 )(1 p)) = H(

Y |X = 0).
7
Therefore,
C = max
p(x)
I(X;

Y )
= max
p(x)
[H(

Y ) H(

Y |X)]
= max
p(x)
[H(

Y )] H(

Y |X)
= max

[H((1 )((1 p) + p(1 )), , (1 )(p + (1 p)(1 )))]


H((1 )(1 p), , (1 )p). (14)
Note that the maximum value of H(

Y ) occurs when = 1/2 by


the concavity and symmetry of H(). (We can check this also by
dierentiating Eq. (14) with respect to .)
Substituting the value = 1/2 in the expression for the capacity
yields
C = H((1 )/2, , (1 )/2) H((1 p)(1 ), , p(1 ))
= (1 )(1 + (1 p) log(1 p) + p log p)
= C
1
C
2
.
(b) Elegant method:
For the cascade of an arbitrary discrete memoryless channel (with
capacity C) with the erasure channel (with the erasure probability
), we will show that
I(X;

Y ) = (1 )I(X; Y ). (15)
Then, by taking suprema of both sides over all input distributions
p(x), we can conclude the capacity of the cascaded channel is
(1 )C.
Proof of Eq. (15):
Let
E =
_
1,

Y = e
0,

Y = Y
.
Then, since E is a function of Y ,
H(

Y ) = H(

Y , E)
= H(E) + H(

Y |E)
= H() + H(

Y |E = 1) + (1 )H(

Y |E = 0)
= H() + (1 )H(Y ),
8
where the last equality comes directly from the construction of E.
Similarly,
H(

Y |X) = H(

Y , E|X)
= H(E|X) + H(

Y |X, E)
= H(E) + H(

Y |X, E = 1) + (1 )H(

Y |X, E = 0)
= H() + (1 )H(Y |X),
whence
I(X;

Y ) = H(

Y ) H(

Y |X) = (1 )I(X; Y ).
8. Channel capacity
(a) What is the capacity of the following channel
1
1
1
2
1
2
1
2
1
2
(Input) X
Y (Output)
0
0
1
1
2
2 3
3
4
(b) Provide a simple scheme that can transmit at rate R = log
2
3 bits
through this channel.
Solution for Channel capacity
(a) We can use the solution of previous home question:
C = log
_
2
C
1
+ 2
C
2
+ 2
C
3
_
Now we need to calculate the capacity of each channel:
C
1
= max
p(x)
I(X; Y ) = H(Y ) H(Y |X) = 0 0 = 0
9
C
2
= max
p(x)
I(X; Y ) = H(Y ) H(Y |X) = 1 1 = 0
C
3
= max
p(x)
I(X; Y ) = max
p(x)
{H(Y ) H(Y |X)}
= max
p(x)
_

1
2
p
2
log
_
1
2
p
2
_

_
1
2
p
2
+ p
3
_
log
_
1
2
p
2
+ p
3
__
p
2
Assigning p
3
= 1 p
2
and derive against p
2
:
dI(X; Y )
dp
2
=
p
2
2

1
2

1
p
2
2

1
2
log
_
p
2
2
_
+
2 p
2
2

1
2

1
2p
2
2
+
1
2
log
_
2 p
2
2
_
1 = 0
And as result p
2
=
2
5
:
C
3
0.322
And, nally:
C = log(2
0
+ 2
0
+ 2
0.322
) 1.7
(b) Here is a simple code that achieves capacity.
Encoding: You just use ternary representation of the message and
send using 0,1,2 but no 3 (or 0,1,3 but no 2) of the input channel.
Decoding: map the ternary output into the message.
9. Source-channel coding problem The next question is dealing with
the separation of source and channel coding. Please read the lecture
note on this topic before answering the question. It is in the website as
an appendix to Lecture 6
Consider the source-channel coding problem given in Fig. 1, where
V, X, Y, W have a Binary alphabet. The source V is i.i.d. Bernoulli
(p), and the channel is in Fig. 2.
(a) What is the capacity of the channel given in Fig. 2? [8 points]
(b) Assume that error-free bits can be transmitted through the chan-
nel. What is the minimum rate in which the source V can be
encoded such that the source decoder can reconstruct the source
V losslessy? [4 points]
10
Source Enc. Channel Enc. Channel Channel Dec. Source Dec.
V
X Y

V
Figure 1: A source-channel coding problem
0.5
0.5
0.1
0.9
0 0
1 1
Figure 2: The channel.
(c) For what values of p can the source V be reconstructed losslessly
using the scheme in Fig. 1 (you may use the inverse of H, i.e.,
H
1
(q))? [4 points]
(d) Would the answer to 9c changes if a joint source-channel coding
and decoding is allowed? [4 points]
Solution to Source-channel coding:
(a) We rst write the mutual information function: I(X; Y ) = H(Y )
H(Y |X). Note that H(Y |X) = pH(0.5)+(1p)H(0.1), where p =
p(x = 0). As for H(Y ), note that Y is distributed B(0.90.4p).
Hence,
I(X; Y ) =pH(0.5) (1 p)H(0.1)
(0.9 0.4p) log(0.9 0.4p) (0.1 + 0.4p) log(0.1 + 0.4p).
Solving
I(X;Y )
p
= 0 leaves us with
0.4 log
0.9 0.4p
0.1 + 0.4p
= 1 H(0.1),
or p = 0.4623. Thus, C = 0.1476.
11
(b) The minimum rate, of course, is R = H(p) since V B(p).
(c) We require that R C. Recall that H
1
(C) has to values, given
by a, 1 a. Thus, p a, p 1 a is the answer.
(d) No, it would not change. This is due to the fact that for DMCs,
it doesnt matter if you do joint source-channel decoding (source-
channel separation theorem).
12

Anda mungkin juga menyukai