2 Notes1

EE 387, John Gill, Stanford University Notes #1, 9/20/10, Handout #2
EE 387 course information

• Instructor: John Gill, Packard 266
• Textbook: Algebraic Codes for Data Transmission by Richard Blahut
• Weekly homework, including occasional programming problems
• Midterm exam, in class, Wednesday, November 3, 12:30–2:05pm
• Final exam, in class, Friday, December 10, 8:30–11:30am
• Class WWW page: http://www.stanford.edu/class/ee387
• CCNet: http://ccnet.stanford.edu/ee387 (used for recording grades)
EE 387 Notes #1, Page 1
Textbook coverage
Chapters 1 through 8 and 12 of Algebraic Codes for Data Transmission:
1. Introduction to error control codes
2. Introduction to algebra: groups, rings, vector spaces, Galois (finite) fields
3. Linear block codes: matrices, syndromes, weight distributions
4. Galois field structure and arithmetic
5. Cyclic codes: generator polynomials, shift registers
6. BCH and Reed-Solomon codes: definitions and properties
7. BCH and Reed-Solomon codes: decoders for errors and erasures
8. Implementation of Galois field arithmetic and shift register decoders
12. Product codes; coding gain

Communication systems
The familiar communication system block diagram is shown below.
Source Channel
Source Encrypt Modulator
Encoder Encoder
Channel Noise
Source Channel
Sink Decrypt Demodulator
Decoder Decoder
EE 387 covers only part of one subcomponent of communication systems:

• Design and analysis of block codes for error protection
• Algebraic rather than probabilistic techniques for decoding block codes
Trellis, turbo, and LDPC codes are not covered in EE 387; see EE 388.
Other communication systems courses

• Source coding
Data compression of digital data, quantization of analog data.
◦ Lossless coding: Huffman codes, arithmetic coding, Lempel-Ziv codes
◦ Lossy coding: for example, transform coding and motion estimation are used
in compressing images and video
Relevant courses: EE 392J, EE 398A, Music 422.
• Encryption
Private key vs. public key.
Encryption works best with nonredundant plaintext, so data compression
improves security
The output of a good encryption unit appears random; channel errors cause
error propagation, so good channel coding is very important
Relevant courses: CS 255, CS 355.

Other communication systems courses (2)
• Modulation
Modulator generates waveforms suitable for transmission over a physical
channel, e.g., deep space, telephone lines, magnetic media, optical fibers, optical
media such as CD-ROMs and DVDs.
◦ AM: x(t) → (1 + ax(t)) sin 2πfct: EE 279
◦ CD-ROMs, disk drives: 8-to-14 and RLL(1,7) codes satisfy run-length
constraints (“symbolic dynamics”): EE 392P (not offered)
◦ Modems: FSK, PSK, QAM, GMSK, and even fancier signal constellations:
EE 379
Advances in modulation schemes have resulted in significant improvements in
communication rates and storage capacity.
However, channel errors that the demodulator does not correct can result in
error propagation and must be corrected by the channel decoder.
Relevant results from EE 376A, Information Theory

The Source and Channel Coding Theorems imply the following:
• The communication problem can be broken down into the separate components
shown in the above diagram, without loss of reliability or efficiency — in
principle.
• Reliable communication can be achieved at any rate below the capacity of the
communication channel — in the limit for large amounts of data.
In EE 387 we will study how to achieve reliable communication with reasonable
block sizes and feasible decoding complexity.
We add controlled redundancy to data transmitted over the channel.
This redundancy lowers the raw data rate, but reduces the error rate when the
redundancy is used to correct errors.
The net effect is to increase the rate at which reliable data is delivered.

Error control classification
The error control problem can be classified in several ways.
• Type of errors: how much clustering — random, burst, catastrophic
• Type of modulator output: digital (“hard”) vs. analog (“soft”)
• Type of error control coding: detection vs. correction
• Type of codes: block vs. convolutional
• Type of decoder: algebraic vs. probabilistic

The first two classifications are used to select a coding scheme according to the
last three classifications.
Types of errors: noise characteristics

Consider only the channel portion of the communication diagram.
Noise
n1 , n2 , . . .
m1 , m2 , . . . Channel x1 , x2 , . . . y1 , y2 , . . . Channel m̂1 , m̂2 , . . .

Encoder Channel Decoder
The raw error rate is defined to be the fraction of incorrect senseword symbols.
n
1X
lim P(xi 6= yi)
n→∞ n
i=1
This definition assumes that the demodulator output is hard data.

Important special case: i.i.d. channel has raw error rate P(xi 6= yi).
We usually consider noise to be additive. Noise is the difference between received
and transmitted symbols:
ni = yi − xi ⇒ yi = xi + ni .

Noise clustering
Channel noise can be classified by the dependence between noise events.
1. Random: independent noise symbols, perhaps i.i.d. or Bernoulli. Each noise
event affects isolated symbols.
2. Burst: a noise event that causes a contiguous sequence of unreliable symbols.
Some causes of burst noise:
• Noise event is larger in duration or physical size than one symbol
• Error propagation may occur because of demodulator design. Example: when
a (17,16) code is used, a single raw error may affect 16 data bits.
3. Catastrophic: channel becomes unusable for a period of time comparable to or
longer than a data packet. Example: ethernet collisions.
Retransmission, perhaps at a much later time, is needed because packets are
hopelessly corrupted.
Types of error protection

• Error detection
Goal: avoid accepting faulty data.
Lost data may be unfortunate; wrong data may be disastrous.
Solution: checksums are included in messages (packets, frames, sectors).
With high probability the checksums are not valid when any part of the message
is altered.
• (Forward) error correction (FEC or ECC).
Use redundancy in encoded message to estimate from the received data
(senseword) what message was actually sent.
The best estimate is usually the “closest” message. The optimal estimate is the
message that is most probable given what is received. (MAP)
Error correction is more complex than error detection. “Proof”: ECC can be used
for detection as follows: reject a message if any correction is needed.

Types of error protecting codes
• Block codes
Data is blocked into k-vectors of information digits, then encoded into n-digit
codewords (n ≥ k) by adding p = n − k redundant check digits.
data data
Encoder Encoder
(general) (systematic)
codeword data checks
There is no memory betwen blocks. The encoding of each data block is

independent of past and future blocks.
An encoding in which the information digits appear unchanged in the codewords
is called systematic.
Types of error protecting codes (2)

• Convolutional codes
Time-invariant encoding scheme. Each n-bit codeword block depends on the
current information digits and on the past m information blocks
The parameter m is called the memory order. The constraint length is
(m + 1)n; this is the number of bits that the decoder must consider.
mi mi−1
c1i = mi
c2i = mi + mi−1
For this rate 1/2 convolutional code, m = 1 and n = 2.

Exercise: determine the error correction ability and a decoding method for the
above convolutional code.

Error control applications
• Planetary probes (Gaussian noise; ground-based decoders).
• Memory subsystems (9-bit SIMMs for detection, 72-bit DIMMs for correction).
• Computer buses (high speed, short blocklength). In many systems, the

possibility of errors is ignored.
• Modems. V.32/V.90 use trellis codes. V.42 uses error detection with
retransmission.
• Datacomm networks (Ethernet, FDDI, WAN, WiFi, Bluetooth) (usually error

detection only).
• Magnetic disks and tapes (detection for soft errors, correction for burst errors).
• CDs, DATs, minidisks, DVDs. Digital sound needs ECC!
• Satellite TV.
Coding schemes
The above examples exhibit a range of data rates, block sizes, error rates.
• No single error protection scheme works for all applications.
• Some applications use multiple coding techniques.
A common combination uses an inner convolutional code and an outer
Reed-Solomon code.
• Choosing a good coding scheme may be difficult because error characteristics
are not known.
Common solution: fall back on methods that correct multiple classes of errors.
Disadvantage: may not be “optimal” for any particular environment.

Alphabets
Definition: An alphabet is a discrete (usually finite) set of symbols.
Examples:
• B = {0, 1} is the binary alphabet
• T = {−1, 0, +1} is the ternary alphabet1
• X = {00, 01, . . . , FF} is the alphabet of 8-bit symbols (used in codes for
compact discs, DVDs, and most hard disk drives)
A channel alphabet symbol may be
• an indivisible transmission unit, e.g., one point from a signal constellation, or
• a sequence of modulation symbols encoded into a coding alphabet symbol
The alphabets encountered in EE 387 usually have 2m symbols.
1
The ternary alphabet is used by alternate mark inversion modulation; successive ones in data are represented by
alternating ±1.
Block codes: definition

The channel alphabet is the set of output symbols of the channel encoder — same
as set of input symbols to channel (modulator).
The senseword alphabet is the set of output symbols from channel demodulator,
i.e., input to channel decoder.
The senseword alphabet may be larger than the channel alphabet, e.g., when the
received symbols represent soft information.
Examples:
• Binary erasure channel has input alphabet {0, 1} and output alphabet {0, ?, 1}.
• Some disk drive read channel circuits quantize input signal to 6 bits; the channel
alphabet has 64 symbols.
Definition: A block code of blocklength n over an alphabet X is a nonempty set
of n-tuples of symbols from X .
C = {(c11, . . . , c1n), . . . , (cM 1, . . . , cM n)}
The n-tuples of the code are called codewords.
Block codes: rate
Suppose the channel alphabet has Q symbols.
The rate of a block code with blocklengh n with M codewords is defined to be
1
R= log Q M .
n
Typically, codewords of length n are generated by encoding messages of
k information (data, message) symbols using an invertible encoding function.
In this case,
• number of codewords is M = Qk .
1 1 k
• rate of code is R = logQ Qk = k = .
n n n
Such a code with blocklength n and rate k/n is called an (n, k) code.
The rate is a dimensionless fraction (bits/bits). It is the fraction of transmitted
symbols that carry information.
Systematic encoder
The error protection ability of a block code depends only on the set of codewords,
not on the mapping from source messages to codewords.
But obviusly an encoder is needed for practical applications.
message m Encoder codeword c

k symbols n symbols
An encoder is called systematic if it copies the k message symbols unchanged in

consecutive locations in the codeword. Codewords are of the form
c = [ m|p ] or c = [ p|m ]
where m is the vector of k message symbols and p is the vector of
n − k redundant or check symbols.
Almost all codes use systematic encoders. Exception: Reed-Muller codes.

Block codes: very simple examples
• C = {00010110} = {SYN}
Blocklength n = 8, M = 1, rate R = 18 log2 1 = 0.
Codes with rate 0 are called useless.
This code could be used for error rate analysis or byte synchronization.
• C = {00, 01, 10, 11}
Blocklength n = 2, M = 4, rate R = 12 log2 4 = 1.
This code has no redundancy, so it can neither correct nor detect errors.
• C = {001, 010, 100}
Blocklength n = 3, M = 3, rate R = 31 log2 3 = 0.528.
This code might be used over a channel that drops bits (1 → 0 may occur but
not 0 → 1), since any dropped 1 can be detected.
C ′ = {011, 101, 110} is a better code for this channel. Why?
Block codes: more interesting examples

• Parity SIMMs have rate 8/9 and blocklength 9 or 36. They can detect one error
per 8-bit byte.
• ECC DIMMs have blocklength 72 and rate 8/9. They can correct one error and
detect two errors in 72 bits.
• Ethernet packet sizes range from 64 to 1518 bytes (12144 bits).
Checksum is only 32 bits ⇒ very high rate code for large packets.
• The number of binary 5-tuples of weight 2 or 3 (nearly DC balanced) is

5 5
+ = 10 + 10 = 20 > 16 = 24 .
2 3
The 4B5B TAXI code for FDDI uses 16 of these 5-tuples to convey 4 bits of
data information:
{1E 09 14 15 0A 0B 0E 0F 12 13 16 17 1A 1B 1C 1D}
(A few other 5-tuples are used for control purposes.)

Hamming distance
The Hamming distance dH between n-tuples is the number of components in
which the n-tuples differ:
n
(
X 1 if xi 6= yi
dH (x, y) = dH (xi, yi) , where dH (xi, yi) =
i=1
0 if xi = yi
Hamming distance satisfies the axioms for a metric or distance measure:
• d(x, y) ≥ 0 with equality if and only if x = y (nonnegativity)
• d(x, y) = d(y, x) (symmetry)
• d(x, y) ≤ d(x, z) + d(z, y) (triangle inequality)
Hamming distance is a coarse or pessimistic measure of difference.
Other useful distances that occur in error control coding:
• Lee distance, distance on a circle, is applicable to phase shift coding.
• Euclidean distance is used with sensewords in Rn.
Minimum distance
The minimum (Hamming) distance d∗ of a block code is the distance between any
two closest codewords:
d∗ = min { dH (c1, c2) : c1, c2 are codewords and c1 6= c2 }
Some obvious properties of minimum distance of a code of blocklength n:
• d∗ ≥ 1 since Hamming distance between distinct codewords is a positive integer.
• d∗ ≤ n if code has two or more codewords.
• d∗ = n + 1 or d∗ = ∞ for the useless code with only one codeword.
(This is a convention, not a theorem.)
• d∗(C1) ≥ d∗(C2) if C1 ⊆ C2 — smaller codes have larger (or equal) minimum
distance.
The minimum distance of a code determines both its error-detecting ability and
error-correcting ability.

Error-detecting ability
Suppose that a block code is used for error detection only.
• If the received n-tuple is not a codeword, a detectable error has occurred
• If the received n-tuple is a codeword but not the transmitted codeword, an error
has occured that cannot be detected.
Let c be the transmitted codeword and let r the senseword — the received n-tuple.
If d(c, r) < d∗, then the senseword cannot be an incorrect codeword. Otherwise c
and r would be two codewords whose distance is less than the minimum distance.
Conversely, let c1, c2 be two closest codewords. If c1 is transmitted but c2 is
received, then an error of weight d∗ has occurred that cannot be detected.
Theorem: The guaranteed error-detecting ability is e = d∗ − 1.
The error-detecting ability is a worst case measure of the code.
Codes designed for error detection can detect the vast majority of errors when d∗
or more symbols are incorrect.
Error-correcting ability
We may think of a block code as a set of M vectors in an n-dimensional space.
The geometry of the space is defined by Hamming distance — quite different from
Euclidean geometry. Nonetheless, geometric intuition can be useful.
d*
decoding
spheres
decoding
regiion
The optimal decoding procedure is usually nearest-neighbor decoding: the

senseword r is decoded to the nearest codeword ĉ:
ĉ = argmin{dH (c, r) : c is a codeword}
Decoding regions in R2 are Voronoi regions defined by perpendicular bisectors of
lines connecting codewords. Hamming space is much more complicated.

Error-correcting ability (2)
Theorem: Using nearest-neighbor decoding, errors of weight t can be corrected
if and only if 2t < d∗. (For Hamming distance, equivalently, 2t + 1 ≤ d∗).
Proof: The “spheres” of radius t surrounding the codewords do not overlap.
Otherwise, there would be two codewords ≤ 2t distant. Therefore when ≤ t errors
occur, the decoder can unambiguously decide which codeword was sent.
d*
decoding
spheres
decoding
regiion
The maximal decoding region is usually larger than the decoding sphere.
Most decoders correct only when senseword r belongs to a decoding sphere of
radius t. They are called bounded-distance decoders.
Decoding outcomes
A particular block code can have more than one decoder, depending on
• the type of errors expected or observed
• the error rate
• the computational power available at the decoder
Suppose that codeword c is transmitted, senseword r is received, and ĉ is the
decoder’s output. The table below classifies the outcomes.
ĉ = c decoder success Successful correction (including no errors)
ĉ = ? decoder failure Uncorrectable error detected, no decision (not too bad)
ĉ 6= c decoder error Miscorrection (very bad)
Important: the decoder cannot distinguish the outcome ĉ = c from ĉ 6= c.
However, it can assign probabilities to the possibilities; more bit errors corrected
suggests a higher probability that the estimate is wrong.

Decoding outcomes (2)
The codeword transmitted and noise encountered are random variables, so the
decoder outcomes are probabilistic events, which are named in the following table.
ĉ = ? Pued Error detected but not corrected (decoder failure)
ĉ 6= c Pmc Miscorrection (decoder error)
ĉ 6= c Pue Undetectable error (error detection only)
Definition: A complete decoder is a decoder that decodes every senseword to
some codeword; i.e., the decoder never fails (to make a hard decisions).
For a complete decoder, Pued = 0.
Definition: A bounded-distance decoder corrects all errors of weight ≤ t but no
errors of weight > t. More than t errors results in failure or error.
For a fixed code, if we reduce t then decoder failure becomes more common while
decoder error becomes less likely.
The subscript “ued” can be read as “uncorrectable error detected,” whereas “ue” is “undetected error.”
Example: simple repetition code

Transmit a single information symbol n times.
For the binary alphabet, there are two codewords, i.e., C = {00 · · · 00, 11 · · · 11}.
Rate: 1/n
Minimum distance: d∗ = n
Error detection: up to n − 1 symbol errors
P{undetected error} = ǫn
Error correction: up to ⌊(n − 1)/2⌋ symbol errors
P{miscorrection} = P{more than n/2 errors} (for binary alphabet)
For binary alphabet, any two complementary codewords could be used.
Example: C ′ = {01010101, 10101010} = {55, AA}.
C ′ would be a better code if errors result in all 0s or all 1s being received.

Example: simple parity-check codes
Append one check bit to data bits so that all codewords have the same overall
parity — either even or odd.
Even-parity codewords are defined by a single parity-check equation:
c1 ⊕ c2 ⊕ · · · ⊕ cn = (c1 + c2 + · · · + cn) mod 2 = 0 ,
where ⊕ denotes the exclusive-or operation.
If we XOR cn to both sides of the above equation, we obtain an encoding equation:
cn = c1 ⊕ c2 ⊕ · · · ⊕ cn−1 .
This shows how to compute the check bit cn from the data bits c1, . . . , cn−1.
P
Any bit ci can be considered to be the check bit because it can be computed from the other n − 1 bits: ci = j6=i cj .
Simple parity-check codes: error protecting ability

Any single bit error (or any odd number of errors) can be detected.
Rate: (n − 1)/n = 1 − 1/n
Error detection: 1 bit and any odd number of errors
If bit errors are i.i.d. with probability ǫ, the pmf for the number of errors is binomial:
n
P{1 error} = ǫ(1 − ǫ)n−1 ≈ nǫ
1
n
P{2 errors} = ǫ2(1 − ǫ)n−2 ≈
2
1 2 2
n ǫ
2
n 1
P{k errors} = ǫk (1 − ǫ)n−k ≈ nk ǫk
k k!
(The approximations are valid for large n and small ǫ ≪ 1/n.)
For ǫ = 1/2, the undetected error probability is 21 − 2−n. This is special case of
general rule that undetected error probability with p check bits is ≈ 2−p.
X n k
Exercise: P{odd number of errors} = ǫ (1 − ǫ)n−k = 12 − 21 (1 − 2ǫ)n .
k odd k
Example: nonbinary simple check codes
Suppose we have defined an addition operation on a nonbinary alphabet, modulo 5
addition for alphabet {0, 1, 2, 3, 4}.
A block code can be defined by the single check equation or by the corresponding
encoding equation:
c1 + c2 + · · · + cn = 0 ⇔ cn = − c1 − c2 − · · · − cn−1 .
This code can detect all single symbol errors and most error patterns with two or
more symbol errors.
Some addition operations are better than others. For 8-bit symbols:
• 8-bit parallel exclusive: two bit errors in the same bit position in two different
bytes cannot be detected. Very bad.
• For 8-bit unsigned addition, two errors in the same bit positions are detected
when they change the value of the carry into the next bit position. Bad.
• Ones-complement arithmetic is a better choice for the addition operator (used
in the Fletcher’s OSI checksum). Better.
Simple product codes

Arrange data bits in a two-dimensional array. Append parity check bits at the end
of each row and column.
k1 info bits row checks
n = (k1 + 1)(k2 + 1)
k2 info bits
k = k1 k2
column checks n − k = k1 + k2 + 1
A single error causes failure of one row equation and one column equation.
The incorrect bit is located at the intersection of the bad row and bad column.
Double errors can be detected — two rows or two columns (or both) have the
wrong parity — but cannot be corrected.
Some triple errors cause miscorrection. Which?
Almost all multiple errors can be detected — product code has lots of redundancy.
Exercise: what fraction of errors is not detected?

Hamming codes
Simple product codes are simple but inefficient:
• a failed parity-check equation locates row or column of error
• however, a satisfied equation gives little information
An “efficient” equation gives one bit of information about the error location.
It “looks” at half the codeword bits and is “independent” of other equations.
The following table defines a (7, 4) Hamming parity-check code (BSTJ 1950).
c1 c2 c3 c4 c5 c6 c7

1 0 1 0 1 0 1 
0 1 1 0 0 1 1 3 parity-check equations

0 0 0 1 1 1 1
The 1’s indicate which codeword bits affect which parity-check equations.
Hamming codes: parity-check equations and matrix

The following three equations are satisfied by all (and only) valid codewords:
c1 ⊕ c3 ⊕ c5 ⊕ c7 = 0
c2 ⊕ c3 ⊕ c6 ⊕ c7 = 0
c4 ⊕ c5 ⊕ c6 ⊕ c7 = 0
The check equations can be described by a parity-check matrix:
 
1 0 1 0 1 0 1
H = 0 1 1 0 0 1 1
0 0 0 1 1 1 1
Codewords are characterized analytically by the following equations:
 
c1
H ..  = 0 3×1 ⇔ [ c1 . . . c7 ] H T = 0 1×3

c7
In other words, a 7-tuple c is a codeword if and only if cH T = 0 .

Hamming codes: encoding equations
Each of the codeword bits c1, c2, c4 appears in only one equation.
Therefore c1, c2, c4 can be computed from the other bits, c3, c5, c6, c7.
c1 = c3 ⊕ c5 ⊕ c7
c2 = c3 ⊕ c6 ⊕ c7
c4 = c5 ⊕ c6 ⊕ c7
These linear encoder equations can be written as a vector-matrix product.
 
1 1 0
1 0 1
[ c1 c2 c4 ] = [ c3 c5 c6 c7 ] P = [ c3 c5 c6 c7 ] 
0 1
.
1
1 1 1
We could choose other sets of check bits, such as {c2, c3, c4}.
Not all sets work. In particular, c1, c2, c3 cannot be determined from c4, c5, c6, c7
since the leftmost 3 columns of H form a singular (not invertible) matrix.
Hamming codes: error detection and correction

Each codeword bit affects ≥ 1 equation ⇒ every single-bit error can be detected.
Each bit is checked by a unique set of equations ⇒ error location can be
determined by which parity-check equations fail.
Definition: the syndrome s = [ s0 s1 s2 ] of a received vector r = [ r1 r2 . . . r7 ] is
the binary vector that tells which parity-check equations are not satisfied.
s0 = r1 ⊕ r3 ⊕ r5 ⊕ r7
s1 = r2 ⊕ r3 ⊕ r6 ⊕ r7 ⇔ [ s0 s1 s2 ] = [ r1 . . . r7 ] H T
s2 = r4 ⊕ r5 ⊕ r6 ⊕ r7
When s = 0, the decoder assumes that no error has occurred. This is the most
likely conclusion under reasonable assumptions.
Each nonzero value of s corresponds to an error in a different one of 23 − 1 = 7 bit
positions. If a single error has occurred, the syndrome identifies its location.
For this particular parity-check matrix H , the syndrome s = [ s0 s1 s2 ] is the binary representation of the assumed
error location (most significant bit is s2 ).

Hamming codes: minimum distance
Hamming codes can correct single errors. Therefore d∗ ≥ 2t + 1 = 2 · 1 + 1 = 3 .
Thus Hamming codes can detect double errors (when used for error detection only).
Fact: minimum distance is exactly 3. Therefore Hamming codes can either correct
single errors or detect double errors (but not both simultaneously).
A Hamming code with m ≥ 3 parity-check bits has 2m − 1 nonzero syndromes,
hence blocklength n = 2m − 1. The rate quickly approaches 1 for large n.
Some Hamming codes:
m n k rate
2 3 1 0.3333
3 7 4 0.5714
4 15 11 0.7333
5 31 26 0.8387
6 63 57 0.9047
8 255 247 0.9686
15 32767 32752 0.9995
32 4294967295 4294967263 1.0000
Extended (expanded, expurgated) Hamming codes

Two easy ways to “extend” a Hamming code:
• Add an overall parity-check bit: c0 = c1 ⊕ · · · ⊕ c7 ⇔ c0 ⊕ · · · ⊕ c7 = 0.
0 1 0 1 0 1 0 1
0 0 1 1 0 0 1 1
H1 =
0 0 0 0 1 1 1 1
1 1 1 1 1 1 1 1
This expanded code has blocklength 8 but the same number of codewords.
Code parameters: (8, 4, 4), rate 1/2.
• Add an overall parity-check equation: c1 ⊕ c2 ⊕ · · · ⊕ c6 ⊕ c7 = 0 .
1 0 1 0 1 0 1
0 1 1 0 0 1 1
H2 =
0 0 0 1 1 1 1
1 1 1 1 1 1 1
This expurgated code consists of the Hamming codewords with even parity.
Code parameters: (7, 3, 4), rate 3/7.
Extended Hamming codes: minimum distance
Both expanded and expurgated Hamming codes are constructed by adding
redundancy to code with minimum distance 3.
• The minimum distance of extended codes is no smaller, hence ≥ 3.
• All codewords have even parity, so the distance between codewords is even.
Therefore the minimum distance is an even number and so is ≥ 4.
• Hamming codes contain codewords of weight 3.
• The additional parity-check bit increases distance by at most 1.
Therefore the minimum distance of extended Hamming codes is d∗ = 4.
These codes can correct single errors and simultaneously detect double errors.
A double error is indicated by even overall parity but nonzero symdrome,
corresponding to failure of one or more of the original m Hamming equations.
General product codes

Let C1 be an (n1, k1) block code and let C2 be an (n2, k2) block code.
The product code C1 ⊗ C2 is an (n1n2, k1k2) code.
k1 n1 − k1
k2
n2 − k2
Encoder (systematic) for product code:

• First arrange k1k2 information symbols in a k2 × k1 array.
• Then encode first k2 rows using code C1.
• Finally encode all n1 columns using code C2.
Fact: the minimum distance of C1 ⊗ C2 is d∗ = d∗1 · d∗2 .
By definition, every column is a codeword of C2. But if C1 and C2 are linear codes,
then all rows are codewords of C1.
This definition assumes systematic encoders for C1 and C2 .

General product code example
Consider the product of two (8, 4, 4) expanded Hamming codes.
4 4
4 data
Product code parameters: (n, k, d∗) = (64, 16, 16). Rate: 1/4
Error correcting ability: t = ⌊(16 − 1)/2)⌋ = 7
Product codes can be decoded up to the guaranteed error correcting ability.
The decoding procedure requires a column decoder that can correct both errors
and erasures. (Blahut chapter 12.)
We will find more efficient codes; e.g., the (64,25,16) expanded BCH code needs
only 39 check bits for same minimum distance.
Nonbinary single error correcting code

The single check equation
c1 + c2 + · · · + cn = 0
allows detection of a single symbol error in a received n-tuple.
Furthermore, the syndrome s defined by
s = r1 + r2 + · · · + rn
indicates the magnitude of the error. If the error is in location i and the incorrect
symbol is ri = ci + ei, then
s = r1 + r2 + · · · + rn = c1 + · · · + (ci + ei) + · · · + cn = ei .
The syndrome tells exactly what should be subtracted from the incorrect symbol in
order to obtain a codeword.
What is not known is where the error is — which symbol is wrong.

More equations needed
A second equation is needed to identify the error location. The effect of an error
magnitude on the syndrome should be different for each location.
A reasonable choice for this second equation:
1 · c1 + 2 · c2 + · · · + n · cn = 0 .
Now every valid codeword satisfies two equations:
1 · c1 + 1 · c2 + · · · + 1 · cn = 0
1 · c1 + 2 · c2 + · · · + n · cn = 0
We can derive encoding equations to express c1, c2 in terms of c3, . . . , cn.
Example: Let symbols be 4-bit values with addition modulo 16. For n = 15,

1 1 1 ··· 1
H=
1 2 3 · · · 15
is parity-check matrix for a code that can correct single symbols errors. Almost.
Decoding procedure
Suppose there is a single error of magnitude ei 6= 0 in location i. The syndrome
s = [ s0 s1 ] can be expressed in terms of the unknowns i and ei:
n
X n
X
s0 = rj = ei + cj = ei
j=1 j=1
Xn Xn
s1 = jrj = iei + jcj = iei
j=1 j=1
We can determine ei and i from the syndrome equations:

ei = s0
iei s1
i = =
ei s0
Sadly, division is not always defined for modulo 16 arithmetic.
Example: Suppose s0 = 4, s1 = 8. Then s1 = is0 mod 16 has four solutions:
2, 6, 10, 14 .
We cannot be certain where the single error is located.
Finite fields
This problem with division is solved by using a “better” multiplication.
We will define GF(16), the field of 16 elements.
In GF(16), multiplication has an inverse operation of division, and most of the
other familiar properties of arithmetic are valid.
Another approach is modulo 17 arithmetic with channel alphabet {0, 1, . . . , 16}.
The “parity-check” matrix for a 1EC code over GF(17) is

1 1 1 ··· 1
H= .
1 2 3 · · · 16
The error pattern and location can be computed using the above equations:
error pattern: ei = s0
s1
error location: i =
s0
Using either GF(16) or modulo 17 arithmetic, these equations can be solved
whenever s0 6= 0.
Reed-Solomon codes
The previous codes over GF(16) and GF(17) are examples of Reed-Solomon codes.
Reed-Solomon codes use symbols from a finite field GF(Q) and have n = Q − 1.
Each row of H consists of consecutive powers of elements of GF(Q).
When the elements are chosen carefully, each additional check equation increases
the minimum distance by 1.
For example, the following parity-check matrix corresponds to 4 equations:
   
1 1 1 1 ··· 1 1 1 1 1 ··· 1
1 2 3 4 ··· 16  1 2 3
  4 · · · 16
H= 1 4 9 = 
16 · · · 256  1 4 9 16 · · · 1 
1 8 27 64 · · · 4096 1 8 10 13 · · · 16
This PC matrix defines a code over GF(17) with minimum distance 5. It can
correct two symbol errors in a codeword of length 16.
Decoding procedures for Reed-Solomon codes are the chief goal of this course.

Types of channel errors (review)
The raw error rate of a communication channel is the average number of symbol
errors per symbol sent.
The raw error rate may vary over time, so we average over segments of the data.
How do we measure error rate?
• Send known data and count incorrect symbols.
• Keep track of number of errors corrected and uncorrectable sensewords.
Another noise parameter is clumpiness (clustering, statistical dependence).
• Catastrophic: channel becomes unusable for a long time — many data packets.
Retransmission needed.
• Burst: channel becomes very noisy for short time, resulting in contiguous
sequence of bad symbols.
• Random: independent noise symbols, i.i.d. or slowly time-varying statistics.
Each noise event affects isolated symbols.
Three channels with bit error rate 10−3

• Catastrophic. A communication channel that is unusable for 9 hours each year
(solar storm?) but otherwise noiseless has average bit error rate 10−3 .
A small amount of redundancy allows detection of bad packets. Overall
redundancy approaches the 1% needed for retransmitted packets.
• Burst. Suppose errors occur in clusters of 10–100 bits. EE 387 presents codes
that effectively handle burst errors.
These codes are often used at a second level in conjunction with lower level
codes that deal with uniformly distributed errors.
• Random. P{yi 6= xi} = ǫ, a constant independent of time i and data xi. The
probability that any given received bit is wrong equals the average bit error rate.
For a given bit error rate, random errors are more difficult than burst errors,
which are more difficult than catastrophic errors.
“Proof”: reordering a large block of data to separate bits that were originally
close together transforms burst errors into isolated errors.

Communication channel
Recall the portion of the communication diagram that is covered in EE 387:
Channel Channel
Modulator Channel Demodulator
Encoder Decoder
Noise
The modulator converts the digital output of the channel encoder into waveforms
suitable for transmission over the physical channel, while the demodulator
estimates digital values from the received waveforms.
The channel coding scheme should take into account the modulation method.
However, once the channel code has been chosen, we can include the
modulator/demodulator in the channel block. We get a simplified channel model.
Channel Digital Channel

Noise
This simplified channel has digital inputs and outputs.

Digital channel model

The mathematical model of a discrete-time communication channel is probabilistic.
Channel Digital Channel
Encoder x1 , x2 , . . . , xn Channel y1 , y2 , . . . , yn Decoder
Noise
A discrete channel is completely characterized by a collection of conditional

probabilities.
P(outputs y1, y2, . . . , yn | inputs x1 , x2, . . . , xn)
We assume that the channel is causal: yj is independent of xi for i > j.

A channel is discrete if its input and output alphabets X and Y are discrete.
A channel is memoryless if the output at time i depends only on the input at time i:
P(y1, y2, . . . , yn | x1, x2, . . . , xn) = P(y1 | x1)P(y2 | x2 ) · · · P(yn | xn) .
The usual definition of a discrete memoryless channel (DMC) requires that the
transition probabilities P(yi | xi) do not depend on time i.

Binary symmetric channel
The simplest, most popular, and occasionally appropriate channel model2 is the
binary symmetric channel (BSC).

1−ǫ ǫ
P(y | x) =
ǫ 1−ǫ
(
1−ǫ y =x
=
ǫ y 6= x
The transition probabilities are shown in the following transition diagram.
1−ǫ
0 0
ǫ
ǫ
1 1
1−ǫ
2
Robert Gallager, a famous researcher in information theory, stated in a talk at Stanford that he had never met a
binary symmetric channel. He meant that most channels have correlated noise.
Capacity of binary symmetric channel

The capacity of the BSC is easy to compute but hard to achieve.
C(ǫ) = 1 − H(ǫ)
where H(ǫ) = −ǫ log2 ǫ − (1 − ǫ) log2(1 − ǫ) is the binary entropy function.
H(ǫ) is a measure of the amount of information or uncertainty in a binary decision
whose a priori probabilities are ǫ and 1 − ǫ.
Below are a plot of capacity and entropy and a table of capacity vs. bit error rate.
1
0.9
entropy
capacity ber H(ber) C(ber)
−1
0.8 10 0.46899559358928 0.53100440641072
0.7 10−2 0.08079313589591 0.91920686410409
0.6 10−3 0.01140775773746 0.98859224226254
0.5 10−4 0.00147303352833 0.99852696647167
0.4 10−5 0.00018052328302 0.99981947671698
0.3
10−6 0.00002137426289 0.99997862573711
0.2
10−7 0.00000246961916 0.99999753038084
0.1
10−8 0.00000028018120 0.99999971981880
0
10−9 0.00000003134005 0.99999996865995
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Random bit errors
When we refer to random bit errors, we are assuming a binary channel with i.i.d.
errors — in other words, a binary symmetric channel.
A binary symmetric channel has a single noise parameter, ǫ:
bit error rate = ǫ = P{ri 6= ci} = crossover probability
This channel model assumes that errors are not data dependent.
The number of errors in an n-bit senseword has a binomial distribution:

n i
P{i errors} = ǫ (1 − ǫ)n−i , i = 0, . . . , n
i
A block code with minimum distance d∗ ≥ 2t + 1 can correct up to t errors.
If more than t errors occur, the decoder has two possible outcomes: it may
• recognize that the senseword is more than distance t from any codeword, or
• miscorrect to the (unique) codeword ĉ within distance t of r.
When miscorrection occurs, ĉ is a worse estimate than r.
Error probabilities for repetition codes

Consider the (n, 1) repetition code with odd blocklength n = 2t + 1.
There is only one message bit m. The decoder estimates m̂ by majority vote.
A decoder error occurs when senseword has more than n/2 bit errors.
The miscorrection probability Pmc is the probability that codeword is miscorrected.
Pmc = P{decoder error} = P{> t bit errors} = P{≥ t + 1 bit errors}
n
(nǫ)t+1

X n i n−i n t+1 n−t−1
= ǫ (1 − ǫ) ≈ ǫ (1 − ǫ) ≈
i=t+1
i t + 1 (t + 1)!
The approximations apply when ǫ ≪ 1/n and n ≫ t .

Since there is only one message bit, cooked error rate = decoder error rate:
P{m̂ 6= m} = P{ĉ 6= c} = Pmc
It can be shown (using Chernoff’s inequality or Stirling’s approximation) that
if ǫ < 1/2 then Pmc → 0 exponentially with n.

Error probabilities for Hamming codes
The (7, 4) Hamming code has 24 = 16 codewords.
Each codeword has a decoding region containing 1 + 7 = 8 sensewords.
Decoding regions do not overlap; so they contain 16 · 8 = 128 = 27 vectors,that is,
all possible n-tuples.
Every senseword r is decoded to some codeword by changing at most one bit.
noise
message m codeword c senseword r estimate ĉ
4 3 4 3 4 3 4
For two errors, dH (c, r) = 2 and decoder miscorrection causes dH (c, ĉ) = 3.
The bit error rate measured between c and ĉ is P{2 errors} · 3/7.
It can be shown that for the systematic encoder, the mistakes in ĉ are distributed
uniformly over the 7 codeword bits.
Matlab confirms: P{3 errors} = 4.5 × 10−12 ≪ 4.6 × 10−8 = P{2 errors}.
So for homework #1 problem #5 we can ignore ≥ 3 errors.
More general channel models

We often do not have a good statistical model of the noise source.
Sometimes we assume the worst case — i.i.d. errors.
An error-prevention code may have to protect against multiple error sources.
random noise + burst noise
noise n1 , n2 , . . .
m1 , m2 , . . . Channel x1 , x2 , . . . y1 , y2 , . . . Channel m̂1 , m̂2 , . . .

A classic burst error model is the Gilbert-Elliot model.

1−ǫ1 ǫ1 1−ǫ2
ǫ1 = 10−6
good bad ǫ2 = 10−1
ǫ2
In the simplest version of this model, the bit error rate is 0 in the good state and
1/2 in the bad state, resulting in a burst of errors.
Error model example
Suppose a combination of random (1-bit) and burst (45-bit) errors.
P{random error} = 10−6 , P{burst error} = 4 × 10−7
1
The raw bit error rate is 1 · 10−6 + 2 · 45 · 4 × 10−7 = 10−5.
Suppose 1000-bit packets and we require bit error rate after correction 10−12.
To achieve this with a random error correcting code, we consider
(1000 · 10−5)5 10−10
P{5 errors} ≈ = ≈ 0.83 × 10−12
5! 120
Correcting 5 random errors would be sufficient if only random errors occurred.
But packets with 45-bit bursts cannot be corrected. Burst errors contribute about
9 × 10−6 ≫ 10−12 to the final error rate.
Furthermore, correcting 5 random errors is overkill.
(1000 · 10−6)4 10−12
P{4 errors | no burst} ≈ = ≈ 4.2 × 10−14
4! 24
We need a method for correcting a mixture of random and burst errors.
Random and burst errors

Methods for correcting a mixture of random and burst errors:
• Interleave codewords so that a burst does smaller damage to multiple codewords.
A 45-bit burst affects at most 5 bits in each of the 10 codewords that make up
a 10000-bit supercodeword. We could use a 5-error correcting code. This code
requires 50 check bits or each 1000-bit subcodeword, 500 bits total.
10 1000-bit codewords or 1000 10-bit symbols
• Use a code that corrects symbols errors, where symbols contain multiple bits.
A Reed-Solomon code with 10-bit channel symbols has blocklength 210 − 1
symbols or 10230 bits. With 500 check bits, we can correct any pattern of
errors confined to 25 symbols. In particular, a 45-bit burst affects at most 6
symbols, so multiple bursts–up to 4–can be corrected.

Rules of thumb
1. Given p bits of redundancy, we can achieve an undetected error probability
of 2−p or smaller.
For example, CRC-16 is a 16-bit checksum that provides an undetected error
rate of 2−16 = 1.6 × 10−5.
2. For messages of length n bits, t log2 n check bits are approximately necessary
and sufficient to correct up to t bit errors.
For example, BCH codes with blocklength 255 can correct one error with
8 check bits, two errors with 16 check bits, and so on.
3. We can trade correction capability for error detection capability, at a rate of one
bit of error correction for two bits of error detection.
For example, a code that can correct 2 errors can detect 4 errors or correct
1 error while detecting 2 errors.

2 Notes1

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

2 Notes1

Diunggah oleh

Hak Cipta:

Format Tersedia

EE 387, John Gill, Stanford University Notes #1, 9/20/10, Handout #2

EE 387 course information

EE 387 Notes #1, Page 1

EE 387 Notes #1, Page 2

EE 387 covers only part of one subcomponent of communication systems:

EE 387 Notes #1, Page 3

Other communication systems courses

EE 387 Notes #1, Page 4

EE 387 Notes #1, Page 5

Relevant results from EE 376A, Information Theory

EE 387 Notes #1, Page 6

• Type of modulator output: digital (“hard”) vs. analog (“soft”)

• Type of error control coding: detection vs. correction

• Type of codes: block vs. convolutional

• Type of decoder: algebraic vs. probabilistic

EE 387 Notes #1, Page 7

Types of errors: noise characteristics

m1 , m2 , . . . Channel x1 , x2 , . . . y1 , y2 , . . . Channel m̂1 , m̂2 , . . .

This definition assumes that the demodulator output is hard data.

EE 387 Notes #1, Page 8

EE 387 Notes #1, Page 9

Types of error protection

EE 387 Notes #1, Page 10

codeword data checks

There is no memory betwen blocks. The encoding of each data block is

EE 387 Notes #1, Page 11

Types of error protecting codes (2)

For this rate 1/2 convolutional code, m = 1 and n = 2.

EE 387 Notes #1, Page 12

• Computer buses (high speed, short blocklength). In many systems, the

• Datacomm networks (Ethernet, FDDI, WAN, WiFi, Bluetooth) (usually error

• CDs, DATs, minidisks, DVDs. Digital sound needs ECC!

EE 387 Notes #1, Page 13

EE 387 Notes #1, Page 14

Block codes: definition

EE 387 Notes #1, Page 17

message m Encoder codeword c

An encoder is called systematic if it copies the k message symbols unchanged in

EE 387 Notes #1, Page 18

EE 387 Notes #1, Page 19

Block codes: more interesting examples

EE 387 Notes #1, Page 20

EE 387 Notes #1, Page 21

EE 387 Notes #1, Page 22

EE 387 Notes #1, Page 23

The optimal decoding procedure is usually nearest-neighbor decoding: the

EE 387 Notes #1, Page 24

EE 387 Notes #1, Page 25

EE 387 Notes #1, Page 26

EE 387 Notes #1, Page 27

Example: simple repetition code

EE 387 Notes #1, Page 28

EE 387 Notes #1, Page 29

Simple parity-check codes: error protecting ability

EE 387 Notes #1, Page 31

Simple product codes

EE 387 Notes #1, Page 32

EE 387 Notes #1, Page 33

Hamming codes: parity-check equations and matrix

EE 387 Notes #1, Page 34

EE 387 Notes #1, Page 35

Hamming codes: error detection and correction