Error Correction and LDPC Decoding

Error Correction and LDPC decoding
1
Error Correction in Communication Systems
2
Error: Given the original frame k and the
received frame k, how many corresponding bits
differ?
Hamming distance (Hamming, 1950).
Example:
Transmitted frame: 1110011
Received frame: 1011001
Number of errors:

noise
Transmitter
Binary
information
Corrected
information
frame
Corrupted
frame
channel
Receiver
3
3
Error Detection and Correction
Add extra information to the original data being
transmitted.
Frame = k data bits + m bits for error control: n = k + m.
Error detection: enough info to detect error.
Need retransmissions.
Error correction: enough info to detect and
correct error.
Forward error correction (FEC).
Encoder
(Adding
Redundancy)
Channel
Decoder
(Error Detection
and Correction
Noise
Binary
information
Corrected
information
Encoded
information
Corrupted
information
with noise
Reed Solomon
codes
Hamming
codes
LDPC
Introduced
Convolutional
codes
BCH codes
Renewed interest
in LDPC
Turbo
codes
1970
1960
1990 2000 1980
Practical
implementation
of codes
LDPC beats
Turbo and
convolutional
codes
Modulation

The information format is changed *******
Binary Phase Shift-Keying (BPSK) Modulation
1-2X

5
signal power is two times noise power
Key terms
Encoder : adds redundant bits to the sender's bit stream to
create a codeword.
Decoder: uses the redundant bits to detect and/or correct as
many bit errors as the particular error-control code will allow.
Communication Channel: the part of the communication
system that introduces errors.
Ex: radio, twisted wire pair, coaxial cable, fiber optic cable, magnetic
tape, optical discs, or any other noisy medium
Additive white Gaussian noise (AWGN)

Larger noise makes the
distribution wider
6
Important metrics
Bit error rate (BER): The probability of bit error.
We want to keep this number small
Ex: BER=10
-4
means if we have transmitted10,000 bits, there is 1 bit
error.
BER is a useful indicator of system performance independent of error
channel
BER=Number of error bits/ total number of transmitted bits
Signal to noise ratio (SNR): quantifies how much a signal has
been corrupted by noise.
defined as the ratio of signal power to the noise power corrupting the
signal. A ratio higher than 1:1 indicates more signal than noise
often expressed using the logarithmic decibel scale:

Important number: 3dB means
7
signal power is two times noise power
Goal: Attain lower BER at smaller SNR
Error correction is a key component
in communication and storage
applications.
Coding example: Convolutional,
Turbo, and Reed-Solomon codes
What can 3 dB of coding gain buy?
A satellite can send data with half the
required transmit power
A cellphone can operate reliably with
half the required receive power
Signal to Noise Ratio (dB)
Figure courtesy of B. Nikolic, 2003
(modified)
B
i
t

E
r
r
o
r

P
r
o
b
a
b
i
l
i
t
y

10
0
10
-1
10
-2
10
-3
10
-4
0 1 2 3 4 5 6 7 8
3 dB

Convolutional
code
Uncoded system
noise
8
Information
k-bit
channel
Codeword
n-bit
Receivedword
n-bit
Decoder
(check parity,
detect error)
Encoder
(add parity)
Corrected
Information
k-bit
LDPC Codes and Their Applications
Low Density Parity Check (LDPC) codes have superior
error performance
4 dB coding gain over convolutional codes

Standards and applications
10 Gigabit Ethernet (10GBASE-T)
Digital Video Broadcasting
(DVB-S2, DVB-T2, DVB-C2)
Next-Gen Wired Home
Networking (G.hn)
WiMAX (802.16e)
WiFi (802.11n)
Hard disks
Deep-space satellite missions

Signal to Noise Ratio (dB)
B
i
t

E
r
r
o
r

P
r
o
b
a
b
i
l
i
t
y

10
0
10
-1
10
-2
10
-3
10
-4
0 1 2 3 4 5 6 7 8
4 dB

Conv. code
Uncoded
Figure courtesy of B. Nikolic, 2003
(modified)
9
Future Wireless Devices Requirements
Increased throughput
1Gbps for next generation of WiMAX
(802.16m) and LTE (Advanced LTE)
2.2 Gbps WirelessHD UWB [Ref]
Power budget likely
Current smart phones require 100 GOPS
within 1 Watt [Ref]
Required reconfigurability for
different environments
A rate-compatible LDPC code is proposed
for 802.16m [Ref]
Required reconfigurability for
different communication standards
Ex: LTE/WiMax dual-mode cellphones
require Turbo codes (used in LTE) and
LDPC codes (used in WiMAX)
Requires hardware sharing for silicon area
saving

Future Digital TV Broadcasting Requirements
High definition television for
stationery and mobile users
DTMB/DMB-TH (terrestrial/mobile),
ABS-S (satellite), CMMB
(multimedia/mobile)
Current Digital TV (DTV)
standards are not well-suited for
mobile devices
Require more sophisticated signal
processing and correction algorithms
Require Low power
Require Low error floor
Remove multi-level coding
Recently proposed ABS-S
LDPC codes (15,360-bit
code length, 11 code rates)
achieves FER < 10
-7

without concatenation [Ref]

Future Storage Devices Requirements
Ultra high-density storage
2 Terabit per square inch [ref]
Worsening InterSymbol Interference (ISI) [ref]
High throughput
Larger than 5 Gbps [ref]
Low error floor
Lower than 10
-15
[ref]
Remove multi-level coding
Next generation of Hitachi IDRC read-channel technology [9]

Encoding Picture Example
H.V
i
T
=0
1 0 1 1 1 0 1 1 0 0 0 0 1 1 1 1 0 0 0 1 1 1 1 1
0 1 0 0 1 0 0 0 1 0 1 1 0 1 1 0 0 1 0 0 0 1 1 0
0 1 0 1 0 0 0 1 1 1 1 1 1 0 0 1 0 0 1 1 1 0 0 1
0 1 0 0 0 1 1 0 0 1 0 0 0 1 0 0 1 1 1 0 0 1 0 0
...
V =
1 0 0 0 0 0 0 0 0 0 . . .1 1 0 0 0 1 1 1 1 1 0 0 0 0 0 1 0
0 1 0 0 0 0 0 0 0 0 . . .1 1 1 0 0 0 1 1 1 1 1 0 0 0 0 0 1
0 0 1 0 0 0 0 0 0 0 . . .0 0 1 1 0 1 1 0 0 0 1 1 0 0 0 1 0
0 0 0 1 0 0 0 0 0 0 . . .0 0 0 1 1 0 1 1 0 0 0 1 1 0 0 0 1
...
H=
Parity Image
V=

Binary multiplication called syndrome check
Decoding Picture Example
20 40 60 80 100 120 140 160 180 200
50
100
150
200
250
20 40 60 80 100 120 140 160 180 200
50
100
150
200
250
50 100 150 200
50
100
150
200
250
20 40 60 80 100 120 140 160 180 200
50
100
150
200
250
Iterative message passing decoding
Receiver noise
Iteration 1
Transmitter
Iteration 5 Iteration 15
Iteration 16
channel
Ethernet cable,
Wireless,
or Hard disk
LDPC CodesParity Check Matrix
Defined by a large binary matrix, called a parity check matrix or H matrix
Each row is defined by a parity equation
The number of columns is the code length

Example: 6x 12 H matrix for a12-bit LDPC code
No. of columns=12 (i.e. Receivedword (V) = 12 bit)
No. of rows= 6
No. of ones per row=3 (row weight)
No. of ones per col= 2 (column weight)

15
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 0 1 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
1 0 0 1 0 1 0 0
H =
C1
V3 V4 V8 V1 V2 V5 V6 V7 V9
C2
C3
C4
C5
C6
0 0 1
0 1 0
0 1 0
1 0 0
0 0 1
1 0 0
V11 V10 V12
0
LDPC CodesTanner Graph
Interconnect representation of H matrix
Two sets of nodes: Check nodes and Variable nodes
Each row of the matrix is represented by a Check node
Each column of matrix is represented by a Variable node
A message passing method is used between nodes to correct errors
(1) Initialization with Receivedword
(2) Messages passing until correct
Example:
V3 to C1, V5 to C1,
V8 to C1, V10 to C1
C2 to V1, C5 to V1
16
C 1 C 2 C 3 C 4 C 5 C 6
V 1 V 2 V 3 V 4 V 5 V 6 V 7 V 8 V 9 V 10 V 11 V 12
Check nodes
Variable nodes
Receivedword from channel
C 1 C 2 C 3 C 4 C 5 C 6
V 1 V 2 V 3 V 4 V 5 V 6 V 7 V 8 V 9 V 10 V 11 V 12
Check nodes
Variable nodes
C 1 C 2 C 3 C 4 C 5 C 6
V 1 V 2 V 3 V 4 V 5 V 6 V 7 V 8 V 9 V 10 V 11 V 12
Check nodes
Variable nodes
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 0 1 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
1 0 0 1 0 1 0 0
H =
C1
V3
V4
V8
V1
V2 V5
V6
V7
V9
C2
C3
C4
C5
C6
0 0
1
0 1 0
0 1 0
1 0 0
0 0 1
1 0 0
V11 V10
V12
0
=
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
1 0 0 0 1 1 0 0
H
0 0 1
0 1 0
0 1 0
1 0 0
0 0 1
1 0 0
0
Transmission scenario

17
Message Passing: Variable node processing
C1 C2 C3 C4 C5 C6
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
is the original received information from the channel
18
: message from check to
variable node
: message from variable
to check node
j ij
h j
j
ij
Z o + =

=
'
1 , '
'
=
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
1 0 0 0 1 1 0 0
H
0 0 1
0 1 0
0 1 0
1 0 0
0 0 1
1 0 0
0
ij j ij
Z o | =
Message Passing: Check node processing (MinSum)
( ) Sfactor sign
ij
j j h j
j j h j
ij MS ij
ij
ij
|
|
.
|
\
|
|
|
.
|
\
|
=
= =
= =
[ '
' , 1 , '
' , 1 , '
'
min
'
'
| | o
19
Sign
Magnitude
After check node
processing, the next
iteration starts with
another variable
node processing
(begins a new
iteration)
C 1 C 2 C 3 C 4 C 5 C 6
V 1 V 2 V 3
V
4 V 6 V 7 V 8 V 9 V 10 V 11 V 12 V6
Check nodes
Variable nodes
=
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
1 0 0 0 1 1 0 0
H
0 0 1
0 1 0
0 1 0
1 0 0
0 0 1
1 0 0
0
Code Estimation
Based on your modulation
scheme (here BPSK)
estimate the transmitted
bits
20
^
V
^
V
Z
Syndrome Check
Compute syndrome

Ex:
21
H.V
i
T
=0 (Binary multiplication)
If syndrome =0, terminate
decoding
Else, continue another iteration

^
Example
Encoded information V= [1 0 1 0 1 0 1 0 1 1 1 1]
22
BPSK modulated= [-1 1 -1 1 -1 1 -1 1 -1 -1 -1 -1]

(Received data from channel)=
[ -9.1 4.9 -3.2 3.6 -1.4 3.1 0.3 1.6 -6.1 -2.5 -7.8 -6.8]
Estimated code=
V= 1 0 1 0 1 0 0 0 1 1 1 1

^
Information
k-bit
channel
Codeword (V)
n-bit
Receivedword()
n-bit
Decoder
(iterative
MinSum)
Encoder
Corrected
Information
n-bit
BPSK
modulation
Ex: Variable node processing (iteration 1)
C1 C2 C3 C4 C5 C6
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
23
0
-9.1 4.9 -3.2 3.6 -1.4 3.1 0.3 1.6 -6.1 -2.5 -7.8 -6.8
=
12
| =
15
|
0
=
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
1 0 0 0 1 1 0 0
H
0 0 1
0 1 0
0 1 0
1 0 0
0 0 1
1 0 0
0
=
1
Z
=
Ex: Check node processing (Iteration 1)
24
=
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
1 0 0 0 1 1 0 0
H
0 0 1
0 1 0
0 1 0
1 0 0
0 0 1
1 0 0
0
1 ) 1 )( 1 )( 1 ( ) (
} 5 . 2 , 6 . 1 , 6 . 3 { | |
13
13
= =
=
o
o
Sign
Min
=-9.1 4.9 -3.2 3.6 -1.4 3.1 0.3 1.6 -6.1 -2.5 -7.8 -6.8
=
=
) (
| |
14
14
o
o
Sign =
=
) (
| |
18
18
o
o
Sign =
=
) (
| |
110
110
o
o
Sign
|
1 = Sfactor
Here assume
C 1 C 2 C 3 C 4 C 5 C 6
V 1 V 2 V 3
V
4 V 6 V 7 V 8 V 9 V 10 V 11 V 12 V6
Ex: Code Estimation (Iteration 1)
25
^
V
^
V
Z
= 1 0 1 0 1 0 0 0 1 1 1 1

Z= =
[ -9.1 4.9 -3.2 3.6 -1.4 3.1 0.3 1.6 -6.1 -2.5 -7.8 -6.8]
^
V
Ex: Syndrome Check (iteration 1)
Compute syndrome

H.V
i
T

^
26
=
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
1 0 0 0 1 1 0 0
H
0 0 1
0 1 0
0 1 0
1 0 0
0 0 1
1 0 0
0
=
=
= e
i
i
h h j
j i
Syndrome Syndrom Sum
v XOR Syndrome
ij ij
0 |
) (
Sumsyndrome=2 Not ZERO => Error, continue decoding

1
0
1
0
1
0
0
0
1
1
1
1

=
x
0
0
1
1
1
0
0
=
Second iteration
In variable node processing, compute , and Z based on the algorithm

27
Z= [-12.1 7.1 -4.5 7.7 -7.2 4.4 -4.2 7.2 -10.0 -7.7 -8.9 -8.1]

[ 1 0 1 0 1 0 1 0 1 1 1 1 ]

^
V=
C1 C2 C3 C4 C5 C6
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
-1.4
-1.6
=
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
1 0 0 0 1 1 0 0
H
0 0 1
0 1 0
0 1 0
1 0 0
0 0 1
1 0 0
0
= -9.1 4.9 -3.2 3.6 -1.4 3.1 0.3 1.6 -6.1 -2.5 -7.8 -6.8
=
Ex: Syndrome Check (iteration 2)
Compute syndrome

H.V
i
T

^
28
=
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
1 0 0 0 1 1 0 0
H
0 0 1
0 1 0
0 1 0
1 0 0
0 0 1
1 0 0
0
=
=
= e
i
i
h h j
j i
Syndrome Syndrom Sum
v XOR Syndrome
ij ij
0 |
) (
Sumsyndrome= ZERO => corrected code Terminate Decoding

1
0
1
0
1
0
1
0
1
1
1
1

=
x
0
0
0
0
0
0
0
=
Full-Parallel Decoding
C1 C2 C3 C4 C5 C6
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
Check nodes
Every check node and variable
node is mapped to a processor
All processors directly connected
based on the Tanner graph
Very High throughput
No large memory storage elements
(e.g. SRAMs)
High routing congestion
Large delay, area, and power caused
by long global wires

29
Chk
1
Chk
2
Chk
5
Var
1
Var
2
Var
3
Var
12
Chk
1
Chk
2
Chk
5
Var
1
Var
2
Var
3
Var
12
init: all = 0
Chk
1
Chk
2
Chk
6
Var
1
Var
2
Var
3
Var
12
from channel
C1 C2 C3 C4 C5 C6
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
C1 C2 C3 C4 C5 C6
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
Check nodes
Var
1
Var
2
Var
3
Var
12
Chk
1
Chk
2
Chk
6
Var
1
Var
2
Var
3
Var
12
Chk
1
Chk
2
Chk
6
C1 C2 C3 C4 C5 C6
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
Check nodes
Var
1
Var
2
Var
3
Var
12
Chk
1
Chk
2
Chk
6
Variable nodes
Full-Parallel LDPC Decoder Examples
For all data in the plot:
Same automatic place & route flow is used
CPU: Quad Core, Intel Xeon 3.0GHz
Ex 1: 1024-bit decoder, [JSSC 2002]
52.5 mm
2
, 50% logic utilization, 160 nm CMOS
Ex 2: 2048 bit decoder, [ISCAS 2009]
18.2 mm
2
, 25% logic utilization, 30 MHz,
65 nm CMOS
CPU time for place & route>10 days

10
5
10
6
10
7
0
100
200
300
Number of wire connections
C
P
U

t
i
m
e

(
h
o
u
r
s
)
512 Chk &
1024 Var
Proc.
384 Chk
& 2048
Var Proc.
Serial Decoder Example
C1 C2 C3 C4 C5 C6
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12
Check nodes
Variable nodes
Mem
Row Col
Mem
Row Col
Mem
Row Col
(2) compute V1
and store
V2 V3
V4 V5 V6
V7 V8 V9
V10 V11 V12
(1) initialize memory
(clear contents)
(3) now
compute C1
and store
C2 C3
C4 C5 C6
Decoding Architectures
Partial parallel decoders
Multiple processing units
and shared memories
Throughput: 100 Mbps-Gbps
Requires Large memory
(depending on the size)
Requires Efficient Control and
scheduling

Var Var Var Var
Mem Mem Mem Mem
Mem Mem Mem Mem
Mem Mem Mem Mem
Chk
Chk
=
1 0 0 0 0 1 0 1 0
0 1 0 1 0 0 0 0 1
0 0 1 0 1 0 1 0 0
0 0 1 1 0 0 0 1 0
1 0 0 0 1 0 0 0 1
1 0 0 0 1 1 0 0
H
0 0 1
0 1 0
0 1 0
1 0 0
0 0 1
1 0 0
0
Reported LDPC Decoder ASICs
2000 2002 2004 2006 2008 2010
10
1
10
2
10
3
10
4
10
5
Year
T
h
r
o
u
g
h
p
u
t

(
M
b
p
s
)
Partial-parallel Decoder
Full-parallel Decoder
10GBASE-T
802.16e
DVB-S2
802.11n
802.11a/g
Throughput Across Fabrication Technologies
Existing ASIC implementations without early termination
Full-parallel decoders have the highest throughput
65 90 130 160 180
10
1
10
2
10
3
10
4
CMOS Technology (nm)
T
h
r
o
u
g
h
p
u
t

(
M
b
p
s
)
Energy per Decoded Bit in Different Technologies
Existing ASIC implementations without early termination
Full-parallel decoders have the lowest energy dissipation
65 90 130 160 180
10
-1
10
0
10
1
10
2
E
n
e
r
g
y

p
e
r

b
i
t

(
n
J
/
b
i
t
)
Circuit Area in Different Technologies
Full-parallel decoders have the largest area due to the high
routing congestion and low logic utilization
65 90 130 160 180
10
0
10
1
10
2
10
3
A
r
e
a

o
f

D
e
c
o
d
e
r

C
h
i
p

(
m
m
2
)
Key optimization factors
Architectural optimization
Parallelism
Memory
Data path wordwidth (fixedpoint format)
37
Check
Node
+ + +
_
i
i=1
+
Variable Node
clk
W
c
i
i=1
W
c
1
...
W
c
Architectural optimization

38
BER performance versus quantization format

39
SNR(dB)
Check Node Processor
Wr/inputs
Log2(Wr/Spn) comp
stages
Split-Row Threshold
The same benefits as
Split-Row
Added two comparators
and a few logic gates

Min1
Min2
1

Wr/Spn
|
Wr/Spn
|
IndexMin1
2
|
Wr/Spn
|
|
1
|
n1
n

Wr/Spn 1
|
2
|
|
n1
|
|
n
|
|
Wr/Spn 1
|
L = log2(Wr)
Comp
Comp
Comp
Comp
Comp
Comp
Sign (
1
)
Sign (
Wr/Spn
)
Sign (
Wr/Spn
)
Sign (
1
)
Sign Logic
Mag Logic

41
Variable Node Processor
Based on the variable
update equation
The same as the
original MinSum and
SPA algorithms
Variable node
hardware complexity
is mainly reduced via
wordwidth reduction
j ij
h j
ij
ij
o | + =

=
'
1 , '
'
+
+
+
3
i
+ j
SM
to 2's
2's to
SM
2's to
SM

wc
SM
to 2's
j=1:wc
wc

+
+
+
+
+
+
-
-
SAT
SAT
8
6
5
8
7
7
7
seven 5-bit
inputs
Partial parallel decoder example

43
802.11ad LDPC code

44

Error Correction and LDPC Decoding

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Error Correction and LDPC Decoding

Diunggah oleh

Hak Cipta:

Format Tersedia

Error Correction and LDPC decoding

Sumsyndrome=2 Not ZERO => Error, continue decoding

Sumsyndrome= ZERO => corrected code Terminate Decoding

Anda mungkin juga menyukai