Anda di halaman 1dari 25

Architectures and Implementations of

Low-Density Parity Check


Decoding Algorithms

Engling Yeo, Borivoje Nikolić, and Venkat Anantharam


Department of Electrical Engineering and Computer Sciences
University of California, Berkeley, CA 94720, USA

Engling Yeo University of California, Berkeley 1


Background: Iterative Codes
0
SNR vs. BER for rate 1/2 codes
10

Year Rate ½ Code SNR Required 10


-1

for BER < 10-5


Uncoded
1948 SHANNON 0dB 10
-2

1967 (255,123) BCH 5.4dB

BER
Iterative
Code
1977 Convolutional Code 4.5dB 10
-3

Conv. Code
1993 Iterative Turbo Code 0.7dB Capacity
ML decoding

2001 Iterative LDPC Code 0.0045dB 10


-4 Bound

C. Berrou and A. Glavieux, "Near Optimum Error Correcting Coding And Decoding: Turbo- 4 dB
Codes," IEEE Trans. Comms., Vol.44, No.10, Oct 1996.

S. Chung; G.D. Forney, T.J. Richardson, and R. Urbanke, “On the design of low-density parity- 0 1 2 3 4 5 6
check codes within 0.0045 dB of the Shannon limit,” IEEE Communications Letters, vol.5,
(no.2), IEEE, Feb. 2001. pp.58-60.
SNR

Engling Yeo University of California, Berkeley 2


Outline

 LDPC Codes
 Soft Decoding of LDPC Codes
 Parallel vs. Serial Architectures
 Platforms
 Methods for Code Construction

Engling Yeo University of California, Berkeley 3


Low Density Parity Check Codes (LDPC)
CHECK
NODES

VARIABLE
NODES

- R. G. Gallager, IRE Trans. Info. Theory, Vol. 8(1962) p. 21

• LDPC representation by bi-partite graph.


• Decoding by message computation and relay along edges
 p (1) 
• Iteratively improved estimates of log-likelihood ratios ln  n 
 pn (0) 
• Example code:
4096 information bits
4608 variable nodes , 512 check nodes
512 parity bits
Engling Yeo University of California, Berkeley 4
Outline

 LDPC Codes
 Soft Decoding of LDPC Codes
 Parallel vs. Serial Architectures
 Platforms
 Methods for Code Construction

Engling Yeo University of California, Berkeley 5


Message from Variable n to Check m

 pn (1)   
Qnm  ln      Rm 'n   Rmn
 pn (0)   m ' ( n ) 

CHECK m
1 2 3
NODES

R
1n R 2n

m
n
R3

Qn
VARIABLE
NODE n

 p (1) 
Decoder input: ln  n 
 pn (0) 

Engling Yeo University of California, Berkeley 6


Message from Check m to Variable n

1     
Rmn       Qn 'm     Qnm     sgn  Qnm    sgn(Qn 'm ) 
  
 n 'N ( m)    n ' ( m ) 
 1 
 ( x)   log  tanh( x)    1 ( x); x0
 2 

• Signed magnitude representation


CHECK m
NODE • MSB represents parity information

Q
R nm
3m
Q 1m
2m
Q

VARIABLE
1 2 n 3
NODE

Engling Yeo University of California, Berkeley 7


Hardware for Computation of Rmn

b : Wordlength of messages
Engling Yeo University of California, Berkeley 8
Outline

 LDPC Codes
 Soft Decoding of LDPC Codes
 Parallel vs. Serial Architectures
 Platforms
 Methods for Code Construction

Engling Yeo University of California, Berkeley 9


Parallel Architecture of LDPC decoders

PECV,1 PECV,2 ... PECV,M


 Throughput Efficiency
 Power Efficiency
 Complex Interconnect
PEVC,1 PEVC,2 PEVC,3 PEVC,4 ... PEVC,N-1 PEVC,N

Soft Soft Soft Soft Soft Soft


Check-to-Variable Input1 Input2 Input3 Input4 InputN-1 Input
PECV N
Processing Element Soft Soft Soft Soft Soft Soft
Output1 Output2 Output3 Output4 OutputN-1 OutputN
PEVC Variable-to-Check
Processing Element
A. Blanksby and C. J. Howland, “A 220mW 1-Gbit/s 1024-Bit Rate-1/2 Low
Density Parity Check Code Decoder,” Proc IEEE CICC, Las Vegas, NV, USA, pp.
293-6, May 2001.

Engling Yeo University of California, Berkeley 10


Serial Architecture of LDPC decoders
P
Memory
E
Memory Soft
P Output
Soft Crossbar E
.
Switch . s
Inputs . .
. .
P
Memory
E

Processing Element for Both


P
Classes of Message
E
Computation
 High logic density
 Memory requirement grows linearly with number of edges
G. Al-Rawi, J. Cioffi, and M. Horowitz, “Optimizing the mapping of low-density parity check
codes on parallel decoding architectures,” Proc. IEEE ITCC, Las Vegas, NV, USA, pp.578-
86, Apr 2001.

Engling Yeo University of California, Berkeley 11


Outline

 LDPC Codes
 Soft Decoding of LDPC Codes
 Parallel vs. Serial Architectures
 Platforms
 Methods for Code Construction

Engling Yeo University of California, Berkeley 12


Decoding with Software Approach
 General purpose microprocessors and Digital Signal Processors (DSP)
 Limited number of Processing Elements (ALUs)
 Serial Architecture
 Few hundreds of kbps throughput
 Design, simulate, and perform comparative analysis of LDPC codes
 Low throughput applications with fast time to market element

Engling Yeo University of California, Berkeley 13


Decoding with Hardware Approach

 Parallel architecture
 Power and throughput efficiency

 FPGA
 Parallel adders and table lookups
 Need to fit PEs and routing onto single FPGA
die
 Existing implementations with serial
architecture limited to 56Mbps throughput
 [M. M. Mansour and N. R. Shanbhag, “Memory-efficient turbo decoder architectures for LDPC
codes,” Proc. IEEE SIPS 2002, San Diego, CA, Oct. 2002.]
[T. Zhang and Keshab Parhi, “A 56Mbps (3,6)-Regular FPGA LDPC Decoder,” Proc. IEEE SIPS
2002, San Diego, CA, Oct. 2002.]

Engling Yeo University of California, Berkeley 14


Decoding with Hardware Approach

 Custom ASIC
 Parallel implementation demonstrated with 1Gbps
throughput
[A.J. Blanksby and C.J. Howland, “A 690-mW 1-Gb/s 1024-b, rate-1/2 low-density parity-check code
decoder,” IEEE Journal of Solid-State Circuits, vol.37, (no.3), (Proceedings of the IEEE 2001 Custom
Integrated Circuits Conference, San Diego, CA, USA, 6-9 May 2001.) IEEE, March 2002. p.404-12. ]

 Routing congestion
 Logic density is 50%
 Design not scalable to codes with larger block
sizes

Engling Yeo University of California, Berkeley 15


Solving Routing Congestion in Hardware

Bank of Variable- Soft


Soft
Inputs Memories to-Check PEs Outputs
Crossbar
Switch
Bank of Check-to-
Memories Variable PEs

 Serial architecture with groups of parallel optimized processing


elements
 Full utilization of pipelined hardware with alternating blocks
 E.g. 128x parallelism in commercial IP (FlarionTM)
 Further memory reduction through staggered decoding schedule
[E. Yeo, P. Pakzad, B. Nikolic, and V. Anantharam, "High throughput low-density parity-check architectures," Proc. IEEE Globecom2001, San
Antonio, TX, pp.3019-24, Nov 2001. ]

Engling Yeo University of California, Berkeley 16


Platform vs. Throughput Summary

103 104 105 106 107 108 109

Engling Yeo University of California, Berkeley 17


Outline

 LDPC Codes
 Soft Decoding of LDPC Codes
 Parallel vs. Serial Architectures
 Platforms
 Methods for Code Construction

Engling Yeo University of California, Berkeley 18


Density Evolution
 Density Evolution
 Very good codes (< 0.0045dB from theoretical bound)
 Large variable edge degree (~ 100)
 Large block size (107)

 Cayley and Ramanujan Graphs


 Unstructured interconnects

 Algebraic Constructions
 Cyclic or quasi-cyclic properties
 Use of shift registers
 Parallel implementation has to address sparse code /
interconnect issue.
Engling Yeo University of California, Berkeley 19
Summary

 Difficulties with routing or memory requirement

 Parallel architectures are optimal for power/


throughput efficiency

 Different platforms (microprocessor/FPGA/ASIC)


offers possibilities for various applications

 Methods for code construction need to consider


implementability

Engling Yeo University of California, Berkeley 20


END

Engling Yeo University of California, Berkeley 21


Engling Yeo University of California, Berkeley 22
Low Density Parity Check Codes (LDPC)

• LDPC representation by bi-partite graph.


• Non-zero entries in each
• Row m represent the set of bits that are connected to check m,
(m) = {n: Hm,n = 1}
• Column n represent the set of checks that are connected to bit n.
(n) = {m: Hm,n = 1} is

Engling Yeo University of California, Berkeley 23


Sparse Graph

Limited spatial locality between input edges


Rearrangement of nodes has limited effect in improving spatial locality

Engling Yeo University of California, Berkeley 24


Hardware Pipelining of serial architecture

Butterfly Butterfly PE CV,1 PE VC,1


Unit Unit

Butterfly PE CV,2 PE VC,2

STALL!!
Butterfly
Unit Unit

Memory

Memory

Memory
Memory
Memory

Memory

... ... ... ...


... ...
Butterfly Butterfly PE CV,3 PE VC,3
Unit Unit

Butterfly Butterfly PE CV,4 PE VC,4


Unit Unit
...

Traditional DSP Algorithms LDPC Decoding Algorithms


• e.g. FFT, Digital Filters Pipeline stall delays
Throughput increases Throughput hardly increased
High spatial locality Limited spatial locality
• Sparse graph
Engling Yeo University of California, Berkeley 25

Anda mungkin juga menyukai