Anda di halaman 1dari 19

Graduate Institute of Electronics Engineering, NTU

FFT VLSI Implementation


VLSI Signal Processing
台灣大學電機系
吳安宇

1. Shousheng He and Mats Torkelson, A new approach to pipeline FFT


processor. IEEE Proc. Of IPPS, P766-770, 1996.
2. E. Bidet, D. Castelain, C. Joanblanq, and P. Senn, A fast single-chip
implementation of 8192 complex point FFT. IEEE J. Solid-State Circuits,
P300-305, March 1995

ACCESS IC LAB
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

FFT Review
N 1
X (k )    (n)W
n 0
nk
N for k  0,1,..., N - 1

 with WN  e  j ( 2 / N ) 
[0] X [0]

W N0
[4] X [1]
1

W N0
[2] X [2]
1

W N0 W N2
[6] X [ 3]
1 1

W N0
 [1] X [4]
1

W N0 W N1
 [ 5] X [ 5]
1 1

W N0 W N2
 [ 3] X [6]
1 1

W N0 W N2 W N3
 [ 7] X [7]
1 1 1
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Implementation
--- Two Extreme Method
Fully Spread
Reuse Single Butterfly [0] X [0]

W N0
[4] X [1]
1

W N0
[2] X [2]
1

W N0 W N2
[6] X [ 3]
1 1

W N0
 [1] X [4]
1

W N0 W N1
 [ 5] X [ 5]
1 1

W N0 W N2
 [ 3] X [6]
1 1

W N0 W N2 W N3
 [ 7] X [ 7]
1 1 1

Slow  ----------------- Speed ----------------- Fast


Small  ------------------Area------------------- Large
Complicated  ------------ Control --------------- Simple
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Design Consideration
System Requirement
e.g., speed, area,power …
Trade-off in these two cases, we need
More Processing Elements (PE’s)
Better Processing Element Utilization
Rate
Better Control Scheme
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

FFT Processor
--- Block Diagram

COEF
ROM

DATA IN Processing DATA OUT


INPUT FFT
Element
BUFFER RAM
(Butterfly)

CONTROL
SIGNAL
CONTROL
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Some Current Themes

Radix-2 Multi-path Delay Commutator. ( N = 16 )

8 4 2 1

BF2 BF2 BF2 BF2


X X j

Radix-2 Single-path Delay Feedback. ( N = 16 )


ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Some Current Themes (cont.)


8 4 2 1

BF4 BF4 BF4 BF4


X X j

Radix-4 Single-path Delay Feedback. ( N = 256 )


192 48 12 3
128 X 16 32 X 4 8 X 1 2
C4 BF4 X 32 C4 BF4 X 8 C4 BF4 X 2 C4 1 BF4
64 16 4
X 48 X 12 X 3
Radix-4 Multi-path Delay Commutator. ( N = 256 )

DC6x64 BF4 X DC6x16 BF4 X DC6x4 BF4 X DC6x1 BF4

Radix-4 Single-path Delay Commutator. ( N = 256 )


ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Distinctive merit of the above


The delay-feedback are more efficient
than delay-commutator in terms of
memory utilization
Radix-4 has higher multiplier utilization
,however,Radix-2 has simpler BF which
are better utilized
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Comparison
Radix / Speed
Low  ----------------------------------- High
Control Theme
Simple  ----------------------------------- Complex

Processing Ability / Unit


Low  ----------------------------------- High

Combine the advantages


 Further decompose high radix PE
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Decompose Method (1)


Simply ‘‘reuse’’ the repeated micro unit

Reuse 4
times

A radix-4 PE
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Decompose Method (2)


From algorithm level
Applying 3 index:
n=<n1*N/2 + n2*N/4 + n3>N where n1,n2={0,1} ;n3={0~N/4-1}
k=<k1 + 2k2 + 4k3>N

Summation of n1
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Decompose Method (2) cont.


Summation of n2

Only real-imaginary swapping & sign inversion


ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Graphical Explanation (N=16)

Trivial multiplication
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Graphical Explanation (cont.)


The Eqs are equivalent to the operations
below

BF4 BF2 I BF2 II

Control Control
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Circuit of BF2I
First N/2 cycles

Xr(n) Zr(n+N/2)

Xi(n) Zi(n+N/2)

Xr(n+N/2) Zr(n)

Xi(n+N/2) Zi(n)

Second N/2 cycles


ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Circuit of BF2II
Xr(n) Zr(n+N/2)
Xi(n) Zi(n+N/2)

Xr(n+N/2) Zr(n)

Xi(n+N/2) Zi(n)

Swap Re&Im and sign inversion


ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Radix-22 Single-path Delay Feedback


128 64 32 16 8 4 2 1

x(n) BF2i BF2ii BF2i BF2ii BF2i BF2ii BF2i BF2ii X(k)
X X X

W1(n) W2(n) W3(n)


clk
7 6 5 4 3 2 1 0

FFT architecture using the above technique, for N=256

Compare with original architecture, for N=256


ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Structural advantage
Radix-22 has the same complexity as
radix-4,but still retain radix-2 BF
structure
The stage has non-trivial multiplication
Control is simple;
synchronization controller
n
address counter for W
ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU

Conclusions
1. FFT Applications: Radar Signal Processing, Fast
convolution, Spectrum Estimation, OFDM-based
Modulation/demodulations
2. Efficient VLSI architectures (parallel processing) are
required for real-time processing.
3. However, most systems still employ DSP processors (e.g.,
TI C3x/C5x) for computations (fast algorithms like DIT and
DIF FFT).
4. VLIW (Very Long-length Instruction Word)-based processors
(TI C6x) need new programming skills to utilize the two
parallel MAC units.

Anda mungkin juga menyukai