Anda di halaman 1dari 36

Typical Digital Signal

Processing Operations!
Prof. Shankar Prakriya!
Indian Institute of Technology,
Delhi!
12/9/14

Outline!
Why DSP Processors?!
Typical DSP operations: Dot product,
matrix product, convolution, filtering, FFT
Algorithm!
Typical DSP operations required!
INDEXING ISSUES in typical operations!

12/9/14

The Vector Dot Product!


y = cTx where c = [c1,,cN] x=[x1,..,xN]!
The multiply accumulate MAC req.!
In a normal processor, ci and xi are
accessed sequentially from memory
before the multiply accumulate SLOW!
In a DSP processor, these are accessed
simultaneously!

12/9/14

The Vector Dot Product!


DSP Processors utilize the Harvard/Modified
Harvard architecture (separate program
and data memory). !
!
Von-Nueman architecure is used by general
purpose processors!
!

12/9/14

The Vector Dot Product!


DSP Processors like the C5510 allow the
user to define sections of the memory as
program/data memory!
Crossover path is allowed for some
sections (both program and data memory)!
The C5510 allows three quantities to be
accessed simultaneously!
!
12/9/14

The Vector Dot Product!


Indexing requirement for repeated
operation of MAC!
One pointer for ci one for xi!
Increment of each pointer after MAC!
Check to see if N operations have been
completed!
The RPT instruction is used by C5510 to
repeat MAC!
12/9/14

The Vector Dot Product


When we multiply accumulate with fixed
point arithmetic, the accumulator has to
have extra guard bits to avoid overflow!
The C5510 has a 40 bit accumulator with
16 bit words!

12/9/14

Convolution!
Consider two finite length sequences h
and x of length N and Nx (convolution
length N+Nx-1)!
Assume h is stored in program memory
and x in data memory!
Convolutions of long length are often
implemented using FFT in practice!

12/9/14

Convolution!
y[n] = i=0N-1h[i]x[n-i] n=0,.., N+Nx-2!
To compute y[n], x[n]..,x[n-N+1] are used along with
h[0].,,h[N-1]!
To compute y[n+1], x[n+1]..,x[n-N+2] are used along with
h[0].,,h[N-1]!
We need the pointer for h to increase each time and repoint to h[0] for computation of next output!
The pointer to x[n] decreases within a MAC but needs to
increment by 1 in preparation for next output!
This can be achieved without extra computation in DSP
processors like the C5510!

12/9/14

Convolution!
It can be seen that the pointer to h[n] can benefit
from circular buffers!
In circular buffers, the pointer when incremented
beyond the last addresses goes back to the
starting addresses!
When decremented below the start address, it
goes forward to the last address automatically
savings in clock cycles!

12/9/14

10

Filtering!
Filtering is the most common operation in signal
processing!
Same as convolution, but the input x[n] is an
infinite length stream (real-time filtering)!
Used to implement lowpass, highpass,
bandpass or bandstop filters!
These frequency selective filters change the
frequency content of an input signal!

12/9/14

11

Filtering!
y[n] = i=0N-1h[i]x[n-i] is a Finite Impulse
Response (FIR) Filter!
H(ej) = nh[n] e-jn is referred to as the
Frequency Response of the Filter!
Note that H(ej) is complex, and both
magnitude and phase responses should
be studied!

12/9/14

12

Filtering!
FIR Filters are preferred for their
guaranteed stability with finite word-length
effects or in adaptive systems where h[n]
are constantly adapted!
FIR filters require large number of
coefficients but can be designed to have a
linear phase!

12/9/14

13

IIR Filters!
y[n] = i=0N-1h[i]x[n-i] - i=1Mg[i]y[n-i]!
is the difference equation of the Infinite Impulse
Response (IIR) type!
Note the feedback of the output in the filter!
Stability is an issue for these causal filters since
the transfer function B(z) = G(z)/H(z)!
may have poles outside the unit circle!
!

12/9/14

14

Indexing Issues in FIR Filtering!


y[n] = i=0N-1h[i]x[n-i] !
Note that an input array of size N is all that
can be used for x[n]!
Computation of y[n] requires x[n],,x[n-N
+1] but y[n+1] requires x[n+1]..,x[n-N+2]
(x[n-N+1] is no longer needed)!

12/9/14

15

Indexing Issues in FIR Filtering!


OPTION 1: After y[n] computation, x[n-N+2] is
shifted to x[n-N+1] location, x[n-N+3] to x[n-N+2]
location, etc!
x[n+1] can then be stored in the same location
as x[n] was!
This complicated moving operations can
consume a lot of time!
Processors like the TIs C5510 perform this
move while computing each output!

12/9/14

16

Indexing Issues in FIR Filtering!


Option 2: It is also possible for x[n] to be
stored in a circular buffer of size N!
A pointer is required to indicate the location
where a new incoming sample is to be
stored.!

12/9/14

17

FIR Filtering - Issues


Accumulator has to be of larger number of
bits with guard bits
To avoid overow, it is important to scale
inputs DSP processes should be able to scale
numbers while accessing/storing numbers
Having mulDple MAC units speeds ltering
more buses required!
MulDple MACs speeds ltering, assists in
complex mulDplicaDon/ltering
12/9/14

18

Generalized Linear Phase FIR Filters!


In many applications, it is desirable that
the phase (ej) of H(ej) = nh[n] e-jn
possess the GLP property!
(ej) = + over 0 2!
In these cases, the impulse response is
either symmetric or anti-symmetric!
These symmetry has to be exploited for
computational and power efficiency!
12/9/14

19

Generalized Linear Phase Filters

12/9/14

20

Generalized Linear Phase FIR Filters


There are four types of FIR filters !
N even symmetric!
N even asymmetric!
N odd symmetric!
N even symmetric!
Eg. h[0]x[n]+h[1]x[n+1]+h[0]x[n+2] should FIRST
ADD x[n] and x[n+2], multiply with h[0] This
saves multiplications, and power (less use of
MAC units) !
FIRSADD and FIRSSUB of C5510!
12/9/14

21

Issues related to FIR Filters!


Some applications require FIR filters with
complex coefficients to be implemented!
Some processors provide multiple MAC units to
facilitate this!
Multiple MACs can also speed up FIR filtering!
Care has to be taken to avoid overflow/
underflow!

12/9/14

22

Matrix Products!
Assuming two matrices of size m x n and n x r
are stored in memory (row-wise)!
The first row and the first column of the second
matrix need to be accessed for the first element
of the product!
To access the first column, the pointer needs to
be incremented by r (such pointer manipulation
is provided for in C5510)!

12/9/14

23

Fast Fourier Transforms!


Filtering is efficiently implemented using
DFTs, which are implemented using FFTs!
DFT: X[k] = n=0N-1x[n]WNnk where WN=e-j2/N
(twiddle factor)!
!
It requires N2 complex multiplications for
direct implementation!

X[k] = n=0N/2-1xe[n]WN/2nk +WNk n=0N/2-1xo[n]WN/2nk !


!

Sum of two DFTs (N/2)2 each +N/2 extra!


Complexity has decreased (xe[n] and xo[n] are
even and odd parts of the sequences)!
12/9/14

24

Fast Fourier Transforms!


Nlog2N computations required in log2N
stages of FFT!
This algorithm is referred to as Decimation
in Time algorithm!
The sequence x[n] is often complex!
The basic computation is implemented as
a butterfly requires one complex
multiplication!
We can utilize the symmetry in WN: !
WNk = -WNk+N/2 for k=0,1,..,N/2-1!
12/9/14

25

Fast Fourier Transforms!


The BuOery (note that only one complex
mulDply is required)

12/9/14

26

Fast Fourier Transform - DIT

12/9/14

27

Fast Fourier Transform


Note that the input is in bit reversed order,
but the output is in normal order
Note the sequence of twiddle factors
accessed in each stage. Use of a circular buer
with variable pointer step can help in the
implementaDon.
ComputaDons are in-situ, and all numbers are
complex
12/9/14

28

Fast Fourier Transforms!


Note that the sequence of inputs accessed
is in a bit reversed order !
!
The last log2N bits are reversed in the first
stage, the last log2N-1 in the second, etc.!
!
The C5510 and other DSP processors
implement bit reversed ordering in
hardware while implementing the FFT no
extra instructions required!
12/9/14

29

Pseudo-random Number Gen.

12/9/14

30

PN Sequence
Sequence is periodic, and has
autocorrelation that is periodic impulse!
Periodic sequence has length that is a
maximum of 2n-1!
Used widely in communications to
scramble, spread spectrum etc!

12/9/14

31

PN sequence
Has to tap selected registers, XOR the bits and
nd parity in single instrucDon!
Should be able to rotate logically
ConcatenaDon of larger registers
(accumulators) desirable for larger length
sequences

12/9/14

32

Viterbi Algorithm
Used in communicaDon receivers, decoders
ComputaDonally ecient implementaDon of
ML receivers
Finite State Machines

12/9/14

33

Viterbi Algorithm

12/9/14

34

Viterbi Algorithm
CalculaDon of path metrics, rapid comparisons
of path metrics
Useful to perform mulDple addiDons
simultaneously
Most DSPs allow addiDon of mulDple
quanDDes at the same Dme, access mulDple
numbers simultaneously

12/9/14

35

Summary
DSP requires very specialized funcDons to be
implemented, with specic indexing and other
requirements
DSP processor are designed to take care of
these
Good programming requires a good
understanding of the DSP architecture

12/9/14

36

Anda mungkin juga menyukai