Anda di halaman 1dari 39

A

Presentation on

Design & Verification of DCT
Algorithm


Guided by:- Submitted by:-
Mr. Preet Jain Atush Jain
(Asst. Prof. (0802EC09ME03)
EC Department)
Abstract
DCT is abbreviated as Discrete Cosine Transform.

It can be regarded as a discrete time version of the Fourier
Cosine series.

It is technique for converting a signal into elementary
frequency components.

It is very Common and well known algorithm, used for signal
and image compression.

Abstract Contd
The DCT Core uses Direct Implementation of algorithm (i.e.
as per the standard equation of DCT)

The DCT Core is implemented using Verilog HDL.

The output of core is then verified from the output of
MATLAB.
Discrete Cosine Transform
A discrete cosine transform (DCT) expresses a sequence of
finitely many data points in terms of a sum of cosine functions
oscillating at different frequencies.
The Discrete Cosine Transform (DCT) of a one dimensional
sequence of length N is defined as



where k = 0,1,2,..., N 1
The original signal vector x(n) can be reconstructed back from
the DCT coefficients Y[k] by using the Inverse DCT (IDCT)
operation and it can be defined as

=
+ =
1
0
} 2 / ) 1 2 cos{( ) ( ) ( ) (
N
n
N k n n x k c z Y t
Discrete Cosine Transform


where n = 0,1,2,..., N 1
In both the above equations c[k] is defined as

=
+ =
1
0
} 2 / ) 1 2 cos{ ] [ ] [ ) (
N
k
N k n k Y k c n x t

=
+ =
1
0
} 2 / ) 1 2 cos{ ] [ ] [ ) (
N
k
N k n k Y k c n x t

=
+ =
1
0
} 2 / ) 1 2 cos{ ] [ ] [ ) (
N
k
N k n k Y k c n x t

=
+ =
1
0
} 2 / ) 1 2 cos{ ] [ ] [ ) (
N
k
N k n k Y k c n x t
(
(

=
=
=
1 ......... 2 , 1 , ) / 2 (
0 , ) / 1 (
] [
N k N
k N
k c
DCT vs DFT
If we wish to find the frequency spectrum of a function that
we have sampled, the continuous Fourier Transform is not so
useful. For that, We need a discrete version like DFT.

When the input data contains only real numbers, the sine
component of the DFT is 0, then DFT becomes a Discrete
Cosine Transform (DCT).

The Discrete Fourier Transform (DFT) and Discrete Cosine
Transform (DCT) perform similar functions i.e. they both
decompose a finite-length discrete-time vector into a sum of
scaled-and-shifted basis functions.
DCT vs DFT
The difference between both the transforms is the type of basis
function used; the DFT uses a set of complex exponential
functions, while the DCT uses only (real-valued) cosine
functions.

The DCT & DFT are used because Some tasks are much easier
to handle in the frequency domain that in the time domain. For
example: graphic equalizer. We want to boost the bass:
1. Transform to frequency domain.
2. Increase the magnitude of low frequency components.
3. Transform back to time domain.
Application of DCT
For audio:
Human ear has different dynamic range for different
frequencies.
It transform from time domain to frequency domain, and
quantize different frequencies differently.

For images and video:
Human eye is less sensitive to fine detail.
It transform from spatial domain to frequency domain, and
quantize high frequencies more coarsely (or not at all)
Has the effect of slightly blurring the image - may not be
perceptable if done right.
Literature Review
Distributed Arithmetic [6,9,10]:-
DCT have been implemented using distributed mechanism.
Most often encountered form of its computation is sum of
product.
The product of a pair of matrices can be realized using the DA
when one of the vectors is constant.


(1)
Distributed Arithmetic Contd..
Where, Ak is constant
Xk is the input data.
If A1,.AL are all N bits signed 2s complement
binary number, (1) can also be represented as:


(2)


(3)
Distributed Arithmetic Contd..
In eq. (2), matrix A is a adder matrix


(4)


but it only consists of two elements: 0 and 1. It is easy to
find that Y0, Y1, , YN-1 are the sum of some data from
X1,X2,XL, so the computation of Y only contains two
operations: addition and shift.
Distributed Arithmetic Contd..
DA uses a look-up table and accumulators instead of
multipliers.
Each single bit from each single value of the two multiplied
variables contribute only once to the sum. Because {0, 1} are
the values as discussed earlier, can be restricted to 2^n,
therefore they can be pre-calculated and saved in a look-up
table to be retrieved later.
The construction of look up table used by the Distributed
Arithmetic method take large memory size.
The shift operation is implemented by wirings, which costs
little delay and hardware resources.
Fast DCT Algorithms
To overcome the extensive computation of the DCT Chen et al
[5, 15], proposed fast DCT (FDCT).

Chen used the Fast Fourier Transform (FFT) method to
propose more efficient algorithm involving only real operation
for computing what he called the Fast Discrete Cosine
Transform algorithm (FDCT).

Let, The 8-point DCT can be written as a matrix transform.
Y=AX
Fast DCT Algorithms Contd
Where,






The Multiplier coefficients are given by

Fast DCT Algorithms Contd






Where,
Fast DCT Algorithms Contd
Due to the Symmetry of the (8 X 8) multiplication matrix, it
can be replaced by two (4x4) x (4x4) matrices, which can be
computed in parallel, as can the sums and differences forming
the vectors below

Fast DCT Algorithms Contd
The matrices operation of the design was implemented in
terms of a plot for the signal-flow.

Fast DCT Algorithms Contd
The Chen fast DCT signal-flow requires total of 18
multiplications.

Lee Algorithm [8, 15]: -

Lee algorithm [8] is based on the matrix representation.

In fact, the first step is nothing than a butterfly decomposition
yielding to an even and an odd part.


Fast DCT Algorithms Contd
The even part will be just a 1-D DCT of order N/2. While, the
odd part will be computed through a matrix multiplication.









For 1-D DCT of order N=8, the number of operation
necessarily for these algorithm will be 32 multiplications and
32 additions.

Project Design Flow
HDL Flow Matlab Flow

Taking input matrix of size 8 x 1
Apply DCT algorithm designed through HDL on it

Check simulation results for DCT output
A
Taking input matrix of size 8 x 1
Compute DCT through dct command on MATLAB

Store result of above calculated DCT
B
Project Design Flow Contd
Comparison b/w HDL & Matlab Results

Compare results of A and B
B A
Design of DCT Controller
The equation stated below is the standard equation for the 1d -
DCT

(1)

Where,
Y(u) = Coefficient value in transform domain
X(i) = Coefficient value in pixel domain.
u = Co-ordinates in transform domain
i = Spatial co-ordinates in pixel domain

=
+ =
7
0
] 16 / ) 1 2 cos[( ) ( } 2 / ) ( { ) (
i
u i i x u c u y t
Design of DCT Controller Contd...

(2)
Considering the eq. (1) following eight equations are inferred

Y(0) = [X(0) + X(1) + X(2) + X(3) + X(4) + X(5) + X(6) + X(7)] P (3.1)

Y(1) = [X(0) - X(7)]A + [X(1) - X(6)]B + [X(2) - X(5)]C + [X(3) - X(4)] D
(3.2)
Y(2) = [X(0) - X(3) X(4) + X(7)]M + [X(1) - X(2) - X(5) + X(6)] N
(3.3)
Y(3) = [X(0) - X(7)]B + [X(1) - X(6)] (-D) + [X(2) - X(5)] (-A) + [X(3) -
X(4)] (-C) (3.4)

)
`

=
=
=
0 , 1
0 , 2 / 1
) (
u
u
u c
)
`

=
=
=
0 , 1
0 , 2 / 1
) (
u
u
u c
)
`

=
=
=
0 , 1
0 , 2 / 1
) (
u
u
u c
)
`

=
=
=
0 , 1
0 , 2 / 1
) (
u
u
u c
Design of DCT Controller Contd...
Y(4) = [X(0) - X(1) - X(2) + X(3) + X(4) - X(5) - X(6) + X(7)] P (3.5)

Y(5) = [X(0) - X(7)]C + [X(1) - X(6)](-A) + [X(2) - X(5)]D + [X(3) - X(4)]
B (3.6)

Y(6) = [X(0) - X(3) X(4) + X(7)]N + [X(1) - X(2) - X(5) + X(6)] (-M)
(3.7)
Y(7) = [X(0) - X(7)]D + [X(1) - X(6)] (-C) + [X(2) - X(5)] B + [X(3) -
X(4)] (-A) (3.8)
Design of DCT Controller Contd...
Where,
M = 0.5 * Cos(pi/8) = 0.5 * Cos (2*pi/16)
N = 0.5 * Cos(3*pi/8) = 0.5 * Cos (6*pi/16)
P = 0.5 * Cos(pi/4) = 0.5 * Cos (4*pi/16)
A = 0.5 * Cos(pi/16)
B = 0.5 * Cos(3*pi/16)
C = 0.5 * Cos(5*pi/16)
D = 0.5 * Cos(7*pi/16)
Generalized Equation Implementer
Block

Input
(16 bit)
Xin0 Xin7 Xin1 Xin6 Xin2 Xin5 Xin3 Xin4
Add/Sub Block Add/Sub Block

Add/Sub Block

Add/Sub Block

Multiplier Multiplier

Multiplier

Multiplier

Adder
Output
Add/Sub Block


Adder

Substractor





Mux
Input 1
Input 2
Output
Sel
Multiplier Block


Output of Add/Sub Block


Multiplier
Cos Coefficient
Output
DCT Controller Interface
RTL View
Results
The DCT core is implemented in HDL. It is synthesized and simulated
using Xilinx ISE 9.2i on Spartan 3 (xc3s4000-5fg900)

Synthesis Report:-
S.No. Logic Utilization Used Available Utilization
1 Number of Slices 685 27648 2%
2 Number of Slice Flip Flops 788 55296 1%
3 Number of 4 input LUTs 1238 55296 2%
4 Number of bonded IOBs 259 633 40%
Advanced HDL Synthesis Report

S.No. Component Used
1 16x16-bit Multiplier 22
2 16-bit Adder 13
3 16-bit Subtractor 15
4 16-bit Register 50
Matlab Results
Verification

Open DCT Core Code with Xilinx ISE 9.2i
Go to the process window and double click on synthesis Button
After Successful Synthesis. Create Test Bench

Apply Stimulus to Test Bench
Select Behavioral Simulation from source window
From Process Window run Xilinx ISE Simulator
Simulation will start & generate output
Simulation Results
Comparison b/w HDL Simulation &
Matlab Results
Conclusion
The 1D DCT algorithm code was written in the Verilog HDL. It is then,
synthesized and simulated successfully through Xilinx ISE 9.2i.

Eight 8 x 1 input samples are taken, and DCT is calculated through DCT
core designed in Verilog HDL and the same inputs are used for calculating
DCT using MATLAB.

The latency of implemented core is five clock cycles and through put is one
clock cycle.

Comparison is done between output of Matlab and HDL Simulation Result
which shows that, the accuracy of implemented core is 93.75%.
References
1. R.C. Gonzalez, R.E. Woods, Digital Image Processing, Pearson Education 3
rd
Edition 2008

2. David Salomon, Data Compression, The Complete Reference, 2nd Edition Springer-Verlag 1998

3. Y. C. Lim, J. B. Evans, and B. Liu, Decomposition of binary integers into signed power-of-two terms,
IEEE Trans. Circuits Syst., vol. 38, no. 6, pp. 667-672,1991

4. R. J. Clark, Relation between the Karhunen-Lobe and cosine transform, IEEE Proc., vol. 128, pt. F, no.
6, pp. 359-360, Nov.1981.

5. W. Chen, C.H.Smith, and S.C.Fralick,A fast computational algorithm for the Discrete Cosine transform
IEEE, Trans.Commun.COMM-25, pp.100 1009, Sep.1977.

6. Peng Chungan, Cao Xixin, Yu Dunshan, Zhang Xing, A 250MHz optimized distributed architecture of 2D
8x8 DCT, 7th International Conference on ASIC, pp. 189 192, Oct. 2007

7. Bian Li Jian, Zeng Xuan, Tong Jia Rong, Liu Yue, An Efficient VLSI Architecture for 2D-DCT Using
Direct Method

8. B. G. Lee, A new algorithm to compute the discrete cosine transform, IEEE Trans. Acoust., speech, Signal
Processing, vol. ASSP-32, pp.1243 1245, Dec.1984.

References Contd
9. Vijay Kumar Sharma, K. K. Mahapatra and Umesh C. Pati, An Efficient Distributed Arithmetic based VLSI
Architecture for DCT, National Institute of Technology, Rourkela, India-769008

10. Sungwook Yu DCT implementation with Distributed Arithmetic. IEEE Transactions on Computers
Volume 50, Issue 9 September 2001 Pages: 985 991, year of Publication: 2001 ISSN: 0018-9340.

11. Trevor W. Fox-2002 Rapid Prototyping of Field Programmable Gate Array- Based Discrete Cosine
Transform Approximations EURASIP JASP 2003, 543- 554.

12. Anthony Edward Nelson Implementation of image processing algorithms on FPGA hardware thesis of
Master of Science in Electrical Engineering, May 2000 Graduate School of Vanderbilt University.

13. Latha Pillai. video compression using DCT, XILINX Application Note : Virtex- 11 series.XAPP610,v1.4,
April 10,2008.

14. K. R. Rao and P. Yip, Discrete Cosine Transform: Algorithms, Advantages and Applications, Academic
Press, Inc., 1990

15. Hassan EL-Banna, Alaa A. EL-Fattah, Waleed Fakhr, An Efficient Implementation of the 1D DCT using
FPGA Technology, ICM 2003, Dec. 9-1 1, Cairo, Egypt

Anda mungkin juga menyukai