mance
High clock speed (>250 MHz in
0.18um ASIC technologies)
Applications
The DCT-FI can be used in a variety of multimedia and image processing applications,
including:
Office automation equipment (Multifunction printers, digital copiers etc)
Digital cameras & camcorders
Video production, video conference
Display-projection systems
Surveillance systems
Block Diagram
DCT-FI
dp_din Transpose dp_dout
SP SP
Memory
din
SP
dout
din_wen Stage 1 Stage 2
dp_waddr
dp_wen
dp_raddr
dp_ren
SP
clk
rst
clr
enable dout_wen
din_rdy dout_rdy
Control Unit sob
forward
eob
January 2008
Functional Description ple/DCT coefficient of an input block has been fed to the me-
gafunction.
The forward DCT (DCT) is a transform that converts a signal
into its constituent frequency components as represented by Implementation Results
a set of coefficients. The inverse DCT (IDCT) reconstructs
the original signal from its constituent DCT coefficients. A 2- DCT-FI reference designs have been evaluated in a variety
dimensional array of coefficients results by applying the DCT of technologies. The following sample Altera results are ob-
to 2-dimensional signals, such as images. The megafunction tained after area optimization during synthesis and place and
receives image samples or DCT coefficients and outputs route, while assuming that all megafunction I/Os are routed
DCT coefficients or image samples on a block by block basis, off-chip.
where each block has a size of either 8x8 or 16x16. The me- Altera Device Logic Frequency Special Features
gafunction implements the DCT or IDCT over the input blocks Apex 20KE
3,887 LEs 64 MHz 1 ESB
EP20K200E-1
by performing two 1-dimensional transforms, using row-
Apex-II
column decomposition, as defined by the following formulas: EP2A15-C7
3,889 LEs 102 MHz 1 ESB
DCT: Cyclone
2i 1 u 2 j 1 v
3,576 LEs 103 MHz 1 M4K
N 1 N 1 EP1C6-C6
2
Yuv Cu C v X ij cos cos Stratix 1 M4K
N i 0 j 0 2N 2N EP1S10-C5
2,105 LEs 136 MHz
16 DSP blocks 9 bit
2 N 1 2 N 1
Cu Cv X ij cos
2i 1v cos 2 j 1u Cyclone-II
2,231 LEs 151 MHz
1 M4K
EP2C5-C6 16 DSP blocks 9 bit
N i 0 N j 0 2N 2N
Stratix-II 1 M4K
2,081 ALUTs 205 MHz
EP2S15-C3 16 DSP blocks 9 bit
IDCT: Support
N 1 N 1
2
X ij Cu CvYuv cos
2i 1u cos 2 j 1v The megafunction as delivered is warranted against defects
u 0 v 0 N 2N 2N for ninety days from purchase. Thirty days of phone and
N 1
2 N 1
2 2 j 1 v cos 2i 1 u email technical support are included, starting with the first in-
N
Cu
N
CvYuv cos
2N
2N teraction. Additional maintenance and support options are
u 0 v 0 available.
where C C 1 for
u v
u, v 0 and Cu Cv 1 other-
Verification
2
wise, X ij are the image samples, Yuv are the DCT The megafunction has been verified through extensive simu-
lation and rigorous code coverage measurements. Being
coefficients.
embedded in numerous of products, the megafunction is sili-
The intermediate results being produced from the first 1- con proven in both FPGA and ASIC technologies.
dimensional transform are stored in the “Transpose Memory”.
The Transpose Memory is a dual ported RAM capable of Deliverables
storing an entire 8x8 or 16x16 block resulting from applying The megafunction is available in ASIC (synthesizable HDL)
the first stage of row decomposition. While the Transpose and FPGA (netlist) forms, and includes everything required
Memory is written in row-major order, the second stage of for successful implementation. The Altera version includes:
processing reads data from the Transpose Memory in a col-
Post-synthesis EDIF netlist
umn-major order, effectively performing a transposition of the
intermediate results. A bit-accurate model (BAM) of the megafunction including
support of custom test vector generation
The number of bits used for each intermediate result stored
in the Transpose Memory, as well as the number of bits used Sophisticated self-checking Testbench (Verilog versions
to represent each of the cosine coefficients, is configurable at use Verilog 2001) supporting test vectors, expected re-
synthesis time. This allows the designers to perform their sults, and verification
own accuracy versus megafunction area tradeoffs. Further- RTL and gate level (FPGAs) simulation scripts
more, the bit-width of both megafunction inputs and outputs
Place and route scripts
is also configurable at synthesis time. It is noted that the de-
fault settings for these synthesis parameters, result to a Comprehensive user documentation, including detailed
DCT/IDCT implementation that satisfy the accuracy criteria of specifications and a system integration guide
the JPEG standard.
The first DCT coefficient/image sample of a block will appear
at the output 89 clock cycles after the first image sam-
CAST, Inc. 11 Stonewall Court
Woodcliff Lake, NJ 07677 USA
tel 201-391-8300 fax 201-391-8694
Copyright © CAST, Inc. 2008, All Rights Reserved.
Contents subject to change without notice.
Trademarks are the property of their respective owners. This megafunction developed by the
multimedia experts at Alma Technologies S.A..