Anda di halaman 1dari 20

3/30/2012

1
I HC QUC GIA TP.H CH MINH
TRNG I HC BCH KHOA
KHOA IN-IN T
B MN K THUT IN T
11
VIDEO AND IMAGE PROCESSING
USING DSP AND PFGA
Chapter 4: Hardware architecture for Video
and image processing
4.1 Buffer and memory for video and image
4.2 Hardware design for video interface
4.3 Hardware design for image filtering
B mn K Thut in T
Reference
Donald G. Bailey, Design for Embedded Image
Processing on FPGAs, Wiley, 2011
Roger Woods, FPGA-based Implementation of Signal
Processing Systems, Wiley, 2008.
2
3/30/2012
2
B mn K Thut in T
4.1 Buffer and memory for video and image
A typical hardware design for real-time video and
image processing
Video/image source
Video buffer
Video/image processing
Video/image processing
3
Video/image
processing
Video
buffer
Video
buffer
Video/
image
sink
Video
/image
source
B mn K Thut in T
Video/image Buffer
Video/image buffer: to make sure data to
transfer and arrive in time.
There are two types:
Line buffer: store several lines of an image
Frame buffer: store several frames of a video
4
3/30/2012
3
B mn K Thut in T
Video/image Buffer
Video/Image stream data
raster scan stream
5
B mn K Thut in T
Line buffer
Hardware Implementation:
Shift-register
Block RAM
6
Line 3
Line 2
Line 1
3-line buffer
3/30/2012
4
B mn K Thut in T
Line buffer design in Quartus II
7
B mn K Thut in T
Line buffer design in Matlab
Signal Processing Blockset
8
3/30/2012
5
B mn K Thut in T
Frame buffer
Hardware Implementation
SRAM
DDR2-SDRAM
9
SDRAM controller
SDRAM
W
r
i
t
e
R
e
a
d
Video/Image Processing Unit
B mn K Thut in T
Frame buffer design in Quartus II
10
3/30/2012
6
B mn K Thut in T
Frame buffer design in Matlab
11
B mn K Thut in T
Assignments
1. Implement a 1x128 line buffer in VHDL/Verilog using shift
registers
2. Implement a 1x256 line buffer in VHDL/Verilog using RAM
3. Implement a block to convert raster-scan data to 3x3 block
data in VHDL/Verilog
4. Simulate a video processing system in Matlab with
throughput data using line buffer
5. Create 3x64 line buffer using Megafunction in Quartus II
6. Create 3x128 line buffer using Signal Processing Blockset in
Matlab
12
3/30/2012
7
B mn K Thut in T
4.2 Hardware Design for Video Interface
Video source
camera
TV source
Video file
Video sink
Monitor, LCD
Video storage
Video source controller
video decoder (uncompression)
TV decoder
file reader
Video sink controller
VGA, DVI, HDMI controller
Memory write controller
13
B mn K Thut in T
Video source
Analog video
Component video has three separate signals, one for each of the red, green
and blue components.
S-video (also called Y/C) has two signals, one for the luminance and one for
the chrominance.
Composite video combines the luminance and chrominance together as a
single signal.
For NTSC, the signal consists of 525 scan lines (480 with actual video data)
with a frame rate of 60 Hz.
For both PAL and SECAM, the signal has 625 lines (576 active) at a frame rate
of 50 Hz.
14
3/30/2012
8
B mn K Thut in T
Video source
Camera Link
Camera Link is based on National Semiconductors Channel Link technology.
A single Channel Link connection provides one-way transmission of 28 data
signals and an associated clock over five LVDS pairs.
One pair is used for the clock, while the 28 data signals are multiplexed
over the remaining four LVDS pairs.
15
B mn K Thut in T
Video interface in DE2 Kit
Video source: from TV decloder 7181B
Video sink: VGA controller, video DAC7213
16
3/30/2012
9
B mn K Thut in T
Video source/sink in Matlab
Video source: read video frames from compressed multimedia file
Video sink: Displays images or video streams, or write a video to a
multimedia file
17
B mn K Thut in T
Bayer Pattern
Most single-chip cameras obtain a color image by filtering the
light falling on each pixel using a color filter array
The most common pattern is the Bayer pattern
To form a full colour image, interpolation process, sometimes
called demosaicing, has to be performed
The simplest form of filtering is nearest neighbor interpolation.
For the red and blue components, this is accomplished by duplicating the
available component within each 2x2 block
Within the green component, the missing pixels may be obtained either from
above or to the left
18
3/30/2012
10
B mn K Thut in T
Bayer Pattern Processing
Edge directed color filter demosaic
The horizontal and vertical edge strengths are estimated:
19
Then, these are used to weight the horizontal and vertical
averages:
B mn K Thut in T
Bayer Pattern Processing
Direct implementation of the demosaicing scheme
20
3/30/2012
11
B mn K Thut in T
Video Sink
Video display controller
The image is sent to a display in a raster scanned format. At the end of
each line of active video data there is a blanking period
During this blanking period, a synchronisation pulse is used to indicate
to the monitor to begin the next line.
For CRT monitors, the blanking period is typically 20% of the scan time
for each row.
21
Horizontal video timing.
B mn K Thut in T
Video Sink
VGA output:
Video Graphics Array (VGA) refers specifically to the display hardware
first introduced with the IBM PS/2 line of computers in 1987
The VGA output connector provides the display with analogue signal
video signals for each of the red, green, and blue components.
These require a high speed D/A converter for each channel.
For simple displays, with only one or two bits per channel, these
signals may be provided reasonably simply using a resistor divider
chain.
However, for image display the simplest approach is to use a video
digital to analogue chip, which is available from several
manufacturers.
22
3/30/2012
12
B mn K Thut in T
Video Sink
VGA Output
23
Parameter Value Unit
Pixel clock frequency 25.175 MHz
Horizontal frequency 31.4686 kHz
Horizontal pixels 640
Horizontal sync polarity Negative
Total time for each line 31.77 s
Front porch (A) 0.94 s
Sync pulse length (B) 3.77 s
Back porch (C) 1.89 s
Active video (D) 25.17 s
B mn K Thut in T
Video Sink
DVI Output
Digital Visual Interface (DVI) is a video display interface developed by
Digital Display Working Group (DDWG).
DVI's digital video transmission format is based on PanelLink, a serial
format devised by Silicon Image Inc.
PanelLink uses transition minimized differential signaling (TMDS), a
high-speed serial link developed by Silicon Image.
Like modern analog VGA connectors, the DVI connector includes pins
for the display data channel (DDC).
A newer version of DDC called DDC2 allows the graphics adapter to
read the monitor's extended display identification data (EDID).
24
3/30/2012
13
B mn K Thut in T
Video Sink
The DVI connector on a device is
therefore given one of three names,
depending on which signals it
implements:
DVI-D (digital only, both single-link and dual-
link)
DVI-A (analog only)
DVI-I (integrated digital and analog)
25
B mn K Thut in T
Video Sink
HDMI Output
HDMI (High-Definition Multimedia Interface) is a compact audio/video
interface for transferring encrypted uncompressed digital audio/video
data
The maximum pixel clock rate for HDMI 1.0 was 165 MHz, which was
sufficient to support 1080p and WUXGA (19201200) at 60 Hz.
HDMI 1.3 increased that to 340 MHz, which allows for higher resolution
(such as WQXGA, 25601600) across a single digital link.
An HDMI connection can either be single-link (type A/C) or dual-link (type
B) and can have a video pixel rate of 25 MHz to 340 MHz (for a single-link
connection) or 25 MHz to 680 MHz (for a dual-link connection).
Video formats with rates below 25 MHz (e.g., 13.5 MHz for 480i/NTSC) are
transmitted using a pixel-repetition scheme.
26
3/30/2012
14
B mn K Thut in T
4.3 Hardware design for image filtering
Window filter
The shaded pixels represent the input window located at X that
produces the filtered value for the corresponding location in the
output image.
Each possible window position generates the corresponding pixel
value in the output image.
27
B mn K Thut in T
Window Filter
Scanning the window through the image is equivalent to streaming the
image through the window.
The row buffers can either be placed in parallel with the window or, since
the window consists of a set of shift registers, in series with the window
The parallel row buffers need to be slightly longer (the full width of the
image) but have the advantage that they are kept independent of the
window and filter.
28
3/30/2012
15
B mn K Thut in T
Window Filter
29
Partially unrolling horizontally, streaming multiple pixels each clock cycle
Partially unrolling the loop vertically, streaming in multiple rows in parallel
B mn K Thut in T
Window Filter
Filtering with single-port row buffers
30
3/30/2012
16
B mn K Thut in T
Window Filter
Edge extension schemes
Border pixel duplication
Mirroring with duplication
Mirroring without duplication.
31
Border pixel duplication Mirroring with duplication Mirroring without duplication
B mn K Thut in T
Window Filter
Part of priming the filter window includes loading the extended input
image (beyond the borders of the original image).
Consider the case of boundary replication the first and last rows must be
loaded (W +1)/2 times, and similarly the first and last pixels on each row.
Rather than explicitly reload them, it is possible to reuse the loaded
values
As the first pixel of each row is streamed in, the value is loaded into all of
the shift register stages on the input.
32
3/30/2012
17
B mn K Thut in T
Window Filter Architecture
Direct implementation of linear filtering
The corresponding input pixel values can be added prior to the
multiplication.
Alternatively, since each input pixel is multiplied by a number of
different coefficients, the input pixel value can be multiplied by each
unique coefficient once, with the results cached until needed
33
B mn K Thut in T
Window Filter Architecture
Pipelined linear filter
Note the different order of coefficients because the transpose filter
structure is used for each row.
34
3/30/2012
18
B mn K Thut in T
Window Filter Architecture
Transposed implementation of linear filtering.
Note the changed coefficient order
35
B mn K Thut in T
Window Filter Architecture
A two-dimensional filter may be decomposed into a cascade of two one-
dimensional filters
This will reduce the number of both multiplications and associated
additions from W2 to 2W.
Note that the column filter does not require a separate pass through the
image.
It can be streamed by replacing the pixel delays within the window by row
buffers
36
Converting a row filter to a column filter by replacing pixel delays with row buffers
3/30/2012
19
B mn K Thut in T
Window Filter Architecture
If any of the coefficients is a power of two, the corresponding
multiplication is a trivial shift of the corresponding pixel value.
Such shifts can be implemented without logic
37
Adder only implementation of a 21x1 Gaussian filter with = 3.
B mn K Thut in T
Window Filter Architecture
The equal weights make
the filter amenable to a
recursive implementation:
38
Efficient implementation of a W x W box average filter.
3/30/2012
20
B mn K Thut in T
Video/Image processing architecture for
debugging
It is much harder to debug a failing algorithm in hardware than it is in
software
Algorithm debugging in hardware is much more complex.
It may be important to view the image after any of the operations, not just
the output.
Note that if stream processing is used, it is not necessary to have a frame
buffer on the output; the output is generated on-the-fly and routed to the
display driver.
However, non-streamed operations or non-image based information may
require a frame buffer to display the results.
39
B mn K Thut in T
Assignments
1. Implement Border Extension Block in Verilog/VHDL
2. Implement Mean Filter in Verilog/VHDL
3. Implement Gaussian Filter in Verilog/VHDL
4. Implement Prewitt Filter in Verilog/VHDL
5. Implement Laplacian Filter in Verilog/VHDL
6. Design VGA controller in Verilog/VHDL
7. Design Prewitt Filter for running on DE2 Kit
40

Anda mungkin juga menyukai