Anda di halaman 1dari 4

The purpose of this laboratory is to show a simple image processing

application that takes as input a YUV420 planar image and applies various
filters to it.
1. The YUV420 planar image format.
The YUV color space is one of the standard image formats output by a
wide variety of camera sensors. It encodes a color image or video taking
human perception into account. The YUV model defines a color space in
terms of one luma (Y) and two chrominance (UV) components. There are
several sub-formats of the YUV color space but in this laboratory well
focus on the planar YUV420. Further reading can be found here:
http://en.wikipedia.org/wiki/YUV
A YUV420 image has its pixels coded on 8bits. Every pixel can have
values between 0 and 255, accordingly. This corresponds to a grayscale
representation ranging from black (0) to white (255). This applies for both
luma and chromas. In the 420 format, one of each chroma pixels
correspond to one 4 pixel block in the luma field. This means that for a

HEIGHT x WIDTH dimensioned image, the luma plane will have HEIGHT x
WIDTH pixels and each of the chromas will have (HEIGHT x WIDTH)/4
pixels. The graphical representation for this is as follows:
Figure 1. YUV420 planar format
(http://en.wikipedia.org/wiki/File:Yuv420.svg)

In this manner, the HEIGHT x WIDTH dimensioned image, with 8bit


pixels will occupy in memory a total space of (HEIGHT x WIDTH) * 1.5
pixels. Well come back to this further on.

2. Filters.
Image processing filters are kernels that are applied on each pixel of an
input image in order to get a desired effect on the output. Thinking that
an image in memory is represented by a WIDTH x HEIGHT matrix, for
example, a 3x3 kernel applied on one random pixel, takes into account
the value for that particular pixel and the values of the adjacent pixels to
the one in question. Applying the kernel to all pixels in the image, we can
obtain a certain effect.
2.1

The Gaussian filer

The first filter that well talk about is the Gaussian filter. This uses
the Gaussian kernel that replaces every pixel in the input image with a
weighted median of itself and its neighbors. Various implementations
of this filter exist, according to the desired strength of the effect. The
Gaussian filter is used to create a blurred image, because every output
pixel is based on the values of its neighbors. Typically, we can have
3x3, 5x5, 7x7 and so on Gaussian kernels, the bigger the kernel, the
more accentuated the blur effect. The weights of the kernel are again,
subject to discussion but the overall thing to take into account is that
they decrees radially, the biggest weight being, of course, the center
pixel that we take into account.
Lets take for example, the simplest kernel:
2

16

Figure 2. A 3x3 Gaussian kernel


Applying this kernel to a pixel, means centering it on the pixel in
question and multiplying all 9 pixels by their respective weights. The
output pixel is divided by the total weight of the kernel (in this example
total weight = 56). Remember that other weights can be used and feel
free to try different values.
So for example, for a random pixel with coordinates (x,y) the
filter is applied as follows:
out_pix[x,y] =
in_pix[x-1,y-1]*2 +

(in_pix[x-1,y-1]*2

in_pix[x-1,y]*8

in_pix[x,y-1]*8
1]*8 +

+ in_pix[x,y]*16 + in_pix[x,y-

in_pix[x+1,y-1]*2
in_pix[x+1,y-1] *2) / 56

in_pix[x+1,y]*8

Repeating this calculation on all pixels in the image, the output


will be a blurred copy of the original image. Well see exactly how this
looks for an image saved in memory in the application part of the
laboratory.
2.2 The Sobel operator
The Sobel operator is used in image processing, particularly within
edge detection algorithms. Technically, its a discrete differentiation
operator, used to calculate approximations of the derivatives - one for
horizontal changes, and one for vertical for each pixel.
Further reading can be found here: http://en.wikipedia.org/wiki/Sobel_filter

The operator consists of two 3x3 kernels that are applied to each
pixel. Well define them as follows:
-1
0

-2 -1
Sx =
0
0

-1

-2

-1

and

Sy = *A,

where Sx and
Sy are two images which at each
point contain the horizontal and vertical derivative approximations and
A is the original 3x3 block of the image, centered in the pixel in
question.
This can be interpreted as follows:
Sx[x,y] = (in_pix[x-1,y-1]*(-1)
in_pix[x-1,y-1]*(-1) +
in_pix[x,y-1]*0
in_pix[x,y-1]*0

in_pix[x-1,y]*(-2)

+ in_pix[x,y]*0

+ in_pix[x+1,y]*2

in_pix[x+1,y-1]*1
in_pix[x+1,y-1] *1),
and accordingly for Sy[x,y].

At each point in the image, the resulting gradient approximations


can be combined to give the gradient magnitude, using:

out_pix = sqrt(Sx2+ Sy2)


Doing these calculations on each pixel in the original image, the
output will be an image that shows how each pixels derivative looks,
more formally, an image which highlights the edges of the original
because the edges have the biggest derivatives. This can be explained
because an edge means a sudden change in pixel intensity, thus
bigger derivative. Things will hopefully be clearer in the application
part. The Sobel operator also can be used to calculate the direction of
the gradient (or the direction of the edge) using the formula out_dir =
arctan(Sy / Sx) but this will not be discussed in this laboratory.
3. The C Application

Anda mungkin juga menyukai