Anda di halaman 1dari 35

Image Compression

Need for compression


Proliferation of multimedia data
Images form bulk of this data

Demand for access via Internet, transfer across telecommunication network Examples:
1 hour of TV-quality movie requires (640 x 480 x 30 fps) x 24 b/pixel= .. Mbps bandwidth for xmission and . Gb for storage If it is HDTV (1280 x 720 x 60 fps) x 24 = ..Mbps and .. Gb for storage A 14 x 17 in2 X- ray scanned @ 70 m requires .. storage

What is image compression?


Processing images in order to reduce their storage requirement
I.e. Given x find y such that y needs fewer bits to store than x

Desirable features
minimal loss of information and perceptual quality; highest reduction in amount of data

Related fields, terminologies


Data (signal) compression Information coding Compression Coding noiseless lossless noisy lossy

CODEC: the complete pipeline of coder and decoder

Types of compression
Lossy techniques
decompressed image is not identical to original it may appear visually identical

Lossless (coding) techniques


decompressed image is identical to original

Basic strategy for compression


Reduce different redundancies in images
Perceptual Spatial Statistical Structural

Perceptual redundancy
Refers to limitations in the human visual system Eye has greater sensitivity to distortion in
smoother regions dark regions luminance (I) component

Regions other than these can therefore be represented with less data.

Spatial redundancy
Refers to correlation among neighbouring pixels in an image
Ex. Neighbouring pixels in an object have (by and large) similar brightness and motion values

Predictive coding can exploit this point

Statistical redundancy
All brightness values dont occur in an image with equal frequency
i.e., the image histogram is not always uniform

Nonuniform bit allocation can be used to exploit this point

Structural redundancy
Images are 2-D projections of 3-D objects An image can be represented in terms of structural image models
a texture image coded as {Ci, Ti}
- Ci contours of regions - Ti texture in each region

Definitions
Entropy measure of the disorder in the image and
hence of information content
high degree of order
- Ex. line image

fewer bits needed for representation


2 m 1

length of line in pixels and pixel value

H = pi log 2 pi
i =0

pi ~ probability of ith grey value in an image;


i = 0,1,..2m -1

H in bits/greylevel is the lowest bit rate required for storage

Definitions contd.
Compression ratio bo and bc are bit rates for original and compressed
images Cmax = m/H
bo C= bc

Mean squared error (MSE)


a measure of info loss in compression
For an image x and decompressed image xd (size MxN)

1 MSE = MN

( xd [ m, n] x[m, n]) 2

Definitionscontd.
PSNR Peak signal to noise ratio in dB
255 PSNR = 20 log10 ( ) MSE

Not a true measure of image quality

Low complexity techniques

Run-length coding
A lossless method (C ~ 2 or 3) A Run is a sequence of pixels with identical values (0 or 1) in a scan direction 1-D RLC Each scan line = {gi li}; terminate with End Of Line symbol.
Ex: I = [100 100 100 32 32 200 200 50]

[{100 3}{32 2}{200 2}{50 1}]

Extensions are also possible (used in baseline JPEG)


Ex : I(m) = [26 29 32 35 38 41 44 34] 1. Find first difference D(m) = I(m) I(m+1) = [-3 -3 -3 -3 -3 -3 10] 2. Apply RLC on D to get = [{-3 6}{10 1}]

RLC ..contd.
2-D RLC uses connected areas of pixels with identical values Applications Fax transmission standard (due to max efficiency for
binary images)

bit plane encoding for grey scale images

Variable Length coding


Also known as Entropy coding Uses variable word length to code different grey values Examples Huffman and Arithmetic coding
Code is derived by assigning a VL codeword to each grey value Entire message is coded at once

Huffman code - summary


Assigns variable length code to each input symbol (pixel level)
Achieved by sequentially combining symbols with least probabilites

Code length 1/frequency of a pixel level


Two least frequent pixels have same word lengths and differ only in LSB

Average code length ~ entropy of the image

Huffman code -algo


Generate a Huffman tree
1. Start with a node for each pixel value gi and label = p(gi)
Rank ordering first is sometimes used

2. Find and combine 2 nodes with lowest p(gi) to produce a parent node. Label this with the sum of p(gi) of the 2 child nodes 3. Continue until you get only 1 root node with label = probability of 1

Assigning the Huffman code


1. 2. Starting with the root assign a code 1 for right branch and 0 for left branch. Concatenate for each level Continue until all branches are covered.

Example of HC
3-bit image with pixel values gi [0,1,2.7] Probabilities:
p(0) p(1) p(2) p(3) p(4) p(5) p(6) p(7) 0.4 0.08 0.08 0.2 0.12 0.08 0.04 0

Huffman tree and code


Symbol B-code s0 000 s1 001 s2 010 s3 011 s4 100 s5 101 s6 110 s7 111 p(si) 0.4 0.08 1 0.08 0.2 0.12 1 0.08 0.04 0 1 0.04 0 1 0.12 0 0.24 0 0.16 0 1 0.36 0 1 0.6 0 0 1.0 1

si s0 s1 s2 s3 s4 s5 s6 s7

H-code 1 0111 0110 010 001 0001 00001 00000

Huffman coding - performance


Average code length n =
2 m 1 i =0

s l = 2.52
i i

s is the symbol and l is the length of the Huffman codeword for s; m is the no of bits used for the original image

Compression ratio C = m/n = 3/2.52 =1.24 Upper bound for C = source entropy
H = 2.42 bits/greylevel
H = pi log 2 pi
i =0 2 m 1

Arithmetic coding (AC)


Huffman derives a code for each symbol
each code has variable word-length

AC, represents a sequence of symbols by some interval in [0.0,1.0) Most efficient for binary image coding

Predictive coding
Can be lossless or lossy Differential Pulse Code Modulation (DPCM)
Lossless version of predictive coding Used in the sequential lossless mode of jpeg Exploits spatial redundancy Predicts a pixel value based on neighbouring pixels value

DPCM - coding
1. Predict a pixel value based its neighbours. Round to the nearest integer
Prediction is usually a linear one

2. Find the error between predicted and actual pixel values 3. Encode the error using variable length coding

Prediction function options


1. x = a or 2. x = b or 3. x = c 4. 5. 6. 7.
1-D prediction
c b a x

x = a + b c or x = a + 0.5(b-c) or x = b + 0.5 (a-c) or x = 0.5(a+b)

2-D prediction

Special rule is used to handle pixels in the first row

Lossy predictive coding


A quantiser is added to the DPCM to quantise the error before encoding
error quantiser

input

encoder

Compressed output
predictor

Ex. Delta modulation which uses 1 bit quantiser and horizontal prediction

Transform coding
Image is compressed in the transform domain by modifying the image transform Steps Image is divided into n x n blocks and transform is computed for each block Transform coefficients are processed (quantisation etc) and then encoded

Issues in transform coding


Which transform to use?
Discrete Cosine and wavelet transforms are standards

What should be the block size?


8 and 16 are popular

What level of quantisation to use? Should quantisation level be adapted to local image content?

JPEG
JPEG: Joint Photographic Experts Group Four modes: 1. Sequential lossless mode
(de)compression requires a single scan of the image DPCM + entropy encoding for the prediction error

JPEG modes
DCT-based lossy modes 2. Sequential 3. Progressive scan
- (de)compression requires multiple scans of the image

4. Hierarchical
- Compress at multiple resolutions for different display devices
Drawback: Blocking artifacts appear at high compression rates

Baseline JPEG

Input image block

DCT

Quantiser

Zig-zag ordering

Entropy encoding

Coded image

Quantisation table specs.

Entropy table specs.

Baseline JPEG for colour images


RGB is converted to a format which separates luminance and chrominance info (YCbCr, YUV etc) Only chrominance data is subsampled
m:n:k indicates the subsampling () format
4:2:0 means chrominance component is 4 4:2:2 means chrominance is 2 row-wise 4:4:4 means no subsampling is done

JPEG 2000
Strategy: Compress once and decode in many modes Based on discrete wavelet transform
Daubechies biorthogonal spline

Region of interest (ROI) coding


Parts of images can be coded at higher rate than the rest

Compressed domain processing possible


Cropping, translation etc.

JPEG2000
Rate control Input image block Multicomponent transformation Wavelet transform Quantiser Encoding Coded image Region of interest

Multicomponent transformation: tiling (optional), dc level shifting and RGB conversion with or without subsampling

YUV

Anda mungkin juga menyukai