Anda di halaman 1dari 83

DIGITAL IMAGE PROCESSING

Time schedule
MATLAB BASICS - 2 DAYS MATLAB GUI - 2 DAYS BASIC IMAGE PROCESSING - 2 DAYS PROJECT EXPLANATION - 1 DAY MODULE 1 - 2 DAYS MODULE 2 - 2 DAYS MODULE 3 - 2 DAYS Q-A - 2 DAYS 1 HRS A DAY

PRESENTATION
Objective  Literature survey  Problem statement  Proposed solution  Simulation results & validation  References


IMAGE REPRESENTATION
GRAY  RGB  BINARY  2D  3D


IMAGE FILE FORMATS


BMP Bit map file format  JPG jpeg  TIF Tagged image file format  PNG -portable network graphics  PGM -portable gray map


VIDEO FILE FORMATS


AVI-Audio video interleave  MPEG-Motion pictures Expert group  CIFF Common Interchange file format  WMV-windows media video format


Why Do We Need Compression?


 

TO REDUCE THE MEMORY EFFICIENT USE OF BANDWIDTH

Two types of compression  Lossy compression  Loss less compression

COMPRESSION TECHNIQUES
Lossless Compression  Data can be completely recovered after decompression  Recovered data is identical to original  Exploits redundancy in data Lossy compression  Data cannot be completely recovered after decompression  Some information is lost for ever  Gives more compression than lossless  Discards insignificant data components

COMPONENTS OF COMPRESSION

Redundancy -Dublication  Irrelevancy reduction.- parts of signal that will not be noticed by the signal receiver, which is the Human Visual System (HVS).


Types of redundancy
Spatial Redundancy - correlation between neighboring pixel values.  Spectral Redundancy-. correlation between different color planes or spectral bands.  Temporal Redundancy-. correlation between adjacent frames in a sequence of images (in video applications).


ENTROPHY ENCODING
Minimum no of bits required to represent a signal  Huffman  Arithmetic  RLE


Example Huffman encoding




   

A=0 B = 100 C = 1010 D = 1011 R = 11 ABRACADABRA = 01001101010010110100110 This is eleven letters in 23 bits A fixed-width encoding would require 3 bits for five different letters, or 33 bits for 11 letters Notice that the encoded bit string can be decoded!

Why it works
In this example, A was the most common letter  In ABRACADABRA:


5 As 2 Rs 2 Bs 1C 1D

code for A is 1 bit long code for R is 2 bits long code for B is 3 bits long code for C is 4 bits long code for D is 4 bits long

Example, step I are: Assume that relative frequencies




 

A: 40 B: 20 C: 10 D: 10 R: 20 (I chose simpler numbers than the real frequencies) Smallest number are 10 and 10 (C and D), so connect those

Example, step II
 

C and D have already been used, and the new node above them (call it C+D) has value 20 The smallest values are B, C+D, and R, all of which have value 20 Connect any two of these

Example, step III


 

The smallest values is R, while A and B+C+D all have value 40 Connect R to either of the others

Example, step IV


Connect the final two nodes

Example, step V
 

Assign 0 to left branches, 1 to right branches Each encoding is a path from the root


A=0 B = 100 C = 1010 D = 1011 R = 11 Each path terminates at a leaf Do you see why encoded strings are decodable?

RLE
  

Yet another simple compression technique is applied to the AC component: 1x64 vector has lots of zeros in it Encode as (skip, value) pairs, where skip is the number of zeros and value is the next non-zero component. Send (0,0) as end-of-block sentinel value.

Image compression Techniques


DCT  ENERGY EFFICIENT WAVELET  SPIHT-SET PARTITIONED IN HIREARICHAL TREES  EMBEDDED ZERO WAVELET


VIDEO COMPRESSION
MPEG  H.264 BASED VIDEO COMPRESSION


What is JPEG?
   

"Joint Photographic Expert Group" -- an international standard in 1992. Works with color and grayscale images, Many applications e.g., satellite, medical, ... JPEG compression involves the following: \

JPEG

The Major Steps in JPEG Coding involve


     

DCT (Discrete Cosine Transformation) Quantization Zigzag Scan DPCM on DC component RLE on AC Components Entropy Coding

In JPEG, each F[u,v] is divided by a constant q(u,v). Table of q(u,v) is called quantization table.

Eye is most sensitive to low frequencies (upper left corner), less sensitive to high frequencies (lower right corner) Standard defines 2 default quantization tables, one for luminance (above), one for chrominance

Quantizer

Scalar Quantization (SQ):performed on each

individual coefficient


Vector Quantization (VQ):performed on a group of coefficients

Zigzag Scan
What is the purpose of the Zigzag Scan
 

To group low frequency coefficients in top of vector. Maps 8 x 8 to a 1 x 64 vector

Overview of Wavelet


What are Wavelets?


Wavelets are mathematical functions that cut up data into different frequency components,and then study each component with a resolution matched to its scale. These basis functions or baby wavelets are obtained from a single prototype wavelet called the mother wavelet, by dilations or contractions (scaling) and translations (shifts).

Wavelet Transform
 What

is wavelet Transform:

Wavelet Transform is a type of signal representation that can give the frequency content of the signal at a particular instant of time.

Wavelet Transform


Why need wavelet transform?


Wavelet analysis has advantages over traditional Fourier methods in analyzing physical situations where the signal contains discontinuities and sharp spikes.

Discrete Wavelet Transform(DWT)




Wavelet transform is related to two functions:the scaling function (x) and mother wavelet (x).

g i ! (1) i hi 1
hi, gi are called coefficients(filters).  hi : Smooth filters.(Lowpass filters)  gi : Detail filters.(Highpass filters)  Each kind of wavelet transform defines its own filter coefficients. E.g Daubechies.


DWT Subband Structure


Horizontal(Rows) L
Image corresponding to resolution Level R

Vertical(Columns) L 2 2 2 2 LL LH HL HH

2 H L

Image corresponding to resolution Level R-1 Detail Image corresponding to information visible at the resolution Level R

NxM

2 H N/2 x M

N/2 x M/2

L: Lowpass filter H: Highpass filter 2: downsample by 2

DWT Subband Structure


LL: Horizontal Lowpass & Vertical Lowpass LH: Horizontal Lowpass & Vertical Highpass

HL: Horizontal Highpass & Vertical Lowpass

HH: Horizontal Highpass & Vertical Highpass

DWT Subband Structure

Stage 1

Stage 2

Stage 3
DWT with D=3 stages

A DWT Example

LL2

LH2 HH2

LL1 LL0 HL1

LH1 HH1

HL2

LH2 HH2

HL2

3D Wavelet Transform (3DWT)


time

F1 F2

F3 F4 F5 F6 F7

F8

DWT over DCT


No need to divide the input coding into nonoverlapping 2-D blocks, it has higher compression ratios avoid blocking artifacts.  Allows good localization both in time and spatial frequency domain.  Transformation of the whole image introduces inherent scaling  Better identification of which data is relevant to human perception higher compression ratio

Discrete Wavelet Transform




Higher flexibility: Wavelet function can be freely chosen No need to divide the input coding into non-overlapping 2D blocks, it has higher compression ratios avoid blocking artifacts. Transformation of the whole image introduces inherent scaling Better identification of which data is relevant to human perception higher compression ratio (64:1 vs. 500:1)

OVERVIEW OF IMAGE COMPRESSION

Discrete Wavelet Transform

Entropy Compressed Data Stream Quantization Coding


1 1 0 1 0 0 1 0 1 0 1
...

Inverse Discrete Wavelet Transform

Entropy
Dequantization

Decoding

Discrete Wavelet Transform


(LL3)

(LL2)
(LH3) (HL3) (HH3)

(LL1)
(LH2) (HL2) (HH2)

Input Image (LL0)


(LH1) (HL1) (HH1)

Uniform quantization
Why do we need to quantise: To throw out bits Example: 101101 = 45 (6 bits). Truncate to 4 bits: 1011 = 11. Truncate to 3 bits: 101 = 5. Quantization error is the main source of the Lossy Compression. Divide by constant N and round result (N = 4 or 8 in examples above). Non powers-of-two gives fine control (e.g., N = 6 loses 2.5 bits)

DWT

Quantize One band

Subtract orig. image Add 128

DWT-1

DWT

QUANTIZE TWO BANDS

Subtract orig. image Add 128

DWT-1

Quantize all subbands

Energy Efficient wavelet




HH Elimination method

H* Elimination Method

Computation cost

SPIHT Algorithm


SPIHT algorithm is based on 3 concepts


Ordered bit plane progressive transmission. Set partitioning sorting algorithm. Spatial orientation trees.

Spatial Orientation Trees (1)

Tree root is 2 x 2

Spatial Orientation Trees (2)




The following sets of coordinates are used to present the new coding method
O (i, j) set of coordinates of all offspring of node (i, j) ; D (i, j) set of coordinates of all descendants of the node (i, j) ; H set of coordinates of all spatial orientation tree roots ; L (i, j) = D (i, j) O (i, j) ;

Coding Algorithm


In a practical implementation the significance information is stored in three ordered lists


LIS list of insignificant sets LIP list of insignificant pixels LSP list of significant pixels

A Simple Example (1)13Encoder First Pass 10 Initialization 26 6


-7 4 2 7 6 -4 4 -2 -2 4 -3 0

Chose T0 = 2 log226 = 16; (n = 4) LIP = { (0,0), (0,1), (1,0), (1,1) } LSP = { } LIS = { D(0,1), D(1,0), D(1,1) } LSPT = { } 1. Process LIP Sn(0,0) 26 T0, we transmit 1; 26 > 0, we transmit 0; Then move (0,0) to LSPT. Sn(0,1), Sn(1,0), Sn(1,1) are insignificant, transmit three 0; 2. Process LIS Sn( D(0,1) ) 13, 10, 6, 4 < T0, we transmit 0; Sn( D(1,1) ), Sn( D(1,0) ) are insignificant, transmit two 0; 3. Neednt to process LSP (because LSP = NULL) 4. Update LSPT to LSP The transmitted bit stream 10000000 (8 bits) LIP = { (0,1), (1,0), (1,1) } LIS = { D(0,1), D(1,0), D(1,1) } LSP = { (0,0) }

A Simple Example (3)13Encoder 1 = 3 T = 2 = 8 Pass Second 10 n=4 26 6


1 n

-7 4 2

6 -4 4 -2 -2

4 -3 0

LIP = { (0,1), (1,0), (1,1) } LSP = { (0,0) } LIS = { D(0,1), D(1,0), D(1,1) } LSPT = { (0,0) } 1. Process LIP Sn (0,1), Sn(1,0), Sn(1,1) are insignificant, transmit three 0; 2. Process LIS significant Sn( D(0,1) ) 13, 10 T0, we transmit 1 10 10; 13 10 Then move (0,2), (0,3) to LSPT. 6, 4 < T0, transmit two 0; Then move (1,2), (1,3) to LIP. Sn( D(1,0) ) Sn( D(1,1) ) are insignificant, transmit two 0; 3. Process LSP (because LSP NULL) c(0,0) = 26 = (11010)2 transmit n-th MSB = 1 4. Update LSPT to LSP The transmitted bit stream 0001101000001 (13 bits) LIP = { (0,1), (1,0), (1,1), (1,2), (1,3) } LIS = { D(1,0), D(1,1) } LSP = { (0,0), (0,2), (0,3) }

A Simple Example (2)0 Decoder First Receive 0 Initialization 24 0


0 0 0 0 0 0 0 0 0 0 0 0

Get n = 4 T0 = 2n = 16 LIP = { (0,0), (0,1), (1,0), (1,1) } LIS = { D(0,1), D(1,0), D(1,1) } LSP = { } The transmitted bit stream 10000000 (8 bits) 1. Process LIP Get 1 Sn(0,0) is significant, next is 0 positive value; Move (0,0) to LSP, then reconstruct c`(0,0) = +(3/2)T0; Get three 0 Sn(0,1), Sn(1,0), Sn(1,1) are insignificant; 2. Process LIS Get three 0 Sn( D(0,1) ), Sn( D(1,0) ), Sn( D(1,1) ) are insignificant; LIP = { (0,1), (1,0), (1,1) } LIS = { D(0,1), D(1,0), D(1,1) } LSP = { (0,0) }

A Simple Example (5) Decoder Second Receive (cont)


28 0 0 0 0 0 12 12 0 0 0 0 0 0 0 0

LIP = { (0,1), (1,0), (1,1), (1,2), (1,3) } LIS = { D(1,0), D(1,1) } LSP = { (0,0), (0,2), (0,3) }

Spiht Vs Vspiht

VSPIHT
Wavelet decomposition and child parent relationship  Spatial orientation tree in spiht  Virtual decomposition of LL band  Virtual zero tree


EZW

Morton order scan

Raster scan

EZW the algorithm


What is inside the second step?
threshold = initial_threshold; do { dominant_pass(image); subordinate_pass(image); threshold = threshold/2; } while (threshold > minimum_threshold);
The main loop ends when the threshold reaches a minimum value, which could be specified to control the encoding performance, a 0 minimum value gives the lossless reconstruction of the image The initial threshold t0 is decided as:

Here MAX() means the maximum coefficient value in the image and y(x,y) denotes the coefficient. With this threshold we enter the main coding loop

          

Overview Discrete Wavelet Transform Zerotree Coding of Wavelet Coefficients Successive-Approximation Quantization (SAQ) Adaptive Arithmetic Coding Relationship to Other Coding Algorithms A Simple Example Experimental Results Conclusion References Q&A

EZW


 

E The EZW encoder is based on progressive encoding. Progressive encoding is also known as embedded encoding Z A data structure called zero-tree is used in EZW algorithm to encode the data W The EZW encoder is specially designed to use with wavelet transform. It was originally designed to operate on images (2-D signals)

Overview - Embedded Zerotree Wavelet (EZW)




2 Properties Producing a fully embedded bit stream Providing competitive compression performance 4 Features Discrete wavelet transform Zerotree coding of wavelet coefficients Successive-approximation quantization (SAQ) Adaptive arithmetic coding 2 Advantages Precise rate control No training of any kind required

Discrete Wavelet Transform (2-1)




Identical to a hierarchical subband system Subbands are logarithmically spaced in frequency Subbands arise from separable application of filters

LL2 HL2

LL1

HL1
LH2 HH2

HL1

LH1

HH
1

LH1

HH
1

First stage

Second stage

Zerotree Coding (3-1)




A typical low-bit rate image coder


whether a coefficient has a zero or nonzero quantized value True cost of encoding the actual symbols: Total Cost = Cost of Significance Map + Cost of Nonzero Values

Large bit budget spent on encoding the significance Binary decision as to: map

Zerotree Coding (3-2)




What is zerotree?

A new data structure Parent: IF: To improve the compression of significance maps of The coefficient at the coarse scale. Awavelet coefficients wavelet coefficient at a coarse scale is insignificant Scanning rule:

Child: with is scanned to x insignificant a wavelet is node respectsaid before threshold T,with coefficient A coefficientbe is its parent. x is to a No child insignificant? given the same spatial  What All coefficients corresponding to THEN: is based on an empirically if : hypothesis respect to a given thresholdIF true T  Zerotree location atAn element of a zerotree insignificant with the its finer scale of similar orientation. itself and all ofnextdescendents arefor orientationT isthe All wavelet coefficients of the same threshold in |x| < T  Parent-child dependencies (descendants & ancestors) respect to IF T. same spatial location at finer scales are likely to be
  

Scanning of the coefficients It is not the descendant of a previously found zerotree insignificant with respect to T. An element of a zerotree for threshold T root for threshold T. Scanning order of the subbands Parent-child dependencies of A zerotree root subbands

Zerotree Coding (3-3)




Encoding

The significance map can be efficiently represented as a string of symbols from a 4 symbols are used 1. Zerotree root (zr) => if (|xWT| < Ti) && (all descendants of xWT < Ti) 2. Isolated zero (iz) => if (|xWT| < Ti) && ((some descendants of xWT > Ti) || (xWT is the last item)) 3. Significant positive (sp) => if (|xWT| >= Ti) && (xWT > 0) 4. Significant negative (sn) => if (|xWT| >= Ti) && (xWT < 0)

SAQ (3-1)
Successive-Approximation Quantization (SAQ)
Sequentially applies a sequence of thresholds T0,,TN-1 to determine significance

Thresholds
Chose so that Ti = Ti-1 /2 T0 is chosen so that |xj| < 2T0 for all coefficients xj

Two separate lists of wavelet coefficients


Dominant list Subordinate list Dominant list contains: Subordinate listof those coefficients that have not yet been The coordinates contains: The magnitudes of those the same relative order as found to found to be significant in coefficients that have been the initial be significant. scan.

SAQ (3-2)
Dominant pass During a dominant pass: Subordinate pass coefficients with coordinates on the dominant list are Encoding process During a subordinate pass:

compared to the threshold Ti to determine their significance, all coefficients on the subordinate list are scanned and the and encoding process: SAQif significant, their sign. specifications of the magnitudes available to the decoder are FOR I=T TO TN-1 refined to an additional bit0 of precision. Dominant Pass; Subordinate Pass (generating string of symbols) ; String of symbols is entropy encoded; Sorting (subordinate list in decreasing magnitude); IF (Target stopping condition = TRUE) break; NEXT;

A Simple Example (2-1)


  

Only string of symbols shown (No adaptive arithmetic coding) Simple 3-scale wavelet transform of an 8 X 8 image T0 = 32 (largest coefficient is 63) 63 -34 49 10 7 13 -12 7 -31 23 14 -13 3 4 6 -1 15 14 3 -12 5 -7 3 9 -9 -7 -14 8 4 -2 3 2 -5 9 -1 47 4 6 -2 2 3 0 -3 2 3 -2 0 4 2 -3
6 4

3 6

3 6

5 11 5 6 0 3 -4 4 Example

A Simple Example (2-2)


 

First dominant pass First subordinate pass

63 -3449 10 7 13 -12 7 -31 23 14 -13 3 4 6 -1 15 14 3 -12 5 -7 3 9 -9 -7 -14 8 4 -2 3 2 -5 9 -1 47 4 6 -2 2 3 0 -3 2 3 -2 0 4 2 -3


6 4

3 6

3 6

5 11 5 6 0 3 -4 4 Example

Magnitudes are partitioned into the uncertainty intervals [32, 48) and [48, 64), with symbols 0 and 1.

PSNR AND MSE


W and h are the width and height of the image O is the original image data C is the compressed image data MAX is the maximum value that a pixel can have, 255.

Bit-stream c of length ||c||. N1 *N2 are the rows and columns

Compression ratio
ORIGINAL FILE SIZE COMPRESSION RATIO=

---------------------compressed file size

DATA HIDING
STEGANOGRAPHY  WATERMARKING  WATERCASTING AUDIO STEGANOGRAPHY VIDEO STEGANOGRAPHY


STEGANOGRAPHY