Anda di halaman 1dari 23

1 Chapter 1


1. Introduction Often signals we wish to process are in the time-domain, but in order to process them more easily other information, such as frequency, is required. Mathematical transforms translate the information of signals into different representations. For example, the Fourier transform converts a signal between the time and frequency domains, such that the frequencies of a signal can be seen. However the Fourier transform cannot provide information on which frequencies occur at specific times in the signal as time and frequency are viewed independently. To solve this problem the Short Term Fourier Transform (STFT) introduced the idea of windows through which different parts of a signal are viewed. For a given window in time the frequencies can be viewed. However Heisenburgs Uncertainty Principle states that as the resolution of the signal improves in the time domain, by zooming on different sections, the frequency resolution gets worse. Ideally, a method of multiresolution is needed, which allows certain parts of the signal to be resolved well in time, and other parts to be resolved well in frequency. The power and magic of wavelet analysis is exactly this multiresolution. Images contain large amounts of information that requires much storage space, large transmission bandwidths and long transmission times. Therefore it is advantageous to compress the image by storing only the essential information needed to reconstruct the image. An image can be thought of as a matrix of pixel (or intensity) values. In order to compress the image, redundancies must be exploited, for example, areas where there is little or no change between pixel values. Therefore images having large areas of uniform colour will have large redundancies, and conversely images that have frequent and large changes in colour will be less redundant and harder to compress. The approximation subsignal shows the general trend of pixel values, and three detail subsignals show the vertical, horizontal and diagonal details or changes in the image. Ideally, during compression the number of zeros and the energy retention will be as high as possible. However, as more zeros are obtained more energy is lost, so a balance between the two needs to be found. Chapter 2


2 2.1 The need for Wavelets Often signals we wish to process are in the time-domain, but in order to process them more easily other information, such as frequency, is required. A good analogy for this idea is given by Hubbard. In order to do this calculation we would find it easier to first translate the numerals in to our number system, and then translate the answer back into a roman numeral. The result is the same, but taking the detour into an alternative number system made the process easier and quicker. Similarly we can take a detour into frequency space to analysis or process a signal.

2.1.1 Multiresolution and Wavelets The power of Wavelets comes from the use of multiresolution. High frequency parts of the signal use a small window to give good time resolution, low frequency parts use a big window to get good frequency information. As frequency resolution gets bigger the time resolution must get smaller. .

In Fourier analysis a signal is broken up into sine and cosine waves of different frequencies, and it effectively re-writes a signal in terms of different sine and cosine waves. Wavelet analysis does a similar thing, it takes a mother wavelet, then the signal is translated into shifted and scale versions of this mother wavelet.

Figure 2.1 The different transforms provided different resolutions of time and frequency. 2.1.2 The Continuous Wavelet Transform (CWT) The continuous wavelet transform is the sum over all time of scaled and shifted versions of the mother wavelet . Calculating the CWT results in many coefficients C, which are functions of scale and translation. The translation, , is proportional to time information and the scale, s, is proportional to the inverse of the frequency information. To find the constituent wavelets of the signal, the coefficients should be multiplied by the relevant version of the mother wavelet. The scale of a wavelet simply means how stretched it is along the x-axis, larger scales are more stretched:

Figure 2.2 The db8 wavelet shown at two different scales The translation is how far it has been shifted along the x-axis. Figure 2.3 shows a wavelet, figure 2.4 shows the same mother wavelet translated by k:

Figure 2.3 The translated wavelet.

Figure 2.4 The wavelet translated by k The Wavelet Toolbox Users Guide [7] suggests five easy steps to compute the CWT coefficients for a signal. 1. Choose a wavelet, and compare this to the section at the start of the signal. 2. Calculate C, which should measure how similar the wavelet and the section of the signal are. 3. Shift the wavelet to the right by translation , and repeat steps 1 and 2, calculating values of C for all translations. 4. Scale the wavelet, and repeat steps 1-3. 5. Repeat 4 for all scales.

Chapter 3

Wavelet transform is capable of providing the time and frequency information simultaneously, hence it gives a time-frequency representation of the signal. When we are interested in knowing what spectral component exists at any given instant of time, we want to know the particular spectral component at that instant. In these cases it may be very beneficial to know the time intervals these particular spectral components occur. For example, in EEGs, the latency of an event-related potential is of particular interest (Event-related potential is the response of the brain to a specific stimulus like flash-light, the latency of this response is the amount of time elapsed between the onset of the stimulus and the response)example comes from a reference.

6 Wavelets (small waves) are functions defined over a finite interval and having an average value of zero. The basic idea of the wavelet transform is to represent any arbitrary function (t) as a superposition of a set of such wavelets or basis functions. These basis functions are obtained from a single wave, by dilations or contractions (scaling) and translations (shifts). For pattern recognition continuous wavelet transform is better. The Discrete Wavelet Transform of a finite length signal x (n) having N components, for example, is expressed by an N x N matrix similar to the Discrete cosine transform (DCT). 3.1 WAVELET-BASED COMPRESSION Digital image is represented as a two-dimensional array of coefficients, each coefficient representing the brightness level in that point. We can differentiate between coefficients as more important ones, and lesser important ones. Most natural images have smooth color variations, with the fine details being represented as sharp edges in between the smooth variations. Technically, the smooth variations in color can be termed as low frequency variations and the sharp variations as high frequency variations. The basic difference wavelet-based and Fourier-based techniques is that short-time Fourierbased techniques use a fixed analysis window, while wavelet-based techniques can be considered using a short window at high spatial frequency data and a long window at low spatial frequency data. This makes DWT more accurate in analyzing image signals at different spatial frequency, and thus can represent more precisely both smooth and dynamic regions in image. The compressor includes forward wavelet transform, Quantizer, and Lossless entropy encoder. The corresponding decompressed is formed by Lossless entropy decoder, de-Quantizer, and an inverse wavelet transform. Wavelet-based image compression has good compression results in both rate and distortion sense. 3.2 PRINCIPLES OF IMAGE COMPRESSION Image Compression is different from data compression (binary data). When we apply techniques used for binary data compression to the images, the results are not optimal. In Lossless compression the data (binary data such as executables, documents etc) are compresses such that when decompressed, it give an exact replica of the original data. They need to be exactly reproduced when decompressed. On the other hand, images need not be reproduced 'exactly'.

7 In images the neighboring pixels are correlated and therefore contain redundant information. Before we compress an image, we first find out the pixels, which are correlated. The fundamental components of compression are redundancy and irrelevancy reduction. Redundancy means duplication and Irrelevancy means the parts of signal that will not be noticed by the signal receiver, which is the Human Visual System (HVS). There are three types of redundancy can be identified: 1) Spatial Redundancy i.e. correlation between neighboring pixel values. 2) Spectral Redundancy i.e. correlation between different color planes or spectral bands. 3) Temporal Redundancy i.e. correlation between adjacent frames in a sequence of images (in video applications). Image compression focuses on reducing the number of bits needed to represent an image by removing the spatial and spectral redundancies. 3.3 PROCEDURE FOR COMPRESSING AN IMAGE Just as in any other image compression schemes the wavelet method for image compression also follows the same procedure: i) ii) iii) iv) i) ii) Decomposition of signal using filters banks. Down sampling of Output of filter bank Quantization of the above Finally encoding Up samples Re composition of signal with filter bank.

For decoding

3.4 PRINCIPLES OF USING TRANSFORM AS SOURCE ENCODER Before going into details of wavelet theory and its application to image compression, I will go ahead and describe few terms most commonly used in this report. Mathematical transformations are applied to signals to obtain further information from that signal that is not readily available in the raw signal. Various wavelet transforms Discrete Wavelet Transform

8 Haar Wavelet Transform

3.5 DISCRETE WAVELET TRANSFORM At the heart of the analysis (or compression) stage of the system is the forward discrete wavelet transform (DWT). A block diagram of a wavelet based image compression system is shown in Figure 3.1. Here, the input image is mapped from a spatial domain, to a scale-shift domain. This transform separates the image information into octave frequency subbands. The expectation is that certain frequency bands will have zero or negligible energy content; thus, information in these bands can be thrown away or reduced so that the image is compressed without much loss of information. The DWT coefficients are then quantized to achieve compression. Information lost during the quantization process cannot be recovered and this impacts the quality of the reconstructed image. Due to the nature of the transform, DWT coefficients exhibit spatial correlation, that are exploited by quantization algorithms like the embedded zero-tree wavelet (EZW) and set partitioning in hierarchical trees (SPIHT) for efficient quantization. The quantized coefficients may then be entropy coded; this is a reversible process that eliminates any redundancy at the output of the quantizer.

Figure 3.1: A wavelet based image compression system. In the synthesis (or decompression) stage, the inverse discrete wavelet transform recovers the original image from the DWT coefficients. In the absence of any quantization the reconstructed image will be identical to the input image. However, if any information was discarded during the

9 quantization process, the reconstructed image will only be an approximation of the original image. Hence this is called lossy compression. The more an image is compressed, the more information is discarded by the quantizer; the result is a reconstructed image that exhibits increasingly more artifacts. Certain integer wavelet transforms exist that result in DWT coefficients that can be quantized without any loss of information. These result in lossless compression, where the reconstructed image is an exact replica of the input image. However, compression ratios achieved by these transforms are small compared to lossy transforms (e.g. 4:1 compared to 40:1). 3.6 2-D DISCRETE WAVELET TRANSFORM A 2-D separable discrete wavelet transform is equivalent to two consecutive 1-D transforms. For an image, a 2-D DWT is implemented as a 1-D row transform followed by a 1-D column transform. Transform coefficients are obtained by projecting the 2-D input image x(u, v) onto a set of 2-D basis functions that are expressed as the product of two 1-D basis. The 2-D DWT (analysis) can be expressed as the set of equations shown below. The scaling function and wavelets and (and the corresponding

transform coefficients X(N, j,m), X(1)(N, j,m), X(2)(N, j,m) and X(3)(N, j,m)) correspond to different subbands in the decomposition. X(N, j,m) are the coarse coefficients that constitute the LL subband. The X(1)(N, j,m) coefficients contain the vertical details and correspond to the LH subband. The X(2)(N, j,m) coefficients contain the horizontal details and correspond to the HL subband. The X(3)(N, j,m) coefficients represent the diagonal details in the image and constitute the HH subband. The histograms for coefficients in all subbands shown in, illustrate the advantage of transform coding. The original 512 512 lighthouse image had few zero pixel values. Thus a single-level decomposition at scale (N +1) has four subbands of coefficients as shown in Figure 3.2.


Figure 3.2: Single-level 2-D wavelet decomposition (analysis). The synthesis bank performs the 2-D IDWT to reconstruct . 2-D IDWT is

given in equation below.

A single stage of a 2-D filter bank is shown in Figure 3.3. First, the rows of the input image are filtered by the highpass and lowpass filters. The outputs from these filters are downsampled by

11 two, and then the columns of the outputs are filtered and downsampled to decompose the image into four subbands. The synthesis stage performs upsampling and filtering to reconstruct the original image. Multiple levels of decomposition are achieved by iterating the analysis stage on only the LL band.

Figure 3.3: One level filter bank for computation of 2-D DWT and IDWT. For l levels of decomposition, the image is decomposed into 3l + 1 subbands. Figure 3.4 shows the filter bank structure for three levels of decomposition of an input image, and Figure 3.5 shows the subbands in the transformed image.

Figure 3.4: Analysis section of a three level 2-D filter bank.


Figure 3.5: Three level wavelet decomposition of an image. Chapter 4

4.1 THE HAAR WAVELET The Haar wavelet algorithms published here are applied to time series where the number of samples is a power of two (e.g., 2, 4, 8, 16, 32, 64...) The Haar wavelet uses a rectangular window to sample the time series. The first pass over the time series uses a window width of two. The window width is doubled at each step until the window encompasses the entire time series. Each pass over the time series generates a new time series and a set of coefficients. The new time series is the average of the previous time series over the sampling window. The coefficients represent the average change in the sample window. For example, if we have a time series consisting of the values v0, v1, ... vn, a new time series, with half as many points is calculated by averaging the points in the window. If it is the first pass over the time series, the window width will be two, so two points will be averaged: for (i = 0; i < n; i = i + 2) si = (vi + vi+1)/2; The 3-D surface below graphs nine wavelet spectrums generated from the 512 point AMAT close price time series. The x-axis shows the sample number, the y-axis shows the average value at that point and the z-axis shows log2 of the window width.


Figure 4.1: The Wavelet Spectrum The wavelet coefficients are calcalculated along with the new average time series values. The coefficients represent the average change over the window. If the windows width is two this would be: for (i = 0; i < n; i = i + 2) ci = (vi - vi+1)/2; The graph below shows the coefficient spectrums. As before the z-axis represents the log2 of the window width. The y-axis represents the time series change over the window width. Somewhat counter intutitively, the negative values mean that the time series is moving upward

vi is less than vi+1 so vi - vi+1 will be less than zero

Positive values mean the the time series is going down, since vi is greater than vi+1. Note that the high frequency coefficient spectrum (log2(windowWidth) = 1) reflects the noisiest part of the time series. Here the change between values fluctuates around zero.


Figure 4.2: Haar coefficient spectrum Plot of the Haar coefficient spectrum. The surface plots the highest frequency spectrum in the front and the lowest frequency spectrum in the back. Note that the highest frequency spectrum contains most of the noise.

4.2 FILTERING SPECTRUM The wavelet transform allows some or all of a given spectrum to be removed by setting the coefficients to zero. The signal can then be rebuilt using the inverse wavelet transform. Plots of the AMAT close price time series with various spectrums filtered out are shown here. 4.3 NOISE FILTERS Each spectrum that makes up a time series can be examined independently. A noise filter can be applied to each spectrum removing the coefficients that are classified as noise by setting the coefficients to zero. 4.4 WAVELETS VS. SIMPLE FILTERS How do Haar wavelet filters compare to simple filters, like windowed mean and median filters. Whether a wavelet filter is better than a windowed mean filter depends on the application.

15 The wavelet filter allows specific parts of the spectrum to be filtered. For example, the entire high frequency spectrum can be removed Or selected parts of the spectrum can be removed, as is done with the gaussian noise filter. The scaling function (which uses the hn coefficients) produces N/2 elements that are a smoother version of the original data set. In the case of the Haar transform, these elements are a pairwise average. The last step of the Haar transform calculates one scaling value and one wavelet value. In the case of the Haar transform, the scaling value will be the average of the input data. The power of Haar wavelet filters is that they can be efficiently calculated and they provide a lot of flexibility. They can potentially leave more detail in the time series, compared to the mean or median filter. To the extent that this detail is useful for an application, the wavelet filter is a better choice. 4.5 LIMITATIONS OF THE HAAR WAVELET TRANSFORM The Haar wavelet transform has a number of advantages:

It is conceptually simple. It is fast. It is memory efficient, since it can be calculated in place without a temporary array. It is exactly reversible without the edge effects that are a problem with other wavelet trasforms. The Haar transform also has limitations, which can be a problem for some applications. In

generating each set of averages for the next level and each set of coefficients, the Haar transform performs an average and difference on a pair of values. Then the algorithm shifts over by two values and calculates another average and difference on the next pair. The high frequency coefficient spectrum should reflect all high frequency changes. The Haar window is only two elements wide. If a big change takes place from an even value to an odd value, the change will not be reflected in the high frequency coefficients. For example, in the 64 element time series graphed below, there is a large drop between elements 16 and 17, and elements 44 and 45.


Figure 4.3: frequency coefficient spectrum

Chapter 5


Images require much storage space, large transmission bandwidth and long transmission time. The only way currently to improve on these resource requirements is to compress images, such that they can be transmitted quicker and then decompressed by the receiver. In image processing there are 256 intensity levels (scales) of grey. 0 is black and 255 is white. Each level is represented by an 8-bit binary number so black is 00000000 and white is 11111111. An image can therefore be thought of as grid of pixels, where each pixel can be represented by the 8-bit binary value for grey-scale.

Figure 5.1 Intensity level representation of image


The resolution of an image is the pixels per square inch. (So 500dpi means that a pixel is 1/500th of an inch). To digitise a one-inch square image at 500 dpi requires 8 x 500 x500 = 2 million storage bits. Using this representation it is clear that image data compression is a great advantage if many images are to be stored, transmitted or processed. According to [6] "Image compression algorithms aim to remove redundancy in data in a way which makes image reconstruction possible." This basically means that image compression algorithms try to exploit redundancies in the data; they calculate which data needs to be kept in order to reconstruct the original image and therefore which data can be thrown away. By removing the redundant data, the image can be represented in a smaller number of bits, and hence can be compressed. During compression however, energy is lost because thresholding changes the coefficient values and hence the compressed version contains less energy. But what is redundant information? Redundancy reduction is aimed at removing duplication in the image. According to Saha there are two different types of redundancy relevant to images: (i) Spatial Redundancy . correlation between neighbouring pixels. (ii) Spectral Redundancy - correlation between different colour planes and spectral bands. Where there is high correlation, there is also high redundancy, so it may not be necessary to record the data for every pixel. There are two parts to the compression: 1. Find image data properties; grey-level histogram, image entropy, correlation functions etc.. 2. Find an appropriate compression technique for an image of those properties. 5.1 Image Data Properties relavent to wavelets In order to make meaningful comparisons of different image compression techniques it is necessary to know the properties of the image. One property is the image entropy; a highly correlated picture will have a low entropy. For example a very low frequency, highly correlated image will be compressed well by many different techniques. A compression algorithm that is good for some images will not necessarily be good for all images, it would be better if we could say what the best compression technique would be given the type of image we have. One way of calculating entropy is suggested by [6] Information redundancy, r, is r = b - He [6], where b is the smallest number of bits for which the image quantisation levels can be represented.

18 Information redundancy can only be evaluated if a good estimate of image entropy is available, but this is not usually the case because some statistical information is not known. More specifically this means that upsampling can be used to remove every second sample. The scale has now been doubled. The resolution has also been changed, the filtering made the frequency resolution better, but reduced the time resolution. An estimate of He can be obtained from a greylevel histogram. If h(k) is the frequency of grey-level k in an image f, and image size is MxN then an estimate of P(k) can be made: The Compression ratio K = b / He.

5.2 WAVELETS AND COMPRESSION Wavelets are useful for compressing signals but they also have far more extensive uses. They can be used to process and improve signals, in fields such as medical imaging where image degradation is not tolerated they are of particular use. They can be used to remove noise in an image, for example if it is of very fine scales, wavelets can be used to cut out this fine scale, effectively removing the noise. 5.2.1 The Fingerprint example The FBI have been using wavelet techniques in order to store and process fingerprint images more efficiently. The problem that the FBI were faced with was that they had over 200 Million sets of fingerprints, with up to 30,0000 new ones arriving each day, so searching through them was taking too long. The FBI thought that computerizing the fingerprint images would be a better solution, however it was estimated that checking each fingerprint would use 600Kbytes of memory and even worse 2000 terabytes of storage space would be required to hold all the image data. The FBI then turned to wavelets for help, adapting a technique to compress each image into just 7% of the original space. Even more amazingly, according to Kiernan[8], when the images are decompressed they show "little distortion". However this was unsatisfactory, trying to compress images this way into less than 10% caused "tiling artefacts" to occur, leaving marked boundaries in the image. The basic steps used in the fingerprint compression were: (1) Digitise the source image into a signals (2) Decompose the signal s into wavelet coefficients

19 (3) Modify the coefficients from w, using thresholding to a sequence w. (4) Use quantisation to convert w to q. (5) Apply entropy encoding to compress q to e.

5.3 WAVELETS AND DECOMPRESSION Whenever we compress an image, at the receiver end to get the original form of the image we have to decompress it. This can be done by image decompression models. The main blocks in this decompression models are channel decoder and source decoder. A source decoder consists of two stages, namely symbol decoder and inverse mapper. These blocks do the exact opposite operation of channel as well as source encoder blocks. The source decoder contains only two components: a symbol decoder and an inverse mapper. These blocks perform, in reverse order, the inverse of the source encoders symbol encoder and mapper blocks. Because quantization results in irreversible information loss, an inverse quantizer block is not included in the general decoder model.

Chapter 6 SUMMARY

20 The importance of the threshold value on the energy level was something of which we did not have an appreciation before collecting the results. To be more specific we understood that hresholding had an effect but didn.t realise the extent to which thresholding could change the energy retained and compression rates. Therefore when the investigation was carried out more attention was paid to choosing the wavelets, images and decomposition levels than the thresholding strategy. ADVANTAGES: We can avoid creating of blocking artefacts from an transformed image. Multi-Resolution is possible with Wavelet transformation. Image compression. DISADVANTAGES: When compared with DCT(Discrete Cosine Transformation) algorithm, Performance is low. Blockiness can be occurred after this transformation.

Chapter 7

The general aim of the project was to investigate how wavelets could be used for image compression. We believe that we have achieved this. We have gained an understanding of what wavelets are, why they are required and how they can be used to compress images. We understand the problems involved with choosing threshold values. While the idea of thresholding is simple and effective, finding a good threshold is not an easy task. We have also gained a general understanding of how decomposition levels, wavelets and images change the division of energy

21 between the approximation and detail subsignals. So we did achieve an investigation into compression with wavelets. Using global thresholding is not incorrect, it is a perfectly valid solution to threshold, the problem is that using global thresholding masked the true effect of the decomposition levels in particular on the results. This meant that the true potential of a wavelet to compress an image whilst retaining energy was not shown. The investigation then moved to local thresholding which was better than global thresholding because each detail subsignal had its own threshold based on the coefficients that it contained. This meant it was easier to retain energy during compression. However even better local thresholding techniques could be used. These techniques would be based on the energy contained within each subsignal rather than the range of coefficient values and use cumulative energy profiles to find the required threshold values. If the actual energy retained value is not important, rather a near optimal trade off is required then a method called BayesShrink could be used. This method performs a denoising of the image which thresholds the insignificant details and hence produces Zeros while retaining the significant energy. We were perhaps too keen to start collecting results in order to analyse them when we should have spent more time considering the best way to go about the investigation. Having analysed the results it is clear that the number of thresholds analysed (only 10 for each combination of wavelet, image and level) was not adequate to conclude which is the best wavelet and decomposition level to use for an image. There is likely to be an optimal value that the investigation did not find. So it was difficult to make quantative predictions for the behaviour of wavelets with images, only the general trends could be investigated. We feel however that the reason behind my problems with thresholding was that thresholding is a complex problem in general. Perhaps it would have been better to do more research into thresholding strategies and images prior to collecting results. This would not have removed the problem of thresholding but allowed us to make more informed choices and obtain more conclusive results. At the start of the project our aim was to discover the effect of decomposition levels, wavelets and images on the compression. We believe that we have discovered the effect each have. How well a wavelet can compact the energy of a subsignal into the approximation subsignal depends on the spread of energy in the image. An attempt was made to study image properties but it is still unclear as to how to link image properties to a best wavelet basis function. Therefore an investigation into the best basis to use for a given image could be another extension.

22 Only one family of wavelets was used in the investigation, the Daubechies wavelets. However there are many other wavelets that could be used such as Meyer, Morlet and Coiflet. Wavelets can also be used for more than just images, they can be used for other signals such as audio signals. They can also be used for processing signals not just compressing them. Although compression and denoising is done in similar ways, so compressing signals also performs a denoising of the signal. Overall we feel that we have achieved quite a lot given the time constraints considering that before we could start investigating wavelet compression we had to first learn about wavelets and how to use VC++. We feel that we have learned a great deal about wavelets, compression and how to analyse image.

1. D. Donoho and I. Johnstone, Adapting to unknown smoothness via wavelet shrinkage, JASA, v.19, pp. 1200-1224, 1995 2. A. Chambolle, R. DeVore, N. Lee and B. Lucier, Nonlinear Wavelet Image Processing, IEE Transactions on Image Processing, v.7, pp319-335, 1998 3. K. Castleman, Digital Image Processing', Prentice Hall 1996 4. V. Cherkassky and F. Mulier, Learning from Data, Wiley, N.Y. 1998 5. V. Cherkassky and X. Shao, Model Selection for Wavelet-based Signal Estimation, Proc. International Joint Conference on Neural Networks, Anchorage, Alaska, 1998 6. Amir Said and William A. Pearlman, A New Fast and Efficient Image Code Based on Set Partitioning in Hierarchical Trees, IEEE Transactions on Circuits and Systems for Video Technology, vol. 6, pp. 243-250, June 1996

23 7. Amir Said and William A. Pearlman, An Image Multi-resolution Representation for Lossless and Lossy Image Compression, first submitted July 1994, and published in IEEE Transactions on Image Processing, vol. 5, pp. 1303-1310, Sept. 1996