mages are everywhere. No wonder, since we as human beings rely on the images we perceive with our eyes more than any other sensory stimulus. Almost all of the information we digest comes to us in the form of an image; whether we look at a photograph, watch television, admire a painting, or read a book, it all makes use of imagery. Images are so natural to us, that we go to great lengths to convert almost any kind of information to images. For example: the TV weather forecast shows the temperature distribution in some geographical area as an image with different colors representing different temperatures, medical scanners can show human metabolism activity as an image where bright spots indicate high activity, etc. Moreover, our vision is usually the most efcient of our senses: consider, for example, a computer keyboard. The function of each key is represented by a small image (a character). We could also have identied each key by a specic relief texture, but it would be far less efcient. We could even try to give each key a specic smell, but it is easy to imagine (!) the trouble we would have typing. We are also adept at a number of image processing tasks. For example, the focusing of our eyes: when we look at something, the rst image our eyes send to the brain is probably out of focus. The brain then tries to correct this by adjusting the eye lens, whereupon a new image is sent from the eyes to the brain, and so on. This feedback process is so fast that we arent even aware of it. Another example is stereo vision: our eyes send two two-dimensional images to the brain, but the brain is able to merge these into one three-dimensional image virtually instantaneously. The science of image processing combines this natural way humans use images with the science of mathematics. This provides a unique blend, since images and image processing are described in a rigorous mathematical way without loss of the intuitive character of imagery.
10
Introduction
The manipulation and analysis of information contained in images. This denition is of course very broad, and includes a wide range of natural and articial processes, from the use of a pair of glasses to automatic analysis of images transmitted by the Hubble telescope. Simple forms of image processing can be found all around us; examples include: the use of glasses or contact lenses brightness, contrast, etc. controls on a television or monitor taking (and developing) a picture using a photocamera natural examples: reection of scenery on a water surface, distortions of scenery in the mist, a fata morgana, etc.
Examples of the use of advanced image processing include: forensic science: enhancement of images of video surveillance cameras, automatic recognition and classication of faces, ngerprints, DNA codes, etc. from images industry: checking of manufactured parts, application to CAD/CAM information processing: reading of handwritten and printed texts (frequently referred to as OCR; optical character recognition), scanning and classication of printed images A large number of applications of image processing can be found in the medical sciences using one or more medical images of a patient, e.g., Visualization. For example: before we can make a 3D visualization of a three dimensional object (such as the head in gure 1.1), we often need to extract the object information rst from two-dimensional images. Computer aided diagnosis. For example: in western countries it is common to regularly make breast radiographs of females above a certain age in order to detect breast cancer in an early stage. In practice, the number of images involved is so large that it would be very helpful to do part of the screening by means of automated computer image processing. Image segmentation, i.e., the division of an image into meaningful structures. For example: the division of a brain image into structures like white brain matter, grey brain matter, cerebrospinal uid, bone, fat, skin, etc. An example can be seen in gure 1.2. Segmentation is useful in many tasks, ranging from improved visualization to the monitoring of tumor growth. Image registration (also called image matching), i.e., the exact alignment of two or more images of the same patient, which is necessary if the information contained in these images is to be combined in a meaningful new image. An example is shown in gure 1.3.
11
Figure 1.1 Example of extraction of information by image processing: the head visualized in three dimensions on the left was extracted from two dimensional images such as the one on the right.
Figure 1.2 Example of segmentation: the brain on the left was segmented into six structures (indicated by different grey values in the right image) by an automatic algorithm.
12
Introduction
Figure 1.3 Example of two medical images that have been brought into registration: The upper left and lower right quarters belong to the rst image, the lower left and upper right belong to the second image. Visualizations like these enable the viewer to make use of the information contained in the separate images simultaneously.
13
00 01 00 00 00 00 00 00
00 00 00 04 3d 0a 00 00
01 46 4a 4c 4c 4d 48 01
01 4a 5b 4c 4c 4c 50 02
01 45 4c 4c 4b 4d 4d 10
03 7f 63 4c 4c 4d 57 00
01 01 00 4d 49 17 01 01
00 01 01 00 00 00 00 01
Figure 1.4 Example of sampling and quantization. The left image shows an original scene with continuous intensity values everywhere. After sampling using 8 8 locations (middle image), the image has real intensity values at specic locations only. After quantization (right image), these real values are converted to discrete numbers (here: hexadecimal values).
task. Moreover, the use of analog hardware is rapidly becoming obsolete in many image processing areas, since it can often be replaced by digital hardware (computers) which is much more exible in its use. But what exactly is a digital image? Obviously, we start with some kind of imaging device such as a videocamera, a medical scanner, or anything else that can convert a measure of a physical reality to an electrical signal. We assume this imaging device produces a continuous (in space) electrical signal. Since such an analog signal cannot directly be handled by digital circuits or computers, the signal is converted to a discretized form by a digitizer. The resulting image can then directly be used in digital image processing applications. The digitizer performs two tasks, known as sampling and quantization (see gure 1.4). In the sampling process, the values of the continuous signal are sampled at specic locations in the image. In the quantization process, the real values are discretized into digital numbers. After quantization we call the result a digital image. So this answers the question at the beginning of this section: a digital image is nothing more than a matrix of numbers. Each matrix element, i.e., a quantized sample, is called a picture element or a pixel. In the case of three-dimensional images this is named a volume element or voxel. We can indicate the location of each pixel in an image by two coordinates (x, y). By convention, the (0, 0) pixel (the origin) is in the top left corner of the image, the x axis runs from left to right, and the y axis runs from top to bottom (see gure 1.5). This may take a little getting used to, because this differs from the conventional mathematical notation of two-dimensional functions1 , as well as from conventional matrix coordinates2 .
1 2
Where the origin is at the bottom left, and the y axis runs from bottom to top. Where the origin is the top left, the x axis runs from top to bottom, and the y axis from left to right.
14
Introduction
pixel (0, 0)
0 0 1 1
x
2
y2
3 4 5
Figure 1.5 Coordinate conventions for images. Pixel (3, 1) is marked in grey.
If a digital image is nothing more than a matrix of numbers, a very sceptic person might say that digital image processing is nothing more than a collection of mathematical algorithms that operate on a matrix. Fortunately, reality is not nearly as boring as this sounds, because in practice we will seldomly use the matrix representation shown in gure 1.4. Instead, we work with the middle image from gure 1.4 which is in fact the same matrix, but with intensity values assigned to each number but which usually makes much more sense to a human being. Throughout this book, you will nd that image processing algorithms will be formulated as mathematical operators working on pixel values or pixel matrices, but the results of these algorithms will also be displayed using images.
15
Further on in this book, we will come across images with more than two dimensions. The mathematical representations will be logical extensions of the ones above for twodimensional images. For example, a movie can be regarded as a time series of twodimensional images (frames), so a pixel value at location (x, y) of frame t can be represented by, e.g., f (x, y, t). Another example is given by some medical images that have three spatial dimensions, and are therefore represented as, e.g., f (x, y, z). If we want to visualize such three-dimensional images we often use volume renderings or simply display the three-dimensional image one two-dimensional slice at a time (see gure 1.1).
16
Introduction
You could argue that they are all part of image processing3 , given the broad denition we gave to image processing. A -debatable- graph showing relations of image processing and other disciplines can be seen in gure 1.6.
Figure 1.6 The relation of image processing and various connected disciplines.
Here, the various disciplines are dened by their actions regarding images and models, where the model is usually a compact mathematical structure describing a phenomenom using its essential parameters. For example, the image could be a photograph of a face, and the model a graph of facial feature points such as the corners of the eyes, the tip of the nose, etc, and the model parameters could be the distances between these points. A typical computer vision task would be to nd the feature points in the image and match them to the model, e.g., for face recognition. A typical computer graphics task would be to generate an image of a face given the facial model and a set of distances. A typical image processing task would be to produce a new image by somehow enhancing the photograph, e.g., by improving the contrast, red eye removal, etc. Although the disciplines mentioned in the graph all overlap to some extent, such overlap is small in the case of both computational geometry and computer graphics. This is not the case for image processing, computer vision, and image analysis, and attempting to nd the borders between these areas often leads to articial results. In this book, we will therefore not attempt it and simply consider all three to be image processing.