Anda di halaman 1dari 4

COMPARISON OF CBF, ANN AND SVM CLASSIFIERS FOR OBJECT BASED CLASSIFICATION OF HIGH RESOLUTION SATELLITE IMAGES

Krishna Mohan Buddhiraju and Imdad Ali Rizvi Centre of Studies in Resources Engineering, Indian Institute of Technology Bombay, Mumbai 400076, INDIA.
ABSTRACT Image classification is an important task for many aspects of global change studies and environmental applications. This paper emphasizes on the analysis and usage of different advanced image classification techniques like Cloud Basis Functions (CBFs) Neural Networks, Artificial Neural Networks (ANN) and Support Vector Machines (SVM) for object based classification to get better accuracy. For comparison, adaptive Gaussian filtered images were classified using ANN and post-processed using relaxation labeling process (RLP). The results are demonstrated using high spatial resolution remotely sensed images. Index Terms Object Based Image Classification, ANN, SVM, Radial Basis Functions, High Resolution Satellite Images 1. INTRODUCTION Remotely sensed satellite image analysis is a challenging task considering the volume of data and combination of channels in which the image is acquired. The traditional classification techniques for analyzing images on a pixel by pixel basis suffer from radiometric differences between adjacent pixels, as well as noise due to short observation times and large radiometric resolution when applied to high resolution images. These are information rich, containing spectral information as well as textural, shape, contextual and topological information. Object based classification methods are increasingly used for classification of land cover/use units from high resolution images, and often the final result is close to the way a human analyst would interpret the image. To deal with the problem of complexity of high resolution images, the image is first segmented into homogeneous regions, and a set of features are computed for each region segment. These segments are classified using one or more of the machine learning algorithms. The regions are described by their spectral, textural, shape attributes and one can also incorporate topological relationships between neighboring regions. This method basically includes three steps. 1) Image segmentation to extract the regions from the pixel information based on homogeneity criteria. 2) Calculation of spectral parameters like mean vector and NDVI, textural descriptors like co-occurrence statistics, and spatial/shape parameters like aspect ratio, convexity, solidity, etc. for each region. 3) Classification of image using the region feature vectors using suitable classifiers such as ANN and SVM. This paper is discussed under five different headings. In Section 2, object-based image segmentation and feature extraction are elaborated. Section 3 presents various classifiers. Section 4 presents the experimental results and discussion. Section 5 summarizes the research findings and points out avenues for possible future works. 2. IMAGE SEGMENATION AND FEATURE EXTRACTION In present study, region based segmentation is used in order to reduce over-segmentation and local variability between pixels due to high spatial and radiometric resolution, the input images are smoothed by adaptive Gaussian filter before segmentation. 2.1 Region Growing Using Morphological Watershed Transformation Region growing algorithms take one or more pixels, called seeds, and grow the regions around them based upon certain homogeneity criteria. Watershed transformation is a powerful tool based on this concept. Regions of terrain that drain to the same point are defined to be part of the same watershed. The same analogy can be applied to images by viewing intensity as height. In this case, the image gradient is used to predict the direction of drainage in an image. By following the image gradient downhill from each point in the image, the set of points (dark regions), which drain to each local intensity minimum, can be identified. These disjoint regions are called the watersheds of the image. Similarly, the gradients can be followed uphill to local intensity maximum in the image, defining the inverse watersheds (bright regions) of the image. [1] [2] After the input image is smoothed, watershed segmentation starts with locating regional maxima in the images. The

978-1-4244-9566-5/10/$26.00 2010 IEEE

40

IGARSS 2010

regional maxima are flat zones (a maximal connected component of grey scale image with sample pixel values) surrounded by flat zones of strictly lower grey values. In order to reduce too many regions, the region process involves first selection of marker (seed) pixels) that correspond to minima of the local intensity gradient. Using the markers as seeds, the regions are grown by simulating a flooding process of the terrain, i.e., adding adjacent pixels to grow regions. When two growing floods (regions) meet, the region boundaries occur at those points. The formed regions in the above steps are labeled using connecting component labeling so that each region can independently be indexed and accessed [3]. Once each region is uniquely labeled, then the region shape, size, average grey level / spectral / textural properties can be computed for each region. This gives an opportunity to deal with the image in an object oriented manner rather than on a per-pixel basis. 2.2 Region Parameters While it is possible to compute a large number of parameters for each region, some of the prominent ones may be listed as follows 1) Size; 2) Spatial Moments; 3) Roundness, Convexity, and Solidity 4) Mean vector; 5) NDVI 6) Texture statistics (entropy, contrast,angular second moment, etc.), 7) Length/Width ratio, 8) Area etc. 3. IMAGE CLASSIFICATION Once the parameters for each region are generated, it is possible to use any classifier, but it is often appropriate to use distribution-free classifiers such as neural networks [4,5] and support vector machines [4]. We experimented with ANN and radial basis function (RBF) and Cloud Basis Function (CBF) kernels. 3.1 Radial Basis Function Neural Networks The construction of a RBF NN, in its most basic form, involves three layers with entirely different roles. The input layer is made up of source nodes (sensory units) that connect the network to its environment. The second layer, the only hidden layer in the network, applies a nonlinear transformation from the input space to the hidden space; where hidden space is in general of high dimensionality. The output layer is linear, supplying the response of the network to the activation pattern applied to the input layer. Following parameters are considered while implementing RBF ANN, Number of input nodes = 7 Number of output nodes = 9 Number of hidden layers = 1 Number of nodes in each hidden layer = 8 Learning rate = 0.85 Momentum = 0.5 Normalization factor for patterns = 255

3.2 Using Cloud basis functions (CBFs) The drawback of RBF ( kernel ) based classifiers is that the boundaries of the classes as taken according to radial basis function networks are spherical while the same is not true for majority of the real data. The boundaries of the classes vary in shape, thus leading to reduced accuracy. The new basis functions, called cloud basis functions (CBFs) use a different feature weighting, derived to emphasize features relevant to class discrimination [5]. Further, these basis functions are designed to have multiple boundary segments, rather than a single boundary as for RBFs. This new enhancements to the basis functions along with a suitable training algorithm allow the neural network to better learn the specific properties. Following parameters are considered while implementing CBF NN, Total number of training samples taken = 135 Maximum number of training iterations =10 Maximum number of iterations for the iterative gradient descent for updating the scale factors calculated during each training iteration = 2 Learning rate for the iterative gradient descent for updating the scale factors calculated during each training iteration = 0.2 Maximum number of neighboring functions, for each basis function = 7 to 11 Maximum number of misclassified samples, which when exceeded, a new basis function is to be added to the network = 10 3.2 Support Vector Machines (SVM) The support vector machine (SVM) is superior to all machine learning algorithms which are based on statistical learning theory. There are a number of publications detailing the mathematical formulation and algorithm development of the SVM [6] [7]. The inductive principle behind SVM is structural risk minimization (SRM), which constructs a hyper-plane between two classes, such that the distance between support vectors to the hyper-plane would be maximum. In order to deal with non-linearly separable classes, the input data are first mapped using a kernel to a higher dimensional space in SVM. The radial basis function (RBF) kernel is popularly used in SVM. Object based SVM is implemented using an open source library called Libsvm (version 2.91). It was found that a parameter influenced the classification accuracy using the RBF kernel. There are two parameters while using RBF kernels: i. C > 0 is the penalty parameter of the error term. ii. = kernel parameter. It is not known beforehand which C and are the best for one problem; consequently some kind of model selection (parameter search) must be done. The goal is to identify good (C, ) so that the classifier can accurately predict unknown data (i.e., testing data). In our case we found it as

41

C=2.0, =2.0 with ve-fold cross-validation rate=89.62% 3.3 Relaxation Labelling Process (RLP) Common classification methods based on pixel level information result in some of the pixels being wrongly classified. Accuracy of pixel based classification can be further boosted by incorporating contextual information embedded in the neighbourhood of each pixel. Relaxation Labelling Process (RLP) is one such approach that was investigated in the 1980s but due to high computational demands it has not been very popular in the remote sensing area. RLP requires initial class likelihoods to begin the refinement process. These probabilities are then updated iteratively using updating rule given in equation (1). It was proposed by Rosenfeld et al. [8], combining the neighborhood supports and label probabilities to move from kth iteration to (k+1)th iteration,
pik +1 ( ) =
'

based RBF NN, CBF NN and SVM in order to compare the performance of various classifiers. All these algorithms are implemented using C/C++ on Windows Pentium 2.8 GHz processor machine. The methodology is tested on a QuickBird window (2000 x 2000 pixels) of an urban fringe area comprising a few buildings, a quarry site, ponds, road, vegetation and foot paths. This image is shown in Fig.1.

pik ( ) 1 + qi ( ) p (
k i ' i

(1)
'

Fig. 1. High resolution image used as Study Area The image was classified into 9 prominent classes covering a majority of the land cover features, as shown in the legend of Fig 2. Accuracy and error statistics were computed for each classifier. Fig. 2 depicts output of object based classification using CBF NN which clearly indicates that object based classification is not a universal remedy, it is evident that regions are misclassified. For example, the roads and buildings or the grass and tree are spectrally similar and have a significant amount of spectral overlap.

) 1+ q ( )
(2)

qi' ( ) =

qij ( )
j

N (i )

( qijk ) ( ) =

'

rij ( , ' ) p(jk ) ( ' )

(3)

Where pik ( ) denotes the probability of pixel i to belong to class in the kth iteration, qij ( ) denotes the support that a pixel i receives from neighbor j, qi ( ) denotes the total neighborhood support and |N(i)| is the cardinality of the set of neighbors of i. In equation (3) rij( , ) denotes compatibility coefficients, the notation stands for the compatibility of label at pixel i to co-occur with label at neighboring pixel j. The input image was smoothed as in case of region classification following which it was classified using the neural network. In the present study, initial class probabilities were generated by normalizing the responses of the nodes of the output layer of a neural network classifier. Canny operator was employed to derive the edges from the image which were later used for defining edge pixel constraint while updating the label likelihoods. When a pixel lies on an edge between two regions, the label likelihoods are updated with reduced neighbor influence so that one class does not overgrow into another class. It was found that compared to simple per-pixel classification, the smoothing followed by classification and relaxation labeling was significantly better, yet not as good as the region segmentation-classification approach described above. The use of context which is associated with neighboring pixels makes the task of classification much easier and produces reliable results. 4. RESULTS AND DISCUSSION For the purpose of this study, we have implemented object

Fig. 2. Classified image using CBF NN

Fig. 3. Classified image (ANN + RLP )

This is the primary reason for the large number of misclassifications between these classes or similarly a part of the lake is being classified as a pool; an entire lake is classified as shadow, etc. This happens due to spectral closeness of these regions. In order to reduce misclassification we need to take into consideration ancillary data (contextual data) available about the image. For example, if a region is classified as a shadow then there has to be tall structure in the vicinity of the shadow. Hence some improvement can be observed in Fig 3 shows the output of pixel based classification using RBF NN after RLP. A

42

summary of accuracy and error statistics of all mentioned classifiers can be found in Table 1. The object-based classifier using CBF outperforms the other kernel based NN classifiers in overall accuracy. The kappa coefficient, which is 0.8491, is low indicating the RBF method is still an unsatisfactory one to classify remotely sensed images, where as for CBF it is recorded as 0.8789. Table 2. shows the accuracy of ANN classified image after RLP is used. 5. CONCLUSION This paper attempts to study and compare ANN, SVM and CBF for object based image classification. The object based image analysis greatly reduced the salt-and-pepper classification effect in the classified image without adversely affecting the classified image accuracy. This greatly improves the visual effect of the classified image. The study shows that the CBFs NN improves the classification accuracy. This study emphasizes that kernel type and kernel parameter affect the shape of the decision boundaries as indicated by the SVM and thus influence the performance of the SVM. It was found that the neural network classifier trained using the standard back-propagation algorithm produced marginally better results compared to the other methods, though the cloud basis function, being a relatively new technique in the remote sensing arena requires further study. A combined approach to classification using object based methods and contextual information available about the image, seems promising and needs further exploration. Table 1. Comparison of Classification Accuracies
Classes ANN (CBF) ANN (RBF)

REFERENCES [1] F. Meyer and S. Beucher, Morphological Segmentation Journal of Visual Communication and Image Representation, vol. 1, pp. 21-46, 1990. [2] P. Soille, Morphological Image Analysis, SpringerVerlag, Berlin 2003. [3] L. G. Shapiro, G. C. Stockman, Computer Vision, Prentice Hall, NJ, 2001. [4] J. Han, S. Lee, K. Chi, K. Ryu, Comparison of NeuroFuzzy, Neural Network, and Maximum Likelihood Classifiers for Land Cover Classification using IKONOS Multispectral Data Proc. of IEEE International Geosciences and Remote Sensing Symposium (IGARSS 02), pp. 3471-3473, 2002. [5] C.R. De Silva, S. Ranganath, and L.C. De Silva, Cloud Basis Function Neural Network: A modified RBF neetwork architecture for holistic for holistic Facial Expression Recognition, The Journal of Pattern Recognition Society, vol. 41 pp.1241-1253, 2008. [6] V.N.Vapnik, The Nature of Statistical Learning Theory, Springer-Verlag, New York, 1995. [7] C. J. C. Burges, A tutorial on support vector machines for pattern recognition, Data Mining and Knowledge Discovery, 2 (2), pp. 121167, 1998. [8] A. Rosenfeld, R. Hummel and S. Zucker, Scene labeling using relaxation operations, IEEE Trans. Systems, Man and Cybernetics, SMC-6, pp. 420- 433, 1976.

SVM (RBF)

Table 2. Classification Accuracies


% Accuracy Classes Vegetation1 Vegetation2 Water Shadows Buildings1 Buildings2 Open Area1 Road Open Area2 % of Overall Accuracy ANN 100 90 85.18 94.44 91.67 100 91.14 84.84 100 ANN+RLP 100 90 85.18 96.29 91.67 100 95.83 84.84 100

Consumer s Accuracy Vegitation1 Water Building 1 Open Area 1 Vegetation 2 Shadow Building 2 Road Open Area 2 Accuracy Kappa Coefficient 0.9808 0.8105 1.0000 0.9000 0.8954 0.9000 0.7988 0.9059 0.8514

Produc ers Accura cy 0.8750 0.8045 0.8750 0.8750 1.0000 1.0000 0.8873 0.7660 0.8000 0.8764 0.8789

Consum Produc ers ers Accurac Accura y cy 0.8974 0.9127 1.0000 1.0000 0.8473 0.8494 0.9308 0.8242 0.5064 0.6873 1.0000 0.9711 1.0000 1.0000 0.5760 0.6388 0.5395 0.6445 0.8542 0.8491

P Consu Produce mers rs Accura Accurac cy y 1.0000 1.0000 1.0000 0.8571 0.8889 0.8750 1.0000 0.8750 0.7778 0.6667 1.0000 1.0000 0.8571 0.8750 0.875 1.0000 0.6364 0.6667 0.8962 0.8735

90.67

92.66

43

Anda mungkin juga menyukai