1458
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 45, NO. 5, MAY 2007
Classification of High Spatial Resolution Imagery Using Improved Gaussian Markov Random-Field-Based Texture Features Yindi Zhao, Liangpei Zhang, Member, IEEE, Pingxiang Li, Member, IEEE, and Bo Huang
Abstract—Gaussian Markov random fields (GMRFs) are used to analyze textures. GMRFs measure the interdependence of neighboring pixels within a texture to produce features. In this paper, neighboring pixels are taken into account in a priority sequence according to their distance from the center pixel, and a step-by-step least squares method is proposed to extract a novel set of GMRF texture features, named as PS-GMRF. A complete procedure is first designed to classify texture samples of QuickBird imagery. After texture feature extraction, a subset of PS-GMRF features is obtained by the sequential floating forward-selection method. Then, the maximum a posterior iterated conditional mode classification algorithm is used, involving the selected PS-GMRF texture features in combination with spectral features. The experimental results show that the performance of classifying texture samples on high spatial resolution QuickBird satellite imagery is improved when texture features and spectral features are used jointly, and PS-GMRF features have a higher discrimination power compared to the classical GMRF features, making a notable improvement in classification accuracy from 71.84% to 94.01%. On the other hand, it is found that one of the PS-GMRF texture features—the lowest order variance—is effective for residential-area detection. Some results for IKONOS and SPOT-5 images show that the integration of the lowest order variance with spectral features improves the classification accuracy compared to classification with purely spectral features. Index Terms—Classifying texture samples, Gaussian Markov random fields (GMRFs), least squares (LS) method, priority sequence, residential-area detection.
I. I NTRODUCTION
S
INCE the successful launch of high-resolution sensors and the wide application of high spatial resolution satellite data, it has been possible to monitor environmental changes on a small spatial scale. However, improved spatial resolution does not always lead to better classification results when employing conventional spectral classification methods [1]. While increases in spatial resolution create increased amounts of
Manuscript received August 26, 2006; revised December 14, 2006. This work was supported in part by the 973 Program of the People’s Republic of China under Grant 2006CB701302, in part by the National Natural Science Foundation of China under Grants 40471088 and 40523005, and in part by the Foundation of State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing under Grant 904151695. Y. Zhao, L. Zhang, and P. Li are with the State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, Wuhan 430079, China (e-mail:
[email protected]). B. Huang is with the Department of Geography and Resource Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong. Digital Object Identifier 10.1109/TGRS.2007.892602
information detail, this also creates higher spectral variance within each class corresponding to land-cover units, which decreases their spectral separability and results in lower classification accuracy. Texture is a very useful property of spatial-structure information in high-resolution images [2]. By incorporating texture information in the classification, heterogeneous objects and different objects with the same spectral characteristics can be detected effectively, and higher classification accuracies can be obtained. Texture analysis, which provides a complementary tool to the interpretation of high-resolution satellite images, has received great attention in image processing. Over the last two decades, many approaches to texture feature extraction have been developed, categorized roughly as feature-based methods such as the gray-level cooccurrence matrix, model-based methods such as Markov random-field (MRF) models, and structural methods [3]. Comparison and fusion of different types of texture features can be found in the literature [4], [5]. This paper focuses on extracting texture features based on MRF models. These models characterize spatial interactions between the central pixel of a texture image and its neighboring pixels to describe the local structure of an image [6]. Early work on texture classification using MRF was done by Chellappa and Chatterjee [7], wherein texture features were extracted based on the Gaussian MRF (GMRF), and a simple minimum-distance classifier was used for the texture image classification. More systematic theories about satellite image analysis were built up using MRF [8], [9] with emphasis on the application of MRF for spatial information extraction from satellite images. TorresTorriti and Jouan [10] evaluated the performance of MRF models and Gabor filters for computing texture properties, which were then used as discrimination features to classify different land-cover types in synthetic aperture radar (SAR) imagery. Clausi and Deng [11] demonstrated that both texture and tonal features are necessary to properly perform segmentation and classification of SAR sea-ice imagery. Subsequently, Clausi and Yue compared MRF and gray-level co-occurrence probability (GLCP) for texture analysis of SAR sea-ice imagery [12]. However, the most relevant research about MRF regards all the neighboring pixels equally and puts the whole neighbor pixels together into the corresponding MRF model equation [7]–[12]. In this paper, the neighboring pixels will be treated with a priority sequence, in accordance with the distance from the center pixel, and a new set of MRF texture features will be computed based on this consideration, followed by feature
0196-2892/$25.00 © 2007 IEEE
ZHAO et al.: CLASSIFICATION OF HIGH SPATIAL RESOLUTION IMAGERY
1459
Let S = {s = (i, j)|0 ≤ i ≤ M − 1, 0 ≤ j ≤ N − 1} denote the set of grid points in the M × N lattice corresponding to the pixels in the image region. Supposing that the image is modeled as a fourth-order GMRF model f (s), the gray-level intensity of a pixel s has a local conditional probability density function Fig. 1.
First- to fifth-order neighborhood of site s.
selection to reduce data dimensions and increase computational efficiency. The objectives of this paper are twofold: to assess the potential of new MRF texture features applied to classification of high spatial resolution imagery and to assess the effect of integrating spectral features with texture features in highresolution-imagery classification. The rest of this paper is organized as follows. Section II describes the novel MRF texture features. Section III briefly illustrates the classification method. Section IV provides some comparison experiments using QuickBird samples. Detecting residential areas using the new MRF features is studied in Section V, and conclusions are given in Section VI.
p (f (s)|fRN (s)) 2 1 1 =√ exp − β(r)(f (s + r) − µ) f (s) − µ − 2ν 2πν r∈RN (1) where fRN (s) = {f (s + r)|r ∈ RN} stands for the set of values at the neighboring site of the central pixel s, µ is the mean gray value of the whole image, β(r)s are the model parameters, and ν is the conditional variance. On the other hand, the gray value of the central pixel s, f (s) is also represented as a linear combination of values of its neighbor pixels and an additive noise [6], obeying the following difference equation: β(r) (f (s + r) − µ) + e(s) (2) f (s) − µ = r∈RN
II. T EXTURE F EATURE E XTRACTION A texture feature extraction method is a computational method that analyzes the pixels contained in a certain window of an input image and generates a set of values that represents the contents of that window. An ideal texture feature extraction method should be capable of generating different values for different textures. One important property of an MRF model is that it uses a finite number of parameters to characterize spatial interactions of pixels to describe an image region. A typical MRF model is the GMRF model, which is widely used to model image textures by utilizing spectral as well as spatial information.
where {e(s)} is a zero-mean Gaussian noise sequence with the correlation structure r = (0, 0) ν, Cov [e(s), e(s + r)] = −β(r)ν, r ∈ RN (3) 0, otherwise. Since the power spectrum associated with (2) must be real and positive [15], there are r ∈ RN ⇔ −r ∈ RN and β(r) = β(−r). Moreover, (2) can be rewritten by f (s) − µ = e(s) +
β(r)[(f (s + r)−µ)+(f (s − r)−µ)]
r∈RN
(4)
A. Novel Texture Features Based on GMRF Models In order to make this paper self-sufficient, some important concepts for GMRF models are restated here. Markovianity refers to the fact that the central pixel of an image interacts with only the neighboring pixels, independent of other pixels [6]. A GMRF model specifies spatial dependencies between the central pixel and its neighboring pixels by defining a local conditional probability distribution, which is assumed to be Gaussian [13]. The number and position of neighboring pixels are determined by the neighbor order defined using a Euclidean distance [14]. Up to fifth-order neighbors are depicted in Fig. 1; wherein the number indicates the neighbor order to which the underlying site belongs. Note that one neighbor set with order n includes all the neighbors of sets with order one to n. For example, a fourth-order GMRF model centered on an image cell s, would include those cells marked one to four, and its corresponding neighbor set can be denoted as a set of shift vectors RN = {(0, 1), (1, 0), (1, −1), (1, 1), (0, 2), (2, 0),(1, −2), (1, 2), (2, −1), (2, 1)} ∪ {(0, −1), (−1, 0),(−1, 1), (−1, −1), (0, −2), (−2, 0), (−1, 2), (−1, −2), (−2, 1), (−2, −1)}.
where RN is an asymmetric neighbor set such that if r ∈ RN, then −r ∈ RN and RN = {r|r ∈ RN} ∪ {r| − r ∈ RN}. The parameters β(r)s and conditional variance ν describe the GMRF models and characterize textures. When they are unknown, they should be calculated. There are many existing methods for estimating those unknowns but none can guarantee consistency1 as well as stability.2 The choice of the least squares (LS) method [7] is motivated by the simplicity–stability tradeoff. Denote the set of the GMRF parameters by β = ˆ of the model parameter col[β(r)|r ∈ RN]. The LS estimate β vector β is given by ˆ= β
1 The
−1 Q(s)Q (s) Q(s) (f (s) − µ)
s∈S
s∈S
T
(5)
estimates converge toward the true values of the parameters. covariance matrix in the expression for the joint probability density of MRF must be positive definite. 2 The
1460
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 45, NO. 5, MAY 2007
where Q(s) = col[(f (s + r) − µ) + (f (s − r) − µ)|r ∈ RN]. The estimate νˆ of the conditional variance ν is calculated by νˆ =
2
1 ˆ . (f (s) − µ) − QT (s)β M ×N
2) For 1 < k ≤ m:
T −1 (k) β = Q(k) (s) Q(k) (s)
(6)
s∈S
s∈S
×
ˆ νˆ} as a A general GMRF-based approach employs η = {β, texture feature vector. The classical GMRF (CGMRF)-based method thinks that all of the neighboring pixels, which are treated equally, interact on the center pixel simultaneously. However, it is reasonable that neighboring pixels have influence on the center pixel in a priority sequence, i. e., the closer the neighboring pixel to the center pixel, the higher the priority. To study this situation, a step-by-step LS (SSLS) method is proposed, to derive a different set of GMRF texture features. For this purpose, the parameters of the GMRF model can be divided into a certain number of groups and then estimated step-by-step, adopting the following rules. Rule 1) The number of groups is equal to the order of the GMRF model. Rule 2) The parameters for neighbors with different distances from the center pixel should be divided into different groups. Rule 3) The parameters for neighbors closer to the center pixel should be estimated before those for neighbors further from the center pixel. Note that the purpose is not to generate facsimiles of the texture but to assess a discrimination measure among different textures. For a fourth-order GMRF model, β is divided into four groups: β (1) , β (2) , β (3) , and β (4) . Each group β (k) (k) corresponds to a group of neighbors, denoted by RN ; that is, (1) (2) (3) RN = {(0, 1), (1, 0)}, RN = {(1, −1), (1, 1)}, RN = (4) {(0, 2), (2, 0)}, and RN = {(1, −2), (1, 2), (2, −1), (2, 1)}. Let k stand for the current group of the model parameters, Q(k) (s) = col[(f (s + r) − µ) + (f (s − r) − µ)|r ∈ (k) RN ], and m is the total number of groups. The SSLS estimate method is designed as follows, and the corresponding estimate parameters are denoted using the subscript “−” unlike those from the LS method. 1) For k = 1: β
(1)
=
(1)
Q
(1)
(s) Q
T (s)
−1
×
Q(1) (s) (f (s) − µ)
(7)
s∈S
ν (1) =
(k)
Q
(s) (f (s) − µ + R)
(9)
s∈S
ν
(k)
k
T (i) 2 1 (i) Q (s) β = (f (s) − µ)− M ×N i=1 s∈S
(10) where k−1
T (i) 1 (i) (f (s) − µ) − Q (s) β R= . m i=1
From (7)–(10), it is obvious that the parameters in the lowlevel group are computed independent of those in the higher level groups; however, the parameters in the high-level groups are estimated based on those in the lower level groups. For (1) example, the vector of the parameters in the first group β is calculated by (7), not relating to the parameters or neighbors in other groups. However, the vector of parameters in the second (4) (1) group β is calculated by (9) with k = 4, dependent on β , (2) (3) β , and β , via the supplement variable R. For the sake of simplicity, let us denote the original GMRF features as “classical GMRF” features, and denote the new GMRF features that consider the priority sequence, computed using the SSLS estimate method, as PS-GMRF features. A PS-GMRF feature vector was constructed using the results of (k) each estimating step, namely, ψ = {β , ν (k) |1 ≤ k ≤ m}. By means of SSLS, the lower order parameters are independent of those of higher orders. As a result, it is not necessary to determine the exact order of the GMRF model for a certain texture,3 and a higher order model is preferred. For some texture, if the order of the model used is higher than the exact order, the correlations between the center pixel and the higher order neighbor pixels are very weak, and the higher order parameters, which would not play an important role in texture recognition, will be abandoned by means of feature selection (described in Section II-B). The proposed SSLS algorithm is computationally more economical, wherein PS-GMRF texture features are calculated in groups, thereby obviating the large matrix process in the classical LS method. B. Texture Feature Selection
s∈S
T (1) 2 1 . (f (s) − µ) − Q(1) (s) β M ×N s∈S
(8)
A set of texture features having good discriminating power is essential for a texture-image-classification system. The presence of a large number of texture features could lead to poor 3 In the traditional way, model order is one of the important concerns for a successful texture classification. Usually, the more complicated the texture, the higher order model should be used. Kashyap and Chellappa [6] presented a method generated from Bayes procedure for estimation and choice of the method order for the GMRF texture.
ZHAO et al.: CLASSIFICATION OF HIGH SPATIAL RESOLUTION IMAGERY
1461
classification results if care is not given to the contribution of these features to the interclass separability [16]. In order to reduce data dimensions and improve classification quality, GMRF features extracted from the input image should be processed by feature selection. The problem of feature subset selection is to select a subset of b features from a larger set of a (b < a) features to optimize the value of a criterion over all subsets of the size b. There are ab = (a!/b!(a − b)!) such subsets, for example, 11 3 = (11!/3!8!) = 14 165 and 3 = (14!/3!11!) = 364. Exhaustive evaluation of all the subsets is computationally prohibitive. To avoid the exhaustive search for an optimal feature subset, direct methods of selecting a suboptimal feature subset instead of searching for an optimal subset have been developed [17], [18], such as sequential forward selection (SFS), sequential backward selection (SBS), plus-l take-away-r (PTA), sequential forward floating selection (SFFS), and sequential backward floating selection (SBFS). SFS (SBS) begins with a feature subset and sequentially adds (removes) features until the termination criterion is met. Both of them suffer from the so-called “nesting effect,” which implies that for SFS the features selected cannot be removed, and for SBS, the features discarded cannot be reselected later. PTA was proposed to prevent the “nesting” effect, repeating this process: Go forward l stages by adding l features by SFS and go backward r stages by deleting r features by SBS. However, there is no theoretical guidance to determine the appropriate value of l and r. SFFS and SBFS are the floating version of PTA [19]. SFFS can be understood as plus one minus x and minus one plus x, where x is dynamically changed according to the backtrack effect. Unlike PTA, SFFS can backtrack unlimitedly as long as the backtrack finds a better feature subset than the feature subset obtained so far at the same size. Moreover, SBFS is the backward version of SFFS. In this paper, the SFFS method is adopted in view of its good performance in both the quality of obtained feature subset and computation efficiency. Texture features are evaluated according to their discriminatory power. The criterion used here for feature selection is the Kappa coefficient of testing samples, based on the Gaussian maximum-likelihood classifier [20].
can be denoted by l = {l1 , l2 , . . . , lw }, where li takes on value in {1, 2, . . . , C}, and each pixel is assigned to one of the C classes according to predefined rules. The MAP estimator is to maximize the following expression [21]:
III. C LASSIFICATION M ETHODOLOGY
where α is the model parameter expressing the strength of how an occurrence of class c for pixel i is affected by its neighbor class attributes {lj , j ∈ V (i)}, Z is a normalizing constant, and δ(c, lj ) is an indicator function, if c = lj , δ(c, lj ) = −1, otherwise, δ(c, lj ) = 1. In fact, the variation of α does not greatly influence the results [24], and α is taken as 1.0 in our experiments. Then, using (13) and (14) and after some manipulations, (12) is equivalent to −1 1 (yi − µc )T (yi − µc ) li = arg min 2 c∈{1,2,...C} c 1 + ln | |+ αδ(c, lj ) . (15) 2 c
Maximum a posterior (MAP) estimation is known to be optimal in the sense that it is a probabilistic relaxation method as a means to enforce the spatial constraints into classification. The fundamental notion of spatial context is that the class attributions of two spatially adjacent pixels are highly related. For example, if s and s are two neighboring pixels and if s belongs to class c, then there is a high probability that pixel s also belongs to the same class c. Thus, classification should be carried out based not only on feature values of each pixel but also on the statuses of the corresponding pixels. An input multidimensional feature image to be classified is described by y = {y1 , y2 , . . . , yw }, where w is the number of pixels in the input image and yi = (yi1 , yi2 , . . . , yid )T represents a vector of d features (can be spectral and texture features) for the ith pixel. Assuming that there are C predetermined classes in the input feature image, the corresponding classified image
p(l|y) =
p(y|l)p(l) p(y)
(11)
where p(l|y) is the posteriori probability of l conditioned on y, p(y|l) denotes the likelihood distribution of y conditioned on l, p(l) is the a priori probability of l, and p(y) is the probability distribution of y. As p(y) is normally considered as constant, maximizing (11) is equivalent to maximizing the product of p(y|l) and p(l). In this paper, p(y|l) is modeled in terms of the Gaussian distribution, while p(l) is modeled based on the second-order multilevel logistic (MLL) model. When MAP estimation is applied to image classification, the task becomes a combinatory optimization problem [22]. A deterministic relaxation algorithm, iterated conditional modes (ICM) [23], is used because of its computational efficiency. The ICM solution is obtained by performing the following optimization. Moreover, for the ith pixel, the class attribute is determined by li = arg
max
c∈{1,2,...C}
{p(yi |c)p (c|lj , j ∈ V (i))}
(12)
where V (i) denotes the eight-connectivity neighbor set of the ith pixel. The law p(yi |c) is given by −1 1 1 T p(yi |c) = (yi − µc ) exp − (yi − µc ) 2 (2π)d | c | c (13) where µc is the mean vector and Σc is the covariance matrix associated with the class c. The chosen a priori term is an MLL model, which yields 1 p (c|lj , j ∈ V (i)) = exp − αδ(c, lj ) (14) Z j∈V (i)
j∈V (i)
Equation (15) is to achieve the minimum of the entire energy, which could be considered as the sum of the energy of the
1462
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 45, NO. 5, MAY 2007
Fig. 3.
QuickBird image mosaic of the six sampled land-cover classes.
Fig. 2. MAP–ICM classification algorithm flow chart.
and observed field (the first term plus the second term) and the energy of the labeled field (the third term). The classification method based on the MAP criterion using the ICM update scheme, named as the MAP–ICM algorithm, is implemented by the steps outlined below. The method is described by a flow chart in Fig. 2. Step 1) Estimate the statistic parameters (µc , Σc ) for each class. Step 2) Based on (µc , Σc ), estimate an initial classification using the noncontextual pixelwise maximumlikelihood decision rule by minimizing the energy of the observed field of (15). Step 3) Update (µc , Σc ) by µc = and c
=
1 yi I(li , c) nc i
(16)
1 (yi − µc )(yi − µc )T I(li , c) (17) nc i
where nc is the number of pixels in class c; I(li , c) = 1, if pixel i belongs to class c and zero otherwise. Step 4) For each pixel of the image, perform the local minimization defined by (15) and update the corresponding label in parallel. Step 5) Go back to Step 3) until the specified determination condition is satisfied. In this paper, we take the maximum iteration of 80 as the stopping criterion. Note that in Step 1), for the supervised classification, (µc , Σc ) is computed from the training samples of each class, while for the unsupervised classification (µc , Σc ) is obtained by the fuzzy C-means (FCM) algorithm [25]. Supposing a partition matrix U = [uci ], where uci represents the probability for the pixel i to be in the cluster c, µc and Σc are estimated from the uci w u2ci yi µc = i=1 (18) w 2 i=1 uci
c
=
w i=1
u2ci (yi − µc )(yi − µc )T w . 2 i=1 uci
(19)
IV. C LASSIFYING L AND -C OVER T YPES U SING Q UICK B IRD I MAGERY In this paper, we consider a QuickBird true-color image with 2.4-m spatial resolution of a suburban area in Beijing, China. Fig. 3 is a mosaic of six different land-cover types selected from the QuickBird image, coded as C1–C6 from left to right and then up to bottom. C1 is the residential districts made of concrete, in which houses are arranged orderly. C2 and C3 are woodlands with different density and have similar spectral characteristics; C2 is sparse and fine in texture, while C3 looks very thick and exhibits large particles in texture. C4, C5, and C6 are three different kinds of crops. From the point of view of reflective spectral values, C5 is different from C4 and C6, and C4 is quite like C6. Moreover, in terms of spatial structures, C5 is coarsest, and C6 is finest. The size of each image chip is 128 × 128. The classification scheme outlined in Fig. 4 was used for classifying different classes of the QuickBird image. First, band 1 (B-band) of QuickBird data was chosen for computing texture features, because this band has the maximum variability in terms of standard derivation. Texture features are extracted based on a moving window of a fixed size. We attempt to find a window that is not too large in order to limit computation time and avoid the influence of the texture features of adjacent different types; on the other hand, the window should not be too small, as otherwise a robust GMRF estimator would not be obtained [26]. The 17 × 17 pixels were used in the experiments as a result of a tradeoff between the application of too large and too small a window, and the odd number makes sure that the window is centered on the current pixel. Using a fourth-order GMRF model and a program written in the C++ language, a 14-D PS-GMRF feature vector is yielded for each pixel of the input image. All experiments in this paper are performed using the computer with Intel Pentium processor 1.60-GHz and EMS
ZHAO et al.: CLASSIFICATION OF HIGH SPATIAL RESOLUTION IMAGERY
1463
Fig. 6. CGMRF texture features and the corresponding classified image. (a) False-color image composed of the optimal three CGMRF texture feaˆ 1), β(1, ˆ −1), and νˆ, extracted from the QuickBird texture tures: β(0, mosaic. (b) Classified image using the combined spectral and selected CGMRF texture data.
Fig. 4.
Flow diagram for classifying the QuickBird image mosaic.
Fig. 5. PS-GMRF texture features and the corresponding classified image. (a) False-color image composed of the optimal three PS-GMRF texture features: ν (1) , ν (2) , and ν (3) , extracted from the QuickBird texture mosaic. (b) Classified image using the combined spectral and selected PS-GMRF texture data.
memory of 512 MB. The computational time for PS-GMRF texture feature extraction from Fig. 3 is 133 s. All the extracted texture features are normalized so that the measurements have zero mean and unit standard deviation. Afterwards, SFFS is implemented and the selected PS-GMRF texture features are ν (1) , ν (2) , and ν (3) . Fig. 5(a) shows the false-color images composed of the above features. Finally, the MAP–ICM classifier designed in Section III is applied to the QuickBird spectral bands combined with all the selected PS-GMRF features together, in which the initial classification is obtained by a maximum-likelihood technique. The classified image is depicted in Fig. 5(b). For comparison, the 11-D CGMRF feature vectors were also computed from band 1 of the QuickBird original image and tested according to the flow chart expressed by Fig. 4. The computation time for CGMRF texture feature extraction from Fig. 3 is 191 s, more than the 133 s for PS-GMRF. After feature ˆ 1), β(1, ˆ −1), selection, the selected CGMRF features are β(0, and νˆ, which are displayed in Fig. 6(a). Likewise, the classification scheme using spectral information incorporated the selected CGMRF texture information was employed to classify land-cover types, and Fig. 6(b) is the corresponding result. From Figs. 5 and 6, it is obvious that the PS-GMRF texture features perform better than CGMRF texture features. Since a precise ground truth for the QuickBrid texture mosaic is
available, the performance of the two types of texture features can be exactly evaluated in a quantitative manner. Overall accuracy and Kappa coefficient, which are proven to be general and robust for any classification algorithm (pixel-based classification methods, neighbor/region-based classification methods [27], [28], or object-based classification methods), are acted as criteria to summarize classification results [29]. Incorporating PS-GMRF features, a Kappa value of 0.9281 was obtained with an overall accuracy of 94.01%, while when incorporating CGMRF features, the Kappa coefficient was only 0.6621 and the overall accuracy is 71.84%. In order to further demonstrate that PS-GMRF features are superior to those of CGMRF, for each type of QuickBird samples in Fig. 3, 2000 samples were chosen to analyze the distribution of both. For PS-GMRF texture features, the histograms of ν (1) , ν (2) , and ν (3) are plotted in Fig. 7(a)–(c), respectively. Each graph consists of the histograms of only one feature for all the six land-cover types C1–C6, simultaneously. The average profile of the three selected PS-GMRF features is shown in Fig. 7(d). In the same way, the histograms of the ˆ 1), β(1, ˆ −1), and νˆ selected CGMRF texture features: β(0, are shown in Fig. 8(a)–(c) and the average profile in Fig. 8(d). A good feature should have minimal intraclass distance and maximum interclass distance. The histograms in Fig. 7 are approximately detached, while those in Fig. 8 almost coincide for the texture types. Obviously, PS-GMRF features are more discriminatory than CGMRF features. Moreover, the transformed divergence (TD) measures between class pairs of Fig. 3 are given in Table I(a) and (b), using the selected PS-GMRF and CGMRF features, respectively. TD values vary from 0.0 to 2.0 corresponding to complete overlap and ideal separation between two classes. For the selected PS-GMRF features, the overall average of TD values among classes was 1.99, whereas it dropped to 1.31 for the selected CGMRF features. Next, in order to evaluate the contribution of the integration of the spectral and texture features in the discrimination of the six land-cover types, the spectral data and the texture data were, respectively, used alone to classify the QuickBird texture mosaic. The comparison results are listed for five combinations in Table II: 1) using spectral features alone; 2) using the selected CGMRF texture features alone; 3) combining the spectral and selected CGMRF features; 4) using the selected PS-GMRF texture features alone; and 5) combing the spectral and selected PS-GMRF features. From this table, two valuable observations
1464
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 45, NO. 5, MAY 2007
Fig. 7. (a)–(c) Histograms of the selected PS-GMRF features: ν (1) , ν (2) , and ν (3) . Each of them corresponds to the distribution of one feature for six different texture types of Fig. 3. (d) Average profile of the selected PS-GMRF features.
ˆ 1), β(1, ˆ −1), and νˆ. Each of them corresponds to the distribution of one feature for six different Fig. 8. (a)–(c) Histograms of the selected CGMRF features: β(0, texture types of Fig. 3. (d) Average profile of the selected CGMRF features.
TABLE I TD MEASURES BETWEEN QUICKBIRD CLASS PAIRS USING THE SELECTED TEXTURE FEATURES (a) PS-GMRF AND (b) CGMRF
TABLE II CLASSIFICATION RESULTS FOR THE QUICKBIRD IMAGE MOSAIC USING DIFFERENT TYPES OF FEATURES
V. R ESIDENTIAL A REA E XTRACTION U SING L OW -O RDER V ARIANCE
can be made. One is that PS-GMRF texture features provide better discrimination than CGMRF texture features. The Kappa coefficient increased when using the selected PS-GMRF features (0.8474) compared to classification using the selected CGMRF features (0.3026). The other is that the integrated utilization of spectral and texture features can obtain a more satisfactory result than either spectral or texture features alone. The Kappa coefficient was 0.6214 using spectral data alone, 0.8474 using the selected PS-GMRF texture data alone, and 0.9281 using the combination of spectral and selected PS-GMRF texture data.
Residential areas are characterized by low correlation and high variance, which cannot be accurately detected by spectral data. It was found that one of the PS-GMRF texture features, the lowest order variance ν (1) , is effective for residential area extraction. The following introduces the experiments using the IKONOS and SPOT-5 images. These images were provided by the Space Imaging Corporation and the Satellite Imaging Corporation. A. Experiment on IKONOS Imagery Fig. 9(a) shows a true-color image collected by the IKONOS satellite and is situated at the Brazil/Paraguay border. This image consists of three scenes of 350 lines, each 400 pixels in length. When using the spectral features alone, the classified image was obtained with the MAP–ICM algorithm, as shown in Fig. 9(b). The initialization was processed by means of FCM, in which the number of clusters was fixed at five beforehand. From
ZHAO et al.: CLASSIFICATION OF HIGH SPATIAL RESOLUTION IMAGERY
Fig. 9. (a) Initial IKONOS image. (b) Classified result of (a) using the spectral data.
1465
Fig. 11. (a) Overlap image between the detected residential areas (in white lines) and the original IKONOS image. (b) Final classified result of Fig. 9(a) after combining detected residential areas. TABLE III CODES ABOUT FIVE TYPES OF LAND COVERS IN FIG. 11(b)
Fig. 10. PS-GMRF texture features extracted from Fig. 9(a). (a) Feature image of ν (1) , (b) ν (2) , (c) ν (3) , and (d) ν (4) .
Fig. 9, it is obvious that the spectrally heterogeneous residential areas, which are composed of some small subjects such as building and green plants, cannot be detected as an entire object using spectral features, while spectrally homogenous land covers, such as sea and sandy land areas, can be classified well with only spectral information. In order to reduce the blurring border effect, inherent in texture analysis and which introduces important errors in the transition areas between different texture units [12], a classification procedure was designed for such satellite images as follows. First, residential areas are detected using texture features; afterward, other objects are classified using spectral features. Based on the fourth-order GMRF model, PS-GMRF texture features were computed from the one band with the largest spectral variability. Fig. 10 displays the results of the extraction of four PS-GMRF features: ν (1) , ν (2) , ν (3) , and ν (4) . It is easy to see that the lowest order variance ν (1) has a larger capacity for discriminating between residential areas and other land-cover types. Two clusters, one a residential area and the other not, were detected in the lowest order variance space using FCM. Sequentially, a mask was generated to elim-
inate residential areas so that other land-cover types would be classified accurately, and not interfered with the spectrally heterogeneous residential areas. The MAP–ICM classification algorithm under the mask was performed on the input data in Fig. 9(a), followed by combining the resident map, as shown in Fig. 11(a). Fig. 11(b) is the final classified result. Comparing Figs. 9(b) and 11(b), it is obvious that the performance of classification in residential areas was improved by joining texture features with spectral features. In order to prove the effectiveness of the lowest order variance for residential-area detection in a quantitative manner, comparisons of discriminative abilities of fourth-order variances (ν (1) , ν (2) , ν (3) , and ν (4) ) are provided based on the classification result of Fig. 11(b). For the sake of simplicity, five different types of land covers are coded. The residential area labeled in white in Fig. 11(b) is denoted as D1, and other land covers are numbered by D2–D5 given in Table III. Just similar to Section IV, 2000 samples for each kind of land covers were selected according to the final classified image in Fig. 11(b), and histograms were plotted to analyze distributions of the fourth-order variances corresponding to each land-cover type, shown in Fig. 12. All of PS-GMRF texture features were normalized to have zero mean and unit standard deviation. Fig. 12 contains four graphs (a)–(d), corresponding to ν (1) , ν (2) , ν (3) , and ν (4) , respectively. Each graph has five curves of histogram statistics and each curve represents the distribution of one feature with regard to one type of land covers. In Fig. 12, the graph (a) has the largest gaps between D1 (residential areas) and D2–D5 (other types) and no overlapped area between them. It is not difficult to find that the first-order variance ν (1) has the higher ability to discriminate between residential areas and other land covers than those higher order variances, such as ν (2) −ν (4) .
1466
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 45, NO. 5, MAY 2007
Fig. 12. Histograms of four variances extracted from Fig. 9(a). (a) Histograms of ν (1) , (b) ν (2) , (c) ν (3) , and (d) ν (4) . Each image consists of five curves, and each curve corresponds to the distribution of one texture feature for one land-cover types of Fig. 11(b). Fig. 14. PS-GMRF texture variance features extracted from Fig. 13(a). (a) Feature image of ν (1) , (b) ν (2) , (c) ν (3) , and (d) ν (4) .
Fig. 13. (a) Initial SPOT-5 data. (b) Classified result of (a) using the spectral information.
B. Experiment on SPOT-5 Imagery Subsequent experiments were made on a SPOT-5 image. Fig. 13(a) is part of the SPOT-5 satellite image of Reno, NV, acquired on September 14, 2003, and the image size is 400 × 400. Using the spectral features, the MAP–ICM classification with FCM was carried out to detect five clusters. The result is shown in Fig. 13(b), and residential areas are not separated accurately from other land-cover types. Residential areas with high intraclass spectral variance would be detected exactly when texture features are considered. The band with the largest spectral variability was selected from Fig. 13(a) and PS-GMRF texture features computed with the fourth-order GMRF model. Fig. 14 shows the PS-GMRF variances: ν (1) , ν (2) , ν (3) , and ν (4) . In Fig. 14(a), the values are high for pixels inside the residential areas, which are characterized by high lowest order variance, while the values are low for pixels in other land covers, such as fields and water areas, which are characterized by small lowest order variance. To avoid the adverse effect of blurring at the boundaries, residential areas were first extracted
Fig. 15. (a) Overlap image between the detected residential areas (in white lines) and the original SPOT-5 image. (b) Final classified result of Fig. 13(a) after combining detected residential areas.
according to the lowest order variance. White lines outline the detected residential areas in Fig. 15(a). After masking the residential areas, the MAP–ICM procedure is processed in Fig. 13(a), in which FCM is used to initialize four clustering centers. By integrating the above two results, the final classified image was achieved, as shown in Fig. 15(b). The proposed classification approach improves the classification accuracy and visual interpretation, compared to classification with purely spectral features. Subsequently, capabilities of residential-area extraction are presented for different order variances to rededuce the above conclusion that the lowest order variance for the detection of residential areas is more effective than the higher order variances, based on the classification result. Five land-cover types are numbered E1–E5, as listed in Table IV. E1 represents residential areas, E2–E5 as the other land covers. In the
ZHAO et al.: CLASSIFICATION OF HIGH SPATIAL RESOLUTION IMAGERY
TABLE IV CODES ABOUT FIVE TYPES OF LAND COVERS IN FIG. 15(b)
Fig. 16. Histograms of four variances extracted from Fig. 13(a). (a) Histograms of ν (1) , (b) ν (2) , (c) ν (3) , and (d) ν (4) . Each image consists of five curves, and each curve corresponds to the distribution of one texture feature for one land-cover types of Fig. 15(b).
same way, for each type of land cover, 2000 samples selected, according to the classified image Fig. 15(b), were used for statistical calculation, sequentially the distribution curves of each order variance were given in Fig. 16, where Fig. 16(a)–(d) corresponding to ν (1) , ν (2) , ν (3) , and ν (4) , respectively. Moreover, in Fig. 16(a), there are five curves. Each curve depicts the distribution of the lowest order variance ν (1) of each kind of land covers, and the curve of E1 (residential areas) is apart from the other four curves of E2–E5 (other land covers). However, in Fig. 16(b)–(d), five curves overlap in large areas. Hence, it is established that the lowest order variance is more effective for residential area extraction than higher order variances. VI. C ONCLUSION Traditional per-pixel classification methods based upon spectral comparisons are not efficient for spatially heterogeneous land-cover and land-use classes in high spatial resolution imagery. The effectiveness of the spatial information has been tested via texture analysis based on the GMRF model to improve classification accuracy. This paper has developed a new set of GMRF texture features: PS-GMRF features. CGMRF features are computed when all the neighboring pixels are considered at the same time and treated equally. PS-GMRF
1467
features are derived when neighboring pixels are taken into account in a priority sequence dependent on their distance from the center pixel. A complete procedure has been designed to classify texture samples from high spatial resolution satellite imagery. First of all, PS-GMRF texture features are extracted. Then, SFFS is applied to obtain an optimal subset of PS-GMRF features. The selected PS-GMRF texture features, in combination with spectral features, are given as an input to the MAP–ICM classifier. The case study on the QuickBird image mosaic shows that a combination of texture and spectral features significantly improves the classification accuracy compared to classification with spectral features only. The Kappa coefficient increased from 0.6214 to 0.9281 with the addition of the selected PSGMRF texture features. On the other hand, CGMRF features are used instead of PS-GMRF features for comparison. The Kappa coefficient increased from 0.6214 to 0.6621 for an addition of the selected CGMRF texture features. It is obvious that PS-GMRF features are superior to CGMRF features. Furthermore, one of the PS-GMRF texture features—the lowest order variance—proved to be valid for extracting residential areas from high spatial resolution satellite imagery. In order to avoid the blurring-border effect due to texture features and the disturbance with spectrally heterogeneous residential areas, spectrally homogeneous objects were classified after residential-area detection, based on the lowest order variance. The MPA–ICM classifier was carried out, in which MFC is used for initialization. Some experimental results on IKONOS and SPOT-5 images demonstrated that the proposed method outperforms in terms of visual evaluation, compared with classification with spectral features alone. ACKNOWLEDGMENT The authors would like to thank the Space Imaging Corporation for providing the IKONOS image and the Satellite Imaging Corporation for providing the SPOT-5 image. R EFERENCES [1] S. P. Lennartz and R. G. Congalton, “Classifying and mapping forest cover types using IKONOS imagery in the northeastern United States,” in Proc. ASPRS Annu. Conf., Denver, CO, 2004. [Online]. Available: http://www.definiens.com/pdf/publications/ASPRS-2004-0125.pdf [2] S. E. Franklin, R. J. Hall, L. M. Moskal et al.,“Incorporating texture into classification of forest species composition from airborne multispectral images,”Int. J. Remote Sens., pp. 67–79, vol. 21, no.1, Jan. 2000. [3] T. R. Reed and J. M. H. Du Buf, “A review of recent texture segmentation and feature extraction techniques,” CVGIP: Image Underst., vol. 57, no. 3, pp. 359–372, May 1993. [4] P. P. Ohanian and R. C. Dubes, “Performance evaluation for four classes of textural features,” Pattern Recognit., vol. 25, no. 8, pp. 819–833, 1992. [5] D. A. Clausi and H. Deng, “Design-based texture feature fusion using Gabor filters and co-occurrence probabilities,” IEEE Trans. Image Process., vol. 14, no. 7, pp. 925–936, Jul. 2005. [6] R. L. Kashyap and R. Chellappa, “Estimation and choice of neighbors in spatial-interaction models of images,” IEEE Trans. Inf. Theory, vol. IT-29, no. 1, pp. 60–72, Jan. 1983. [7] R. Chellappa and S. Chatterjee, “Classification of textures using Gaussian Markov random fields,” IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-33, no. 4, pp. 959–963, Aug. 1985. [8] M. Datcu, K. Seidel, and M. Walessa, “Spatial information retrieval from remote-sensing images—Part I: Information theorectical perspective,” IEEE Trans. Geosci. Remote Sens., vol. 36, no. 5, pp. 1431–1445, Sep. 1998.
1468
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 45, NO. 5, MAY 2007
[9] M. Schroder, H. Rehrauer, K. Seidel, and M. Datcu, “Spatial information retrieval from remote-sensing images—Part II: Gibbs–Markov random fields,” IEEE Trans. Geosci. Remote Sens., vol. 36, no. 5, pp. 1446–1455, Sep. 1998. [10] M. Torres-Torriti and A. Jouan, “Gabor and GMRF features for SAR imagery classification,” in Proc. IEEE Int. Conf. Image Process., 2001, vol. 3, pp. 1043–1046. [11] D. A. Clausi and H. Deng, “Operational segmentation and classification of SAR sea ice imagery,” in Proc. IEEE Workshop Adv. Techn. Anal. Remotely Sensed Data, 2003, pp. 268–275. [12] D. A. Clausi and B. Yue, “Comparing co-occurrence probabilities and Markov random fields for texture analysis of SAR sea ice imagery,” IEEE Trans. Geosci. Remote Sens., vol. 42, no. 1, pp. 215–228, Jan. 2004. [13] G. R. Cross and A. K. Jain, “Markov random field texture models,” IEEE Trans. Pattern Anal. Mach. Intell., vol. PAMI-5, no. 1, pp. 25–39, Jan. 1983. [14] N. Balram and J. M. F. Moura, “Noncausal Gauss Markov random fields: Parameter structure and estimation,” IEEE Trans. Inf. Theory, vol. 39, no. 4, pp. 1333–1355, Jul. 1993. [15] G. Sharma and R. Chellappa, “A model-based approach for the estimation of two-dimensional maximum entropy power spectra,” IEEE Trans. Inf. Theory, vol. IT-31, no. 1, pp. 90–99, Jan. 1985. [16] K. P. Price, X. Guo, and J. M. Stiles, “Optimal Landsat TM band combinations and vegetation indices for discrimination of six grassland types in eastern Kansas,” Int. J. Remote Sens., vol. 23, no. 23, pp. 5031–5042, Dec. 2002. [17] D. Zongker and A. Jain, “Algorithms for feature selection: An evaluation,” in Proc. 13th Int. Conf. Pattern Recog., 1996, vol. 2, pp. 18–22. [18] M. Kudo and J. Sklansky, “Comparison of algorithms that select features for pattern classifiers,” Pattern Recognit., vol. 33, no. 1, pp. 25–41, Jan. 2000. [19] P. Pudil, J. Novovicova, and J. Kittler, “Floating search methods in feature selection,” Pattern Recognit. Lett., vol. 15, no. 11, pp. 1119–1125, Nov. 1994. [20] X. P. Jia and J. A. Richards, “Efficient maximum likelihood classification for imaging spectrometer data sets,” IEEE Trans. Geosci. Remote Sens., vol. 32, no. 2, pp. 274–281, Mar. 1994. [21] B. Tso and R. C. Olsen, “Scene classification using combined spectral, textural and contextual information,” in Proc. SPIE, 2004, vol. 5425, pp. 135–146. [22] Y. Hu and T. J. Dennis, “Simulated annealing and iterated conditional modes with selective and confidence enhanced update schemes,” in Proc. 5th Annu. IEEE Symp. Comput.-Based Med. Syst., 1992, pp. 257–264. [23] J. Besag, “On the statistical analysis of dirty pictures,” J. R. Stat. Soc. B, vol. 48, no. 3, pp. 259–302, 1986. [24] S. Krishnamachari and R. Chellappa, “Multiresolution Gauss–Markov random field models for texture segmentation,” IEEE Trans. Image Process., vol. 6, no. 2, pp. 251–267, Feb. 1997. [25] R. Krishnapuram, H. Frigui, and O. Nasraoui, “Fuzzy and possibilistic shell clustering algorithms and their application to boundary detection and surface approximation—Part I,” IEEE Trans. Fuzzy Syst., vol. 3, no. 1, pp. 29–43, Feb. 1995. [26] Q. Chen and P. Gong, “Automatic variogram parameter extraction for textural classification of the panchromatic IKONOS imagery,” IEEE Trans. Geosci. Remote Sens., vol. 42, no. 5, pp. 1106–1115, May 2004. [27] J. Stuckens, P. R. Coppin, and M. E. Bauer, “Integrating contextual information with per-pixel classification for improved land cover classification,” Remote Sens. Environ., vol. 71, no. 3, pp. 282–296, Mar. 2000. [28] X. Gigandet, M. B. Cuadra, A. Pointet et al.,“Region-based satellite image classification: Method and validation,” Proc. IEEE Int. Conf. Image Process., pp. 832–835, 2005. [29] J. Whiteside and W. Ahmad, “A comparison of object-oriented and pixel-based classification methods for mapping land cover in northern Australia,” in Proc. SSC Spatial Intell., Innovation and Praxis: National Biennial Conf. Spatial Sci. Inst., 2005, pp. 1225–1231.
Yindi Zhao received the Ph.D. degree in photogrammetry and remote sensing from Wuhan University, Wuhan, China, in 2006. Her current interests are in model-based texture analysis and pattern recognition, with applications in high spatial resolution remote sensing.
Liangpei Zhang (M’06) received the B.S. degree in physics from Hunan Normal University, ChangSha, China, in 1982, the M.S. degree in optics from the Xi’an Institute of Optics and Precision Mechanics, Chinese Academy of Sciences, Xi’an, in 1988 and the Ph.D. degree in photogrammetry and remote sensing from Wuhan University, Wuhan, China, in 1998. From 1997 to 2000, he was a Professor with the School of the Land Sciences, Wuhan University. In August 2000, he joined the State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University, as a Professor and Head of the Remote Sensing Section. He has published more than 120 technical papers. His research interests include hyperspectral remote sensing, high-resolution remote sensing, image processing, and artificial intelligence. Dr. Zhang has served as the Cochair of the International Society for Optical Engineers Series Conferences on Multispectral Image Processing and Pattern Recognition (MIPPR) and the Conference on Asia Remote Sensing in 1999, as an Editor of the MIPPR’01, MIPPR’05, and Geoinformatics Symposiums, as an Associate Editor of Geo-spatial Information Science Journal, as a member of the Chinese National Committee for the International Geosphere-Biosphere Programme, and as an Executive Member for China Society of Image and Graphics.
Pingxiang Li (M’06) received the B.S., M.S., and Ph.D. degrees in photogrammetry and remote sensing from Wuhan University, Wuhan, China, in 1986, 1994, and 2003, respectively. Since 2002, he has been a Professor with the State Key Laboratory of Information Engineering in Surveying, Mapping, and Remote Sensing, Wuhan University. His research interests include photogrammetry and SAR image processing.
Bo Huang received the Ph.D. degree in geographic information system from the Institute of Remote Sensing Applications, Chinese Academy of Sciences, Beijing, China, in 1997. He is currently an Associate Professor with the Department of Geography and Resource Management, The Chinese University of Hong Kong, Shatin, N.T., Hong Kong. His current research interests include spatial optimization, spatial statistics for change analysis, and image processing. He has authored and coauthored over 30 papers in refereed international journals and developed several GIS, spatial analysis, and optimization software packages.