Tree Species Discrimination in Tropical Forests ... - Semantic Scholar

Report 2 Downloads 54 Views
IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

73

Tree Species Discrimination in Tropical Forests Using Airborne Imaging Spectroscopy Jean-Baptiste Féret and Gregory P. Asner

Abstract—We identify canopy species in a Hawaiian tropical forest using supervised classification applied to airborne hyperspectral imagery acquired with the Carnegie Airborne Observatory-Alpha system. Nonparametric methods (linear and radial basis function support vector machine, artificial neural network, and k-nearest neighbor) and parametric methods (linear, quadratic, and regularized discriminant analysis) are compared for a range of species richness values and training sample sizes. We find a clear advantage in using regularized discriminant analysis, linear discriminant analysis, and support vector machines. No unique optimal classifier was found for all conditions tested, but we highlight the possibility of improving support vector machine classification with a better optimization of its free parameters. We also confirm that a combination of spectral and spatial information increases accuracy of species classification: we combine segmentation and species classification from regularized discriminant analysis to produce a map of the 17 discriminated species. Finally, we compare different methods to assess spectral separability and find a better ability of Bhattacharyya distance to assess separability within and among species. The results indicate that species mapping is tractable in tropical forests when using high-fidelity imaging spectroscopy. Index Terms—Carnegie Airborne Observatory (CAO), hyperspectral imaging, image classification, tree species identification, tropical biodiversity.

I. I NTRODUCTION

T

ROPICAL forests harbor a high diversity of canopy species, which can be viewed in high-resolution airborne and satellite imagery. The importance of monitoring changes in tropical canopy diversity has long been recognized from an ecological perspective and has become even more evident with changing land use and climate worldwide. However, discrimination of canopy species in high-resolution imagery continues to be a major challenge, and much progress is needed to facilitate monitoring of canopy composition in tropical forests. One of the most promising technologies for species mapping continues to be imaging spectroscopy, also known as hyperspectral remote sensing. Several studies have demonstrated the feasibility of mapping tropical species, despite the problem of spectral similarity, which is particularly challenging for species discrimination [1]–[3]. Species discrimination at the leaf level has been successfully performed on data collected in dry and humid tropManuscript received September 29, 2011; revised January 27, 2012 and February 29, 2012; accepted April 18, 2012. Date of publication July 16, 2012; date of current version December 19, 2012. The authors are with the Department of Global Ecology, Carnegie Institution for Science, Stanford, CA 94305 USA (e-mail: [email protected]; gpa@ stanford.edu). Digital Object Identifier 10.1109/TGRS.2012.2199323

ical forests [4], [5], showing the potential to separate up to 188 species so far tested [6]. Remotely sensed spectroscopic signatures are directly related to the chemical and structural properties of foliage and canopies [7], and several authors have reported successful assessment of indices related to biodiversity [8]–[10]. However, the heterogeneous structure of vegetation and the geometry of measurement also strongly influence reflectance throughout the optical domain, leading to lower classification accuracy compared to leaf scale discrimination [11]. In addition to these factors, the logistical difficulties inherent to field data acquisition in tropical forests have resulted in a much restricted set of accurately identified species in these ecosystems—about seven species per study thus far in the literature [11]–[13]. Various methods exist to identify and map tree crowns using imaging spectroscopy, but only a few have been applied in the context of species identification in tropical forests, and no consensus has been reached on which method performs best for operational use. The most popular method continues to be discriminant analysis [11], [13]–[18], which belongs to the category of parametric classifiers due to the assumption of the population’s normal distribution. However, several studies instead suggest the use of nonparametric classifiers to address the incorrect assumption of normal distribution [14], [19]. At the leaf scale, Castro-Esau et al. [14] showed that nonparametric classifiers performed systematically better than parametric classifiers. These results were tested and confirmed at the canopy scale by [20]. Zhang et al. [3] observed the non-normal distribution of spectral angle in individual tree crowns (ITCs) measured with hyperspectral airborne imagery and related it to the better performance of nonparametric classifiers. Although spectral angle may not be suitable for assessing the statistical distribution of each spectral feature, the robustness of parametric classifiers to non-normality should be further considered in the case of species discrimination in tropical forests [21], [22]. A comparison of classifiers needs to be performed with respect to some important factors such as the training sample size, as this strongly influences classification accuracy and may be a limiting factor in tropical rainforests. In addition to pixel-based discrimination with spectral data, the spatial component is important in the classification of objects, taking advantage of the correlations among neighboring pixels. Spatial-spectral classifiers have proven to be particularly interesting when applied to high spatial and spectral resolution data, showing pure or recognizable objects defined by a group of contiguous pixels [23], [24]. For tree species identification, preliminary delineation of tree crowns, sometimes combined with pixel filtering to avoid shaded pixels, for example,

0196-2892/$31.00 © 2012 IEEE

74

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

provides better results than does a simple pixel-based classification [11], [16], [25]. However, automatic tree crown delineation is particularly challenging in humid tropical forests because of the very dense vegetation, overlapping tree crowns, and the absence of clear boundaries between individuals. Spatial information is usually derived from image segmentation or from texture analysis [26]. Classical segmentation algorithms provide good results for ITC delineation when applied to simple forests like coniferous plantations [27], and their application to deciduous and tropical forests has proven much more challenging, but not impossible [28]. This paper aims to develop an operational method to map canopy species in tropical forests using a combination of spatial and spectral information obtained from airborne imaging spectroscopy. First, several classifiers are compared to determine whether nonparametric classifiers are more suitable than parametric classifiers for tropical species discrimination. Two factors of performance, particularly important in the context of our research, are explicitly studied: the quantity of data required to train the classifier and variations in classification accuracy with increasing species richness. Once the best method is selected based on pixel-scale classification accuracy, its performance is assessed when combined with spatial information. Finally, segmentation of a RGB image derived from the hyperspectral image, based on the mean shift clustering method [29], is applied to delineate tree clusters.

TABLE I P URE S PECIES I DENTIFIED AND L OCATED IN THE CAO-A LPHA I MAGERY

TABLE II DATA C ORRESPONDING TO C ANOPY C ROWNS C ONTAINING M IXED S PECIES I DENTIFIED AND L OCATED IN THE CAO-A LPHA I MAGERY

II. DATA C OLLECTION In this section we first introduce briefly the study site; then, we give the characteristics of the hyperspectral imagery used for our study; we finally describe the ground data collection and the species studied. A. Study Site This study is conducted at the Nanawale Forest Reserve, Island of Hawaii. The Nanawale forest is classified as lowland humid tropical forest, with an average elevation of 150 m above sea level. Mean annual precipitation and temperature are 3200 mm yr−1 and 23◦ C, respectively. The forest canopy is comprised of about 17 species, mostly invasive non-native trees, with some native species remaining [8]. B. Hyperspectral Imagery Imaging spectrometer data were acquired with the Carnegie Airborne Observatory (CAO)-Alpha sensor package [30] in September 2007 at an altitude of 1000 m above ground surface. The spatial resolution (ground sampling distance) of the image acquired during this flight is 0.56 m. A 1980 × 1420 pixel image is used, covering an area of about 70 ha. The spectral data contains 24 spectral bands evenly spaced between 390 nm and 1044 nm, including visible (VIS) and a part of the nearinfrared (NIR) domain. The spectra were radiometrically calibrated in the laboratory following the flight. Atmospheric corrections were applied using the ACORN 5LiBatch (Imspec LLC) model, and a MODTRAN look-up table to correct for Rayleigh scattering and aerosols [31].

C. Biodiversity and Tree Location The field observations were made in November 2010 using a tablet PC with integrated differentially corrected Global Position System (PC-GPS), and by delineating the tree crowns on the PC-GPS using the hyperspectral imagery as a guide. A mixed crown is defined as one in which two or more species occupy the same canopy space at a scale of 1–2 m spatial resolution. Most mixed crowns were composed of Cecropia spp. combined with another species. A total of 920 ITCs were identified and located, corresponding to 17 “pure” species and 12 types of mixed crowns, resulting in 29 different classes to be discriminated. Tables I and II list the different species encountered in the study, providing basic statistics on the number of pixels and tree crowns, as well as the average crown size corresponding to each class. Tables I and II show that the information collected in the field is highly variable in terms of number of labeled pixels, tree crowns delineated, and average size of the tree crowns. This unevenness led us to discard species with too few data for certain experiments.

FÉRET AND ASNER: TREE SPECIES DISCRIMINATION IN TROPICAL FORESTS

TABLE III D ESCRIPTION OF THE C LASSIFIERS S TUDIED , THE F REE PARAMETERS TO O PTIMIZE , AND THE O PTIMIZATION M ETHOD A PPLIED

III. C LASSIFICATION , S EGMENTATION , AND S PECIES S EPARABILITY In this section, we first describe the pixelwise classification, including the different classifiers and the method used to compare them; we continue with the combination of spatial and spectral information, and the method used for segmentation; we finally describe the different methods applied to assess class separability that we will compare. A. Pixel-Wise Classification Description of the classifiers: Several supervised classifiers are applied to perform tree species discrimination. These classifiers are divided into two distinct categories, and Table III summarizes the different classifiers compared here and gives a brief description. Parametric Classifiers: Several variants of discriminant analysis exist [32] and have already been successfully applied to classify vegetation types or to identify tree species at the canopy scale [11], [13]–[18], [33]. The two main methods are linear discriminant analysis (LDA) and quadratic discriminant analysis (QDA). The assumed normal distributions are derived from a training data set and usually based on maximum likelihood estimation: LDA uses a pooled within-class covariance matrix, whereas each class covariance is estimated separately in QDA. This makes LDA less sensitive to ill- and poorly posed problems. Several publications focusing at the leaf/branch scale confirmed the good performance of LDA for species discrimination [4], [6], [11], [34]. Numerous variations of discriminant analysis using regularization or extraction of orthogonal features have been developed to deal with ill-posed problems caused by singular covariance matrices. Bandos et al. [35] proposed a comparative study of different LDA-based methods and concluded on the effectiveness of the regularized discriminant analysis (RDA), initially developed by [36]. In RDA, a covariance mixing parameter (λ) and a covariance shrinkage parameter (γRDA ) must be adjusted between 0 and 1. The tuning of these parameters is performed with a comprehensive grid search in their domain of definition (with an increment of 0.1 for each parameter). The regularization is optimized

75

by minimizing the leave-one-out cross-validated estimate of misclassification risk, evaluated at each point of the grid [36]. The training of LDA and QDA is very fast, so is for RDA when using a reasonable number of nodes in the grid search. Nonparametric Classifiers: The complexity of the optimization stage and the computational resources required for nonparametric classifiers is highly variable. The k-nearest neighbor algorithm (k-N N ) performs classification based on a majority vote of its neighbors [37]. It is one of the simplest machine learning algorithms, and it has been widely used due to its good performance even with no prior knowledge about the distribution of the data. Here, the optimal number of neighbors to take into account for classification is selected after a leaveone-out cross-validation. The MATLAB toolbox PRTools 4.1 [38] is used to perform k-N N classification. Artificial neural networks (ANNs) are nonlinear statistical data modeling tools. They are widely used for classification and regression with remote sensing data [39]–[42]. AN N s proved their ability to model complex problems, but their efficiency depends on the parameterization of the learning process. The most commonly used AN N in remote sensing is the multilayer perceptron with back-propagation algorithm (M LP ) [41]. The selection of a proper architecture and the tuning of different parameters precede the M LP learning process. One single hidden layer is usually sufficient, and the number of neurons in this layer is critical to avoid poor classification and possibly overfitting. The learning rate η, the momentum μ, and the maximum number of iterations (“epochs”) also have to be set by the user before the learning process, and an activation function controlling the amplitude of the output is assigned to each layer. In this paper, a MATLAB program [43] is used to perform classification with M LP -AN N . The data are scaled, and twofold cross-validation is performed during the learning process. A trial-and-error strategy is used to determine appropriate values for the parameters. Finally, the characteristics of our classifier are identical for all iterations: 25 neurons in the hidden layer; η = 0.001; μ = 0.001; number of epochs = 5000; activation function: hyperbolic tangent for the input and hidden layers, sigmoid function for the output layer. The values obtained for the learning rate and momentum are relatively low compare to the recommendations of [44], but such values are not uncommon [45]. SV M is a kernel-based method that performs classification tasks by constructing hyperplanes in a multidimensional space to separate samples of different class labels [46]. This classifier is highly efficient and usually produces results comparable or better than other methods. Our study focuses on two kernels, linear (L) and radial basis function (RBF ). One or more parameters have to be optimized before performing the training stage. The penalty parameter C (also called error term) is common to all SV M s. It controls the tradeoff between complexity of decision rule and frequency of training error [47]. This is the only parameter to tune in linear SV M (L-SV M ). The kernel parameter γRBF also needs to be defined prior to applying RBF -SV M . SV M classifications are performed using the MATLAB interface of the LIBSV M package [48], and the procedure described by [49] is followed: first, the data are scaled between 0 and 1; then the optimal parameters are

76

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

assessed by an exhaustive grid search; finally, SV M is trained using the optimal parameters and a tenfold cross validation to estimate the training performance. The optimal γRBF is tuned in the range from 1e−5 to 1e5 , whereas the optimal C is selected in the range from 1e0 to 1e10 . SV M s are known to show good generalization performance and require no prior knowledge on the problem [50]. Contrary to AN N s, they can handle a large input space efficiently and are less subject to Hughes phenomenon [51]. However, in our situation, the number of spectral channels used (24) is relatively modest. Estimation of the performance of the classifiers: Several metrics exist to judge of the performance of a classifier: F-measure, κ statistic, and overall accuracy are commonly used depending on the type of classification problem. The overall classification accuracy is inadequate when the class distribution is imbalanced, so other metrics are preferred to compare classifiers performances in most of the practical cases [52], [53]. However, in our study, we chose to balance the number of training and test samples equally between the different species. This choice is driven by the lack of prior information about the distribution of each species, and the desire to treat each species equally. Both the F-measure and overall accuracy were computed, and these two metrics showed a very similar trend. Because the latter is also commonly used in several publications performing tree species identification, the average overall accuracy is the only metric we used to compare classifiers. Definition of the training and test data: Labeled pixels are extracted from the image to create two distinct pools of data used to train and test the classifiers. Their sensitivity to two factors is assessed. 1) Classification accuracy is compared among classifiers with varying number of training samples (3 to 400 randomly picked samples per species), and the test data set includes 1000 samples per species, excluding the training samples. Species with less than 1400 labeled pixels are discarded during this experiment, and so only ten pure species are selected (see Table I). The minimum number of samples per species required to perform optimal classification is critical, as it may vary among classifiers, and is also a limiting factor when studying environments such as tropical rainforests due to the difficulty and high cost of tree location on the ground. 2) Classification accuracy is compared among classifiers for different levels of biodiversity, from two-class discrimination to discrimination of all 17 species. For each level of species richness, the classes are randomly selected from the 17 species and are added in a random order to average the bias induced by a systematic sorting. Fifty samples are used for both training and test stage without sample repetition between the two data sets. An additional experiment is done with an added constraint on the number of tree crowns used in the training data set in order to assess the influence of a limited amount of ITC on classification, which corresponds to a possible decrease of the within-species variability. For this experiment, ten ITCs per species are randomly selected to supply 50 training samples for each species. The remaining ITCs are used to create the test data set including 50 samples

per species. The range of species richness then includes at most ten species. The performance of a classifier is strongly dependent upon the representativeness and the distribution of these training samples. Gougeon [25] reported more than 20% variation in classification accuracy with the same classifier, depending on the samples used for training. Foody [54] reported that the use of a nonexhaustively defined set of classes significantly degraded the accuracy of AN N and [55] showed that classification accuracy would vary depending on the forest composition. For each condition studied, 200 repetitions are processed in order to take this variability into account. Each repetition is characterized by the random selection of the training and test samples, as well as the random selection of the species if studied. B. Spatial-Spectral Classification Pixel-wise classification uses only spectral information, whereas the combination of spectral information with spatial information usually results in a superior classification [11], [16], [23], [26], [28]. Clark et al. [11] reported systematic improvement in species classification by using only one mean reflectance spectrum per tree crown (object-based classification), or by applying the majority-class rule to pixels within each tree crown (the greatest frequency in the class distribution decides on the class applied to the whole tree crown). Lucas et al. [16] delineated ITCs in Australian forests and performed pixel selection from each tree crown based on different spectral criteria (point maxima or mean-lit area determined using different wavebands or derived measures). They showed significant differences in classification depending on the criterion. Here, we compare pixel-wise classification to object-based and majorityclass rule classification using the delineated tree crowns. This comparison is performed using only the classifier showing the best performance on a pixel basis. The same ten species selected while studying the influence of training sample size are selected to perform this comparison because the number of pixels and tree crowns labeled for the discarded species is not enough to properly train the classifiers (Table I). Fifty training samples per species are used in the case of pixel-wise classification and crown classification based on the majorityclass rule. The classification accuracy is then tested on all remaining samples for pixel-wise classification. This pixel-wise classification accuracy corresponds to the proportion of the surface correctly classified for each species, averaged among all species. The quantity and size of tree crowns strongly varies for each species, thus studying the percentage of tree crowns correctly identified would be misleading if compared to pixelwise classification. The classification accuracy corresponding to tree crown classification based on the majority rule is then computed by assessing the proportion of the surface correctly identified for each species, averaged among all species. In the case of object-based classification, the mean reflectance is computed for each tree crown, and the classifier is trained and tested using these mean values. This results in a dramatic decrease in data availability, so a leave-one-out cross validation is performed with the tree crowns. In the two other methods, the overall accuracy is expressed in terms of mean proportion of surface correctly classified per species. These three approaches

FÉRET AND ASNER: TREE SPECIES DISCRIMINATION IN TROPICAL FORESTS

are compared when all pixels are used in each tree crown and also when only sunlit pixels are selected. For each tree crown, sunlit pixels are defined by a reflectance greater than the mean reflectance measured over the tree crown at 800 nm. C. Image Segmentation Automatic tree crown segmentation in humid tropical forests is very challenging due to the high density and biodiversity among overlapping tree crowns. With hyperspectral imagery, some methods including hyperspectral segmentation [56] and classification using spectral-spatial approaches [23], [24] showed promising results for automatic clustering and segmentation. However, these methods can be extremely timeconsuming for operational use and have never been applied to humid tropical forests. The mean shift clustering algorithm implemented in the Edge Detection and Image SegmentatiON system (EDISON) [57] gave satisfying results for automatic segmentation of tree crowns with a subset of three visible bands of our data (R = 646 nm; G = 560.7 nm; B = 447 nm). This nonparametric method only requires user settings for resolution, and the default values were unchanged, with an exception made for the minimum cluster size. The EDISON system quickly processed our data, and so the adjustment of the minimum cluster size was based on a visual comparison between the original image and its segmented counterpart. The segments do not exactly correspond to ITCs. However, [28] found that even polygons produced through automated methods that were only partially in agreement with detailed ground mapping improved tree species classification. After this segmentation, a pixel-wise classification is performed using the classifier showing the best performance for pixel-wise classification, with 50 pixels per species for training, and a majority vote rule is applied to decide about the class assigned to each region. D. Species Separability Supervised classifiers take advantage of the high variability among-classes compared to the low variability within-classes. However, within-species spectral variability induced by seasonal and environmental factors can lead to confusion in species recognition. Here, we want to compare a selection of metrics commonly used to assess spectral distances between classes. We create a reference data set including a limited amount of tree crowns per species: five ITCs are randomly selected when more than ten ITCs are available, three ITCs when between five and ten ITCs are available, and at least one ITC is randomly selected if two to four ITCs are available. A subset of 50 samples is then randomly selected for each species. The remaining ITCs are included in a test data set, and 50 samples are also randomly selected for each species. Within-species distance is based on the distance between the reference data set and the test data set of the same species, whereas among-species distance corresponds to the distance between the reference data set of a given species and the test data set of each other species. This experiment is repeated 200 times, and the mean within- and among-species distance are compared. In the case of Cananga orodata and Zingiber zerumbet, the two species encompassing only one ITC, 100 samples are randomly picked and equally distributed in the reference data set and the test data set.

77

Several metrics exist to assess class separability for applications in remote sensing. Here, we compare the performance of a selection of metrics, by assessing for each species the spectral distance between the reference data set and each species included in the test data set. Price [1] proposed two metrics based on the difference in amplitude (D) and spectral angle (θ). D corresponds to the root mean square difference between a pair of spectra Si and Sj averaged over the spectral interval of observation (λa to λb ) ⎛ ⎞ 12 λb 1 [Si (λ) − Sj (λ)]2 dλ⎠ . (1) DSi ,Sj = ⎝ λb − λa λa

θ represents the angle between two spectra and may be interpreted as the difference in shape between a pair of spectra: it is insensitive to illumination and albedo effect ⎛ ⎞  λb S (λ).S (λ)dλ j ⎜ ⎟ λa i θSi ,Sj = cos−1 ⎝

12 

12 ⎠. (2)  λb λb 2 2 λa Si (λ) dλ λa Sj (λ) dλ Here, Si corresponds to the mean reflectance spectrum of the reference data set for species i, and Sj corresponds to the mean reflectance spectrum of the test data set for species j. These nonparametric metrics are fast and compare two individual samples, with no assumption about the distribution of the variables of each sample. Two separability tests commonly used in remote sensing, the Bhattacharyya distance (Bhatt) and the divergence (Div), are also studied [58], [59]. They are computed under the Gaussian distribution hypothesis as follows:

−1 Ci − Cj 1 T Bhatti,j = (μi − μj ) 8 2  |i − C j | 1 × (μi − μj ) + ln  (3) 2 2 |Ci ||Cj |    1 Divi,j = T R (Ci − Cj ) Ci−1 − Cj−1 2   1  −1 (4) Ci − Cj−1 (μi − μj )(μi − μj )T + 2 where μi and μj correspond to the mean reflectance for classes i and j, Ci and Cj correspond to their covariance, and T r is the trace of the matrices. Bhatt and Div require a number of samples per species superior to the dimensionality of the data in order to avoid singular matrices which cannot be inverted. IV. R ESULTS We first report the results of the pixel-wise classification for the different experiments, then we show the influence of the combined spatial-spectral information. We also consider pixel-wise classification with mixed species classes, which is a common situation in tropical forests. Finally, we analyze species separability based on the metrics introduced previously. A. Classifier Performance 1) Influence of the Training Data: The influence of the training sample size per class strongly varies depending on the method used for classification (Fig. 1). Three groups can be

78

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

Fig. 1. Variation of classification accuracy (±1 SD) with respect to an increasing training sample size per species, performed with parametric classifiers (solid lines) and nonparametric classifiers (dotted lines). Classification includes ten species.

Fig. 2. Variation of classification accuracy (±1 SD) with the species richness level, performed with parametric classifiers (solid lines) and nonparametric classifiers (dotted lines). Training data includes 50 samples per species and 17 species are included in the experiment.

defined. First, a group includes M LP -AN N and k-N N , which are clearly outperformed by any other classifier, except QDA when including less than 30 samples per species (an ill-posed classification). However, the performance of these classifiers differs: k-N N increases quasilinearly from 100 to 400 training samples, reaching 62% accuracy whereas M LP -AN N reaches a maximum of 45% overall accuracy at about 100 samples per species, then decreases to 41% for 400 samples per species. The second group corresponds to QDA, which shows very poor performance when using less than 60 samples per species, but performs identically to RDA when more than 250 samples per species are included in the training data set. The third group includes LDA, RDA, L-SV M , and RBF -SV M . These classifiers behave relatively well even with very low sample size (about 50% accuracy when using five sample per species) and rapidly reach more than 70% overall accuracy with 30 samples per species or more. RDA is the most accurate classifier when the training sample size ranges from 30 to 70 samples per species (reaching 77% overall accuracy), but RBF -SV M and L-SV M outperform RDA when using more than 70 and 100 samples per species, respectively. LDA performs 5% to 10% worse than the three other classifiers when using more than 50 samples per species, reaching a maximum of 72%. Remarkably, the overall accuracy of RBF -SV M (and to a lesser extent L-SV M ) keeps increasing with training sample size, reaching 83% for 400 samples per species. We also considered optimal parameter values in k-N N , RDA, L-SV M , and RBF -SV M . The mean number of neighbors required to train the k-N N classifier increases quasi linearly from two neighbors when using three training samples per species to six neighbors for 400 training samples per species. The regularization parameters applied to RDA converges toward 0 when more than 250 training samples per species are used, explaining why RDA and QDA perform equally in such situation. The parameter λ shows a higher sensitivity to training sample size, whereas the optimal γRDA is systematically 0. Finally, for both SV M s, the mean optimal C parameter ranges from 1e4 , for very low training sample size corresponding to low classification accuracy, to a quasiconstant value between 1e7 and 1e8 with more than 20 training samples per species. The value of γRBF first decreases from 1e−1 to 1e−3 when

C increases and finally linearly increases from 1e−3 to 1e−1 whereas C remains quasiconstant. The selection of 50 to 100 training samples per species is enough to obtain near to optimal classification with LDA and RDA when using 24 bands. QDA requires at least 100 to 200 training samples per species to perform in near optimal conditions. The classification accuracy obtained with L-SV M , RBF -SV M and k-N N increases with training sample size over the whole range of the experiment. Finally, M LP -AN N shows very poor results compared to the other classifiers, which can be explained by a default in optimization of the parameters used during the training stage. 2) Influence of Biodiversity Level: Increasing species richness results in decreased classification accuracy for all classifiers (Fig. 2). RDA, L-SV M , and RBF -SV M show the highest classification accuracy when including six species or more, the slight advantage observed for RDA with low species richness tending to decrease with increasing species richness. In contrast, LDA and RDA perform identically for two-class discrimination but the difference in overall accuracy increases with species richness, in the favor of RDA. Moreover, k-N N , M LP -AN N , and QDA are outperformed for the whole range of species richness, and their inaccuracy increases with species richness, at a faster rate than the other classifiers. The decrease in classification accuracy does not show a linear trend, but the species richness studied here is not large enough to clearly confirm this nonlinear trend. Classification accuracy obtained when classifying ten species randomly picked from the 17 species available (Fig. 2) is significantly higher than classification accuracy obtained with 10 given species when using 50 training samples per species (Fig. 1). This can be explained by a better discrimination of some or all of the seven additional species used here. The variation of the regularization parameters with species richness confirms that RDA tends to be regularized toward LDA when discrimination is a two-class problem, with γRDA = 0 and λ = 0.7 on average. When adding classes, γRDA remains equal to 0 whereas λ converges toward 0.1.The average number of neighbors used with k-N N decreases with increasing species richness, from 4 for 2 species to 2 for 17 species. Finally, for SV M s, the value of C increases with

FÉRET AND ASNER: TREE SPECIES DISCRIMINATION IN TROPICAL FORESTS

TABLE IV P RODUCER ’ S ACCURACY (%), U SER ’ S ACCURACY (%), AND OVERALL ACCURACY (%) O BTAINED FOR THE D ISCRIMINATION OF 17 S PECIES W ITH RDA T RAINED W ITH 50 S AMPLES P ER S PECIES AND T ESTED W ITH 50 S AMPLES P ER S PECIES

79

TABLE V C LASSIFICATION OVERALL ACCURACY (%) O BTAINED W HEN C LASSIFYING T EN S PECIES W ITH RDA ON A P IXEL BASIS , M AJORITY VOTE RULE C LASSIFICATION D ERIVED F ROM ITC S D ELINEATED IN THE F IELD AND O BJECT-BASED C LASSIFICATION D ERIVED F ROM ITC S

explained by the important variability existing among ITCs of the same species, and the inability of the classifiers to correctly estimate this variation from a limited number of individual trees. Moreover, the best classifier is LDA when the number of ITCs available for training is low. These observations are very important, as the number of ITCs delineated for each species is a highly limiting factor during field data collection. It suggests that semisupervised classification methods [60], [61] should be used to overcome problems due to low training sample size and lack of representativeness of the delineated tree crowns and thus to improve classification accuracy. Some domain adaptation techniques may also help decrease the sample selection bias [62]. B. Combining Spectral and Spatial Information

Fig. 3. Variation of classification accuracy with the species richness level, performed with parametric classifiers (solid lines) and nonparametric classifiers (dotted lines). Only five ITCs per species are used for training; training data includes 50 samples per species and ten species are included in the experiment.

the number of species to discriminate, from 1e5 to about 1e8 . However, the distribution of C suggests that the range used to optimize C should be widened to higher values: we could not do so, however, because of the excessive computing time required. We conclude that SV M s can possibly perform better if a faster implementation associated with larger values of C for optimization is developed. γRBF remains stable between 1e−2 and 1e−3 . Table IV shows the user’s and producer’s accuracy obtained for the test data set when discriminating 17 species with the best classifier, RDA, using 50 training samples per species. Producer’s accuracy ranges from 47.4% to 98.1%, and user’s accuracy ranges from 41.1% to 98.9%. Ten species show more than 70% user’s and producer’s accuracy, and the overall accuracy is 73.2%. The same experiment done with a constraint on the number of ITCs used for each species in the training data set shows different results (Fig. 3): In this experiment k-N N and M LP -AN N are not shown as they perform poorly. The overall accuracy strongly decreases when using only five ITCs per species, and about 60% to 65% samples are correctly classified for a ten-class problem (to be compared with the 80% overall accuracy obtained for ten species in Fig. 2). This can be

Ten species are selected to study the combination of spectral/ spatial information. Between 16 and 168 tree crowns per species (see Table I) are used to study species classification at the tree crown scale using majority-class rule and object-based approaches. The comparison between classification using all pixels and only sunlit pixels at the pixel scale showed that prior selection of sunlit pixels slightly improves classification at the pixel scale (+1.5%) (Table V), as reported by [11]. The same effect is observed at the tree crown scale: tree crown identification based on the majority rule and an object-based approach produces slightly higher classification accuracy (respectively +0.2% and +1.6%) with sunlit pixels only. With all pixels, tree crown identification with the majority-class rule or the object-based approach show better results than when performing pixel-wise classification. We conclude that the majority-class rule using only sunlit pixels gives the best classification accuracy at the crown scale. C. Discrimination of Mixed Crowns As when using only pure species, RDA, L-SV M , and RBFSV M show higher performances over the whole range of species richness, followed by LDA (Fig. 4). The addition of classes corresponding to mixed crowns in the classification results in decreased overall accuracy compared to when 17 pure species are discriminated. Two reasons explain this lower accuracy: 1) higher species richness increases the risk of confusion between species; 2) some pixels may show a spectral mixture composed of a majority of one or the other species, or a balanced combination of both. It is impossible to describe this mixture accurately and the resulting spectroscopic signature may be very similar to a pure species and thus increase the risk of confusion: the fine analysis of the confusion matrix (not shown)

80

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

TABLE VI P RODUCER ’ S ACCURACY (%), U SER ’ S ACCURACY (%), AND OVERALL ACCURACY (%) O BTAINED FOR THE D ISCRIMINATION OF 17 P URE S PECIES AND 12 M IXED S PECIES W ITH RDA T RAINED W ITH 50 S AMPLES P ER S PECIES AND T ESTED W ITH 50 S AMPLES P ER S PECIES

Fig. 4. Variation of classification accuracy (±1 SD) with species richness. The classes include mixed species, and discrimination is performed with parametric classifiers (solid lines) and nonparametric classifiers (dotted lines). Training data includes 50 samples per species, 17 pure species and 12 mixed species are included in the experiment.

reveals that almost 20% of the pixels corresponding to the combination of Aleurites moluccana with Pandanus tectorius are misidentified as pure Pandanus tectorius. On the other hand, more than 70% of the crowns mixing Cecropia peltata, Mangifera indica, and Pandanus tectorius are correctly assessed, and very low systematic confusion with the individual species is observed, suggesting specific spectral signatures of these crowns. The nonlinear trend assumed previously when studying 17 species is confirmed with this 29-class discrimination. These results show the same trend obtained by [6] who performed species discrimination using LDA, based on leaf optical properties and canopy reflectances scaled-up from leaf optical properties. Some authors also report error in classification increasing linearly with species richness in dry tropical forest [4]. Most of the mixed species studied here combine Cecropia peltata with one or two other species. User’s accuracy obtained for Cecropia peltata decreases by almost 20%, whereas producer’s accuracy decreases by 27.5% (Table VI): About 50% of the pixels identified as Cecropia peltata are in fact mixed crowns including Cecropia peltata, and more than 50% of the pixels labeled as Cecropia peltata are classified as Cecropia peltata mixed with other species (results not shown). Eleven out of 12 mixed classes show less than 70% user’s and producer’s accuracy. However, the results obtained with pure species remain stable, and the only pure species showing a decrease in producer’s accuracy to less than 70% was Trema orientalis, because of the confusion with the mixed crowns Cecropia peltata + T rema orientalis. This result confirms that mixed crowns may induce a noticeable decrease in classification accuracy when using a conventional hard classifier in which each pixel (or other spatial unit) is assigned unambiguously to a single class. The overall accuracy of RDA, the best performing classifier when considering mixed crowns, applied to 29 species or mixed species classes reaches 61.7%. D. Species Mapping With Combined Segmentation and Pixel-Wise Classification The combination of mixed crowns and pure species results in increased error of discrimination, which leads us to discard mixed crowns from the species mapping in the hyperspec-

tral image. Fig. 5 (left) displays the hyperspectral data in three channels (Red = 646.0 nm; Green = 560.7 nm; Blue = 447.0 nm) and shows the ITCs delineated in the field and used for training and test. The results of the segmentation shown in Fig. 5 (right) indicate that the segmentation does not delineate ITCs correctly, but that the isolation is meaningful as the segments usually do not group tree crowns corresponding to different species and provide a good basis for species discrimination. However, these segments may correspond to fractions of tree crowns showing different sun exposure. As a consequence, the differentiation between sunlit and shaded pixels would not make sense. Thus, we decide to keep shaded pixels for tree crown classification. Each species is displayed using a unique color in both Fig. 5 (left) and Fig. 5 (right). E. Spectral Separability Here, we analyze the spectral distance within and among species (Fig. 6). The number of species for which withinspecies distance is smaller than among-species distance depends on the metric used: nine species show lower withinspecies variability than among species variability with D, whereas they are 12 with θ and Div, and 14 with Bhatt. These results also show that within-species distance for Cananga orodata and Zingiber zerumbet is much lower than amongspecies distance, which is explained by the fact that the spectra compared come from the same ITC. Finally, a clear

FÉRET AND ASNER: TREE SPECIES DISCRIMINATION IN TROPICAL FORESTS

81

Fig. 5. (Left) Image taken from the Nanawale Forest Reserve (HI). The three channels used to display the image are (R = 646 nm; G = 560.7 nm; B = 447 nm). The colored polygons correspond to the ITCs delineated after the field survey. (Right) Map of the 17 pure species (see Table I) obtained after applying species classification based on RDA trained with 50 samples per species. The colors correspond to the same species in each image.

Fig. 6. Comparison of different metrics to assess within-species (circles) and among-species (dots) spectral distance. (a) Difference in amplitude (D). (b) Spectral angle (θ). (c) Bhattacharyya distance (Bhatt). (d) Divergence (Div).

difference between within- and among-species distances is visible for three species only with D, seven species with θ, eight species with Div and 12 species with Bhatt. The latter is the only metric showing lower within-species distance for Trema orientalis and Pandanus tectorius. These results imply that Bhatt should be preferred to the other methods included in our

comparison to assess species separability. However, the close values obtained for within- and among-species separability of certain species (e.g., Cecropia peltata, Pithecellobium saman, Psidium cattleianum) suggest possible difficulties, confirmed by the low classification accuracy (Table IV). No significant correlation could be found between the confusion measured

82

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

between species after classification and the values obtained for the different metrics used for separability. V. D ISCUSSION A. Comparison of the Classifiers The results obtained when comparing the different classifiers show that L-SV M and RBF -SV M outperform the other classifiers when enough training samples are available. Moreover, the distribution observed for the optimal free parameters suggests that the overall accuracy can be improved for these two classifiers. The choice of RDA as the best classifier in the standard situation of our classification scheme (50 training samples per species) is based on our results, but we expect that a more accurate selection of the free parameters used by SV M s would lead to even better results. In the same way, the poor result obtained when using M LP -AN N for classification can be partly explained by the suboptimal values selected during the parameterization of the training stage. The relative performance of SV M s and AN N s is case dependent [63], [64] and more work needs to be done to compare these two classifiers under optimal conditions and in different vegetation types. B. Improving the Spectral/Spatial Classification Our study confirms the results of previous studies, showing that spatial information improves spectral classification. However, this improvement with spatial information remains limited due to the modest accuracy of segmentation applied to tropical forests: oversegmentation is preferred in order to avoid the inclusion of different species in one unique region. In this case, ITCs are segmented in several regions, and it becomes impossible to automatically apply a method to select sunlit pixels for example. Added to the difficulty of accurately segmenting overlapping tree crowns, the increasing confusion induced by mixed pixels shown in this study highlights the importance of improving ITC segmentation by taking into account the structure of the canopy. Information about canopy structure cannot be retrieved easily from spectroscopic measurements; however, multisensor systems combining hyperspectral imagery with LiDAR acquisition allow measurements of canopy structure and may be of great interest to improve tree crown segmentation and mask pixels corresponding to mixed tree crowns. Recent advances in 3-D segmentation using LiDAR show very promising results for tree species classification [65]. C. Assessing Spectral Separability and Confusion Overlapping among- and within-species spectral variability based on D and θ were reported by [4] and [6], raising rather pessimistic opinions about the feasibility of species identification in tropical forests. However, they observed good classification accuracy at the leaf level despite this high degree of overlap. This exaggerated similarity between within and among species variability is explained by the method used to assess within-species variability: Within-species difference in spectral amplitude and spectral angle correspond to the mean value obtained when assessing all possible pairs of leaf optics corresponding to one species. For both D and θ, this method applied to our data leads to very similar within- and among-

species variability measured on 17 species with 50 samples per species, which does not allow expecting the correct overall accuracy of 73%. We chose to compute within-species variability using only the mean spectra from the reference and test data sets, as it was the only way to perform a fair comparison with the other methods. Finally, measuring species separability based on D or θ alone does not represent the more complex spectral variability existing among-species that is explained by subtle differences in chemistry and structure, which unevenly influence different parts of the spectral domain. Including covariance between spectral features, as done with Bhatt and Div, greatly improves separability. The measure of separability also has to be compared to the relative quantity of samples available for each species: the systematic confusion observed between Cecropia peltata and Psidium guajava, or Pithecellobium saman and Psidium cattleianum (data not shown) suggest a very low spectral separability, confirmed by the similar Bhatt obtained within each species and between the two species. On the other hand, the large number of labeled pixels and individual trees available for Trema orientalis, Mangifera indica, Eucalyptus robusta, and Syzygium jambos and their excellent classification accuracy suggested good spectral separability for these species, which is confirmed by the Bhattacharyya distance, even if using only five ITCs per species. It is difficult to draw clear conclusions on the possibility of identifying Cocos nucifera or Flindersia brayleyana based on their classification accuracy, as the number of labeled samples and individuals identified on the ground is very low. In that situation, the risk of incorrectly estimating within-species variability and the error associated with field location increases leads either to an overestimation or an underestimation of spectral variability for a given species. VI. C ONCLUSION We performed species discrimination on airborne imaging spectrometer data collected over an area of high canopy diversity, demonstrating good performance of LDA, RDA, L-SV M , and RBF -SV M . We found no evidence that nonparametric classifiers perform better than parametric classifiers or vice versa. RDA provides many advantages including high classification accuracy relative to a small number of training samples, with a fast optimization of the model used for training, and it should be preferred when the training data includes enough ITCs for each studied species. However, if the number of ITCs does not guarantee to describe within-species variability accurately enough, LDA may be a better choice. L-SV M and RBF -SV M show similar performances to LDA and RDA, but we suggest that the overall accuracy obtained in our study could be improved by a better optimization of the parameterization used during the training stage. Moreover, we noticed that the increase of overall accuracy with training sample size was not asymptotic for these two types of SV M , contrary to LDA and RDA, highlighting once again the possibility to improve even more the performance of SV M s. QDA should be avoided, as it never has the advantage over LDA and RDA. The simple implementation and use of k-N N were offset by the relatively poor results, and AN N suffered from suboptimal

FÉRET AND ASNER: TREE SPECIES DISCRIMINATION IN TROPICAL FORESTS

parameterization, leading to poor classification accuracy. The confusion during classification increased with species richness (biodiversity), and it seems unlikely to expect high classification accuracy when more than 30 canopy species are discriminated at a time. These results suggest that alternative methods for classification, including semisupervised and unsupervised methods, should be adapted in order to handle the extreme species richness of tropical forests, and to compensate for the lack of training samples. Domain adaptation techniques should also be included in studies focusing on the temporal evolution of species distribution. Finally, several methods including measures of spectral similarity and scaling-up of leaf optical properties combined with a comprehensive leaf spectral library should be considered to improve studies on the biodiversity of tropical forests using remote sensing. From this perspective, our comparison of different metrics used to assess the distance within and among species also showed that within- and amongspecies distances are usually low, and we recommend the Bhattacharyya distance to assess spectral distances. This study shows the potential of VIS/NIR hyperspectral imagery to identify species in tropical forests, despite the relatively low spectral resolution of our data. The next generation of sensors measuring into the shortwave infrared, and potentially coupling measurements with LiDAR, will provide an opportunity to greatly improve classification accuracies at the species level with much expanded spectral and spatial information [66]. ACKNOWLEDGMENT The authors thank R. Martin, J. Asner, and K. Kinney for field data collection. The authors also thank the two anonymous reviewers for their valuable comments and advise to improve the manuscript. The Carnegie Spectranomics Project (http:// spectranomics.ciw.edu) and this work are supported by the John D. and Catherine T. MacArthur Foundation, the Gordon and Betty Moore Foundation, the Grantham Foundation for the Protection of the Environment, Brigitte Berthelemot, and William Hearst III. R EFERENCES [1] J. C. Price, “How unique are spectral signatures,” Remote Sens. Environ., vol. 49, no. 3, pp. 181–186, Sep. 1994. [2] M. A. Cochrane, “Using vegetation reflectance variability for species level classification of hyperspectral data,” Int. J. Remote Sens., vol. 21, no. 10, pp. 2075–2087, Jul. 10, 2000. [3] J. K. Zhang, B. Rivard, A. Sanchez-Azofeifa, and K. Castro-Esau, “Intra and inter-class spectral variability of tropical tree species at La Selva, Costa Rica: Implications for species identification using HYDICE imagery,” Remote Sens. Environ., vol. 105, no. 2, pp. 129–141, Nov. 30, 2006. [4] K. L. Castro-Esau, G. A. Sanchez-Azofeifa, B. Rivard, S. J. Wright, and M. Quesada, “Variability in leaf optical properties of Mesoamerican trees and the potential for species classification,” Amer. J. Botany, vol. 93, no. 4, pp. 517–530, Apr. 2006. [5] B. Rivard, A. Sanchez-Azofeifa, S. Foley, and J. Calvo-Alvarado, “Species Classification of Tropical Tree Leaf Reflectance and Dependence on Selection of Spectral Bands,” in Hyperspectral Remote Sensing of Tropical and Sub-Tropical Forests. Boca Raton, FL: CRC Press, 2008, pp. 141–159. [6] J. B. Feret and G. P. Asner, “Spectroscopic classification of tropical forest species using radiative transfer modeling,” Remote Sens. Environ., vol. 115, no. 9, pp. 2415–2422, Sep. 15, 2008. [7] G. P. Asner and R. E. Martin, “Airborne spectranomics: Mapping canopy chemical and taxonomic diversity in tropical forests,” Frontier Ecol. Environ., vol. 7, pp. 269–276, Jun. 2009.

83

[8] K. M. Carlson, G. P. Asner, R. F. Hughes, R. Ostertag, and R. E. Martin, “Hyperspectral remote sensing of canopy biodiversity in Hawaiian lowland rainforests,” Ecosystems, vol. 10, no. 4, pp. 536–549, Jun. 2007. [9] G. M. Foody and M. E. J. Cutler, “Tree biodiversity in protected and logged Bornean tropical rain forests and its measurement by satellite remote sensing,” J. Biogeogr., vol. 30, no. 7, pp. 1053–1066, Jul. 2003. [10] G. M. Foody and M. E. J. Cutler, “Mapping the species richness and composition of tropical forests from remotely sensed data with neural networks,” Ecol. Model., vol. 195, no. 1/2, pp. 37–42, May 15, 2006. [11] M. L. Clark, D. A. Roberts, and D. B. Clark, “Hyperspectral discrimination of tropical rain forest tree species at leaf to crown scales,” Remote Sens. Environ., vol. 96, no. 3/4, pp. 375–398, Jun. 30, 2005. [12] H. Z. M. Shafri, A. Suhaili, and S. Mansor, “The performance of maximum likelihood, spectral angle mapper, neural network and decision tree classifiers in hyperspectral image analysis,” J. Comput. Sci., vol. 3, no. 6, pp. 419–423, 2007. [13] M. Papes, R. Tupayachi, P. Martinez, A. T. Peterson, and G. V. N. Powell, “Using hyperspectral satellite imagery for regional inventories: A test with tropical emergent trees in the Amazon Basin,” J. Vegetation Sci., vol. 21, no. 2, pp. 342–354, Apr. 2010. [14] K. L. Castro-Esau, G. A. Sanchez-Azofeifa, and T. Caelli, “Discrimination of lianas and trees with leaf-level hyperspectral data,” Remote Sens. Environ., vol. 90, no. 3, pp. 353–372, Apr. 15, 2004. [15] P. Gong, R. L. Pu, and B. Yu, “Conifer species recognition: An exploratory analysis of in situ hyperspectral data,” Remote Sens. Environ., vol. 62, no. 2, pp. 189–200, Nov. 1997. [16] R. Lucas, P. Bunting, M. Paterson, and L. Chisholm, “Classification of Australian forest communities using aerial photography, CASI and HyMap data,” Remote Sens. Environ., vol. 112, no. 5, pp. 2088–2103, May 15, 2008. [17] P. S. Thenkabail, E. A. Enclona, M. S. Ashton, C. Legg, and M. J. De Dieu, “Hyperion, IKONOS, ALI, ETM plus sensors in the study of African rainforests,” Remote Sens. Environ., vol. 90, no. 1, pp. 23–43, Mar. 15, 2004. [18] J. A. N. van Aardt and R. H. Wynne, “Spectral separability among six southern tree species,” Photogramm. Eng. Remote Sens., vol. 67, no. 12, pp. 1367–1375, Dec. 2001. [19] R. Pu, P. Gong, Y. Tian, X. Miao, R. I. Carruthers, and G. L. Anderson, “Invasive species change detection using artificial neural networks and CASI hyperspectral imagery,” Environ. Monit. Assessment, vol. 140, no. 1–3, pp. 15–32, May 2008. [20] M. Kalacska, S. Bohman, G. A. Sanchez-Azofeifa, K. Castro-Esau, and T. Caelli, “Hyperspectral discrimination of tropical dry forest lianas and trees: Comparative data reduction approaches at the leaf and canopy levels,” Remote Sens. Environ., vol. 109, no. 4, pp. 406–415, Aug. 30, 2007. [21] T. Hastie, A. Buja, and R. Tibshirani, “Penalized discriminant-analysis,” Ann. Stat., vol. 23, no. 1, pp. 73–102, Feb. 1995. [22] J. R. Rausch and K. Kelley, “A comparison of linear and mixture models for discriminant analysis under nonnormality,” Behav. Res. Methods, vol. 41, no. 1, pp. 85–98, Feb. 2009. [23] Y. Tarabalka, J. A. Benediktsson, and J. Chanussot, “Spectral-spatial classification of hyperspectral imagery based on partitional clustering techniques,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 8, pp. 2973– 2987, Aug. 2009. [24] Y. Tarabalka, J. Chanussot, and J. A. Benediktsson, “Segmentation and classification of hyperspectral images using watershed transformation,” Pattern Recog., vol. 43, no. 7, pp. 2367–2379, Jul. 2010. [25] F. A. Gougeon, “Comparison of possible multispectral classification schemes for tree crowns individually delineated on high spatial resolution meis images,” Can. J. Remote Sens., vol. 21, no. 1, pp. 1–9, 1995. [26] P. Bunting, W. He, R. Zwiggelaar, and R. Lucas, “Combining texture and hyperspectral information for the classification of tree species in australian savanna woodlands,” in Innovations in Remote Sensing and Photogrammetry, S. Jones and K. Reinke, Eds. Berlin, Germany: Springer-Verlag, 2009, pp. 19–26. [27] D. G. Leckie, F. A. Gougeon, N. Walsworth, and D. Paradine, “Stand delineation and composition estimation using semi-automated individual tree crown analysis,” Remote Sens. Environ., vol. 85, no. 3, pp. 355–369, May 30, 2003. [28] T. A. Warner, J. B. McGraw, and R. Landenberger, “Segmentation and classification of high resolution imagery for mapping individual species in a closed canopy, deciduous forest,” Sci. China. Ser. E., vol. 49, no. 1, pp. 128–139, Jun. 2006. [29] D. Comaniciu and P. Meer, “Mean shift: A robust approach toward feature space analysis,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 24, no. 5, pp. 603–619, May 2002.

84

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING, VOL. 51, NO. 1, JANUARY 2013

[30] G. P. Asner, D. E. Knapp, T. Kennedy-Bowdoin, M. O. Jones, R. E. Martin, J. Boardman, and C. B. Field, “Carnegie airborne observatory: In-flight fusion of hyperspectral imaging and waveform light detection and ranging (wLiDAR) for three-dimensional studies of ecosystems,” J. Appl. Remote Sens., vol. 1, no. 1, pp. 013536-1–013536-21, 2007. [31] R. O. Green, B. E. Pavri, and T. G. Chrien, “On-orbit radiometric and spectral calibration characteristics of EO-1 Hyperion derived with an underflight of AVIRIS and in situ measurements at Salar de Arizaro, Argentina,” IEEE Trans. Geosci. Remote Sens., vol. 41, no. 6, pp. 1194– 1203, Jun. 2003. [32] R. A. Fisher, “The use of multiple measurements in taxonomic problems,” Ann. Eugenics, vol. 7, no. 2, pp. 179–188, Sep. 1936. [33] J. A. N. Van Aardt and R. H. Wynne, “Examining pine spectral separability using hyperspectral data from an airborne sensor: An extension of field-based results,” Int. J. Remote Sens., vol. 28, no. 2, pp. 431–436, Jan. 2007. [34] T. Fung, F. Y. Ma, and W. L. Siu, “Hyperspectral data analysis for subtropical tree species recognition,” in Proc. IEEE IGARSS, 1998, vol. 3, pp. 1298–1300. [35] T. V. Bandos, L. Bruzzone, and G. Camps-Valls, “Classification of hyperspectral images with regularized linear discriminant analysis,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 3, pp. 862–873, Mar. 2009. [36] J. H. Friedman, “Regularized discriminant-analysis,” J. Amer. Stat. Assoc., vol. 84, no. 405, pp. 165–175, Mar. 1989. [37] E. Fix and J. L. Hodges, “Discriminatory analysis—nonparametric discrimination—consistency properties,” Int. Stat. Rev., vol. 57, no. 3, pp. 238–247, Dec. 1989. [38] R. P. W. Duin, P. Juszczak, P. Paclik, E. Pekalska, D. de Ridder, D. M. J. Tax, and S. Verzakov, “PRTools4.1, A Matlab toolbox for pattern recognition,” Delft Univ. Technol., Delft, The Netherlands, 2007. [39] S. Haykin, Ed., Neural Networks: A Comprehensive Foundation, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 1999, 842 Pages. [40] L. Bruzzone and D. F. Prieto, “A technique for the selection of kernelfunction parameters in RBF neural networks for classification of remotesensing images,” IEEE Trans. Geosci. Remote Sens., vol. 37, no. 2, pp. 1179–1184, Mar. 1999. [41] J. F. Mas and J. J. Flores, “The application of artificial neural networks to the analysis of remotely sensed data,” Int. J. Remote Sens., vol. 29, no. 3, pp. 617–663, Feb. 2008. [42] H. Yang, F. van der Meer, W. Bakker, and Z. J. Tan, “A back-propagation neural network for mineralogical mapping from AVIRIS data,” Int. J. Remote Sens., vol. 20, no. 1, pp. 97–110, Jan. 10, 1999. [43] Y. H. Hu and J.-N. Hwang, “Introduction to neural networks for signal processing,” in Handbook of Neural Network Signal Processing. Boca Raton, FL: CRC Press, 2001. [44] T. Kavzoglu and P. M. Mather, “The use of backpropagating artificial neural networks in land cover classification,” Int. J. Remote Sens., vol. 24, no. 3, pp. 4907–4938, Dec. 2003. [45] J. D. Paola and R. A. Schowengerdt, “The effect of neural-network structure on a multispectral land-use/land-cover classification,” Photogramm. Eng. Remote Sens., vol. 63, no. 5, pp. 535–544, May 1997. [46] V. N. Vapnik, The Nature of Statistical Learning Theory. New York: Springer-Verlag, 1995. [47] C. Cortes and V. Vapnik, “Support-vector networks,” Mach. Learn., vol. 20, no. 3, pp. 273–297, Sep. 1995. [48] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,” ACM Trans. Intell. Syst. Technol., vol. 2, no. 3, pp. 1–39, Apr. 2011. [49] C.-W. Hsu, C.-C. Chang, and C.-J. Lin, “A practical guide to support vector classification,” Bioinformatics, vol. 1, no. 1, pp. 1–16, 2010. [50] C. J. C. Burges, “A tutorial on support vector machines for pattern recognition,” Data Mining Knowl. Discovery, vol. 2, no. 2, pp. 121–167, Jun. 1998. [51] F. Melgani and L. Bruzzone, “Classification of hyperspectral remote sensing images with support vector machines,” IEEE Trans. Geosci. Remote Sens., vol. 42, no. 8, pp. 1778–1790, Aug. 2004. [52] F. Provost and T. Fawcett, “Analysis and visualization of classifier performance: Comparison under imprecise class and cost distributions,” in Proc. KDD, Newport Beach, CA, 1997, pp. 43–48. [53] H. Guo and H. L. Viktor, “Learning from imbalanced data sets with boosting and data generation: The DataBoost-IM approach,” SIGKDD Explor. Newsl., vol. 6, no. 1, pp. 30–39, Jun. 2004. [54] G. M. Foody, “Hard and soft classifications by a neural network with a non-exhaustively defined set of classes,” Int. J. Remote Sens., vol. 23, no. 18, pp. 3853–3864, Sep. 2002. [55] D. G. Leckie, S. Tinis, T. Nelson, C. Burnett, F. A. Gougeon, E. Cloney, and D. Paradine, “Issues in species classification of trees in old growth

[56] [57] [58] [59] [60]

[61]

[62]

[63] [64] [65] [66]

conifer stands,” Can. J. Remote Sens., vol. 31, no. 2, pp. 175–190, Apr. 2005. N. Gorretta, J. M. Roger, G. Rabatel, V. Bellon-Maurel, C. Fiorio, and C. Lelong, “Hyperspectral image segmentation: The butterfly approach,” in Proc. 1st WHISPERS, 2009, pp. 1–4. C. M. Christoudias, B. Georgescu, and P. Meer, “Synergism in low level vision,” in Proc. 16th Int. Conf. Pattern Recog., 2002, vol. 4, pp. 150–155. T. Kailath, “The divergence and Bhattacharyya distance measures in signal selection,” IEEE Trans. Commun. Technol., vol. 15, no. 1, pp. 52–60, Feb. 1967. J. A. Richards and X. Jia, Remote Sensing Digital Image Analysis, vol. 46. New York: Springer-Verlag, 2006. L. Bruzzone, C. Mingmin, and M. Marconcini, “A novel transductive SVM for semisupervised classification of remote-sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 44, no. 11, pp. 3363–3373, Nov. 2006. B. M. Shahshahani and D. A. Landgrebe, “The effect of unlabeled samples in reducing the small sample size problem and mitigating the Hughes phenomenon,” IEEE Trans. Geosci. Remote Sens., vol. 32, no. 5, pp. 1087– 1095, Sep. 1994. L. Bruzzone and M. Marconcini, “Toward the automatic updating of land-cover maps by a domain-adaptation SVM classifier and a circular validation strategy,” IEEE Trans. Geosci. Remote Sens., vol. 47, no. 4, pp. 1108–1122, Apr. 2009. M. Pal and P. M. Mather, “Support vector machines for classification in remote sensing,” Int. J. Remote Sens., vol. 26, no. 5, pp. 1007–1011, Mar. 10, 2005. B. Dixon and N. Candade, “Multispectral landuse classification using neural networks and support vector machines: One or the other, or both?” Int. J. Remote Sens., vol. 29, no. 4, pp. 1185–1206, 2008. A. Ferraz, F. Bretar, S. Jacquemoud, G. Gonçalves, L. Pereira, M. Tomé, and P. Soares, “3-D mapping of a multi-layered Mediterranean forest using ALS data,” Remote Sens. Environ., vol. 121, pp. 210–223, Jun. 2012. G. P. Asner, D. E. Knapp, T. Kennedy-Bowdoin, M. O. Jones, R. E. Martin, J. Boardman, and R. F. Hughes, “Invasive species detection in Hawaiian rainforests using airborne imaging spectroscopy and LiDAR,” Remote Sens. Environ., vol. 112, no. 5, pp. 1942–1955, May 15, 2008.

Jean-Baptiste Féret received the M.Sc. degree in AgroTIC from the International Center for Higher Education in Agricultural Sciences, Montpellier, France, in 2005, and the Ph.D. degree in physics fundamentals from the Universite Pierre et Marie Curie—Paris 6, Paris, France, in 2009. Currently, he is a PostDoctoral Fellow at the Department of Global Ecology, Carnegie Institution for Science, Stanford, CA. His scientific research focuses on hyperspectral remote sensing of vegetation, including assessment of biophysical properties using radiative transfer modeling and statistical methods, species identification, and image classification using high spectral and spatial resolution imagery.

Gregory P. Asner received the B.S. degree in civil and environmental engineering, the M.A. degree in biogeography, and the Ph.D. degree in environmental, organismic, and population biology from the University of Colorado, Boulder, in 1991, 1995, and 1997, respectively. Currently, he is a Faculty Member in the Department of Global Ecology, Carnegie Institution for Science, Stanford, CA. He also holds a faculty position in the Department of Environmental Earth System Science, Stanford University, Stanford. His scientific research centers on how human activities alter the composition and functioning of ecosystems at regional scales. He combines field work, airborne and satellite mapping, and computer simulation modeling to understand the response of ecosystems to land use and climate change.