Land Cover Classification from MODIS Satellite Data Using Probabilistically Optimal Ensemble of Artificial Neural Networks Kenneth J. Mackin1, Eiji Nunohiro1, Masanori Ohshiro2, and Kazuko Yamasaki2 1
Department of Information Systems, Tokyo University of Information Sciences 1200-2 Yatoh-cho, Wakaba-ku, Chiba, Japan {mackin, nunohiro}@rsch.tuis.ac.jp 2 Department of Enviromental Information, Tokyo University of Information Sciences 1200-2 Yatoh-cho, Wakaba-ku, Chiba, Japan {ohshiro, yamasaki}@rsch.tuis.ac.jp
Abstract. Terra and Aqua, 2 satellites launched by the NASA-centered international Earth Observing System project, house MODIS (Moderate Resolution Imaging Spectroradiometer) sensors. Moderate resolution remote sensing allows the quantifying of land surface type and extent, which can be used to monitor changes in land cover and land use for extended periods of time. In this paper, we propose applying a probabilistically optimal ensemble technique, based on fault masking among individual classifier for N-version programming. We create an optimal ensemble of artificial neural networks and use the majority voting result to predict land surface cover from MODIS data. We show that an optimal ensemble of neural networks greatly improves the classification error rate of land cover type.
1 Introduction With the increased interest in monitoring the global ecological changes, the demand for satellite remote sensing has increased. NASA-centered international Earth Observing System project has launched many satellites to monitor the earth for scientific purposes. Two such satellites, Terra and Aqua house MODIS (Moderate Resolution Imaging Spectroradiometer) sensors. Moderate resolution remote sensing allows the quantifying of land surface type and extent, which can be used to monitor changes in land cover and land use for extended periods of time. Past research in intelligent classification of remote sensor data using neural networks[1][2][3] has given promising results. Kushardono et al. [3] compared different neural network structures in order to use several different bandwidth data as input for a target classification cell. For this research, we investigate the effect of applying probabilistically optimal ensemble of neural networks for the classification of land cover type from MODIS satellite remote sensing data. Neural networks, as with other training based classifiers, inherently have a risk that when classifying an untrained data set, the classifying error rate may be much worse than the training result. In order to overcome this risk, we applied a probabilistically optimal ensemble technique proposed by Imamura et al. [4] B. Gabrys, R.J. Howlett, and L.C. Jain (Eds.): KES 2006, Part III, LNAI 4253, pp. 820 – 826, 2006. © Springer-Verlag Berlin Heidelberg 2006
Land Cover Classification from MODIS Satellite Data
821
to N-version programming of neural networks. Our purpose is to research the validity of using a training based classifier method for land cover type, compared against the results of previous statistical methods. We applied the proposed method to MODIS data for classifying land cover for Chiba prefecture, Japan to test the validity of the method.
2 MODIS Data With the increased interest in monitoring the global ecological changes, the demand for satellite remote sensing has increased. NASA-centered international Earth Observing System project has launched many satellites to monitor the earth for scientific purposes, including Terra and Aqua. A key instrument aboard the Terra and Aqua satellites is MODIS (Moderate Resolution Imaging Spectroradiometer). Terra's orbit around the Earth is timed so that it passes from north to south across the equator in the morning, while Aqua passes south to north over the equator in the afternoon. Terra MODIS and Aqua MODIS enable the viewing of the entire Earth's surface every 1 to 2 days. MODIS captures data in 36 spectral bands, or groups of wavelengths. Moderate resolution remote sensing allows the quantifying of land surface type and extent, which can be used to monitor changes in land cover and land use for extended periods of time. This data is used to monitor and understand global dynamics and processes occurring on the land, oceans, and lower atmosphere.
Fig. 1. Terra and Aqua satellites (c)NASA
Monitoring of land cover and land use is an important element of global monitoring. Moderate resolution remote sensing enables the quantifying of land surface characteristics such as land cover type and extent, snow cover extent, surface temperature, leaf area index, and fire occurrence. Satellite measurements of leaf area, leaf duration and net primary productivity provide important inputs to capture or model ecosystem processes. High quality, consistent and well-calibrated satellite measurements are needed to detect and monitor changes and trends in these variables. For this paper, we used MODIS data collected at Tokyo University of Information Sciences, Japan. Tokyo University of Information Sciences receives satellite MODIS data over eastern Asia, and provides this data for open research use, as part of the research output of the Japanese government funded Frontier project. Table1 describes the 36 spectral bands of MODIS sensor data.
822
K.J. Mackin et al. Table 1. MODIS sensor band specifications Primary Use
Band
Bandwidth
Spectral Radiance
Spatial Resolution
Land/Cloud/Aerosols
1
620 - 670 nm
21.8
250 m
Boundaries
2
841 - 876 nm
24.7
250 m
Land/Cloud/Aerosols
3
459 - 479 nm
35.3
500 m
Properties
4
545 - 565 nm
29
500 m
5
1230 - 1250 nm
5.4
500 m
6
1628 - 1652 nm
7.3
500 m
7
2105 - 2155 nm
1
500 m
Ocean Color/
8
405 - 420 nm
44.9
1000 m
Phytoplankton/
9
438 - 448 nm
41.9
1000 m
Biogeochemistry
10
483 - 493 nm
32.1
1000 m
11
526 - 536 nm
27.9
1000 m
12
546 - 556 nm
21
1000 m
13
662 - 672 nm
9.5
1000 m
14
673 - 683 nm
8.7
1000 m
15
743 - 753 nm
10.2
1000 m
16
862 - 877 nm
6.2
1000 m
Atmospheric
17
890 - 920 nm
10
1000 m
Water Vapor
18
931 - 941 nm
3.6
1000 m
19
915 - 965 nm
15
1000 m
20
3.660 - 3.840 µm
0.45(300K)
1000 m
Surface/Cloud Temperature
21
3.929 - 3.989 µm
2.38(335K)
1000 m
22
3.929 - 3.989 µm
0.67(300K)
1000 m
23
4.020 - 4.080 µm
0.79(300K)
1000 m
Atmospheric
24
4.433 - 4.498 µm
0.17(250K)
1000 m
Temperature
25
4.482 - 4.549 µm
0.59(275K)
1000 m
Cirrus Clouds
26
1.360 - 1.390 µm
6
1000 m
Water Vapor
27
6.535 - 6.895 µm
1.16(240K)
1000 m
28
7.175 - 7.475 µm
2.18(250K)
1000 m
Cloud Properties
29
8.400 - 8.700 µm
9.58(300K)
1000 m
Ozone
30
9.580 - 9.880 µm
3.69(250K)
1000 m
Surface/Cloud
31
10.780 - 11.280 µm
9.55(300K)
1000 m
Temperature
32
11.770 - 12.270 µm
8.94(300K)
1000 m
Cloud Top
33
13.185 - 13.485 µm
4.52(260K)
1000 m
Altitude
34
13.485 - 13.785 µm
3.76(250K)
1000 m
35
13.785 - 14.085 µm
3.11(240K)
1000 m
36
14.085 - 14.385 µm
2.08(220K)
1000 m
Land Cover Classification from MODIS Satellite Data
823
3 Land Cover Classification with Neural Networks Artificial neural networks (ANN) can be characterized by its "black box" approach to learn and classify complex data patterns. For this research, we propose applying 3 layer network structure (1 input layer, 1 hidden layer, 1 output layer) for the training of land cover type classification, using the neural network to learn the complex relationship between MODIS sensor data. For this paper, we will propose methods applying N-version programming of neural networks to construct the land cover classifier. First we describe the basic artificial neural network for land cover classification. We considered the 3 layer artificial neural network (1 input layer, 1 hidden layer, 1 output layer) as the basic training classifier. We use a sigmoid function for the synapse function of the neuron, with back propagation (BP) training of the MODIS sensor data. The number of hidden neurons was decided by results of preliminary experiments of the neural network. For the network training we used the database of collected MODIS sensor data, and applied BP training based on the difference between classified land cover and land-truth data provided by the Japanese Ministry of the Environment. As for the neural network input, we considered the possibility that the large number of input nodes increases the problem domain and complicates the classification, causing an adverse affect on the network training efficiency. With this assumption, we decided to minimize the number of input nodes in order to first achieve a workable learning curve and classification accuracy. It has been previously shown that bands 1 and 2 (visible red and infra-red) can be used to classify land cover type, and bands 1 and 2 also have the best spatial resolution (250m) among the MODIS bands. For these reasons, for this research we use only bands 1 and 2 as input to the classifiers. For output classes, we use the same 5 major classifications used by Kushardono et al.[3]. The 5 major classifications are paddy, trees, urban, water, and other.
Fig. 2. Example spectral readings for different land cover types
824
K.J. Mackin et al.
4 Training of Single Neural Network We trained the above described neural network using BP and evaluated the classification accuracy. A standard sigmoid function was used as the neuron's base synapse function. The number of neurons used in each layer was 3 input neurons (band 1 input, band 2 input, and 1 fixed input), 6 hidden layer neurons, and 1 output neuron. For the training data, 4 MODIS data images of the same location (Chiba prefecture), 1 image for each season of spring, summer, autumn, and winter, were used. For the untrained data used to plot the training curve of network accuracy, similarly, 4 MODIS data images of the same location (Chiba prefecture), 1 image for each season of spring, summer, autumn, and winter, were used.
Fig. 3. Land cover data for Chiba prefecture, Japan
5 Probabilistically Optimal Ensemble of Neural Networks Neural networks, as with other training based classifiers, inherently have a risk that when classifying an untrained data set, the classifying error rate may be much larger than the training result. In order to overcome this risk, we applied a probabilistically optimal ensemble technique proposed by Imamura et al. [4] to N-version programming of neural networks.
Land Cover Classification from MODIS Satellite Data
825
Fault masking in N-version programming assumes that the individual members give completely independent results. If certain members give similar output, then correct fault masking will not occur. In a probabilistically optimal ensemble, the members of the ensemble are chosen so that the members are correctly independent of each other. This is realized by selecting members so that the measures error rate of the ensemble comes closest to the expected error rate of the ensemble. If the members are correctly independent of each other, then proper fault masking should allow the measured error rate to become very close to the expected error rate. The expected failure rate f of the probabilistically optimal ensemble can be calculated by the following equation [4]
⎛ n⎞ n− k f = ∑⎜ ⎟(1− p) p k k k= m⎝ ⎠ n
(1)
where p is the failure rate of each individual, n is the size of the ensemble, m is the minimum number of faulty outputs for an ensemble to fail. We assume the same failure rate p for individuals for simplicity. In the case where the ensemble has 3 members, majority vote (2 votes) for output, and p = 0.19 , then f = 0.086 . For our research we trained 6 artificial neural networks using different initial weights, and the same training set. Using the same training set, we compared the measured ensemble failure rate for ensemble size 3, for all combinations of neural networks. Using individual failure rate p = 0.19 , the expected ensemble error rate was f = 0.086 . Table 2 shows the results of the measured ensemble failure rate. The ensemble with the closest ensemble failure rate ( 0.085 ) was selected as the optimal ensemble. The selected optimal ensemble was evaluated using untrained data. The resulting ensemble error rate was f = 0.087, indeed very close the training ensemble error rate, and vastly improved over the error rate 0.19 for the single neural network. Table 2. Measured ensemble failure rate
ANN no. ensemble failure rate 0,1,2 0.11 0,1,3 0.115 0,1,4 0.095 0,1,5 0.085 0,2,3 0.115 0,2,4 0.12 0,2,5 0.1 0,3,4 0.13 0,3,5 0.12
ANN no. ensemble failure rate 1,2,3 0.13 1,2,4 0.135 1,2,5 0.115 1,3,4 0.14 1,3,5 0.13 1,4,5 0.115 2,3,4 0.145 2,3,5 0.135 2,4,5 0.125
826
K.J. Mackin et al.
6 Conclusion In this research we applied a probabilistically optimal ensemble technique [4] to create an N-version programming classifier system using 3-layer artificial neural networks to classify land cover from MODIS satellite data. We were able to confirm that by using an optimal ensemble of neural networks, the classification error rate can be greatly reduced. For future works, we will consider methods to improve classification accuracy of the individual neural network, including the increase in the types of sensor input data, reevaluation of neural network structure, combining fuzzy rules to treat input data, as well as effect of using different base synapse functions for neurons.
References 1. Y. Shkvarko, J. Montiel, L. Rizo, J. Salas, "Neural Network-Based Signal Processing for Enhancing the Multi-Sensor Remote Sensing Imagery", 14th International Conference on Electronics, Communications and Computers, pp. 168-172, 2004 2. F. Roli, S.B. Serpico, L. Bruzzone, "Classification of Multisensor Remote-Sensing Images by Multiple Structured Neural Networks", 13th International Conference on Pattern Recognition (ICPR'96), Volume 4, pp.180-184, 1996 3. D. Kushardono, K. Fukue, H. Shimoda, T. Sakata, “A Study on Neural Network Landcover Classification Models with the aid of Co-occurence Matrix for Multiband Images", Journal of The Remote Sensing Society of Japan, Vol.16, No.1, pp.36-49, 1996 (in Japanese) 4. Kosuke Imamura, Kris Smith, “A Probabilistically Optimal Ensemble Technique for Training Based Classifiers”, Proceedings of Joint 2nd International Conference on Soft Computing and Intelligent Systems and 5th International Symposium on Advanced Intelligent Systems, Japan, 2004