Neurocomputing 51 (2003) 387 – 400 www.elsevier.com/locate/neucom
Application of evolutionary neural network method in predicting pollutant levels in downtown area of Hong Kong W.Z. Lua; ∗ , H.Y. Fanb;1 , S.M. Loa a Department
of Building & Construction, City University of Hong Kong, Kowloon Tong, Kowloon, HKSAR, Hong Kong b SER Turbomachinery Research Centre, School of Energy & Power Engineering, Xi’an Jiaotong University, Xi’an, 710049, People’s Republic of China Received 6 June 2001; accepted 8 May 2002
Abstract Air pollution emerges as an imminent issue in metropolitan cities like Hong Kong, and attracts much attention in recent years. Prediction of pollutant levels and their tendency is an important topic in environmental science today. To achieve such prediction tasks, the use of neural network (NN), in particular, the multi-layer perceptron, is regarded as a cost-e7ective technique superior to traditional statistical methods. But the training of the multi-layer perceptron, normally featured with back-propagation (BP) algorithm or other gradient algorithms, still faces certain drawbacks, e.g., very slow convergence, easily getting stuck in a local minimum, etc. In this paper, a newly developed method, particle swarm optimization (PSO) model, is adopted to train the perceptron and to predict the pollutant levels. As a result, a new neural network model, PSO-based approach, is established and completed. The approach is proved to be feasible and e7ective by applying to some real air-quality problems and by comparing with the simple BP algorithm. c 2002 Elsevier Science B.V. All rights reserved. Keywords: Environmental modelling; Neural networks; Particle swarm optimization; Pollutant
∗
Corresponding author. Tel.: +852-2784-4316; fax: +852-2788-7612. E-mail address:
[email protected] (W.Z. Lu). 1 Visiting scholar of Building and Construction Department, City University of Hong Kong.
c 2002 Elsevier Science B.V. All rights reserved. 0925-2312/03/$ - see front matter PII: S 0 9 2 5 - 2 3 1 2 ( 0 2 ) 0 0 6 2 3 - 9
388
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400
Nomenclature c1 ; c2 fsk f(Wi ) Kj Lj m nS Pi[ j] Pg[ j] tsk Vij Wi[ j] ’1 ; ’2
positive constants predicted output of network Htness value of ith particle row size of the matrix [ • ][i j] column size of the matrix [ • ][i j] number of output neurons number of training set samples best previous position of particle best particle among all particle population expected output of network rate of position change (velocity) for particle i connecting weight matrix between two successive layers of network random numbers in the range [0; 1]
Index number i; j; k; l
index number of matrix during network calculation
1. Introduction Hong Kong is recognized as one of the most developed metropolitans in the Asian region and has probably the highest population density in the world (approximately 6000 persons=km2 ). With continuous economic development and population increase, a series of severe problems relating to the environmental benign and protection has attracted much attention than ever before, e.g., air pollution, noise pollution, shortage of land resources, waste and sewage disposal, etc. Among these, air pollution has direct e7ect on human health through exposure to pollutants at high concentration level existing in ambient. The main sources of air pollution in Hong Kong are vehicle emissions, especially from diesel vehicles. Diesel emissions contain three major pollutants: respirable suspended particulate (RSP), nitrogen oxides, and hydrogen carbons. In Hong Kong, almost all commercial vehicles, such as public buses, goods vehicles, and taxies, run on diesel fuel, which emit high levels of RSP and other pollutants. Statistically, vehicles running on diesel contribute 98% of all emitted particulates. With the economic growth, the rapid developments in road networks and the increasing need for individual mobility, traMc Now will increase continuously. The air pollution due to vehicle exhausts will be the most severe social problem in Hong Kong within the next few years [8,9]. Modelling pollutant levels in ambient includes using a variety of approaches. The most conventional method is to use computational Nuid dynamics (CFD) approach to
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400
389
simulate the airNow pattern and pollutant concentration by solving a highly coupled, non-linear, partial di7erential equation set. Such method demands huge computing cost, which, sometimes, cause diMculties in computational convergence, especially for treating large space cases, and experimental validation, which is even more expensive and diMcult to achieve due to the scaling inconsistency. In recent years, the use of neural networks, in particular, the multi-layer perceptrons, which can be trained to approximate virtually any smooth, measurable function, presents potential and becomes popular in practice. Unlike other statistical techniques, the multi-layer perceptrons take no prior assumptions concerning the data distribution. It can simulate highly non-linear functions and can be trained to accurately generalize=forecast, if presented with new, the unseen data. These features of the multi-layer perceptrons make it very attractive to be used in environmental science, e.g., modelling pollutant tendency in ambient. To date, the neural network method featured with multi-layer perceptrons has been applied in the Held of air-quality prediction in recent years, and some promising results have been reported [2,3,6,7,13–17]. Similar to all neural network models, the multi-layer perceptrons must be trained with the sample solutions, i.e., to obtain prediction ability with a training set. Concerning the training, the back-propagation (BP) algorithm is the most commonly used perceptron to perform such task and has been used by some researchers in their studies [7,5,17]. BP algorithm is a gradient-based method, hence some inherent problems (or diMculties) are frequently encountered in the use of this algorithm, e.g., very slow convergence speed in training, easily to get stuck in a local minimum, etc. Some techniques are therefore introduced in an attempt to resolve these drawbacks, but, to date, all of them are still far from satisfaction [6]. The particle swarm optimization (PSO) perceptron, developed by Eberhart and Kennedy in recent years [12], is a method for optimizing hard numerical functions based on metaphor of human social interaction [4,10,11]. Originally developed as a tool for simulating social behavior, the PSO algorithm has been accepted as a computational intelligence technique closely related to evolutionary algorithms [1,5,13]. In this study, PSO is adopted to train the multilayer perceptrons and to predict air-quality parameters. As a result, a PSO-based neural network approach is developed. Such approach is validated with four practical cases of predicting air-quality parameters in downtown area of Hong Kong based on the original pollutant data supplied by Hong Kong Environmental Protection Department (HKEPD). 2. Mathematical basis of multi-layer perceptrons A multi-layer perceptron consists of a system of simple interconnected neurons, or nodes, as illustrated in Fig. 1. It is a model representing a non-linear mapping between input and output vectors. The nodes are connected by weights and output signals, which are a function of the sum of the inputs to the node modiHed by a simple non-linear transfer, or activation function. It is the superposition of many simple non-linear transfer functions that enables the multi-layer perceptron to approximate extremely non-linear functions. The output of a node is scaled by the connecting weight and fed forward
390
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400 x1 x2
y1
x3
y2 . . .
. . .
. . .
. ym
xl
Input layer
Hidden layer 1
Hidden Output layer 2 layer
Fig. 1. A typical multi-layer perceptron. X = [x1 ; x2 ; : : : ; xl ] = input vector and Y = [y1 ; y2 ; : : : ; ym ] = output vector.
to be an input to the nodes in the next layer of network. This implies a direction of information processing; hence the multi-layer perceptron is also called a feed-forward neural network. The architecture of a multi-layer perceptron is variable but, in general, consists of several layers of neurons. The input layer plays no computational role but merely serves to pass the input vector to the network. The terms input and output vectors refer to the inputs and outputs of the multi-layer perceptron and can be represented as single vectors, see Fig. 1. A multi-layer perceptron may have one or more hidden layers and Hnally an output layer. The multi-layer perceptrons are described as being fully connected to every node in the next and the previous layer. By selecting a suitable set of weights and transfer functions, it is known that a multi-layer perceptron can approximate any smooth, measurable function between the input and output vectors [4]. The multi-layer perceptrons have the ability to learn through training. The training requires a set of training data, i.e., a series of input and associated output vectors. During the training, the multi-layer perceptron is repeatedly presented with the training data and the weights in the network are adjusted from time to time till the desired input–output mapping occurs. This procedure results in the “encoding” of the properties of system to be mapped in the di7erent parts of the neural network. If, after the training, the multi-layer perceptron is presented with an input vector, not belonging to the training pairs, it will simulate the system and produce the corresponding output vector. The error between the actual and the predicted function values is an indication of how successful the training is. In this study, an output mean squared error (MSE) of neural networks is considered and deHned as m nS 1 2 MSE = (tsk − fsk ) ; nS s=1
(1)
k=1
where tsk is the expected output, fsk is the predicted output, m is the number of output neurons, and nS is the number of training set samples.
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400
391
3. PSO algorithm and its adaptation to training the multi-layer perceptrons Particle swarm optimization is a population search method, which resembles a school of Nying birds. Particles, also called individuals in evolutionary algorithms (EAs), are candidate solutions to the problem to be solved. In a PSO system, instead of using genetic operators as in EAs, a population of these individuals is “evolved” by cooperation and competition among the individuals themselves through generations. Each particle adjusts its Nying according to its own Nying experience and its companions’ Nying experiences. The existing versions of PSO usually take vectors as presentation of particles since most optimization problems are convenient for such variable presentations. In this study, PSO is adopted to train the multi-layer perceptrons, within which matrices learning problems are involved. To date, several versions of PSO are available. Among these, the GBEST model, originally proposed by Eberhart and Kennedy (1995) and the most frequently used one, is capable of and suitable for matrix particle presentations. The PSO process regarding to train the multi-layer perceptrons can be described as follows. It is assumed that the four-layered perceptrons are chosen for all application cases in this study. Without loss of generality, it is denoted that W [1] being the connection weight matrix between the input layer and the Hrst hidden layer, W [2] , the one between the two hidden layers, and W [3] , the one between the second hidden layer and the output layer for all perceptron structures is established in the study. It is further denoted as W = {W [1] ; W [2] ; W [3] }. When a PSO is used to train the multi-layer perceptrons, the ith particle is represented as Wi = {Wi[1] ; Wi[2] ; Wi[3] }. The best previous position (i.e., the position giving the best Htness values) of any particle is recorded and represented as Pi = {Pi[1] ; Pi[2] ; Pi[3] }. The index of the best particle among all particles in the population is represented by the symbol g, hence Pg = {Pg[1] ; Pg[2] ; Pg[3] } is the best matrix found in the population. The rate of position change (velocity) for particle i is represented as Vi = {Vi[1] ; Vi[2] ; Vi[3] }. Let k and l denote the index of matrix row and column, respectively. The particles are manipulated according to the following equations: Vi[ j] (k; l) = Vi[ j] (k; l) + c1 ’1 (Pi[ j] (k; l) − Wi[ j] (k; l)) + c2 ’2 (Pg[ j] (k; l) − Wi[ j] (k; l)); Wi[ j] = Wi[ j] + Vi[ j] ;
(2a) (2b)
where j = 1; 2; 3; k = 1; : : : ; Kj , l = 1; : : : ; Lj , Kj and Lj denote the row and column sizes of the matrix [ • ][i j] (i.e., Wi[ j] , or Pi[ j] , or Vi[ j] ), c1 and c2 are two positive constants, ’1 and ’2 are two random numbers in the range [0; 1]. The second item of RHS of Eq. (2a) is the “cognition” part, which represents the private thinking of the particle itself. The third one of RHS of Eq. (2a) means the “social” part, which represents the collaboration among the particles. Eq. (2a) is used to calculate the particle’s new velocity according to its previous velocity and the distances between its current position
392
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400
from its own best experience (position) and the group’s best experience. Then the particle Nies toward a new position according to Eq. (2b) afterwards. The performance of each particle is measured according to a pre-deHned Htness function, which is related to the problem to be solved. In the study, the Htness function takes the form of Eq. (1), i.e., for the ith particle Wi , its Htness value can be calculated by f(Wi ) = MSE(Wi ); (3) The pseudocode for particle swarm algorithm described above is then outlined as follows: for j = 1 to 3 for i = 1 to population size for l = 1 to Lj for k = 1 to Kj if f(Wi ) ¡ f(Pi ) then Pi[ j] (k; l) = Wi[ j] (k; l) Vi[ j] (k; l) = Vi[ j] (k; l) + ’1 (Pi[ j] (k; l) − Wi[ j] (k; l)) + ’2 (Pg[ j] (k; l) − Wi[ j] (k; l)) if Vi[ j] (k; l) ¿ Vmax then Vi[ j] (k; l) = Vmax ; else if Vi[ j] (k; l) ¡ Vmin then Vi[ j] (k; l) = Vmin ; [ j] Wi (k; l) = Wi[ j] (k; l) + Vi[ j] (k; l) next k next l next i next j where Vmax and Vmin are the parameters limiting velocity values and can be regarded as system parameters of PSO. 4. Results and discussion The original air-quality data used in this study come from the Hong Kong Environmental Protection Department (HKEPD). There are totally 14 air-pollutant gaseous monitoring stations distributed over the whole territory of Hong Kong established by HKEPD. In this study, only the data of 1999 from the Causeway Bay Roadside Gaseous Monitoring Station is available and chosen for the numerical simulation. The proposed PSO-based multi-layer perceptrons are examined in comparison with BP-based ones in this section. Four test cases are considered. Case 1 is the ‘data Htting’ problem. This kind of problem is classical but not easily solved. It forms a direct test process to examine the function approximating ability of neural networks. The concentration data of carbon monoxide (CO), hourly measured in two time epochs representing the Hrst day and the Hrst three days of January shown in Fig. 2a, are chosen as the training sets, respectively, to train the multi-layer perceptrons. The trained perceptrons can then be used to predict CO concentration levels at any moment during similar time epochs. The Hrst perceptron in Fig. 2a is used for the Hrst time epoch, i.e., the Hrst day of January. Its structure takes the form as
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400
393
(One-day epoch) 2nd
1st
4th/January
3rd
(Three-days epoch) 2nd
1st
4th/January
3rd
(a) (Hourly RSP prediction during one-day period)
Training set 1st
2nd
Test set 3rd
4th/January
(Daily mean RSP prediction in month period)
Training set
(b)
Jan.
Feb.
Test set Mar.
Apr.
(Hourly NOx and NO2 prediction during three-day period) Training set 1st
3nd
Test set 4rd
6th /January
(c) Fig. 2. Time epochs of numerical simulations with two perceptrons. (a) Case 1—data Htting of CO levels; (b) Case 2—forecasting of RSP levels; (c) Cases 3 and 4—prediction of NOx and NO2 levels.
‘1-4-4-1’, i.e., one neuron in input layer, four neurons in both hidden layers, and one neuron in output layer. In this perceptron, the input neuron presents the time, while the output means the CO concentration. The training set for the perceptron consists of 24 sample pairs, in which, the time (hours) is the input vector, while the corresponding CO concentration the output one. The second perceptron shown in Fig. 2a represents the Hrst three days of January. It is structured with the form as “1-6-6-1”. Similar to the Hrst perceptron, the second one has one input neuron to represent the time, and one output neuron to represent the CO concentration. There are totally 72 sample pairs selected for this perceptron. Considering the Htting tasks in Case 1, the test sets for examining the perceptrons take the same sizes as their corresponding training sets. In Case 2, the multi-layer perceptrons are established to implement forecasting ability of air-quality parameters. It is assumed, in the case, that the concentrations of six pollutants, i.e., carbon monoxide (CO), nitric oxide (NO), nitrogen dioxide (NO2 ), nitrogen oxides (NOx ), sulphur dioxide (SO2 ), and respirable suspended particulates (RSP), and the temperature are related with each other. It means that the concentration
394
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400
of one pollutant at 1 h later, for example, is determined by the hourly concentrations of other pollutants, the pollutant itself, and the hourly temperature at current moment. The RSP concentrations are taken as predicting objective, i.e., the output vector, in the case. The case includes two parts. The Hrst multi-layer perceptron is constructed and trained for forecasting the RSP concentrations 1 h later from the current hourly data shown in Fig. 2b. The second one is constructed and trained for predicting the daily mean RSP concentrations based on the daily mean data in the Hrst 3-month period. The perceptrons in both approaches are structured with the form as “7-8-8-1”. For the Hrst perceptron, the seven input neurons represent seven parameters namely, concentrations of CO, NO, NO2 , NOx , SO2 , RSP, and the temperature at current hour. The output neuron of the perceptron represents the corresponding RSP levels at the next hour. A time epoch of the Hrst four days in January, shown in Fig. 2b, are chosen to achieve this purpose. After eliminating several invalid data, 88 samples are available during that period. From the selected samples, 70 samples in the Hrst three days are speciHed as the training set, while the remaining 18, representing hourly data of 18 h on fourth day, are speciHed as the testing set. For the second perceptron, the seven input neurons represent current daily mean concentrations of CO, NO, NO2 , NOx , SO2 , RSP, and the current daily mean indoor temperature. The output neuron of the perceptron means the corresponding daily mean RSP concentrations in the next day. The time epoch of the Hrst four months, i.e., January, February, March and April are chosen for this purpose (also refer to Fig. 2b). The data in January, February and March are taken as training set, and the ones in April as the testing set. After deleting several invalid data, 85 sample pairs in the Hrst three months are selected to form a training set for the perceptron, and 30 sample pairs in April to form a testing set. Cases 3 and 4 contain the predictions of hourly NOx and NO2 levels during 72-h periods by using both PSO- and BP-based perceptrons. Both perceptrons are Hrstly trained by the corresponding selected training sets, i.e., the data in the Hrst three days of January. The trained perceptrons are then used to forecast the hourly concentrations of NOx and NO2 within the next three-day time epochs. Fig. 2c presents the time epochs speciHed for Cases 3 and 4. The perceptron structures for Cases 3 and 4 are similar to those used in Case 2; while the prediction objectives become NOx for Case 3 and NO2 for Case 4 respectively, and other parameters are regarded as a7ecting factors, i.e., input parameters. According to the selected samples, 72 samples in the Hrst three days are used as training set and other 72 samples during the next 72-h are used as testing set for each case, respectively. It should be noted that, as normal understanding, using the data in Hrst three days of a month (e.g., Cases 3 and 4) or the ones in Hrst three months of a year (e.g., Case 2) to train the perceptron, and then using the trained perceptrons to simulate the data within next 3 days or next month are not the best plan. As a testing set, it would be more appropriate to adopt the data sets within the same time periods from di7erent years, e.g., next year, so that similar seasonal changes in di7erent years can be beneHcial to achieve accurate predictions with the established model. However, due to the limited data sources, there is no better choice but to take the testing data in April in the same year, i.e., the closest month to January, February and March in Case 2, and the ones in next 72-h periods, i.e., the next three days of the same month in Cases 3
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400 4
18 PSO-based perceptron 15
BP-based perceptron
3
Mean square error
Mean square error
PSO-based perceptron
2
1
0
(a)
395
1
2
Fitness evaluating times x 10000
12 9 6 3 0
3
(b)
BP-based perceptron
1
2
3
Fitness evaluating times x 10000
Fig. 3. Training histories of perceptrons for CO data Htting (Case 1): (a) One-day epoch and (b) Three-day epoch.
and 4 as described above. In order to suit the consistency of multi-layer perceptrons, all source data are Hrstly normalized in the range [0:0; 1:0], and then returned to original values after the simulation using xnorm = (x − xmin )=(xmax − xmin ). The above multi-layer perceptrons in all cases established for relevant purposes are trained with both PSO algorithm and BP algorithm, respectively, through the selected training sets. In order to establish a fair “start-up” state for BP-based perceptrons, the training processes of BP-based perceptrons always start with the best solutions in the initial population used for the training processes of the counterparts, PSO-based perceptrons. Once the training is Hnished, they are then examined with the corresponding testing sets. All training and predicting processes are performed with Matlab on a Pentium III PC. The maximum and minimum velocity values are taken as Vmax = −Vmin = 0:2. Training processes terminate in given Htness evaluation times. For BP-based perceptrons, such “evaluation time” directly equals the times of its updated iterations, while for PSO-based perceptrons, it is equivalent to a production of the population size and the evolving generations. In order to compare performances between the PSO- and BP-based perceptrons, the training histories are recorded with the same Htness evaluation times for both perceptrons to be compared with. The training histories of multi-layer perceptrons for Case 1 are described in Fig. 3. The recovery (i.e., predicting) performances of the trained perceptrons are presented in Fig. 4. From these two Hgures, it can be seen that the PSO- and BP-based perceptrons have almost the same convergence performances for the carbon monoxide data Htting for one-day’s epoch (Fig. 3a). Their Htting abilities are similar, especially in the range of lower CO concentration values (Fig. 4a). While for the data Htting of carbon monoxide in three-day epochs, the PSO-based perceptron expresses much better convergence performance as well as Htting ability than the BP-based perceptron does. The results produced by PSO-based perceptron are much closer to the original data than the outputs by BP-based one. The predictions of PSO-based model can reNect the real distributions of the original data. In these data Htting tasks, a shorter time epoch means that a simpler and smaller data set need to be Htted, and further a simple Htting
396
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400
Fig. 4. Recovery performances of both perceptrons via CO data Htting (Case 1): (a) One-day epoch and (b) Three-day epoch.
25
PSO-based perceptron
15
Mean square error
Mean square error
18
BP-based perceptron 12 9 6
PSO-based perceptron 20 15 10 5
3 0
(a)
BP-based perceptron
1
2
Fitness evaluation times x 10000
0
3
(b)
1
2
3
Fitness evaluation times x 10000
Fig. 5. Training histories of perceptrons for pollutant concentration forecasting (Case 2). (a) Hourly RSP level prediction and (b) daily mean RSP level prediction.
task, and vice versa. The results shown in Figs. 3 and 4 indicate that the PSO-based perceptron is superior to the BP-based perceptron in tackling the complex tasks in air-pollutant data Htting. The training histories of both perceptrons for Case 2 are shown in Fig. 5. Fig. 6 presents the prediction results of these trained perceptrons. The results express that, for the Case 2, the PSO-based perceptron still demonstrates few advantages compared with the BP-based one. Not many di7erences on forecasting ability of both models can be seen for hourly and daily RSP concentration forecasting. For predicting hourly RSP concentration, PSO-based perceptron has a better convergence performance than the BP-based one; while for forecasting daily mean RSP level, PSO-based perceptron possesses faster convergence rate in the early period of the training process than the BP-based one does. Figs. 7 and 8 demonstrate the training history and the predicting performance of NO2 level (Case 3) and NOx level (Case 4) by using both perceptrons. From the
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400
397
Fig. 6. Forecast of RSP concentration for di7erent time periods (Case 2). (a) Hourly RSP level prediction and (b) daily mean RSP level prediction.
Fig. 7. Forecasting of NO2 level in three-day time epoch (Case 3). (a) Training history for NO2 prediction and (b) hourly NO2 level prediction.
Fig. 8. Forecasting of NOx level in three-day time epoch (Case 4). (a) Training history for NOx prediction and (b) hourly NOx level prediction.
398
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400
comparison of training histories in Cases 3 and 4, it can be seen that, for the given errors, the PSO-based perceptron again presents better convergence performance than the BP-based perceptron. On considering the prediction accuracy, the results produced by the PSO-based model are more close to the original data than those by BPbased one, especially for Case 3. In general, the PSO-based perceptron generates more accurate predictions and presents better performance than the simple BP-based one does. Figs. 3–8 indicate that the PSO-based perceptrons perform better than the BP-based perceptrons in general. It is known that the training processes of neural network perceptrons are normally complex, high-dimensional, multi-modal optimization problems. The PSO algorithm is a stochastic and, meanwhile, a population search method. It can Hnd the global optimum weights with both a large probability and a fast convergence rate during the training of a perceptron. The BP algorithm, however, is a local search method. It is easily falling into local minima and fails to Hnd the global optimum when used to train a perceptron.
5. Conclusion A PSO-based neural network approach is developed for modelling air-pollutant parameters. The approach takes a novel kind of optimization algorithm, i.e., particle swarm optimization algorithm, to train the multi-layer perceptrons. The feasibility and e7ectiveness of this new approach is validated and illustrated by four practical cases of modelling air pollutant levels. The data measured at a roadside Gaseous Monitory Station by the HKEPD are chosen as the original data to testify the performance of the proposed model and to validate the numerical simulations. Case 1 testiHes the recovery performance of both PSO- and BP-based neural network models by Htting the pollutant concentration levels during di7erent time epochs. Case 2 predicts the air-pollutant parameters in the next month based on the known pollutant data in previous months using both models. A performance comparison is emphasized on the PSO-based perceptron with the most commonly used BP-based perceptron. Cases 3 and 4 present the training histories and the prediction results of NO2 and NOx levels in three-day epoch, respectively. The numerical results show that the PSO-based perceptrons has a better training performance, faster convergence rate, as well as a better predicting ability than the BP-based perceptrons to the selected cases. It is worth mentioning that the current study is very preliminary for the PSO-based neural networks approach applied in air pollution study. Further studies are required to solve certain diMculties, e.g., to improve current approach and apply it to more complex cases as the pollutant data accumulated. Nevertheless, it is expected that the proposed approach have the potential to solve these diMculties after further improvements. It would also be interesting to apply the model in dealing with similar problems in other metropolitan cities like Shanghai, Beijing, Tokyo, etc., if the air pollution information is available. There are many works which need to be carried on further.
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400
399
Acknowledgements The work described in this paper was supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China [Project No. CityU 1013/02E] and a Strategic Research Grant #7001086(BC) from City University of Hong Kong, HKSAR. The provision of original data from Environmental Protection Department, HKSAR, is also appreciated. References [1] P.J. Angeline, Evolutionary optimization versus particle swarm optimization: philosophy and performance di7erences, Evolutionary Programming, Vol. VII, Berlin, Springer, 1998, pp. 601– 610. [2] M. Boznar, M. Lesjak, P. Mlakar, A neural netwok-based method for short-term predictions of ambient SO2 concentrations in highly polluted industrial areas of complex terrain, Atmos. Environ. B 27 (2) (1993) 221–230. [3] A.C. Comrie, Comparing neural networks and regression models for ozone forecasting, J. Air Waste Manage. 47 (1997) 653–663. [4] R. Eberhart, J. Kennedy, A new optimizer using particle swarm theory, Sixth International Symposium on Micro Machine and Human Science, 1995, pp. 39 – 43. [5] H.Y. Fan, W.Z. Lu, Z.B. Xu, An empirical comparison of three novel genetic algorithms, Eng. Comput. 17 (8) (2000) 981–1001. [6] M.W. Gardner, S.R. Dorling, ArtiHcial neural networks (the multi-layer perceptron)—a review of applications in the atmospheric science, Atmos. Environ. 30 (14/15) (1998) 2627–2636. [7] L. Hadjiiski, P.K. Hopke, Application of artiHcial neural network to modeling and prediction of ambient ozone concentrations, J. Air Waste Manage. Assoc. 50 (2000) 894–901. [8] Hong Kong Environmental Protection Department, Air quality in Hong Kong, Air Service Group, 2000. [9] Hong Kong Environmental Protection Department, Environment Hong Kong—A review, 2000. [10] K. Hornik, M. Stinchcombe, H. White, Multi-layer feed-forward networks are universal approximators, Neural Networks 2 (1989) 359–366. [11] J. Kennedy, The particle swarm: social adaptation of knowledge, Proceeding of the 1997 International Conference on Evolutionary Computation, Indianapolis, Indiana, IEEE Service Center, Piscataway, NJ, 1997, pp. 303–308. [12] J. Kennedy, R.C. Eberhart, Particle swarm optimization, Proceeding of the 1995 IEEE International Conference on Neural Networks, IEEE Service Center, Piscataway, NJ, 1995, pp. 1942–1948. [13] W.Z. Lu, H.Y. Fan, A.Y.T. Leung, W.J. Wang, S.M. Lo, J.C.K. Wong, Pollution modeling for Hong Kong downtown area using principal component analysis and artiHcial neural networks, Proceedings of Civil-Comp. 2001, Eisenstadt, Vienna, Austria, 18–21 September 2001. [14] P. Perez, A. Trier, J. Reyes, Prediction of PM2.5 concentrations several hours in advance using neural networks in Santiago, Chile, Atmos. Environ. 34 (2000) 1189–1196. [15] S.L. Reich, D.R. Gomez, L.E. Dawidowski, ArtiHcial neural network for the identiHcation of unknown air pollution sources, Atmos. Environ. 33 (1999) 3045–3052. [16] C.M. Roadknight, G.R. Balls, G.E. Mills, D. Palmer Brown, Modelling complex environmental data, IEEE Trans. Neural Networks 8 (4) (1997) 852–861. [17] J. Yi, R. Prybutok, A neural network model forecasting for prediction of daily maximum ozone concentration in an industrialized urban area, Environ. Pollut. 92 (3) (1996) 349–357.
400
W.Z. Lu et al. / Neurocomputing 51 (2003) 387 – 400
Dr. Lu, Jane Weizhen Assistant Professor, Department of Building and Construction, City University of Hong Kong, Kowloon Tong, Kowloon, HK Tel.: (852) 2784 4316, Fax.: (852) 2788 7612, E-Mail:
[email protected]. Dr. Lu has been working as assistant professor at City University of Hong Kong since October 1996. She obtained BSc and MEng in Xi’an Jiaotong University in 1982 and 1985. She was appointed as Assistant Lecturer and Lecturer in the same University thereafter. From 1990 to 1993, She worked as a research Scientist in National Engineering Laboratory, UK. She joined a consultancy project of multi-phase Now for UK Oil Companies. In 1993, she worked as a Research OMcer in De Montfort University, UK. The research project was analysis of airNow and aerosol particle distribution in buildings, which was sponsored by UK Engineering & Physical Science Research Council (EPSRC). She Hnished Ph.D. project in October 1995, took Lecturer post in Dept. of Building Services Engineering, Hong Kong Polytechnic University in November 1995 and worked actively in teaching and research. The research includes an investigation of aerosol particle distribution in indoor and outdoor environment for The Consumer Council (HK). She is currently teaching Building Services Engineering, Thermal Fluids, Environmental Sciences, etc. She has many years research experiences in air quality aspect including both numerical simulation and experimental study. Her main research interests include: air quality, air inHltration, HVAC system, wind e7ect on high-rise buildings, application of Computational Fluid Dynamics and computation intelligence in various engineering disciplines including building engineering, chemical & environmental engineering, mechanical engineering, power engineering, wind engineering, etc.
Dr. Fan obtained BEng and Ph.D. from Xi’an Jiaotong University in 1983 and 2000 respectively. He joined Yangzi Petroleum and Chemical Corporation of SINOPEC after obtaining BEng and worked there for ten years. He was involved in major production projects and responsible for technical inspection of production operation. Dr. Fan returned Xi’an Jiaotong University to carry on a Ph.D. study after obtaining ten years practical experiences. His Ph.D. research includes CFD analysis, ArtiHcial Intelligent Computation and its application in Nuid machinery design. He then joined Building and Construction Dept, City University of Hong Kong as a researcher. His works in CityU mainly involve Soft Computational methods and its application in Atmospheric and Environmental Engineering. Dr. Fan is currently a research sta7 in Department of Information Technology, Lappeenranta University of Technology, Finland.
Dr. S.M. Lo is an associate Professor at the Department of Building and Construction, City University of Hong Kong. He obtained his Ph.D. in Architecture from the University of Hong Kong. Before starting his teaching career, he has worked for the Buildings Department, Hong Kong Government for many years and has extensive experience in building construction and drafting legislative proposals and codes of practices. He is currently a member of the Contractors’ Registration Committee. His main research interests include building design and environment, Hre safety engineering, computer-aided design, etc. and holds many research grants supported by Hong Kong Research Grant Council for studying evacuation, Hre risk analysis, wayHnding modelling, intelligent understanding of CAD plans, etc.