Factor Sensitivity Analysis with Neural Network Simulation based on ...

Report 5 Downloads 105 Views
1402

JOURNAL OF COMPUTERS, VOL. 6, NO. 7, JULY 2011

Factor Sensitivity Analysis with Neural Network Simulation based on Perturbation System Runbo Bai College of Water-Conservancy and Civil Engineering, Shandong Agricultural University, Tai’an, China Email: [email protected]

Daozhen Zhang Shandong Water Polytechnic, Rizhao, China Email: [email protected]

Hailei Jia College of Civil and Traffic Engineering, Hohai University, Nanjing, China Email: [email protected]

Abstract—Perturbation system is often used in the factor sensitivity analysis with neural network design. The two major and key problems: the sensitivity definition and the input perturbation ratio are investigated in this study. Besides, four models of sensitivity analysis are considered in the investigation. Through comparison and analysis, results show that the definition of sensitivity derived from the partial derivatives is relatively more rational than others and, the optimum range of the input perturbation ratio could be [-20%, 20%] for a general case. Additionally, the effect of quality of model on the prediction accuracy of the sensitivity is discussed in this paper, and their correlation is revealed.

for the perturbation method, the result of sensitivity is related to the mode of input perturbation [10] [11] [12] [13]. Therefore, the input perturbation ratio is another important problem deserving of study. The goals of this study are twofold: (1) explore the rational definition of sensitivity, especially in the neural network simulation, and (2) discuss the optimum range of the input perturbation ratio. Besides, the factors affecting the assessment of the accuracy of the sensitivity are also hereinto elaborated.

Index Terms—perturbation system, input perturbation ratio, artificial neural network, sensitivity analysis

For the neural network-based perturbation method, the sensitivity of a system is estimated by the change of the output with respect to the input perturbation. The basic idea is that the inputs to the network are shifted slightly and the corresponding change in the output is reported either as a percentage or as an absolute difference [14]. The key is to select the reasonable index measuring the change in the outputs. Various indices are available in present studies. Scardi et al. [4] adopts the mean square error to represent the change of the output, and thus the sensitivity of the kth output with respect to the jth input is defined as

I. INTRODUCTION Sensitivity analysis is a fundamental issue in neural network design, from which the evaluation on the degree that the sensitivity of model output to changes in parameter values is implemented for a collection of methods [1]. Several approaches have been developed on the quantitative measurement of sensitivity. For instance, the ‘partial derivatives’ method [2], the ‘weights’ method [3], the ‘perturbation’ method [4] and the ‘profile’ method [5]. Of which, the perturbation method is widely used, which assesses the sensitivity of neural network model through observing the effect upon the overall error with the perturbation of each input. The perturbation is realized by adding noise to a certain input. Although such a method is of extensive prevalence, some disputations are still available in the literature. First, the cost function used in the sensitivity of a network is inconsistent, which means the definition of the sensitivity is not uniform [4] [6] [7] [8] [9]. The major problem is how to reflect the response of neural network output to the perturbation of its inputs. Second, one has to take a subjective scope about how much noise to add. Many researches show that © 2011 ACADEMY PUBLISHER doi:10.4304/jcp.6.7.1402-1407

II. DEFINITION OF SENSITIVITY

(

s jk = D ( ok ) D



) ( ∆u

j

u ∗j ) = ( D ( ok ) D ∗ ) δ (1)

Where, D (ok) is the variance of the kth output corresponding to the perturbation of the jth input changes; D* means the ideal variance without noise input perturbation; ∆uj is a small incremental change in the jth input variable; uj* is the original value of the jth input variable; δ is the input perturbation ratio. Choi et al. [6] defines the sensitivity of input perturbations as the ratio of the standard deviation of output errors to the standard deviation of the input errors under the condition that the latter tends to zero. The formula is

JOURNAL OF COMPUTERS, VOL. 6, NO. 7, JULY 2011

( )

(σ ( u ) → 0 )

s jk = σ ( ok ) σ u j

1403

(2)

j

where σ (ok) and σ (uj) are the standard deviation of the output errors and the input errors, respectively. Jiang Quan et al. [7] use the ratio of the output relative error to the input relative error to represent the sensitivity, expressed as

(

s jk = ∆ok o



) ( ∆u



j

) (

u j = ∆ ok o



)

δ

(3)

in which ∆ok is the corresponding change in the value of the kth output variable due to ∆uj and o* is the original value of the kth output variable. Mathematically, for a causal-to-result system, the sensitivity of a certain independent variable to one dependent variable is calculated by the partial derivative of the dependent variable with respect to the independent variable. The partial derivatives method in neural network sensitivity analysis is just originated from the above idea. Nevertheless, two major weaknesses can be found in the partial derivatives method. Firstly, it cannot implement neural networks with non-differentiable activation function and secondly, it is inadequate to consider the magnitude effect of the parameter in sensitivity assessment [8]. So the numerical computation instead of analytical expression is used by some researchers to obtain the sensitivity of models. Reddy et al. [9] gives a definition of the sensitivity in the form

s jk = ∆ok ∆u j

(4)

where ∆ok reflects the network’s output error more directly and precisely than the variance does. This definition is closer to the one of the partial derivatives compared to the other definitions. It is expected that this definition is more reasonable. N

In the above formulas, ∆ok = ∑ ( yˆ ik − yik ) . In all the i =1

aforementioned literatures, N is the number of the training objects, yik is target output for the i object and k output, and yˆ ik is the corresponding network estimate of the target value with the various input perturbation. In fact, yik should not be the original value of the output but the network estimate of the target value without the input perturbation. This is easily accountable; comparisons must be a kind, which is neural network compared with neural network, not the real target value. For the sake of convenience, we still use the above form of the formulas (1) ~ (4) while the significance of yik is different in the following calculation. III. INPUT PERTURBATION RATIO OF INFLUENTIAL FACTORS The perturbation method adjusts a certain dependent variable through adding noise at a time while keeping all the others untouched. The change ratio of the output variable with regard to the perturbation of the input variable is evaluated. The input variable with most significant change ratio turns out the one that has the © 2011 ACADEMY PUBLISHER

strongest effect on the system analyzed. To obtain an objective assessment on the sensitivity of dependent variables, the optimum range of input perturbation ratio should be determined as a precondition. If the perturbation is overlarge, the sensitivity spectra may appear clipped. Generally, the farther a perturbation moves from the base case value, the less reliable the results become. However, if the perturbation is undersize, the sensitivity spectra may have no noise and sometimes no signal. The input perturbation ratios are added from -30% to 30% in steps of 15% in [7] and from -40% to 40% in steps of 10% in [10], and from 10% to 50% in steps of 10% in [11]. In [4], the range of input perturbation magnitudes is [20%, 100%] and the corresponding range is [0.1%, 10%] in [12]. In [13], the input perturbation ratio used is 5%. In the cases of this paper, the input perturbation ratios cover a wide range [-100%, 100%] so as to explore the appropriate range of input perturbation ratio in the sensitivity analysis of neural network design. The input perturbation is added as the following example: δ=0.1; pv= P_input(k,: )* δ; P_input(k,: )= P_train(k,: )+pv; That means the kth input factor values which are added the perturbation values with the ratio 10%. We can change the value δ, for example, -0.01, to add the perturbation ratio as -1%. If the investigated factor is the (k+1)th, the k can be changed as k+1 to achieve the goal. Assuming there are n input factors in number which needed to be studied, through changing the δ from -100% to 100% and the k from 1 to n, and combing the above definition of the sensitivity, the sensitivities of all the input factors to the output can be calculated. IV. NUMERICAL CASE STUDY Two numerical examples are run in this section to estimate the sensitivity formulas presented in sectionⅡ and to judge the appropriate range of input perturbation ratio. (1) Case 1 The first example employs the following equation: y=a1x1+a2x2+a3x3+a4x4 (5) with a1=1; a2=2; a3=-4; a4=8. This is a simple linear function whose value is the sum of four functions. The sensitivity of x1, x2, x3 and x4 can be calculated by əy/əx1, əy/əx2, əy/əx3, and əy/əx4, and the results are s1=1, s2=2, s3=-4 and s4=8. These results can be regarded as a standard with which the following neural network analysis results are compared. The network structure used in this example contains three layers with 4 input neurons, 6 hidden neurons and 1 output neuron. In this case, back-propagation algorithm with Bayesian training method is adopted, which offers an efficient tool to avoid overffitting so as to improve the generalization of the network [15]. The performance goal was set to a very small value 1E-5.

1404

JOURNAL OF COMPUTERS, VOL. 6, NO. 7, JULY 2011

100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100%

sensitivities according to the formula (2) are all positive and can not reflect the possible inhibitory effects of variables; the formula (3) can reflect the relative magnitude and the direction of the sensitivities, but the values are not directly equal to the aforesaid standard values, which need a conversion of calculating the ratio among the variables; the formula (4) is perfect, which can correctly calculate the magnitude and the direction of the sensitivities. So the formula (4) is the best for calculating the sensitivity in this case. It can also be found from Fig. 1 that the accuracy of sensitivity is decreased as the noise level increases. The reasonable range of the input perturbation ratio is [-20%, 20%]. There is no much difference in the sensitivity measurement within this range.

s4

s3

s2

s1 δ

δ

One hundred and fifty groups of data based on (5) were randomly set as the ideal data for training. To calculate the sensitivity spectra, each input variable was perturbed by ±0.01%, ±0.1%, ±1%, ±5%, ±10%, ±20%, ±50%, ±80% and ±100% of its original data. After the network model is well trained, the sensitivity values of case 1 at increasing levels of input perturbation ratios from -100% to 100% are calculated according to the formula (1) ~ (4) and are shown in Fig. 1 (a) ~ (d) respectively. It can be found from Fig. 1 that the formula (1) can reflect the relative order of each variable sensitivity, but can not accurately measure the size-relationship among various factors; while the formula (2) can correctly calculate the magnitude of the sensitivities, but can not reflect their direction, that is the

-3.75E+008 -2.50E+008 -1.25E+008 0.00E+000 1.25E+008 2.50E+008

100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100% -0.6

s

s4

s3

s2

s1

-0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

1.2

s

(c). formula (3)

δ

(a). formula (1) 100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100%

s4

s3

s2

δ

s1

100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100% -4

0

1

2

3

4

5

6

7

8

s

(b). formula (2)

s4

s3 s2 s1

-3

-2

-1

0

1

2

s

3

4

5

6

7

8

(d). formula (4)

Fig. 1. The sensitivities of case 1 according to the formula (1) ~ (4) at increasing levels of input perturbation ratio.

In order to show the different representation of the calculating sensitivity when yik in the formula (1) ~ (4) denotes the target output of the model and the network estimate of the target value without the input perturbation, respectively. Fig. 2 gives the sensitivities of case 1 according to the formula (4) with yik denoting the target

© 2011 ACADEMY PUBLISHER

output of the model. It is clear that the accuracy of sensitivities in Fig. 2 does not always increase with the noise level decrease. There is a big deviation when the input perturbation ratio is more (for example ±100%) or less (for example ±0.01%). The trend of changes in the sensitivity is irregular when the input perturbation ratio

JOURNAL OF COMPUTERS, VOL. 6, NO. 7, JULY 2011

1405

δ

changes in a regular way under this situation. So it is not suitable to calculate the sensitivity with yik denoting the target output of the model. Herein lies another problem—the accuracy of the neural network simulation. Fig. 3 shows the sensitivities of case 1 according to the formula (4) when the performance goal is 1E-2. Compared with the Fig. 1 (d), the accuracy of sensitivities with 1E-2 performance goal is lower than that with 1E-5 performance goal. The trend of accuracy of sensitivities calculated by the formula (4) is consistent with the accuracy of the neural network simulation. 100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100%

s4

s3

s2 s1

-4

-2

0

2

4

6

8

s

δ

Fig. 2 The sensitivities of case 1 according to the formula (4) with yik denoting the target output of the model. 100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100%

s4

s3

s2

s1

-3

-2 -1

0

1

2

s

3

4

5

6

7

8

Fig. 3 The sensitivities of case 1 according to the formula (4) with the performance goal is 1E-2.

(2) Case 2 The second example employs the following equation:

y = 2 x1 + 5 x22 + 10 sin x2 cos x3 + 5e

x4 − x3

2

(6)

The sensitivity of x1, x2, x3 and x4 can be calculated by

© 2011 ACADEMY PUBLISHER

əy/əx1, əy/əx2, əy/əx3, and əy/əx4. All the variable sensitivities are the functions of the input data except s1. The average sensitivity of x2 in the input domain is 12.21, and -4.69 for x3, 2.51 for x4. So s1=2, s2=12.21, s3=-4.69 and s4=2.51 can be regarded as a standard with which the following neural network analysis results are compared. The neural network used for realizing this function approximation is a three-layered system with 4 input neurons, 8 hidden neurons and 1 output neuron. Back-propagation algorithm with Bayesian training method was adopted, and the performance goal was set to 1E-5. One hundred and fifty groups of data based on equation (6) were randomly set as the ideal data for training. To calculate the sensitivity spectra, each input variable was perturbed by ±0.01%, ±0.1%, ±1%, ±5%, ±10%, ±20%, ±50%, ±80% and ±100% of its original data. Fig. 4 shows the sensitivity values of case 2 at increasing levels of input perturbation ratios from -100% to 100% according to the formulas (1) ~ (4). As the case 1, the same conclusion that the formula (4) is the best and the reasonable range of the input perturbation ratio is [-20%, 20%] can be drawn. In fact, the excellence of formula (4) is not accidental but based on the consistency of its definition with the one of partial derivatives. Fig. 5 is the sensitivities of case 2 according to the formula (4) with yik denoting the target output of the model. The results also show that it is unsuitable to calculate sensitivity when yik denoting the target output of the model. The conclusion drawn from Fig. 6 which shows the sensitivities of case 2 according to the formula (4) with the performance goal being 1E-2 is the same as case 1. The accuracy of the neural network simulation has an important influence on the accuracy of sensitivity calculated by the formula (4). V. CONCLUSION The ‘perturbation’ method on sensitivity was discussed in this paper, especially for its definition and the problem of its input perturbation ratio. Our findings are as follows. (1) The formula (4) in this paper is the best on calculating the sensitivity in comparison with other formulas which used in the published literatures. The formula (4) can correctly calculate the magnitude and the direction of the sensitivity while other formulas can not both satisfy these two points. It should be noted that yik in these formulas is not the target output of the model but the network estimate of the target value without the input perturbation, which is different from the other literatures. (2) The input perturbation ratio is an important factor affecting the sensitivity of the model. The reasonable range of the input perturbation ratio is [-20%, 20%] according to the two examples in this paper. There is no much difference in the sensitivity measurement and the right answers can be obtained within this range. (3) The accuracy of the neural network simulation has an important influence on the accuracy of sensitivity calculated by the formula (4). The more accurate the neural network simulation is, the more accurate the

1406

JOURNAL OF COMPUTERS, VOL. 6, NO. 7, JULY 2011

δ

sensitivity of the model is. In that way, the performance goal of the neural network model should be limited as small as possible. 100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100%

s4

s3 δ

s2

s1

100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100%

s4

s3

s2

s1

-4

-2

0

2

4

6

s

8

10

12

14

(d). formula (4) Fig. 4. The sensitivities of case 2 according to the formula (1) ~ (4) at increasing levels of input perturbation ratio.

-6.00E+008 -3.00E+008 0.00E+000 3.00E+008 6.00E+008 9.00E+008

s

δ

(a). formula (1) 100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100%

s4

s3

s2

δ

s1

s4

s3

s2

s1

-6

0

2

4

6

s

8

10

12

s4 s3 s2

δ

s1

-0.1

0.0

0.1

0.2

0.3

s (c). formula (3)

-2

0

2

4

s

6

8

10

12

14

Fig. 5 The sensitivities of case 2 according to the formula (4) with yik denoting the target output of the model.

100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100% -0.2

-4

14

(b). formula (2)

δ

100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100%

0.4

0.5

0.6

100% 80% 50% 20% 10% 5% 1% 0.10% 0.01% -0.01% -0.10% -1% -5% -10% -20% -50% -80% -100%

s4 s3

s2

s1

-4

-2

0

2

4

s

6

8

10

12

14

Fig. 6 The sensitivities of case 2 according to the formula (4) with the performance goal being 1E-2.

© 2011 ACADEMY PUBLISHER

JOURNAL OF COMPUTERS, VOL. 6, NO. 7, JULY 2011

1407

ACKNOWLEDGMENT This work is supported in part by a grant from the National Natural Science Foundations of China (Grant No. 50979031) and China Outstanding Postdoctors Science Foundation (Grant No. 200801359).

[11]

[12]

REFERENCES [1]

[2]

[3] [4]

[5]

[6]

[7]

[8]

[9]

[10]

Swartzman, G.L., Kaluzny, S.P., Ecological Simulation Primer. Macmillan Publishing Company, NewYork, 370pp. 1987. Dimopoulos, I., Chronopoulos, J., Chronopoulou-Sereli, A., Lek, S., “Neural network models to study relationships between lead concentration in grasses and permanent urban descriptors in Athens city (Greece)”. Ecol. Model. 120, 157-165. 1999. Garson, G.D., “Interpreting neural network connection weights”. Artif. Intel. Expert 6, 47-51. 1991. Scardi, M., Harding, L.W., “Developing an empirical model of phytoplankton primary production: a neural network case study”. Ecol. Model. 120, 213-223. 1999. Lek, S., Delacoste, M., Baran, P., Dimopoulos, I., Lauga, J., Aulagnier, S., “Application of neural networks to modelling nonlinear relationships in ecology”. Ecol .Model. 90, 39-52. 1996. J. Y. Choi and C.-H. Choi, “Sensitivity analysis of multilayer perceptron with differentiable activation functions”, IEEE Transactions on Neural Networks, vol.3, pp.101-107, Jan.1992. Jiang Quan,Feng Xiating,Su Guoshao,Chen Guoqing. “Intelligent Back Analysis of Rock Mass Parameters for Large under Ground Caverns under High Earth Stress Based on EDZ and Increment Displacement”. Chinese Journal of Rock Mechanics and Engineering, 26(Supp.1): 2654-2662. 2007. (in Chinese) AntonyY. Cheng and DanielS. Yeung. “Sensitivity Analysis of Neocognitron”, IEEE Transactions on Systems, Man and Cybernetics --Part C: Applications and Reviews, v 29, n 2, p 238-249, 1999. Reddy, N.S.; Lee, C.S.; Kim, J.H.; Semiatin, S.L. “Determination of the beta-approach curve and beta-transus temperature for titanium alloys using sensitivity analysis of a trained neural network”. Materials Science and Engineering A, v 434, n 1-2, p 218-226, October 25, 2006. S. Jaiswal, E.R. Benson, J.C. Bernard, G.L. Van Wicklen. “Neural Network Modelling and Sensitivity Analysis of a

© 2011 ACADEMY PUBLISHER

[13]

[14]

[15]

Mechanical Poultry Catching System”, Biosystems Engineering, 92(1), 59-68. 2005. Muriel Gevrey, Ioannis Dimopoulos, Sovan Lek. “Review and comparison of methods to study the contribution of variables in artificial neural network models”. Ecological Modelling, 160,249-264. 2003. Peter de B. Harrington and Chuanhao Wan. “Sensitivity Analysis Applied to Artificial Neural Networks: What has my neural network actually learned?” IJIMS 5(2002)2,1-18. Michele Scardi. “Artificial neural networks as empirical models for estimating phytoplankton production, Marine ecology progress series”. 139: 289-299. 1996. D. Lamy, “Modeling and sensitivity analysis of neural network, Mathematics and Computers in Simulation”, 40, p.535-548, 1996. Yuanchang Xie, Dominique Lord, Yunlong Zhang. “Predicting motor vehicle collisions using Bayesian neural network models: An empirical analysis, Accident Analysis and Prevention”, Volume 39, Issue 5, September 2007, pp. 922-933.

Runbo Bai was born in Tai’an, China, in 1982. He received his Ph.D. in College of Civil Engineering from Hohai University, China in 2009. Dr. Bai joined Shandong Agricultural University after his Ph.D. as a lecturer. He has been actively engaged in collaborative research projects in the areas of artificial neural network, hydraulic and civil engineering. Daozhen Zhang was born in Zibo, China, in 1979. He graduated in the College of Water-Conservancy and Civil Engineering, Shandong Agricultural University, Tai’an, China, in 2004, and presently is a lecture in Shandong Water Polytechnic, Rizhao, China. His main interests are dynamic model, nonlinear system and automaton. Hailei Jia was born in Zibo, China, in 1985. He graduated in the College of Water-Conservancy and Civil Engineering, Shandong Agricultural University, Tai’an, China, in 2007, and presently is a postgraduate student in the College of Civil and Traffic Engineering, Hohai University, Nanjing, China. His main interests are artificial neural network, biofilm and biofouling.