Multiple regression, ANN (RBF, MLP) and ANFIS ... - Semantic Scholar

Comment

Report 3 Downloads 22 Views

Expert Systems with Applications 38 (2011) 5958–5966

Contents lists available at ScienceDirect

Expert Systems with Applications journal homepage: www.elsevier.com/locate/eswa

Multiple regression, ANN (RBF, MLP) and ANFIS models for prediction of swell potential of clayey soils Isßık Yilmaz a,⇑, Oguz Kaynar b a b

Cumhuriyet University, Faculty of Engineering, Department of Geological Engineering, 58140 Sivas, Turkey Cumhuriyet University, Faculty of Economics and Administrative Sciences, Department of Management Information Systems, 58140 Sivas, Turkey

a r t i c l e

i n f o

a b s t r a c t

Keywords: ANN ANFIS Multiple regression Soft computing Clayey soil Swell potential

In the recent years, new techniques such as; artiﬁcial neural networks and fuzzy inference systems were employed for developing of the predictive models to estimate the needed parameters. Soft computing techniques are now being used as alternate statistical tool. Determination of swell potential of soil is difﬁcult, expensive, time consuming and involves destructive tests. In this paper, use of MLP and RBF functions of ANN (artiﬁcial neural networks), ANFIS (adaptive neuro-fuzzy inference system) for prediction of S% (swell percent) of soil was described, and compared with the traditional statistical model of MR (multiple regression). However the accuracies of ANN and ANFIS models may be evaluated relatively similar. It was found that the constructed RBF exhibited a high performance than MLP, ANFIS and MR for predicting S%. The performance comparison showed that the soft computing system is a good tool for minimizing the uncertainties in the soil engineering projects. The use of soft computing will also may provide new approaches and methodologies, and minimize the potential inconsistency of correlations. Ó 2010 Elsevier Ltd. All rights reserved.

1. Introduction

loaded, structures on the surface will be affected by heave. As reported by Bell, Cripps, Culshaw, and Entwisle (1993) depending on the catalogue of Burland (1984), the annual cost of the problem in the USA and Sudan in the mid 1980’s was $6–$8 billions and $6 millions, respectively (Yilmaz, 2008). A great deal of structural movement has been unduly blamed on expansive soils. Many ﬂoor slabs, constructed in an expansive soil area, crack and sometimes heave due to improperly designed concrete. It is a well known fact that the improper curing of concrete, in addition to the lack of expansion joints, will cause cracking (Chen, 1975). In order to classify swelling soils and design structures either upon or inside a clayey soil, swell potential of the soil have a vital importance. Swell potential of the soil is mainly used in numerical and analytical methods in design approaches for estimation of surface heave and swelling pressure acting on a building. Correlations have been a signiﬁcant part of soil mechanics from the earliest days. In some cases it is essential as it is difﬁcult to measure the amount directly and in other cases it is desirable, to ascertain the results with other tests through correlations. The correlations are generally semi-empirical based on mechanics or purely empirical based on statistical analysis. However, determination of swell potential of a soil material is time consuming, expensive and involves destructive tests. If reliable predictive models could be obtained to correlate swell percent (S%) to quick, cheap and nondestructive test results, they will be very valuable for at least the preliminary stage of designing a structure. The use of

Many buildings are constructed with foundations that are inadequate for existing soil conditions. Because of the lack of suitable land, homes are often built on the marginal land that has insufﬁcient bearing capacity to support the substantial weight of a structure. Land becomes scarce with city growth and it often becomes necessary to construct buildings and other structures on the sites in unfavorable conditions. The most important characteristic of clayey soils is their susceptibility to the volume change from swelling and shrinkage. Such volume changes can give rise to ground movements which may result in damage to buildings (Bell & Jermy, 1994; Bell & Maud, 1995). The clays most prone to swelling and shrinkage are over-consolidated clays (Dhowian, Ruwiah, & Erol, 1985) and Tertiary and Quaternary alluvial/colluvial soils (Donaldson, 1969). Swelling potential of expansive clayey soils is due to reductions of overburden stress, unloading conditions, or exposure to water and increase in moisture content. Bell and Maud (1995) suggest that low rise buildings are particularly vulnerable to ground movements as they generally do not have sufﬁcient weight or strength to resist such movement. Geotechnical engineers have long recognized that swelling of expansive soils caused by moisture variation may result in considerable distress and consequently in severe damage to the overlying structures (Basma, 1991). If the substrata are not heavily ⇑ Corresponding author. Tel.: +90 346 219 1010x1305; fax: +90 346 219 1171. E-mail address: [email protected] (I. Yilmaz). 0957-4174/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.eswa.2010.11.027

5959

I. Yilmaz, O. Kaynar / Expert Systems with Applications 38 (2011) 5958–5966

empirically obtained parameters from the index test results may not be reliable for engineering projects. However, these data would be very valuable for at least the preliminary stage of designing a structure, when the data joined with interpretation is based on engineering experiences. However, the literature contains a considerable number of empirical equations obtained from the conventional statistical techniques for assessing the swell potential of soils. In recent years, some new soft computing techniques such as artiﬁcial neural networks, fuzzy inference systems, evolutionary computation, etc. and their hybrids have been successfully employed for developing predictive models to estimate the needed parameters. These techniques have more attraction in many research ﬁelds because a wide range of uncertainty can be tolerated by them, and soft computing techniques are now being used as alternate statistical tool. This study aims to determine the empirical relationships for estimation of swell percent of soils by using multiple regression (MR), MLP and RBF functions of ANN (artiﬁcial neural network) and ANFIS (adaptive neuro-fuzzy inference system) models, and compare the prediction capabilities of the models. Soil samples (215) were tested for determination of swell percent (S%), liquid limit (LL), activity (A) and cation exchange capacity (CEC), and in order to establish predictive models, statistical and soft computing techniques such as multiple regression, artiﬁcial neural networks by means of MLP and RBF and adaptive neuro-fuzzy inference system models were used, and prediction performances were then analyzed. It was found that the relationships developed in this study will allow LL, A and CEC to be used a rapid, easy to determine, low cost means to estimate the swelling potential with sufﬁcient accuracy to allow for adequate foundation design in situations where urgency and/or lack of money prevents a thorough geotechnical investigation from being conducted. Moreover, the comparison of performance indices and coefﬁcient of correlations for predicting swell percent revealed that prediction performances of the RBF function of artiﬁcial neural network model is higher than those of multiple regression equations, MLP function of artiﬁcial neural networks and artiﬁcial neuro-fuzzy inference system.

2. Experimental framework In this study, the data were provided from extensive ﬁeld studies and our database was constructed over 15 years. As is well known sufﬁcient number of data having high quality is required in order to construct reliable predictive model, that’s why 215 samples were used in the analyses; however we have the data of 350 or more soil samples. Soils were tested for determination of swell percent, Atterberg limits, cation exchange capacity and grain size distribution according to the procedure suggested by international standards. In order to determine the swelling percent of the soils samples, swelling tests were carried out thereon in accordance with ASTM D-4546 (1994). A 0.07 kgf/cm2 pre-loading pressure and samples with a radius of 5.0 cm were used in our tests. When clay minerals are present in ﬁne-grained soil it can be remoulded in the presence of some moisture without crumbling. This cohesive nature is caused by the adsorbed water surrounding the clay particles. LL increases with the increasing of the quantity of expansive clay minerals such as montmorillonite, etc. The liquid limit and plastic limit values of the samples were determined according to the procedure outlined in British Standard (BS) 1377 (BS, 1975). Swelling properties of the soils are affected by CEC, in other words the swelling capacity is closely related to the CEC. The amount of swelling increases with increasing of CEC (Christidis,

1998). Al-Rawas (1998) has also reported that the cations are the factors controlling the expansive nature of soils. One of the fundamental differences between clay minerals lies in the amount and kind of exchangeable cations present on their surfaces and the excess negative charge of the crystal lattice which these cations neutralize. The property of ion exchange is of great fundamental and practical importance in the investigation of the clay minerals. The CEC of a soil is the number of moles of adsorbed cation charge that can be desorbed from unit mass of soil, under given conditions of temperature, pressure, soil solution composition and soil-solution mass ratio (Sposito, 1989). For soils in which the readily exchangeable cations are solely monovalent or bivalent, the ‘‘index’’ cation can be Na+, whereas for soils also bearing trivalent readily exchangeable cations, Ba2+ is the ‘‘index’’ cation of choice. þ Often NHþ 4 has been used as an ‘‘index’’ cation. In this study NH4 was used as an index cation (Yilmaz, 2006). In the last stage of the laboratory experiments, CEC of the soils was measured by using the ammonium acetate (NH4OAc) method. The basis of this method is the replacement of sodium (Na+) ions with ammonium (NHþ 4 ) ions. In the tests, the soils were ﬁrst saturated with the sodium ions and then replacing of sodium ions with ammonium ions were provided by adding a solution containing ammonium at a pH of 7 (Bache, 1976). At the end of the CEC tests, the amount of sodium in the solution was determined by the atomic adsorption method. The results obtained and their basic test statistics are tabulated in Table 1. The swell percent of the soils ranged between 1.1 and 15.2 with an average value of 6.75. While the average value of liquid limit was 56.5%, values varied from 4% to 112%. The respective average values of activity and cation exchange capacity were determined as 0.85% (0.11–1.84%) and 47.1 meq/100 g (5.1– 94.9 meq/100 g). It was particularly paid to attention to select the data set having a normal distribution. In order to characterize the variation of S% used as an independent value, descriptive statistics such as; minimum, maximum, mean, mode, median, variance, standard deviation, skewness and kurtosis etc. were calculated using the SPSS Version 10.0.1 (1999) package. Table 2 shows that the independent value shows almost normal distribution. However it is close to the normal distribution, data are skewed left and showed a kurtosis (Fig. 1). It can be seen that the respective skewness and kurtosis values of 0.207 and 0.471 were very low. In conclusion, it was evident that the analyses will work well in case. 3. Data processing and analyses In order to establish the predictive models among the parameters obtained in this study, simple regression analysis was performed in the ﬁrst stage of the analysis. The relations between S with other parameters were analyzed employing linear, power, logarithmic and exponential functions. Statistically signiﬁcant and strong correlations were found to be linear, and regression equations were established among index parameters with S (Table 3). All obtained relationships were found to be statistically signiﬁcant according to the Student’s t-test at 95% level of conﬁdence.

Table 1 Basic statistics of the results obtained from tests.

Minimum Maximum Average Std. Dev.

S (%)

LL (%)

A (%)

CEC (meq/100 g)

1.1 15.2 6.75 3.435

4 112 56.5 26.576

0.11 1.84 0.85 0.356

5.1 94.9 47.1 24.312

S, swell percent; LL, liquid limit; A, activity; CEC, cation exchange capacity.

5960

I. Yilmaz, O. Kaynar / Expert Systems with Applications 38 (2011) 5958–5966

all the independent variables are 0. The standardized versions of the b coefﬁcients are the beta weights, and the ratio of the beta coefﬁcients is the ratio of the relative predictive power of the independent variables. The major conceptual limitation of all regression techniques is that one can only ascertain relationships, but never be sure about underlying causal mechanism. Multiple regression analysis was carried out to correlate the measured swell percent to three soil index, namely, liquid limit, activity and cation exchange capacity (Table 4). Multiple regression model to predict swell percent is given below.

Table 2 Descriptive statistics for S% as an independent value. N

Valid: 215 Missing: 0

Mean Std. error of mean Median Mode Std. deviation Variance Skewness Std. error of skewness Kurtosis Std. error of kurtosis Range Minimum Maximum Sum

6.7474 0.2343 6.5000 6.00 3.4351 11.8002 0.207 0.106 0.471 0.230 14.10 1.10 15.20 1450.70

S% ¼ ð9:223 102 ÞLL þ ð2:401 102 ÞA þ ð5:535 102 ÞCEC 0:153

ð1Þ

In fact, the coefﬁcient of correlation between the measured and predicted values is a good indicator to check the prediction performance of the model. Fig. 3 shows the relationships between measured and predicted values obtained from the MR model for S%, with good correlation coefﬁcient. In this study, values account for (VAF) (Eq. (2)) and root mean square error (RMSE) (Eq. (3)) indices were calculated to control the performance of the prediction capacity of predictive model developed in the study as employed by Alvarez and Babuska (1999), Finol, Guo, and Jing (2001), Gokceoglu (2002), Yilmaz and Yüksek (2008, 2009):

varðy y0 Þ 100 VAF ¼ 1 varðyÞ vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ u N u1 X RMSE ¼ t ðy y0 Þ2 N i¼1

ð2Þ ð3Þ

where y and y0 are the measured and predicted values, respectively. The calculated indices are given in Table 5. If the VAF is 100 and RMSE is 0, then the model will be excellent. Mean absolute percentage error (MAPE) which is a measure of accuracy in a ﬁtted series value in statistics was also used for comparison of the prediction performances of the models. MAPE usually expresses accuracy as a percentage (Eq. (4)). Fig. 1. Frequency distribution of UCS values of samples used in analyses.

Table 3 Predictive models for assessing the S%.

S–LL S–A S–CEC

Predictive model

R2

S = 0.1239 LL + 0.2584 S = 7.6766 A + 0.2174 S = 0.1335 CEC + 0.457

0.91 0.68 0.89

Fig. 2 shows the plot of the swell percent versus liquid limit, activity and cation exchange capacity. 3.1. Multiple regression model Multiple regression, a time-honored technique going back to Pearson’s 1908 use of it, is employed to account for (predict) the variance in an interval dependent, based on linear combinations of interval, dichotomous, or dummy independent variables. The general purpose of multiple regression is to learn more about the relationship between several independent or predictor variables and a dependent or criterion variable. The multiple regression equation takes the form y = b1x1 + b2x2 + + bnxn + c. b1, b2, . . ., bn are the regression coefﬁcients, representing the amount the dependent variable y changes when the corresponding independent changes 1 unit. c is a constant, where the regression line intercepts the y axis, representing the amount the dependent y will be when

N 1X Ai Pi 100 MAPE ¼ N i¼1 Ai

ð4Þ

where Ai is the actual value and Pi is the predicted value. The obtained values of RMSE, VAF and MAPE, given in Table 5, indicated high prediction performances. 3.2. ANN (artiﬁcial neural networks) models – MLP and RBF Neural networks may be used as a direct substitute for auto correlation, multivariable regression, linear regression, trigonometric and other statistical analysis and techniques (Singh, Kanchan, Verma, & Singh, 2003). Neural networks, with their remarkable ability to derive a general solution from complicated or imprecise data, can be used to extract patterns and detect trends that are too complex to be noticed by either humans or other computer techniques. A trained neural network can be thought of as an ‘‘expert’’ in the category of information it has been given to analyze. This expert can then be used to provide projections given new situations of interest and answer ‘‘what if’’ questions. When a data stream is analyzed using a neural network, it is possible to detect important predictive patterns that are not previously apparent to a nonexpert. Thus, the neural network can act as an expert. The particular network can be deﬁned by three fundamental components: transfer function, network architecture and learning law (Simpson, 1990). It is essential to deﬁne these components, to solve the problem satisfactorily. Neural networks consist of a large class of different architectures. Multi Layer Perceptron (MLP) and Radial Basis

I. Yilmaz, O. Kaynar / Expert Systems with Applications 38 (2011) 5958–5966

5961

Fig. 2. Swell percent versus liquid limit, activity and cation exchange capacity.

Function (RBF) are two of the most widely used neural network architecture in literature for classiﬁcation or regression problems Table 4 Model summaries of multiple regressions for prediction of S%. Independent variables

Coefﬁcient

Std. error

t-Value

Sig. level

Constant LL A CEC

0.153 9.223 102 2.401 102 5.535 102

0.178 0.011 0.331 0.012

0.859 8.727 0.073 3.064

0.392 0.000 0.942 0.002

(Cohen & Intrator, 2002, 2003; Kenneth, Wernter, & MacInyre, 2001; Loh & Tim, 2000). Both types of neural network structures are good in pattern classiﬁcation problems. They are robust classiﬁers with the ability to generalize for imprecise input data. General difference between MLP and RBF is that RBF is a localist type of learning which is responsive only to a limited section of input space. On the other hand, MLP is more distributed approach. The output of a MLP is produced by linear combinations of the outputs of hidden layer nodes in which every neuron maps a weighted average of the inputs through a sigmoid function. In one hidden

Fig. 3. Cross-correlation of predicted and observed values of S% for multiple regression model.

5962

I. Yilmaz, O. Kaynar / Expert Systems with Applications 38 (2011) 5958–5966

Table 5 Performance indices (RMSE, VAF and R2) for models.

x0 ¼

1 P 1 þ exp ð xh who Þ

ð6Þ

The activation level of the nodes in the hidden layer is determined in a similar fashion. Based on the differences between the calculated output and the target value an error is deﬁned as follows: RMSE, root mean square error; VAF, value account for; MAPE, mean absolute percentage error.

layer RBF network hidden nodes map distances between input vectors and center vectors to outputs through a nonlinear kernel or radial function. In this study, the two different architectures of ANN (MLP and RBF) were also used to estimation of the swelling percent of the soils. All data were ﬁrst normalized and divided into three data sets such as; training (60% of all data), test (20% of all data) and veriﬁcation (20% of all data). In this study Matlab 7.1. (2005) software was used in neural network analyses having a three-layer feed-forward network that consists of an input layer (3 neurons), one hidden layers (2 neurons for MLP, 16 neurons for RBF) and one output layer (Fig. 4). Neuron numbers in hidden layers were selected from a series of trial runs of the networks having 1 neuron to 20 neurons in order to obtain the neuron number in the network having minimum error. In the analyses, network parameters of learning rate and momentum were set to 0.01 and 0.9, respectively. Variable learning rate with momentum (trainLm) as networks training function and tansig as an activation (transfer) function for all layer was used. 3.3. Multi Layer Perceptron (MLP) model Multi Layer Perceptron (MLP) network models are the popular network architectures used in most of the research applications in medicine, engineering, mathematical modeling, etc. In MLP, the weighted sum of the inputs and bias term are passed to activation level through a transfer function to produce the output, and the units are arranged in a layered feed-forward topology called Feed Forward Neural Network (Venkatesan & Anitha, 2006). MLP networks consist of an input layer, one or more hidden layers and an output layer. Each layer has a number of processing units and each unit is fully interconnected with weighted connections to units in the subsequent layer. The MLP transforms n inputs to l outputs through some nonlinear functions. The output of the network is determined by the activation of the units in the output layer as follows:

x0 ¼ f

X

! xh who

ð5Þ

h

where f() is activation function, xh: activation of hth hidden layer node and who: is the interconnection between hth hidden layer node and oth output layer node. The most used activation function to is the sigmoid and it is given as follows:

Fig. 4. MLP and RBF neural network structure used in the study.

E¼

N X L 1X 2 ðt ðsÞ xðsÞ o Þ 2 s o o

ð7Þ

where N is the number of pattern in data set and L is the number of output nodes. The aim is to reduce the error by adjusting the interconnections between layers. The weights are adjusted using gradient descent back propagation (BP) algorithm. The algorithm requires a training data that consists of a set of corresponding input and target pattern values to. During training process, MLP starts with a random set of initial weights and then training continues until set of wih and that of who are optimized so that a predeﬁned error threshold is met between xo and to (after Altun & Gelen, 2004). According to the BP algorithm, each interconnection between the nodes are adjusted by the amount of the weight update value as follows:

dE ¼ gdo xh dwho dE Dwih ¼ g ¼ gdh xi dwih

Dwho ¼ g

ð8Þ ð9Þ

where E is the error cost function given in Eq. (7), do is x0o (to xo) P and dh is x0h ¼ o do who , where x0o ¼ xo ð1 xo Þ and x0h ¼ xh ð1 xh Þ when a sigmoid activation function is used (Altun & Gelen, 2004). Cross-correlation between predicted and observed values (Fig. 5) indicated that the ANN model of MLP is highly acceptable for prediction of S%. RMSE, VAF, MAPE and R2 values are tabulated in Table 5. 3.4. Radial Basis Function (RBF) model Radial Basis Function (RBF) neural network is based on supervised learning. RBF networks were independently proposed by many researchers and are a popular alternative to the MLP. RBF networks are also good at modeling nonlinear data and can be trained in one stage rather than using an iterative process as in MLP and also learn the given application quickly (Venkatesan & Anitha, 2006). The structure of RBF neural network is similar to that of MLP. It consists of layer of neurons. The main distinction is that RBF has a hidden layer which contains nodes called RBF units. Each RBF has two key parameters that describe the location of the function’s center and its deviation or width. The hidden unit measures the distance between an input data vector and the center of its RBF. The RBF has its peak when the distance between its center and that of the input data vector is zero and declines gradually as this distance increases. There is only a single hidden layer in a RBF network there are only two sets of weights, one connecting the hidden layer to the input layer and the other connecting the hidden layer to the output layer. Those weights connecting to the input layer contain the parameters of the basis functions. The weights connecting the hidden layer to the output layer are used to form linear combinations of the activations of the basis functions (hidden units) to generate the network outputs. Since the hidden units are nonlinear, the outputs of the hidden layer may be combined linearly and so processing is rapid. The output of the network is derived from (Foody, 2004).

I. Yilmaz, O. Kaynar / Expert Systems with Applications 38 (2011) 5958–5966

5963

Fig. 5. Cross-correlation of predicted and observed values of S% for ANN (MLP and RBF) models.

yk ðxÞ ¼

M X

wkj /j ðxÞ þ wk0

ð10Þ

j¼1

where M is the number of basis functions, x the input data vector, wkj represents a weighted connection between the basis function and output layer and /j is the nonlinear function of unit j, which is typically a Gaussian of the form

/j ðxÞ ¼ exp

kx lj k2

!

2r2j

ð11Þ

where x and l are the input and the center of RBF unit respectively. rj is the spread of the Gaussian basis function (Foody, 2004). The weights are optimized using least mean square LMS algorithm once the centers of RBF units are determined. The centers can be chosen randomly or using clustering algorithms. In this study, centers were randomly selected from data set. As seen from Table 5 and Fig. 5 of cross-correlation between predicted and observed values (Fig. 5), RBF type of ANN model is highly acceptable for prediction of S%. RMSE, VAF, MAPE and R2 values are also tabulated in Table 5. 3.5. Adaptive neuro-fuzzy inference system (ANFIS) model In ANFIS, both of the learning capabilities of a neural network and reasoning capabilities of fuzzy logic were combined in order to give enhanced prediction capabilities, as compared to using a single methodology alone. The goal of ANFIS is to ﬁnd a model or mapping that will correctly associate the input values with the target values. The fuzzy inference system (FIS) is a knowledge representation where each fuzzy rule describes a local behavior of the system. The network structure that implements FIS and employs hybrid-learning rules to train is called ANFIS. Let X be a space of objects and x be a generic element of X. A classical set A # X is deﬁned as a collection of elements or objects x 2 X such that each x can either belong or not belong to the set A. By deﬁning a characteristic function for each element x in X, we can represent a classical set A by a set of ordered pairs (x, 0) or (x, 1) which indicates x R A or x 2 A, respectively. On the other

hand, a fuzzy set expresses the degree to which an element belongs to a set. Hence the characteristic function of a fuzzy set is allowed to have values between 0 and 1, which denotes the degree of membership of an element in a given set. So a fuzzy set A in X is deﬁned as a set of ordered pairs:

A ¼ fðx; lAðxÞÞjx 2 Xg

ð12Þ

where lA(x) is called the membership function (MF) for the fuzzy set A. The MF maps each element of X to a membership grade (or a value) between 0 and 1. Usually X is referred to as the universe of discourse or simply the universe. The most widely used MF is the generalized bell MF (or the bell MF), which is speciﬁed by three parameters {a, b, c} and deﬁned as (Jang & Chuen-Tsai, 1995)

bellðx; a; b; cÞ ¼ 1=ð1 þ jx ðc=aÞj2b Þ

ð13Þ

Parameter b is usually positive. A desired BellMF can be obtained by a proper selection of the parameter set {a, b, c}. During the learning phase of ANFIS, these parameters are changing continuously in order to minimize the error function between the target output values and the calculated ones (Lee, 1990a,b). The proposed neuro-fuzzy model of ANFIS is a multilayer neural network-based fuzzy system. Its topology is shown in Fig. 6, and the system has a total of ﬁve layers. In this connected structure, the input and output nodes represent the training values and the predicted values, respectively, and in the hidden layers, there are nodes functioning as membership functions (MFs) and rules. This architecture has the beneﬁt that it eliminates the disadvantage of a normal feed forward multilayer network, where it is difﬁcult for an observer to understand or modify the network. For simplicity, we assume that the examined fuzzy inference system has two inputs x and y and one output. For a ﬁrst-order Sugeno fuzzy model, a common rule set with two fuzzy if–then rules is deﬁned as

Rule 1 : If x is A1 and y is B1 ; then f 1 ¼ p1 x þ q1 y þ r1 ;

ð14Þ

Rule 2 : If x is A2 and y is B2 ; then f 2 ¼ p2 x þ q2 y þ r 2 :

ð15Þ

5964

I. Yilmaz, O. Kaynar / Expert Systems with Applications 38 (2011) 5958–5966

Fig. 6. Type-3 fuzzy reasoning (a) and equivalent ANFIS (b) (after Jang, 1993).

As seen from Fig. 6b, different layers of ANFIS have different nodes. Each node in a layer is either ﬁxed or adaptive (Jang, 1993). Different layers with their associated nodes are described below: Layer 1. Every node I in this layer is an adaptive node. Parameters in this layer are called premise parameters. Layer 2. Every node in this layer is a ﬁxed node labeled P, whose output is the product of all the incoming signals. Each node output represents the ﬁring strength of a rule. Layer 3. Every node in this layer is a ﬁxed node labeled N. The ith node calculates the ratio of the ith rules’ ﬁring strength. Thus the outputs of this layer are called normalized ﬁring strengths. Layer 4. Every node i in this layer is an adaptive node. Parameters in this layer are referred to as consequent parameters. Layer 5. The single node in this layer is a ﬁxed node labeled R, which computes the overall output as the summation of all incoming signals. The learning algorithm for ANFIS is a hybrid algorithm, which is a combination of gradient descent and the least-squares method. More speciﬁcally, in the forward pass of the hybrid learning algorithm, node outputs go forward until layer 4 and the consequent parameters are identiﬁed by the least-squares method (Jang, 1993). In the backward pass, the error signals propagate backwards and the premise parameters are updated by gradient descent. Table 6 summarizes the activities in each pass. The consequent parameters are optimized under the condition that the premise parameters are ﬁxed. The main beneﬁt of the hybrid approach is that it converges much faster since it reduces the search space dimensions of the original pure back propagation

Table 6 Forward and backward pass for ANFIS.

Premise parameters Consequent parameters Signals

Forward pass

Backward pass

Fixed Least-squares estimator Node outputs

Gradient descent Fixed Error signals

method used in neural networks. The overall output can be expressed as a linear combination of the consequent parameters. The error measure to train the above-mentioned ANFIS is deﬁned as (Loukas, 2001):

E¼

n X ðfk fk0 Þ2

ð16Þ

k¼1

Table 7 Different parameter types and their values used for training ANFIS. ANFIS parameter type

Value

MF type

Gauss function 9 Linear 12 18

Number of MFs Output function Number of linear parameters Number of nonlinear parameters Total number of parameters Number of training data pairs Number of checking data pairs Number of testing data pairs

30 129 43 43

I. Yilmaz, O. Kaynar / Expert Systems with Applications 38 (2011) 5958–5966

5965

Fig. 7. Cross-correlation of predicted and observed values of S% for ANFIS model.

Fig. 8. The variation of the values predicted by MR, MLP, RBF and ANFIS models, from the observed values.

where fk and fk0 are the kth desired and estimated output, respectively, and n is the total number of pairs (inputs–outputs) of data in the training set. In this study, a hybrid intelligent system called ANFIS (the adaptive neuro-fuzzy inference system) for predicting S% was also applied. ANFIS was trained with the help of Matlab version 7.1 (2005), SPSS 10.0.1 (1999) package was used for RMSE and statistical calculations. Parameter types and their values used in ANFIS model can be seen in Table 7. According to the RMSE, VAF, MAPE and R2 values (Table 5) and cross-correlation between predicted and observed values (Fig. 7), ANFIS model constructed to predict S% has a high prediction performance. 4. Results In this paper, use of multiple regression (MR), artiﬁcial neural network (ANN) and artiﬁcial neuro-fuzzy inference system (ANFIS) models, for the prediction of swell percent of soils, was described and compared. According to the results of simple regression analyses, there are statistically meaningful relationships between swell percent with liquid limit, activity and cation exchange capacity. The models of multiple regression, MLP and RBF types of artiﬁcial neural network, artiﬁcial neuro-fuzzy inference system for the prediction of the swell percent were then constructed using three inputs and one output. The results of the present paper can be drawn as follows: a. The result of the model for prediction of the swell percent showed that the equation obtained from the multiple regression model has a high prediction performance.

b. The ANFIS model for prediction of swell percent revealed a more reliable prediction when compared with the multiple regression model. c. In order to predict the swell percent, ANN models (MLP and RBF), particularly RBF, having three inputs and one output was applied successfully, and exhibited the more reliable predictions than the regression and ANFIS models. As a result of the comparison of VAF, RMSE and MAPE indices and coefﬁcient of correlations (R2) for predicting S%, it was obtained that prediction performance of the ANN-RBF model is higher than those of ANN-MLP, ANFIS and multiple regression. In order to show the deviations from the observed values of S%, the distances of the predicted values from the models constructed from the observed values were also calculated and graphics were drawn (Fig. 8). These graphics indicated that the deviation interval (1.240 to +1.304) of the predicted values from ANN-RBF is smaller than the deviation interval of ANN-MLP (1.535 to +2.117), ANFIS (2.137 to +2.107) and multiple regression (2.721 to +1.754).

5. Conclusions However the accuracies of ANN and ANFIS models may be evaluated relatively similar. It is shown that the constructed ANN models of RBF and MLP exhibit a high performance than ANFIS and multiple regression for predicting S%. The performance comparison showed that the soft computing system is a good tool for minimizing the uncertainties in the soil engineering projects. The use of soft computing will also may provide new approaches and methodologies, and minimize the potential inconsistency of correlations.

5966

I. Yilmaz, O. Kaynar / Expert Systems with Applications 38 (2011) 5958–5966

As is known, the potential beneﬁts of soft computing models extend beyond the high computation rates. Higher performances of the soft computing models were sourced from greater degree of robustness and fault tolerance than traditional statistical models because there are many more processing neurons, each with primarily local connections. However the comparison of the RBF and MLP network models indicates the good predictive capabilities of RBF model. Their accuracies are almost the same. It was found that the time taken by RBF is less than that of MLP in this study. But, limitation of the RBF model is that it is more sensitive to dimensionality and has greater difﬁculties if the number of units is large. It appears that there is a possibility of estimating swell percent of soils by using the proposed empirical relationships and soft computing models. The population of the analyzed data is relatively limited in this study. Therefore, the practical outcome of the proposed equations and models could be used, with acceptable accuracy, at the preliminary stage of design.

References Al-Rawas, A. A. (1998). The factors controlling the expansive nature of the soils and rocks of northern Oman. Engineering Geology, 53(3–4), 327–350. Altun, H., & Gelen, G. (2004). Enhancing performance of MLP/RBF neural classiﬁers via an multivariate data distribution scheme. In International conference on computational intelligence (ICCI2004), Nicosia, North Cyprus, 24–29, May 2004 (pp. 1–6). Alvarez, G. M., & Babuska, R. (1999). Fuzzy model for the prediction of unconﬁned compressive strength of rock samples. International Journal of Rock Mechanics and Mining Sciences, 36, 339–349. ASTM (1994). Annual book of ASTM standards (ASTM, D-4546), Soil and rock (I):D420D4914, V. 04.08 (pp. 693–699). Bache, B. W. (1976). The measurement of cation exchange capacity of soils. Journal of Science Food Agriculture, 27(3), 273–280. Basma, A. A. (1991). Estimating uplift of foundations due to expansion: A case history. Geotechnical Engineering, 22, 217–231. Bell, F. G., Cripps, J. C., Culshaw, M. G., & Entwisle, D. (1993). Volume changes in weak rocks: predictions and measurement. In Anagnostopoulos et al. (Eds.), Geotechnical engineering of hard soils-soft rocks, Balkema, Rotterdam (pp. 925– 932). Bell, F. G., & Jermy, C. A. (1994). Building on clay soils which undergo volume changes. Architectural Science Review, 37, 35–43. BS 1377. (1975). Methods of test for soils for civil engineering purposes. London: British Standards Institution. Bell, F. G., & Maud, R. R. (1995). Expansive clays and construction, especially of lowrise structures: A viewpoint from Natal, South Africa. Environmental and Engineering Geoscience, 1(1), 41–59. Burland, J. B. (1984). Building on expansive soils. In 1st national conf. on the science and technology of buildings with special reference to buildings in hot climates, Khartoum, Sudan, Theme lecture (pp. 925–931). Chen, F. H. (1975). Foundations of expansive soils 280p. Amsterdam, The Netherlands: Elsevier. Christidis, G. E. (1998). Physical and chemical properties of some bentonite deposits of Kimolos Island, Greece. Applied Clay Science, 13(2), 79–98.

Cohen, S., & Intrator, N. (2002). Automatic model selection in a hybrid perceptron/ radial network. Information Fusion: Special Issue on Multiple Experts, 3(4), 259–266. Cohen, S., & Intrator, N. (2003). A study of ensemble of hybrid networks with strong regularization. Multiple Classiﬁer Systems, 227–235. Dhowian, A., Ruwiah, I., & Erol, A. (1985). The distribution and evaluation of expansive soils in Saudi Arabia. Proceedings of second Saudi engineering conference (Vol. 4, pp. 1969–1990). Dhahran: King Fahd University of Petroleum and Minerals. Donaldson, G. W. (1969). The occurrence of problem heave and the factors affecting its nature. Proceedings of second international research and engineering conference on expansive clay soils, Texas (pp. 1969). College Station, TX: A&M Press. Finol, J., Guo, Y. K., & Jing, X. D. (2001). A rule based fuzzy model for the prediction of petrophysical rock parameters. Journal of Petroleum Science and Engineering, 29, 97–113. Foody, G. M. (2004). Supervised image classiﬁcation by MLP and RBF neural networks with and without an exhaustively deﬁned set of classes. International Journal of Remote Sensing, 25(15), 3091–3104. Gokceoglu, C. (2002). A fuzzy triangular chart to predict the uniaxial compressive strength of Ankara agglomerates from their petrographic composition. Engineering Geology, 66, 39–51. Jang, J. R. (1993). ANFIS: Adaptive-Network-Based Fuzzy Inference System. IEEE Transactions on Systems, Man, and Cybernetics, 23, 665–685. Jang, J. S. R., & Chuen-Tsai, S. (1995). Neuro-fuzzy modeling and control. Proceeding of IEEE, 83, 378–406. Kenneth, J., Wernter, S., & MacInyre, J. (2001). Knowledge extraction from Radial Basis Function networks and Multi Layer Perceptrons. International Journal of Computational Intelligence and Applications, 1(3), 369–382. Lee, C. C. (1990a). Fuzzy logic in control systems: Fuzzy logic controller. I IEEE Transactions on Systems, Man, and Cybernetics, 20, 404–418. Lee, C. C. (1990b). Fuzzy logic in control systems: Fuzzy logic controller. II IEEE Transactions on Systems, Man, and Cybernetics, 20, 419–435. Loh, W., & Tim, L. (2000). A comparison of prediction accuracy, complexity, and training time of thirty three old and new classiﬁcation algorithm. Machine Learning, 40(3), 203–238. Loukas, Y. L. (2001). Adaptive neuro-fuzzy inference system: an instant and architecture-free predictor for improved QSAR studies. Journal of Medical Chemistry, 44, 2772–2783. Matlab 7.1 (2005). Software for technical computing and Model-Based Design. The MathWorks Inc. Simpson, P. K. (1990). Artiﬁcial neural system-foundation, paradigm, application and implementation. New York: Pergamon Press. Singh, T. N., Kanchan, R., Verma, A. K., & Singh, S. (2003). An intelligent approach for prediction of triaxial properties using unconﬁned uniaxial strength. Mining Engineering Journal, 5, 12–16. Sposito, G. (1989). The chemistry of soils 277p. Oxford University Press. SPSS 10.0.1 (1999). Statistical analysis software (Standard Version). SPSS Inc. Venkatesan, P., & Anitha, S. (2006). Application of a radial basis function neural network for diagnosis of diabetes mellitus. Current Science, 91(9), 1195–1199. Yilmaz, I. (2006). Indirect estimation of the swelling percent and a new classiﬁcation of soils depending on liquid limit and cation exchange capacity. Engineering Geology, 85(3–4), 295–301. Yilmaz, I. (2008). A case study for mapping of spatial distribution of free surface heave in alluvial soils (Yalova, Turkey) by using GIS software. Computers and Geosciences, 34(8), 993–1004. Yilmaz, I., & Yüksek, A. G. (2008). An example of artiﬁcial neural network application for indirect estimation of rock parameters. Rock Mechanics and Rock Engineering, 41(5), 781–795. Yilmaz, I., & Yüksek, A. G. (2009). Prediction of the strength and elasticity modulus of gypsum using multiple regression, ANN, ANFIS models and their comparison. International Journal of Rock Mechanics and Mining Sciences, 46(4), 803–810.

Recommend Documents

Stepwise multiple regression method of ... - Semantic Scholar

Multiple and logistic regression

Multiple Regression

Multiple regression