Modified Fuzzy-CMAC Networks with Clustering-based Structure ...

Comment

Report 3 Downloads 32 Views

2006 International Joint Conference on Neural Networks Sheraton Vancouver Wall Centre Hotel, Vancouver, BC, Canada July 16-21, 2006

Modified Fuzzy-CMAC Networks with Clustering-based Structure Geraldo Souza Reis Júnior and Paulo E. M. Almeida Abstract — This work proposes a modified structure for the Fuzzy-CMAC network to solve the curse of dimensionality problem, observed when systems with a high number of involved variables are being modeled with the aid of computational intelligence techniques. The approach is based on the fuzzy C-means clustering algorithm, which is used here to initialize the CMAC fuzzy input partitions. Also, the number of fuzzy-CMAC internal memories is drastically decreased by using the clusters defined by the fuzzy C-means algorithm. During network training phase, those fuzzy input partitions are adjusted together with the output layer weights of the network by means of the backpropagation algorithm. To assess the effectiveness of the modified fuzzy-CMAC network structure, the thermal modeling of a real world tube laminating process with nine input variables was accomplished. Obtained results are considered satisfactory when compared with those of classical artificial neural networks algorithms like an Adaptive Neuro-Fuzzy Inference System (ANFIS) network and a multilayer perceptron (MLP) network.

C

I. INTRODUCTION

MAC (Cerebellar Model Articulation controller) networks were originally developed by James Albus [1], [2], based on the functioning of the cerebellum. The original CMAC network can be seen as a big set of multidimensional and interlaced receptive fields with abrupt and finite edges. Any input vector will always excite some receptive fields existing in the CMAC input layer. CMAC network output to an input vector will be the weighted average of the activations of each one of the receptive fields excited by this input vector. The weighting of these receptive fields occurs through the insertion of connection weights between these so-called “input neurons” and linear output neurons [3]. During the training phase of a CMAC network, as the output of the CMAC memories for nonactive fields is zero, the correction actions are only applied to the weights that indeed contributed for the current network output. This training algorithm has a local nature, in opposition to the global nature of training, which is accomplished with multilayer perceptron networks (MLP), where all weights of every neuron in the network are

This work is developed at the Intelligent Systems Laboratory (LSI) from the Federal Center for Technological Education of Minas Gerais State, Brazil (CEFET–MG). The authors would like to thank the Intelligent Systems Research Group (GPSI) from CEFET-MG for the financial support and technical contributions. G. S. Reis Júnior is with the Serviço Nacional de Aprendizagem Industrial (SENAI/FIEMG), Sete Lagoas, Brazil (e-mail: [email protected]). P. E. M. Almeida is with the Electrical Engineering Department at CEFET-MG since 1996, where he is an Associate Professor (e-mail: [email protected]).

0-7803-9490-9/06/$20.00/©2006 IEEE

corrected at each training step. The local training is frequently mentioned as a highly advantageous feature, because it allows a very fast rate of convergence for the error [4], [5]. The algorithm was initially developed for adaptive control of robotic manipulators but, due to its potentialities, it is used today in several applications, e.g., for nonlinear systems modeling. A very important drawback of the original CMAC network is that the output of the activation functions that composes the input receptive fields is binary. This implies that, given a position of memory inside CMAC, or it is associated to an input vector, fully contributing to the network output, or it is totally disconnected to that input vector. Because of this fact, CMAC network output often exhibits a lot of discontinuities, even when a continuous input signal is applied to its inputs [6]. To solve this problem, modified structures for CMAC network have been proposed, most of them with the use of fuzzy receptive fields in the input layer of the network [5], [6], [7], [8] and [9]. Another problem with the original CMAC network is that number of CMAC memories can become very high, mainly while solving problems with a large number of involved variables. In this work, the use of a clustering algorithm is proposed to the definition of the association rules for the input functions. This technique drastically reduces the number of CMAC memories existing inside a Fuzzy-CMAC network structure. This paper is organized as follows. In Section II, the structure of a conventional CMAC network is discussed in details. In Section III, Fuzzy-CMAC network structure is presented and the original proposal of this work is discussed. Section IV shows results obtained by the proposed structure to thermal modeling of a seamless tubes laminating process. A comparative study between results obtained by clusteringbased Fuzzy-CMAC network, ANFIS network and MLP network is shown. In Section V, final discussions are made. II. CMAC NEURAL NETWORKS A CMAC neural network can be depicted into three mappings, from its input to its output: an activation mapping at the input space, which determines the nodes or neurons that will be activated by the input vector. A second mapping generates the output for the CMAC memories based on the activation of the input neurons and on the parameters of linear equations. A third mapping performs a weighted sum of the outputs for the active CMAC memories and generates the network output. The data flow through a CMAC network, as represented in Fig. 1, is shown from (1) to (5).

2879

During the development of the original algorithm, as the total number of receptive fields in this n-dimensional hyperspace could be very high, the memory addresses represented by A were considered as virtual memories. Albus [1] used a mapping from these virtual memories in terms of the available physical memory. He used a hashing function that operated in the components am of the virtual memory addresses and in the corresponding receptive fields in the input space and produced uniformly distributed addresses in the physical memory. This mapping between virtual and physical memories is given by:

A' = h( A, M ) .

(4)

Fig.1. CMAC neural network Structure

As an example, consider an input vector S with real components, represented by:

S = [s1 , s 2 ,..., s n ] ⋅

(1)

The first step to calculate CMAC output is to create a normalized input vector, called S'. This ensures that the internal processing in the network will be independent on the range of input vector components. Equation (2) shows how this normalization task is accomplished.

S = [ x1 ,..., x n ] = '

⎡ ⎛ ⎞ ⎛ ⎞⎤ s −s s −s = ⎢int⎜⎜ r1 1 1 min ⎟⎟,..., int⎜⎜ rn n n min ⎟⎟⎥. ⎢⎣ ⎝ s1 max − s1 min ⎠ ⎝ s n max − s n min ⎠⎥⎦

(2)

In (2), xi is the normalized component of S, int(.) executes the floor operation, ri is the defined resolution for each input, simin and simax are respectively the minimum and maximum values found in the dataset for each input s. The second step is mapping the normalized input to the CMAC memory addresses A. This mapping is done by associating together the output of each activation function fi(xj) in the input space. In the algorithm, a C parameter determines how many functions fi(xj) will be simultaneously activated in each dimension of the input space. The output of these functions is equal to 1 for an active receptive field and equal to 0 for those non-active receptive fields. The association of these functions is carried out by "AND" operations between receptive fields of different inputs. CMAC active memories are only those ones associated to active AND gates, i.e., the input functions that had as output the unitary value. Equation (3) shows how each binary component am of the memory vector A is calculated.

am = ∏ f i (x j )

i ∈ [1, k ], j ∈ [1, n] , m = 1,2,..., k n . (3)

In (3), the "AND" operation is implemented by means of a product function, n is the number of inputs of the system, k is the number of input functions and index m covers from 1 up to the total number of CMAC memories.

In (4), h(.) represents a hashing function that receives components am and size M of physical memory and returns the address A' of the corresponding CMAC memory in the actual memory. Finally, the output y(A') of CMAC network is obtained by equation shown in (5): M

y(A') = ∑wi ⋅ a′i .

(5)

i =1

The training algorithm proposed by Albus was based on the Delta Rule, assuming that network parameter corrections, initiated by an error signal at the output, were equally distributed between all those weights that had contributed to calculate the output. The correction equation is shown in (6):

Δwi = α ⋅ a′i ⋅ ( yt − y) = α ⋅ a′i ⋅ e .

(6)

In (6), α is a learning coefficient that must vary inside [0, 1] interval, e represents the output error of the network, yt is the target (i.e., desired) network output, y represents the actual network output and the component ai is the ith CMAC memory activated by the respective input vector. III. FUZZY-CMAC NETWORKS A. Fuzzy-CMAC Mapping As an alternative to the binary receptive fields of the conventional CMAC network, the concept of fuzzy membership functions (MF) to create fuzzy activation functions has been widely adopted by many researchers [7], [8]. This way, memory positions activated by associations of input vectors are associated to a real value defined in the interval [0, 1] coming from each corresponding MF defined in the input space. In Fuzzy-CMAC structure, the internal layer neurons (or Fuzzy-CMAC memories) can be partially activated, allowing the network output to be smooth, without the undesired discontinuities observed in the case of the conventional CMAC network [6]. Fuzzy-CMAC network structure is shown in Fig. 2.

2880

given by:

E=

1 1 ( yt − y ) 2 = e 2 . 2 2

(10)

The adjustment of each output weight takes into account the actual contribution of it in the calculation of the network output, with respect to the activation strength of each memory. The equations used to adjust the output layer weights are given by:

w i ( p + 1) = w i ( p) − α

Fig.2. Fuzzy-CMAC neural network structure

In the input layer, a fuzzy membership value is calculated for each membership function (MF) in the input partitions. The most used function to implement this fuzzy input layer is the Gaussian function given by: ⎡ 1 ⎛ x − c ⎞2 ⎤ μij = exp⎢− ⎜ i ij ⎟ ⎥ ; i = 1,2,..., n and j = 1,2,...,k. ⎢ 2 ⎜⎝ σ ij ⎟⎠ ⎥ ⎣ ⎦

(7)

In (7), cij and σij represent respectively the center and the width of jth activation function (or MF) for ith input x, n is the total number of inputs (i.e., dimension of input vector) and k is the total number of activation functions per input. In this structure, vector A that represents fuzzy-CMAC memories will not receive binary values, but the fuzzy aggregation of MF values coming from the input layer. This is carried out through a conjunctive operator AND, that can be implemented through a T-norm function combining the MF output values. Equation (8) shows the calculation of the mth component for the memories represented by vector A.

a m = ∏ μ ij ,

i ∈ [1, k ],

j ∈ [1, n] , m = 1,2,..., k . n

(8)

Now, the network output will be calculated by the average of the activation values at each Fuzzy-CMAC memory, weighted by the output connection weights. The network output with respect to memory vector A is given by:

y (A) =

∑ aw ∑ a i =1 i M i =1

a ∂E ( p) ⎛ ⎞ a = −⎜ yt − y ⎟ Mi = −e ⋅ Mi . ∂w i ( p) ⎝ ⎠ ∑ j =1 a j ∑ j =1 a j

i

.

(9)

(11)

(12)

In (12), e represents the error between the target output yt and actual network output y. Indexes i and j rely between 1 to M CMAC memories in the network and p varies from 1 to the number of input/output pairs in the training dataset. For the implementation of a Fuzzy-CMAC network, the designer can also choose to allow adaptation of the input layer parameters, to improve the network performance. In this case, backpropagation method [10] is used to the adjustment of MF [5], [8]. For the MF shape as given by (7), Gaussian function center c and width σ could be adjusted. Through equations (13)-(18), the calculation of gradient error for center c is shown, while through equations (19)(23) the calculation of gradient error for width σ is presented.

∂E ( p) ∂E ( p) ∂μ ij ( p) . = ⋅ ∂c ij ( p) ∂μ ij ( p) ∂c ij ( p)

(13)

The second term in the product given in (13) relates to the derivative of the MF with respect to center c. In Equation (14), the calculation of this derivative is shown.

∂μ ij ( p) ∂c ij ( p)

M

∂E ( p) , ∂w i ( p)

=

( x i ( p) − c ij ( p))

σ ij2 ( p)

⋅ μ ij ( p) .

(14)

The calculation of the quadratic error derivative with respect to μ is shown in (15):

i

In (9), wi represents weight connections between the ith memory and the y output, and constant M is the total number of Fuzzy-CMAC memories.

∂E ( p) ∂E ( p) ∂y ( p) , = ⋅ ∂μ ij ( p) ∂y ( p) ∂μ ij ( p)

(15)

B. Fuzzy-CMAC training algorithm Fuzzy-CMAC network parameters are adjusted by means of a gradient descent method, where weights are corrected in the opposite direction of the error gradient. The error gradient is calculated in function of the quadratic error E,

∂E( p) = −( yt( p) − y( p)) = −e( p) . ∂y( p)

(16)

2881

Multiplying (14) by (15) to realize Equation (13) brings

the term μij multiplied by the derivative of the output with respect to this same term. For simplification, Equation (17) can be used instead.

μ ij ( p) ⋅

∂yr( p) 1 = ⋅ ∑ a z ( p) ⋅ (w z ( p) − y( p)) (17) ∂μ ij ( p) ∑M a m ( p) z m =1

In (17), z corresponds to the index of CMAC memories for which the MF actually contributes. Substituting results obtained in (14) through (17) in (13), the final algorithm to calculate the error gradient with respect to center c is reached:

( x i ( p) − c ij ( p)) ∂E ( p) 1 = −e( p) ⋅ ⋅ M ⋅ 2 ∂c ij ( p) σ ij ( p) a ( p ) ∑m=1 m (18) .∑ a z ( p) ⋅ (w z ( p) − y( p)).

∂σ ij ( p)

=

σ ij3 ( p)

⋅ μ ij ( p) .

(20)

The error gradient with respect to width σ can then be calculated by:

( xi ( p) − cij ( p))2 ∂E( p) 1 = −e( p) ⋅ ⋅ M ⋅ 3 ∂σ ij ( p) σ ij ( p) ∑m=1 am ( p)

w i ( p + 1) = w i ( p) + Δw i ( p) .

(23)

Equations (24) through (27) show the parameters correction algorithm for the input layer using the momentum term.

∂E ( p) ⋅ (1 − β ) + β ⋅ Δc ij ( p − 1) , (24) ∂c ij ( p)

Δσ ij ( p) = −α ⋅

(25)

∂E ( p) ⋅ (1 − β ) + β ⋅ Δσ ij ( p − 1) , (26) ∂σ ij ( p)

σ ij ( p + 1) = σ ij ( p) + Δσ ij ( p) .

(27)

(19)

In (19), a calculation of the first term in the product is made in the same way as developed for of Error derivative with respect to center c, because the derivative of MF with respect to width σ also have μij as term of a product in the corresponding equation, as can be observed by (20).

( x i ( p) − c ij ( p)) 2

∂E ( p) ⋅ (1 − β ) + β ⋅ Δw i ( p − 1) , (22) ∂w i ( p)

c ij ( p + 1) = c ij ( p) + Δc ij ( p) ,

The calculation of the error gradient with respect to width σ is shown in the following. Equation (19) shows the first step:

∂μ ij ( p)

Δw i ( p) = −η ⋅

Δc ij ( p) = −α ⋅

z

∂E ( p) ∂E ( p) ∂μ ij ( p) . = ⋅ ∂σ ij ( p) ∂μ ij ( p) ∂σ ij ( p)

null, depending on the momentum constant β being used. Equations (22) and (23) show the output layer weight correction rule using the momentum term.

(21)

.∑az ( p) ⋅ (wz ( p) − y( p)). z

This training algorithm based on the gradient descent method exhibits convergence problems with higher order error surfaces, which may contain local minima. To prevent the algorithm from being stuck to a local minimum, a momentum term is used. The momentum term is responsible for the historical accumulation of weights adjustments during iterations, resulting in a residual term when the current gradient is null. This residual term forces the weights to overlap existing local minima when the error gradient is

C. Fuzzy-CMAC network with clustering- based structure The implementation of a Fuzzy-CMAC network using all possible combinations for the input mapping as suggested by (8) restricts the use of this network to applications presenting a small number of inputs. This occurs because the number of necessary fuzzy associations between the defined MF inside the fuzzy input partitions varies exponentially with the number of input variables. This problem is known as “curse of dimensionality” and has been discussed by many researchers as the most limiting factor to the dissemination of computational intelligence techniques to solve high dimensional and practical problems in the industrial and production areas. To make it possible to employ Fuzzy-CMAC networks to solve practical problems with a high number of involved variables, a clustering algorithm has been used to define important associations between the input variables. Each cluster has the same dimension of an input vector. Only associations corresponding to the clusters generated by this algorithm are actually implemented as fuzzy associations in the resulting Fuzzy-CMAC network. As will be shown later on Section IV, after applying this technique, the number of fuzzy associations necessary to generate Fuzzy-CMAC memory contents changed from an exponential scale to an arithmetical one. This same process has been used before to reduce the number of rules in fuzzy systems and ANFIS networks, as reported by [11] and [12]. Here, it was used for the first time to the definition of cluster centers for fuzzy input MF in Fuzzy-CMAC neural networks, by means of the Fuzzy C-Means (FCM) algorithm [13].

2882

D. Fuzzy C-Means Algorithm The FCM algorithm divides a group of objects represented by a vector X in fuzzy groups, in such a way that each object has a membership value associated with its distance to the ci centers of the existing groups. A matrix U, of kxn dimension, where k is the total number of groups and n is the total number of objects to be clustered, defines relationships between groups and existing objects. Each element uij in U matrix corresponds to a degree of relevancy of an object n in relation to a specific center ci, and relies inside real interval [0, 1]. To the determination of the clusters centers, a cost function J, as defined by (28), must be minimized [11].

J (U, c1 ,..., c c ) = ∑i =1 ∑ j =1 uijm d ij2 . c

n

(28)

In Equation (28), dij is the Euclidean distance between center ci and object j, and m is a weighting exponent with value larger than 1. The necessary conditions to minimize the cost function are given in (29) and (30).

∑ = ∑ n

ci

j =1 n

uijm x j

um j =1 ij

⎡ c ⎛d ij uij = ⎢∑k =1 ⎜ ⎜ ⎢ ⎝ d kj ⎣

.

(29)

⎞ ⎟ ⎟ ⎠

2 /( m −1)

a m = ∏ μ ij,

(30)

If, for two consecutive iterations, the improvement in the objective function is minor than a specified minimum or the maximum number of iterations is reached, then the algorithm is stopped. Consider that clusters centers resulted by the above algorithm are organized in a matrix C, as shown in (31). c 12 .

.

.

. .

.

. .

c 1n ⎤ . ⎥⎥ . ⎥. ⎥ . ⎥ c kn ⎥⎦

j ∈[1, n], (32)

n

This algorithm was implemented in p iterations by the following steps: 1. U matrix is initiated with random values between 0.0 and 1.0, in such a way that the sum of the membership values for each object must be equal to 1.0; 2. All ci clusters centers are calculated, with i=1,2...c, using (29); 3. Cost function J is calculated in accordance to (28); 4. U matrix is updated using (30); 5. Return to step 2.

⎡ c 11 ⎢c ⎢ 21 C=⎢ . ⎢ ⎢ . ⎢⎣ c k 1

i ∈[1, k ],

m = k n ⇒ a i = ∏ j =1 μ ij , i = 1,2,..., k .

−1

⎤ ⎥ . ⎥ ⎦

In C matrix, each row corresponds to a cluster center and each column corresponds to an input variable (or an input vector component) used to the definition of the clusters. Coming back to a Fuzzy-CMAC neural network structure, the input variable partitions are formed by fuzzy membership functions, which have their centers defined by the values associated to the corresponding variable in each cluster. This way, each column of C matrix will be used to define the partition of a variable according to the column position in the matrix. For example, each component of the second column in C matrix will be used as the center c of a MF for the second input variable as shown in (7). The number of functions for each input is the same as the number of clusters. To calculate Fuzzy-CMAC memories, MF values of each input partition that are part of the same cluster will be associated together by a Fuzzy-AND operator. In other words, the output of the functions that are located in the same row of C matrix will be associated together to generate a Fuzzy-CMAC memory. This way, the number of memories in the network is the same to the number of defined clusters. Equation (8) will have its indexes modified as shown in (32).

(31)

To use FCM algorithm in the modified fuzzy-CMAC structure in a reliable way, the size of the training dataset must be significant and should cover most of the input universe, so that the network can exhibit a good generalization capacity. Convergence of FCM algorithm for the best cluster group cannot be guaranteed due to random initialization of the membership values inside U matrix. However, for the use together with an adaptive neural network algorithm, an optimal adjustment is not necessary, as those same parameters will be again adjusted during the network training phase. Equations (18) and (21) for the backpropagation algorithm will have the z index equal to one, as each fuzzy MF contributes only to one FuzzyCMAC memory. This way, the clustering-based structure here proposed simplifies the overall implementation of the training algorithm and drastically reduces the number of Fuzzy-CMAC memories in the network. For example, for a system with nine inputs and five MF for each input, a conventional Fuzzy-CMAC network would have 59 combinations resulting in 1,953,125 memories, making the computational efforts to process the network very high. Using the clustering-based structure, the number of memories would be the same to the number of MF, resulting in only 5 memories.

2883

IV. APPLICATION OF CLUSTERING-BASED FUZZY-CMAC TO THERMAL MODELING INSIDE A TUBES LAMINATING PROCESS In a laminating process, seamless steel tubes are manufactured from compact steel blocks. These tubes are used by automotive and petroliferous industries, among others. For better understanding of this productive process, a flowchart is presented in Fig. 3 showing the continuous tubes laminating process at Vallourec & Mannesmann Brazil Corporation (VMB). The extending-reducing rolling mill, which was modeled using the Clustering-Based FuzzyCMAC (CBF-CMAC) algorithm here proposed, is marked with an ellipse in the diagram.

CBF-CMAC network was proposed as a mathematical tool to function as a unique thermal model to cover the whole production mix of VMB and to perform such output temperature prediction, based on process input variables and tubes intrinsic properties. For definition of the variables that could influence the tube output temperature after the rolling mill, a huge historical database of the production process existing at VMB was used. By means of the statistical method of Stepwise Regression [15]-[17], nine variables from the dataset were chosen as the most important to the determination of the output temperature. These variables have been selected from a set of sixteen variables existing in the production dataset, and are related to tube dimension, tube chemical composition and rolling mill operation modes, among others. The resulting variables are listed by Table I. TABLE I: SELECTED INPUT VARIABLES FOR THE THERMAL MODEL

Nº 1 2 3 4 5 6 7 8 9

Fig.3. Laminating process flowchart (adapted from [14]).

As raw material for the lamination process, bulk steel blocks are used. These blocks are heated up inside a rotating oven to a high temperature. After the heating phase, the warm block is perforated by an oblique rolling mill. After perforation, the resulting material will be manufactured by the reducing rolling mill and the continuous rolling mill. In the next stage, the material will be again made warm in the reheating oven and after that it is finally remanufactured in the extending reducing rolling mill. After this last rolling mill, the tube will acquire its final dimension, following to a cooling bed and a cutting phase. In the production mix at VMB, various types of tubes with diversified chemical and physical characteristics are often produced. Tubes thermal variation throughout this rolling mill exerts fundamental influence on the permanent mechanical properties of the tubes. The comprehension of thermal phenomena involved in this productive system is extremely important for determination of operational strategies, which will improve the efficiency of the process as well as the quality of the final products. That said, it was very important for VMB to be able to predict the output temperature of a tube, coming out the extending reducing rolling mill. With this information in hand, it would be possible to calibrate set-points at the reheating oven and to preview the mechanical properties of the tubes being produced. In other words, a thermal model of the process would be very useful to ensure final products quality control and an adequate use of resources and energy. The

Variable Description Planned total length for the tube Planned thickness for the tube internal wall Calculated tube internal wall after the rolling mill Length of the pipe before the rolling mill Planned external diameter for the tube Number of chairs used in the rolling mill Tube input temperature before the rolling mill Pipe external diameter before the rolling mill Average chromium percentage of the tube

A. Experimental Results The performance index used for verification of the resulting models is the mean square error (MSE) between the actual tube output temperature, acquired at the output of the rolling mill, and the predicted tube output temperature given by the thermal models. MSE is obtained by

MSE =

1 N

∑

N p =1

e ( p) 2 .

(33)

In (33), e represents the error between the actual tube output temperature and the model predicted output temperature and N is number of input/output pairs used in the calculation. Table II shows the average of MSE and standard deviation (SD) for ten training sessions of 5,000 epochs each, collected for seven different structures of the CBF-CMAC network. As the network output layer weights were randomly initialized inside real interval [-1.0, +1.0], each neural network structure training phase has been carried out more than ten times to ensure statistical validity of the experimental results. Network adaptation was accomplished with employment of cross-validation techniques. At each training session, a dataset of about 15,000 input/output pairs was applied for network parameter adaptation (the training dataset) and another dataset of the

2884

same size was used for network validation (the validation dataset), i.e., for verification of the generalization properties of the trained network. Using this methodology, an adaptation session terminates when the generalization error is consecutively increasing for a certain number of training epochs. This means that, from that point on, the network is not anymore finding trustful relationships existing in the whole dataset, but started to store specific relations of the training dataset. As can be seen in Table II, the best result was obtained by the structure configured with thirty MF for each input variable. This network presented an average MSE of 30.38 °C² with the training data and an average MSE of 30.84 °C² with the generalization data. The output temperature of the rolling mill has an average value of 841.1 °C, calculated for the whole production mix at VMB. MSE values obtained by the models are equivalent to 3.61% and 3.67% of this average temperature. This structure had 570 parameters to be adjusted during the training phase.

Table III shows the results obtained by the same CBF-CMAC network structures as used for the results of Table II, but applied to a third dataset used as a generalization test. It is easy to perceive that final performance is similar to those observed with training and validation datasets. TABLE III: MSE FOR TRAINING SESSIONS PERFORMED WITH CBF-CMAC NETWORKS FOR TEST DATA

TABLE II: MSE FOR TRAINING SESSIONS PERFORMED WITH CBF-CMAC NETWORKS

Nº of Mean SD of SD of Mean Membership MSE (°C²) MSE MSE (°C²) MSE Functions Training Training Validation Validation per Input Data Data Data Data Variable 5 56.17 2.54 56.89 2.52 10 42.83 4.27 43.54 4.72 15 36.84 3.08 37.28 3.00 20 33.76 2.58 34.21 2.60 25 32.95 1.03 33.32 0.94 30 30.38 1.40 30.84 1.30 35 30.52 0.71 30.91 0.60 Fig. 4 shows the performance of a CBF-CMAC network thermal model applied to a small validation (i.e., unknown) dataset of five hundred tubes. This model had a structure of thirty MF per input variable and presented an MSE of 30.63 °C², very close to the average MSE value obtained for this kind of structure listed in Table II.

Nº of Membership Functions per Input Variable

Mean MSE (°C²)

5 10 15 20 25 30 35

57.89 43.78 37.50 34.65 33.62 31.14 31.25

Test Data

To confirm the promising results just presented, a comparative study was accomplished to evaluate results obtained by the CBF-CMAC network, by a MLP network and by an ANFIS network while modeling the exact same tube laminating process. MLP networks were configured with ninety neurons in the hidden layer and the chosen training algorithm was backpropagation with momentum. ANFIS networks had twenty five MF per input variable and the chosen training algorithm was also backpropagation with momentum for adjustment of the antecedent and of the consequent parameters. The membership function shape used for the input mapping in the ANFIS network was also the same one used for Fuzzy-CMAC networks. ANFIS network MF had also been initialized with FCM algorithm. Table IV presents the results of these three classes of models in terms of the average of MSE obtained by the networks and the amount of parameters, which were adjusted at each network during the training phases. These results correspond to ten sessions of 5,000 epoch training phases for each network, for statistical validation purposes. TABLE IV: COMPARATIVE RESULTS BETWEEN CBF-CMAC, MLP AND ANFIS NETWORKS

Fig.4. Performance of a CBF-CMAC network trained as a thermal model of VMB tubes laminating process. (Solid line: measured output temperature; circles: network predicted output temperature)

2885

ANN Topology

Number of adjusted parameters

Mean of MSE (°C²) with Training Data

Mean of MSE (°C²) with Validation Data

CBF-CMAC MLP ANFIS

570 900 700

30.38 62.95 32.57

30.84 67.80 33.01

In Table IV, the mean of MSE for ten sessions of training is shown. Among these results, the best average MSE (about 31 °C²) was obtained by a CBF-CMAC, with similar results obtained by the ANFIS network. MLP network had an inferior performance if compared to the others two networks. The best result of MSE for ten simulations with training and with generalization datasets were again obtained by CBF-CMAC network, respectively of 27.49 °C² and of 28.37 °C². For the ANFIS network, the best observed result was 31.29 °C² for the training set and 31.78 °C² for the generalization dataset. Finally, the best MLP network results in the experiments were of 58.95 °C² for the training set and 63.32 °C² for the generalization dataset. V. CONCLUSION

operating modes, which made possible the development of this research. REFERENCES [1] [2] [3] [4]

[5]

Implementation of conventional Fuzzy-CMAC networks using all possible combinations of fuzzy input associations is limited to problems with a small number of variables. This occurs because the number of such combinations varies exponentially with the number of input variables and fuzzy MF per input variable. However, the use of clustering algorithms to reduce the number of fuzzy input associations allows the definition of a much higher number of fuzzy input MF, without excessive increase of the computational cost to train and to process the resulting network. That is exactly what CBF-CMAC network approach proposes in this article. The use of this modified Fuzzy-CMAC network to model thermal behavior of a tube laminating process was shown to be quite viable from the computational effort viewpoint, while presented very satisfactory results to predict output tubes temperature based on input characteristics. The obtained results showed good generalization properties for the experimented datasets. From those experimental results it is clear that CBF-CMAC networks approach have good potential for modeling of similar static processes. The process used in the experiments had a high number of involved variables, showing that this approach minimizes problems and limitations related to curse of dimensionality. The performance of CBF-CMAC networks showed to be superior to MLP and very similar to ANFIS networks while modeling the tubes laminating process at VMB. Nevertheless, the number of adjusted parameters in the case of the ANFIS networks was 23% higher than of the CBF-CMAC networks, indicating better representation capabilities for the latter approach. The CBF-CMAC network seems to be a very attractive option to be chosen by systems engineers when searching for reliable and effective tools to system modeling tasks.

[6]

[7] [8] [9]

[10] [11] [12] [13] [14] [15]

[16] [17]

ACKNOWLEDGMENTS The authors would like to thank Vallourec & Mannesmann Brazil Corporation for the cession of operational data used in the experiments and for describing the production techniques and industrial equipment 2886

J. S. Albus, “A new approach to manipulator control: The cerebellar model articulation controller”. Trans. ASME, J. Dyn. Syst., Meas. Control, v. 97, pp. 220-227, Sept. 1975. J. S. Albus, “Data Storage in the Cerebellar Model Articulation Controller”. Trans. ASME, J. Dyn. Syst., Meas. Control, v. 97, pp. 228-233, Sept. 1975. A. G. Evsukoff, P. E. M. Almeida, “Sistemas Neuro Fuzzy”. In: S. O. Rezende, Sistemas Inteligentes Fundamentos e Aplicações. Barueri: Manole, pp. 203-224, 2003. M. R. G Meireles, P. E. M. Almeida, M. G. Simões, “A Comprehensive Review for Industrial Applicability of Artificial Neural Networks”. IEEE Transactions on Industrial Electronics, v. 50, n. 3, pp. 585 – 601, jun. 2003. D. Kim, S. Kang, “A Design of CMAC-Based FLC with Fast Learning and Accurate Approximation”. Proc. IEEE International Fuzzy Systems Conference, Seoul, v. 3 pp. 1476 – 1481, 1999. P. E. M. Almeida, M. G. Simões, “Parametric CMAC Networks: Fundamentals and Applications of a Fast Convergence Neural Structure” IEEE Transactions on Industrial Electronics, v.39, n.5, pp.1551-1557, October, 2003. H-R. Lai, C-C. Wong, “A Fuzzy CMAC Structure and Learning Method for Function Approximation”. Proc. IEEE International Fuzzy Systems Conference, Melbourne, pp. 436-439, 2001. K. Zhang, F. Qian. “Fuzzy CMAC and Its Applications”. In: Proc. III World Congress on Intelligent Control and Automation, Hefei, pp. 944-947, 2000. W. Shitong, L. Hongjun, “Fuzzy System and CMAC Network with BSpline Membership/Basis Functions Can Approximate a Smooth Function and its Derivatives” International Journal of Computational Intelligence and Applications, n. 3 pp. 265-279, 2003. D. E. Rumelhart, G. E. Hinton, R. J. Willlians, “Learning representations by back-propagations errors,” Nature, vol. 323, pp. 533-536, 1986. J-S. R. Jang, C-T. Sun, E. Mizutani, Neuro-Fuzzy and Soft Computing – A Computational Approach to Learning and Machine Intelligence, Prentice Hall, 1997. C. C. Berni, “Implementação em Hardware/Firmware de um Sensor Virtual Utilizando um Algoritmo de Identificação Nebulosa”. M.Sc. dissertation. Dept. Eng., University of the São Paulo, São Paulo, 2003. J. C. Bezdek. “Fuzzy Mathematic in Pattern Classification,” Ph.D. Thesis, Applied Math. Center, Cornell University, Ithaca, 1973. V & M do Brasil S.A. Technical Catalogue from the Corporative Management Section, VMB, 2001. G. M. Almeida, S. W. Park, M. Cardoso. “Modelagem Neural do Vapor Gerado Pela Caldeira de Recuperação Assistida por Técnicas de Seleção de Variáveis”. In: Proc. SBRN 2004- Brazilian Symposium on Neural Network, 8, 2004, São Luís 2004. D. R. Anderson, D. J. Sweeney, T. A. Willians, Modern Business Statistics with Microsoft© Excel. Mason: South-Westem, 2003. STAFTSOFT. Electronic Statistics Textbook. Tulsa OK: Statsoft, 2004. Available: http//www.statsoft.com/textbook/stathome.html.

Recommend Documents

A modified lattice structure with pleasant scaling ... - Semantic Scholar