Neurocomputing 145 (2014) 381–391
Contents lists available at ScienceDirect
Neurocomputing journal homepage: www.elsevier.com/locate/neucom
An evolving fuzzy neural predictor for multi-dimensional system state forecasting De Z. Li a, Wilson Wang b,n, Fathy Ismail a a b
Department of Mechanical and Mechatronics Engineering, University of Waterloo, Waterloo, ON, Canada N2L 3G1 Department of Mechanical Engineering, Lakehead University, Thunder Bay, ON, Canada P7B 5E1
art ic l e i nf o
a b s t r a c t
Article history: Received 20 June 2013 Received in revised form 6 May 2014 Accepted 9 May 2014 Communicated by Wei Chiang Hong Available online 2 June 2014
In many applications of system state forecasting, the prediction is performed using multi-dimensional data sets. The traditional methods for dealing with multi-dimensional data sets have some shortcomings, such as a lack of nonlinear correlation modeling capability (e.g., for vector autoregressive moving average (VARMA) models), and an inefficient linear correlation modeling mechanism (e.g., for generic neural fuzzy systems). To tackle these problems, an evolving fuzzy neural network (eFNN) predictor is proposed in this paper to extract representative information from multi-dimensional data sets for more accurate system state forecasting. In the proposed eFNN predictor, linear correlations among multi-dimensional data sets are captured by a VARMA filter, while nonlinear correlations of the data sets are modeled by a fuzzy network scheme, whose fuzzy rules are generated adaptively using a novel evolving algorithm. The proposed predictor possesses online learning capability and can address non-stationary properties of data sets. The effectiveness of the proposed eFNN predictor is verified by simulation tests. It is also implemented for induction motor system state prognosis. Test results show that the proposed eFNN predictor can capture the dynamic properties involved in the multi-dimensional data sets effectively, and track system characteristics accurately. & 2014 Elsevier B.V. All rights reserved.
Keywords: Multi-dimensional data sets Evolving fuzzy neural network System state prognosis Multiple-step-ahead forecasting Induction motors
1. Introduction Multi-dimensional system state forecasting is a complex and important research and development area, which aims to predict future states of a dynamic system based on past observations from multiple sources (e.g., sensors). The classical approaches for multidimensional data sets forecasting are mainly based on analytical modeling, such as vector autoregressive (VAR) models, and vector autoregressive moving average (VARMA) models [1]. These classical analytical models can describe the underlying relationships among multi-dimensional data sets to extrapolate future states of a dynamic system, and have been used in some forecasting applications, such as the electricity load demand [2,3], and some economic indicators [4,5]. The VAR/VARMA models, however, can only predict the linear correlations among multi-dimensional data sets; they are unable to efficiently characterize nonlinear correlations (e.g., those related to impulses and transients) among multidimensional data sets. Comparing the VAR with the VARMA, the former estimates future states of a system based on its past multidimensional observations, while the latter deploys both past
n
Corresponding author. Tel.: þ 1 807 766 7174. E-mail addresses:
[email protected] (D.Z. Li),
[email protected] (W. Wang),
[email protected] (F. Ismail). http://dx.doi.org/10.1016/j.neucom.2014.05.014 0925-2312/& 2014 Elsevier B.V. All rights reserved.
multi-dimensional observations and past multi-dimensional innovations for system state prediction. Thus VARMA could provide more comprehensive linear modeling than VAR. Since a long autoregressive (AR) process can be represented by a compact moving average (MA) process, the dimension of the parameter space may be further reduced by VARMA, especially for data sets involving long AR characteristics [6]. Accordingly, VARMA will be used in this work to filter out linear correlations among multidimensional data sets. The alternative approach for multi-dimensional data set modeling is the use of soft-computing tools, such as neural networks (NNs) [7,8,17,18,26–29] and neural fuzzy (NF) systems [9,10]. An NF scheme is usually superior to NNs in mimicking human reasoning processes and extracting knowledge as interpretable IF–THEN rules. Although an NF scheme can track the nonlinear correlations among multi-dimensional data sets, the performance of NF in catching linear correlations may not be efficient, because of its complex modeling nature. Moreover, the performance of an NF predictor with a fixed network architecture cannot be guaranteed when system properties vary significantly in applications (e.g., equipment just after repair and maintenance), and/or when new information is provided (e.g., from a new sensor) [11,12]. In recent years, more work has been focused on the use of evolving NF paradigms that can adaptively adjust their network structures in response to new system conditions (i.e., data sets).
382
D.Z. Li et al. / Neurocomputing 145 (2014) 381–391
Kasabov et al. successfully proposed evolving fuzzy NN models (EFuNN) [13–15] and dynamic evolving NF inference system (DENFIS) techniques [16] for different applications such as learning, knowledge acquisition, and time-series forecasting. These evolving paradigms employ clustering algorithms to adaptively tune model structure and parameters. Although they treat both linear correlations and nonlinear correlations in a data set with the fuzzy NN (nonlinear modeling), they may increase the computational burden, especially when they use nonlinear models to capture linear correlations from the data with large size and dimensions. Another solution to model both linear and nonlinear characteristics of data sets for system state forecasting is the use of hybrid modeling. For example, Medeiros et al. proposed a neural coefficient smooth transition autoregressive model for time series forecasting [24]. Khashei et al. integrated the NNs and ARMA models to conduct time series prediction [25]. One remarkable merit of using the hybrid modeling strategy is its capacity for dealing with non-stationary data sets. The linear, non-stationary, components could be captured by a linear modeling method, and nonlinear components characterized by some nonlinear modeling technique [19]. However, these hybrid methods lack the ability to characterize multi-dimensional data sets; moreover, they cannot adapt their reasoning structures to new system conditions in real time (online), and consequently the model structure may be suboptimal. To tackle the aforementioned challenges, the objective of this work is to develop a new evolving fuzzy neural network (eFNN) technique for the prognosis of complex dynamic systems with multi-dimensional data sets. Compared with EFuNN and DENFIS, the proposed eFNN applies a different approach in processing linear correlation and nonlinear correlation in a data set: a compact VARMA filter to model linear property and an evolving fuzzy network to model nonlinear correlation. Based on this approach, the system structures can become more transparent, which can facilitate system training for optimization and error tracking. The method's novelty lies in the following aspects: (1) the developed eFNN predictor applies both linear and nonlinear modeling strategies to characterize properties of multi-dimensional data sets; (2) a novel cumulative clustering algorithm is proposed to evolve fuzzy reasoning rules for nonlinear modeling; and (3) the developed eFNN predictor is implemented for real-world applications such as the forecasting of currency exchange rates, as well as induction motor (IM) system state prognosis. The remainder of this paper is organized as follows: The proposed eFNN predictor and the proposed adaptive clustering algorithm are discussed in Section 2. In Section 3, the effectiveness
Layer 1
Layer 2
of the proposed eFNN predictor is examined by simulation tests; the new predictor is also implemented for IM system state prognosis. Some concluding remarks are summarized in Section 4.
2. The evolving fuzzy neural network As stated in Section 1, although both VAR and VARMA models can catch the linear (but not nonlinear) correlations among multidimensional data sets, the VARMA is selected in this work for its more efficient generalization and compactness in modeling multidimensional data sets. Although the NNs can be pre-trained by the available multi-dimensional data sets to track system characteristics (mainly nonlinear correlations), they are inefficient in tracking linear correlations among multi-dimensional data sets. To properly tackle these modeling problems, an eFNN predictor is proposed in this section to provide a more efficient tool for prognosis of complex systems with multi-dimensional data sets involving both linear and nonlinear correlations. 2.1. Architecture of the proposed eFNN predictor Fig. 1 describes the network architecture of the proposed eFNN predictor. It is a six-layer feed-forward network. Layer 1 is the input layer. Each input ½xi;t ; xi;t s ; :::; xi;t ðp 1Þs represents a vector from the ith data set with time lags 0 to p 1; s is the time-step; p is the dimension of the input data vector; i¼1, 2, …, m; and m is the dimension of the multiple data sets or the number of inputs to the eFNN predictor. In forecasting applications, there exist both linear and nonlinear correlations between the target data set and these available data sets. Layer 2 performs VARMA filtering to model the linear correlations among the data sets in each dimension. The classic VARMA model generates an output vector which contains m entries; the ith entry corresponds to the predicted value of the ith dimensional data set, i¼1, 2, …, m. In terms of the ith entry in the output vector, the VARMA filter can be expressed as 2 3 2 3 2 3 2 3 φ1;t x1;t ðp 1Þs x1;t x1;t s 6 6 6x 6φ 7 7 7 7 6 x2;t 7 6 x2;t s 7 6 2;t ðp 1Þs 7 6 7 7 6 7 6 7 þ Λ1 6 2;t 7 Y L ¼ Θ1 6 6 ⋮ 7 þ Θ2 6 ⋮ 7 þ ⋯ þ Θp 6 6 ⋮ 7 7 ⋮ 4 5 4 5 4 5 4 5 φm;t xm;t ðp 1Þs xm;t xm;t s 2 3 2 3 φ1;t ðq 1Þs φ1;t s 6φ 7 6φ 7 6 2;t s 7 6 7 7 þ ⋯ þ Λq 6 2;t ðq 1Þs 7 ð1Þ þΛ2 6 6 ⋮ 7 6 7 ⋮ 4 5 4 5 φm;t ðq 1Þs φm;t s
Layer 3
Layer 4
Layer 5
Layer 6
[x1,t, x1,t-s,…, x1,t-(p-1)s]
YD
φ1
•••
B1,1
Bn,1
Y
•
•
•
B1,m φm
•••
• • •
•
VARMA
•
•
[x2,t, x2,t-s,…, x2,t-(p-1)s]
[xm,t, xm,t-s,…, xm,t-(p-1)s]
L1
Bn,m
Fig. 1. Architecture of the eFNN predictor.
Lr YL
D.Z. Li et al. / Neurocomputing 145 (2014) 381–391
where Θk ¼ ½ θk;1 θk;2 ⋯ θk;m are the linear AR parameters, (k ¼1, 2, …, p), and Λl ¼ ½ λl;1 λl;2 ⋯ λl;m are the linear MA parameters; (l¼ 1, 2, …, q). Y L is the predicted value of the ith dimensional data set using VARMA. The linear filtering output Y L is forwarded to the output node in Layer 6. φi;j is the linear filtering error of the ith dimensional data set at time instance j. To conduct an s-step-ahead forecasting, φi;j can be determined as φi;j ¼ Y D Y L , where Y D and YL are the desired and estimated values of the VARMA filter in the ith dimensional data set at time instant j, respectively. To simplify representation, the linear estimation (or filtering) errors fφ1;t ; φ2;t ; ⋯; φm;t g are represented as fφ1 ; φ2 ; ⋯; φm g, which are the inputs in Layer 3. Layer 4 is the fuzzy rule layer. Gaussian functions are selected as membership functions (MFs), Bj,i, to formulate the fuzzy operation. Given the input Φ ¼ ½φ1 ; φ2 ; ⋯; φm , the firing strengths ηj can be derived by using the fuzzy product T-norm ! 1 jjΦ μj jj2 ηj ¼ exp 2 s2j ! 2 2 2 m 1 ðφ1 μj;1 Þ þ ðφ2 μj;2 Þ þ ⋯ðφm μj;m Þ ¼ exp ¼ ∏ Bj;i ðφi Þ 2 2 sj i¼1 ð2Þ and 1 ðφi μj;i Þ Bj;i ðφi Þ ¼ exp 2 s2j
2
! ð3Þ
where μj ¼ ½μj;1 ; μj;2 ; :::; μj;m is the center of the jth cluster, which will be derived using a clustering algorithm. The clustering technique will be introduced in Section 2.2. μj;i (j¼ 1, 2,…, n, and i¼1, 2,…, m) and sj are the respective center and spread of the Gaussian MF Bj;i . n is the number of nodes for each input φi . Layer 5 is the rule layer. Each node in this layer is formulated by Lj ¼ aj;1 x1 þ aj;2 x2 þ ::: þ aj;m xm þ bj
ð4Þ
where Lj denotes a first order TS model, in which aj,i (j¼ 1, 2, …, n; i¼1, 2, …, m) are the linear parameters, and bj is the bias in Lj. Layer 6 is the output layer. The eFNN output Y is formulated as Y ¼ YL þ
∑nj¼ 1 ηj Lj ∑nj¼ 1 ηj
ð5Þ
The fuzzy rules in Layers 4–5 are generated by the use of an evolving clustering paradigm that will be discussed in Section 2.2. 2.2. The adaptive clustering algorithm A cumulative evolving clustering (CEC) algorithm is proposed in this work, to adaptively evolve the fuzzy reasoning rules (clusters) represented in Layers 4–5 in Fig. 1. The inputs to the evolving fuzzy (EF) network are the linear estimation error vectors Φ ¼ ½φ1 ; φ2 ; ⋯; φm , and the jth fuzzy rule can be formulated as Rj : IF ð φ1 is Bj;1 Þ and ð φ2 is Bj;2 Þ and; …; and ð φm is Bj;m Þ; THEN ðZ is Lh Þ
ð6Þ
where j ¼1, 2,…, n; n is the number of fuzzy rules generated; Z is the clustering index; h¼1, 2,…, r; and r is the number of first order TS models ðr r nÞ. The generated jth cluster is an m-dimensional cluster with center C j ¼ ½C j;1 ; C j;2 ; ⋯; C j;m , and radius Rj. Fig. 2 schematically illustrates the clustering process of the CEC algorithm. The normalized Euclidean distance between the new input Φ and the center of the jth cluster, C j , is defined as 1 dj ¼ pffiffiffiffiffi:Φ C j : m
ð7Þ
383
E Cj Rj'
D
A B
Rj Rj" Fig. 2. Schematic representation of a cluster j. Cj is the center and Rj is the radius of the jth cluster. R0j and R″j are intermediate and extended radii of the jth cluster, respectively. A, B, D and E represent different states of clustering.
R0j is the intermediate spread with R0j ¼ 1=2 þ M N j =2M Rj , where M is the total number of input values for clustering, and Nj is the number of input values in cluster j. The CEC clustering processes are listed as follows: Step 1. Initialization: When the first input Φ (i.e., linear estimation error vector) is formulated after VARMA filtering, it becomes the center of the first cluster, C1. The initial radius of this cluster is set as R0 (R0 ¼0.01, in this case). The upper boundary of the radius is denoted by RU . If RU is small, more clusters will be generated, and vice versa. Step 2. Cluster formulation: If a new input Φ falls in more than one cluster, only the center and radius of the closest cluster will be updated using the following rules: (a) If dj rR0j (e.g., state A in Fig. 2), the center of the cluster is updated as C j_new ¼ ðC j_old N j þ ΦÞ=ðN j þ 1Þ
ð8Þ
where Nj is the number of input values in cluster j. (b) If R0j o dj rRj (e.g., state B in Fig. 2), the center and radius of this cluster remain unchanged. (c) Otherwise, if Rj odj r ð2RU 2R0j Þ (e.g., state D in Fig. 2), the center of the cluster remains unchanged, but the radius Rj is updated as Rj ¼ R″j ¼ ð1=2Þdj þ R0 . (d) If Rj r ð2RU 2R0j Þ odj (e.g., state E in Fig. 2), a new cluster is created. The new input value Φ becomes the center of the new cluster and the radius is initialized as R0. (e) If ð2RU 2R0j Þ o Rj o dj , a new cluster is created with the same setting as in (d). In general, the data with dj rR0j will have higher MF degree, and will be used to determine the cluster center. The data with Rj o dj rð2RU 2R0j Þ represent the potential spread of the cluster, and will be used to update the radius Rj of the cluster. If the input data satisfiesR0j o dj r Rj , the parameters of the cluster remain unchanged. Step 3. Structure recognition: The center of the jth cluster C j will be the center of the jth firing strength (i.e.,μj in Eq. (2)). Assume that the center Cj corresponds to the MF degree 100% (i.e., ηj ¼ 1), and the input values ΦR with dj ¼Rj are assigned a MF degree of 1% (i.e., ηj ¼ 0:01). Substituting ηj ¼ 0:01 in Eq. (2), yields ! 1 jjΦR μj jj2 exp ¼ 0:01 2 s2j
ð9Þ
384
D.Z. Li et al. / Neurocomputing 145 (2014) 381–391
Then inserting Eq. (7) into Eq. (9) gives the following equation: ! 2 1 mRj exp ¼ 0:01 ð10Þ 2 s2j The spread sj can be derived by rearranging Eq. (10) pffiffiffiffiffi m Rj sj ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 lnð0:01Þ
ð11Þ
When an input value with dj rRj , its MF degree can be derived as ! 1 jjΦ μj jj2 ηj ¼ exp 2 s2j ! 2 1 mdj ¼ exp 2 s2j " 2 # dj lnð0:01Þ ð12Þ ¼ exp Rj 2 dj =Rj r 1 and ln ð0:01Þ is a negative value, the MF h i 2 degree ηj ¼ exp dj =Rj lnð0:01Þ Z 0:01. When dj takes the extreSince
mely small value (i.e., dj ¼ 0), the MF degree ηj ¼ 1. Therefore the MF degree ηj will be within the range of [0.01, 1]. When projecting the m-dimensional cluster to each dimension, the corresponding MF center and radius will be center μj;i and spread sj of Bj;i in Eq. (3), respectively. The proposed CEC is a constrained evolving algorithm, which is dependent on two factors: the input Φ, and the upper boundary RU . If RU remains constant (a general case), the clusters will be evolved based only on input information Φ. The proposed CEC updates the center of a cluster by considering the information of both previous cluster center position (i.e., Eq. (8)) and the newest input sample. Thus the updated cluster center of the CEC is less sensitive to outliers. The CEC adjusts the cluster center using an evolving mechanism, R0j ¼ ð1=2Þ þ ððM Nj Þ=2MÞÞRj . Those new clusters (containing only a few samples) will have more opportunities to update their centers using new accommodated samples, so as to optimize cluster center position. The centers of the clusters with many samples will be less sensitive to new accommodated samples, especially those new samples which belong to this cluster, but are far away from the cluster center. Therefore the cluster center can be less affected by outliers. 2.3. Training strategy The parameters of the developed eFNN predictor will be optimized by appropriate training as illustrated in Fig. 3. To catch linear correlations of the m-dimensional data sets, the parameters in the
Initialization
RLSE
VARMA filter parameters
GD & RLSE
EF network parameters Stop
Fig. 3. Flowchart of the hybrid training process of the developed eFNN predictor.
VARMA filter are optimized online by the use of the recursive least square estimate (RLSE) suggested by the authors [21]. A hybrid training strategy will be used to optimize parameters in the EF network (Layers 3–5 in Fig. 1). The gradient descent (GD) algorithm is employed to update the nonlinear parameters in nodes Bj,i in Layer 4, whereas the RLSE is utilized to adaptively tune the linear parameters in Lj in Eq. (4). According to our previous research in system training [20], a hybrid training strategy can reduce the search dimension when compared with a single training method (e.g., the GD), prevent becoming trapped in local optima, and improve convergence of the training process. Specific training processes are summarized as follows: (1) The initial values of parameters in nodes Lj, (j¼ 1, 2, …, n) are initialized over the interval [0, 1]. (2) The parameters θk;i and λl;i (k¼1, 2, …, p; l¼ 1, 2, …, q; i¼1, 2, …, m) in the VARMA filter are optimized online by using the RLSE. (3) After training of the VARMA filter parameters, the nonlinear parameters in nodes Bj,i, (j¼ 1, 2,…, n; i¼ 1, 2, …, m) are optimized by using a GD algorithm, and linear parameters in Lj are updated by RLSE adaptively. In the training process, only one training epoch is needed to update the VARMA filter parameters. After the linear correlation information is filtered out, the estimation error data set fφ1 ;φ2 ; :::; φm g will retain less regulated information. This means that fewer clusters can be formulated, which can simplify the eFNN predictor structure and speed up training convergence, as discussed in Section 3.
3. Performance evaluation and applications 3.1. Overview The effectiveness of the proposed eFNN predictor is verified in this section, first by developing a forecasting example of the multidimensional financial data set, and then being implemented for IM system state prognosis. To simplify the discussion, the developed eFNN predictor using the proposed CEC algorithm is designated as eFNN-CEC. To make a comparison, the related predictors based on an enhanced fuzzy filtered neural network (EFFNN) [20], an eNF scheme [21], and the DENFIS [13] are employed for testing. The EFFNN predictor is a four-layer feed-forward NN with the same number of nodes from Layers 1 to 3 (i.e., 20–20–20–1). The eNF predictor is an evolving NN with three input nodes. The DENFIS is a data-driven NN using a clustering technique. To evaluate the effectiveness of the proposed evolving CEC algorithm in the eFNN predictor, an evolving clustering method (ECM) suggested in [13] is implemented in the eFNN predictor to replace the CEC algorithm, designated here as eFNN-ECM. That is, the only difference between the eFNN-CEC and the eFNN-ECM is associated with evolving clustering algorithms. The maximum number of training epochs of the predictors (i.e., EFFNN, eNF, eFNN-ECM, and eFNN-CEC) is set at 1000. 3.2. Simulation example: exchange rate forecasting In this test, a six-dimensional currency exchange rate data set is used to examine the performance of the proposed eFNN-CEC predictor. The data set consists of daily exchange rates of Canadian dollar versus US dollar (the first dimensional data set), European euro versus US dollar (the second dimensional data set), British pound versus US dollar (the third dimensional data set), Australian
D.Z. Li et al. / Neurocomputing 145 (2014) 381–391
dollar versus US dollar (the fourth dimensional data set), Hong Kong dollar versus US dollar (the fifth dimensional data set) and New Zealand dollar versus US dollar (the sixth dimensional data set). All of them were collected simultaneously over the period
from January 1, 2010 to May 31, 2012 [22]. The tests are performed on each dimensional data set. The first one-third of the data set in each dimension is used for training, and the remainder is used for testing.
1.6
Exchange Rate
Exchange Rate
1.1 1.05 1 0.95 0.9
0
50
100
150
200
250
300
350
Exchange Rate
Exchange Rate
1 0.95
0
50
100
150
200
250
300
350
Exchange Rate
Exchange Rate
1 0.95
0
50
100
150
200
250
300
350
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
1.4 1.3
1.5 1.4 1.3
1.6
Exchange Rate
Exchange Rate
50
1.5
1.2
400
1.1 1.05 1 0.95
0
50
100
150
200
250
300
350
1.5 1.4 1.3 1.2
400
1.1
1.6
Exchange Rate
Exchange Rate
0
1.6
1.05
1.05 1 0.95 0.9
1.3
1.2
400
1.1
0.9
1.4
1.6
1.05
0.9
1.5
1.2
400
1.1
0.9
385
0
50
100
150
200
250
300
350
400
Day Fig. 4. Comparison of three-step-ahead forecasting results of daily Canadian dollar/ US dollar exchange rate data. The blue solid line is the real data to estimate; the red dotted line is the forecasting results using different predictors: (a) EFFNN; (b) eNF; (c) DENFIS; (d) eFNN-ECM; and (e) eFNN-CEC. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
1.5 1.4 1.3 1.2
Day Fig. 5. Comparison of three-step-ahead forecasting results of daily European euro/ US dollar exchange rate data. The blue solid line is the real data to estimate; the red dotted line is the forecasting results using different predictors: (a) EFFNN; (b) eNF; (c) DENFIS; (d) eFNN-ECM; and (e) eFNN-CEC. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
386
D.Z. Li et al. / Neurocomputing 145 (2014) 381–391
Exchange Rate
Exchange Rate
1.75 1.7 1.65 1.6 1.55 1.5
0
50
100
150
200
250
300
350
1.1
1
0.9
400
0
50
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
Exchange Rate
Exchange Rate
1.75 1.7 1.65 1.6 1.55 1.5
0
50
100
150
200
250
300
350
1.1
1
0.9
400
Exchange Rate
Exchange Rate
1.75 1.7 1.65 1.6 1.55 1.5
0
50
100
150
200
250
300
350
1.1
1
0.9
400
Exchange Rate
Exchange Rate
1.75 1.7 1.65 1.6 1.55 1.5
0
50
100
150
200
250
300
350
1.1
1
0.9
400
Exchange Rate
Exchange Rate
1.75 1.7 1.65 1.6 1.55 1.5
1.1
1
0.9 0
50
100
150
200
250
300
350
400
Day
Day
Fig. 6. Comparison of three-step-ahead forecasting results of daily British pound/ US dollar exchange rate data. The blue solid line is the real data to estimate; the red dotted line is the forecasting results using different predictors: (a) EFFNN; (b) eNF; (c) DENFIS; (d) eFNN-ECM; and (e) eFNN-CEC. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Fig. 7. Comparison of three-step-ahead forecasting results of daily Australian dollar/US dollar exchange rate data. The blue solid line is the real data to estimate; the red dotted line is the forecasting results using different predictors: (a) EFFNN; (b) eNF; (c) DENFIS; (d) eFNN-ECM; and (e) eFNN-CEC. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Figs. 4–9 illustrate the three-step-ahead forecasting performance of these six data sets using the related predictors, and the related results are summarized in Tables 1–6. It is seen that the eNF predictor is superior to the EFFNN predictor in terms of both forecasting accuracy and the running time, because of its evolving
reasoning mechanism. The DENFIS achieves less prediction error than the eNF as indicated in Tables 1, 3, 4 and 6,, but with more clusters generated and hence more running time, because more fuzzy rules are used to capture the data characteristics. In Tables 2 and 5, the eNF outperforms the DENFIS in terms of number
0.1285 0.128
100
150
200
250
300
350
0.8 0.75
400
0
50
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
0
50
100
150
200
250
300
350
400
Exchange Rate
0.95
0.1285 0.128
0
50
100
150
200
250
300
350
0.9 0.85 0.8 0.75 0.7
400
0.1295
Exchange Rate
0.95
0.129 0.1285 0.128
0
50
100
150
200
250
300
350
0.9 0.85 0.8 0.75 0.7
400
0.1295
0.95
Exchange Rate
Exchange Rate Exchange Rate Exchange Rate
50
0.129
0.1275
0.9 0.85
0.7 0
0.1295
0.1275
0.129 0.1285 0.128 0.1275
Exchange Rate
Exchange Rate
0.129
0.1275
387
0.95
0.1295
0
50
100
150
200
250
300
350
400
0.9 0.85 0.8 0.75 0.7
0.1295
0.95
0.129 0.1285 0.128 0.1275
0
50
100
150
200
250
300
350
400
Day Fig. 8. Comparison of three-step-ahead forecasting results of daily Hong Kong dollar/US dollar exchange rate data. The blue solid line is the real data to estimate; the red dotted line is the forecasting results using different predictors: (a) EFFNN; (b) eNF; (c) DENFIS; (d) eFNN-ECM; and (e) eFNN-CEC. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
clusters generated, prediction errors and running time, because of its more advanced evolving mechanism. The eFNN related predictors (i.e., eFNN-CEC and eFNN-ECM) yield less forecasting error than those based on the DENFIS, the eNF and the EFFNN, because eFNN undertakes more efficient linear and nonlinear correlation modeling (e.g., eFNN-ECM generates 75% and 67% less errors than the eNF and the DENFIS, respectively, as demonstrated in Table 1). The error
Exchange Rate
Exchange Rate
D.Z. Li et al. / Neurocomputing 145 (2014) 381–391
0.9 0.85 0.8 0.75 0.7
Day Fig. 9. Comparison of three-step-ahead forecasting results of daily New Zealand dollar/US dollar exchange rate data. The blue solid line is the real data to estimate; the red dotted line is the forecasting results using different predictors: (a) EFFNN; (b) eNF; (c) DENFIS; (d) eFNN-ECM; and (e) eFNN-CEC. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
in percentage is calculated as E ð%Þ ¼ 100 ððEB EA Þ=EB Þ; EA and EB represent the MSEs of predictors A and B, respectively. The developed CEC evolving method in eFNN-CEC is more efficient than the classical ECM in eFNN-ECM (e.g., eFNN-CEC generates 42% less error than eFNN-ECM in Table 4).
388
D.Z. Li et al. / Neurocomputing 145 (2014) 381–391
Table 1 Forecasting results (three-step-ahead) of Canadian dollar/US dollar exchange rate. Forecasting schemes
No. of clusters
MSE (10 4)
Running time (s)
EFFNN eNF DENFIS eFNN-ECM eFNN-CEC
20 6 10 3 2
6.072 3.723 2.797 0.918 0.421
27.211 22.524 25.979 17.872 16.013
1
2
3
4
5
6
7
Table 2 Forecasting results (three-step-ahead) of European euro/US dollar exchange rate. Forecasting schemes
No. of clusters
MSE (10 4)
Running time (s)
EFFNN eNF DENFIS eFNN-ECM eFNN-CEC
20 8 10 4 3
15.000 4.836 8.843 3.990 2.689
27.422 25.835 26.634 21.261 18.813
Table 3 Forecasting results (three-step-ahead) of British pound/US dollar exchange rate. Forecasting schemes
No. of clusters
MSE (10 4)
Running time (s)
EFFNN eNF DENFIS eFNN-ECM eFNN-CEC
20 8 12 3 2
9.281 4.579 3.656 3.382 3.141
26.113 22.492 24.956 17.319 16.593
Table 4 Forecasting results (three-step-ahead) of Australian dollar/US dollar exchange rate. Forecasting schemes
No. of clusters
MSE (10 4)
Running time (s)
EFFNN eNF DENFIS eFNN-ECM eFNN-CEC
20 7 10 3 2
14.000 6.448 3.502 2.201 1.279
26.362 24.804 25.462 16.973 16.109
Table 5 Forecasting results (three-step-ahead) of Hong Kong dollar/US dollar exchange rate. Forecasting schemes
No. of clusters
MSE (10 8)
Running time (s)
EFFNN eNF DENFIS eFNN-ECM eFNN-CEC
20 8 10 4 2
4.436 0.926 2.321 0.883 0.738
27.594 25.481 26.687 17.521 15.596
Table 6 Forecasting results (three-step-ahead) of New Zealand dollar/US dollar exchange rate. Forecasting schemes
No. of clusters
MSE (10 4)
Running time (s)
EFFNN eNF DENFIS eFNN-ECM eFNN-CEC
20 7 8 4 2
8.463 3.951 3.332 1.411 1.141
26.524 24.479 24.941 17.893 16.416
Fig. 10. The IM experimental setup: (1) tested IM; (2) speed controller; (3) gearbox; (4) load system; (5) current sensors; (6) data acquisition system; and (7) computer.
On the other hand, eFNN-CEC formulates the fewest clusters in these tests because of its linear filtering operation and effective clustering algorithms, which takes less running time. Therefore, the eFNN-CEC predictor outperforms other predictors in capturing and tracking the dynamic behaviors of the underlying systems in this test. 3.3. Application example: induction motor system state prognosis The developed eFNN-CEC predictor is implemented for IM system state prognosis. System state prognosis is an important strategy for equipment health condition monitoring [23]. An efficient predictor is very helpful in estimating an IM's dynamic characteristics for system state prognosis and performance control. Fig. 10 shows the experimental setup used in this test. The speed of the tested 3-phase IM is controlled by a speed controller (VFD-B from Delta Electronics) with output frequency 0.1–400 Hz. A magnetic particle clutch (PHC-50 from Placid Industries) is used as a dynamometer for external loading. Its torque range is from 1 to 41 N m. The IM used for tests is made by Marathon Electric. The gearbox (Boston Gear 800) is used to adjust the speed ratio of the dynamometer. A Quanser Q4 data acquisition board is used for data acquisition. This test is to forecast future states of phase current signals that will be used for IM health condition monitoring. During the test, phase current signals are collected at a sampling frequency of 10 kHz. Two current signals are used to form a two-dimensional data set for IM system state forecasting in this case. The first dimensional data set is a current residual signal from phase 1 by filtering out supply frequency components, and the second dimensional data set is a current signal from phase 2. The first 300 data are used for training, and the remaining 600 data are used for testing. Figs. 11 and 12 respectively show the respective four-step-ahead forecasting results of the signal residual and the current signal using the related predictors, and the results are summarized in Tables 7 and 8. The unit of the induction motor current signal is in Amperes (A). After calculating the MSE of the predicted values, the unit becomes A2 that is used in Tables 7 and 8. It can be seen that the eNF outperforms the EFFNN, with 29% less error in Table 7. The DENFIS generates more clusters and would take longer running time than the eNF, but the DENFIS is more accurate than eNF because more fuzzy rules are used to model the data characteristics. The eFNN predictors outperform predictors based on the EFFNN, the eNF and the DENFIS because the eFNN can employ both linear and nonlinear modeling mechanisms. Moreover, the eFNN-CEC predictor creates the fewest clusters (only three in this case) compared to eFNN-ECM (four–five clusters), eNF (six–eight clusters) and DENFIS (12–13 clusters). From Tables 7 and 8,
D.Z. Li et al. / Neurocomputing 145 (2014) 381–391
3
IM Current
IM Current
0.5
389
0
-0.5
2.8 2.6 2.4 2.2 2
0
100
200
300
400
500
600
100
200
300
400
500
600
0
100
200
300
400
500
600
0
100
200
300
400
500
600
0
100
200
300
400
500
600
0
100
200
300
400
500
600
3
IM Current
IM Current
0.5
0
0
-0.5
2.8 2.6 2.4 2.2 2
0
100
200
300
400
500
600
3
IM Current
IM Current
0.5
0
-0.5 0
100
200
300
400
500
2.8 2.6 2.4 2.2 2
600
0.5
0
IM Current
IM Current
3
-0.5 0
100
200
300
400
500
2.8 2.6 2.4 2.2 2
600
3
0
IM Current
IM Current
0.5
-0.5 0
100
200
300
400
500
600
Time Sample Step Fig. 11. Comparison of four-step-ahead forecasting results of the signal residual of an IM. The blue solid line is the real data to estimate; the red dotted line is the forecasting results using different predictors: (a) EFFNN; (b) eNF; (c) DENFIS; (d) eFNN-ECM; and (e) eFNN-CEC. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
it is clear that the proposed eFNN-CEC predictor provides the highest forecasting accuracy (in terms of MSE) when compared with eFNNECM, eNF, DENFIS and EFFNN predictors. It can be seen from Table 8 that the eFNN-CEC generates 43% less error than the second best predictor, eFNN-ECM. The eFNN-CEC predictor can catch the dynamic behavior of the tested IM system quickly and accurately.
2.8 2.6 2.4 2.2 2
Time Sample Step Fig. 12. Comparison of four-step-ahead forecasting results of the current signal of an IM. The blue solid line is the real data to estimate; the red dotted line is the forecasting results using different predictors: (a) EFFNN; (b) eNF; (c) DENFIS; (d) eFNN-ECM; and (e) eFNN-CEC. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
The IM data is a two dimensional data set. If the proposed eFNN-CEC technique is used, the residual IM data after VARMA filtering can be represented as Φ ¼ ½φ1 ; φ2 in Eq. (2), which is then fed to EF network for nonlinear modeling. The proposed CEC technique is used to generate clusters from the input patterns Φ, so as to adaptively construct the structure of EF network as well as
D.Z. Li et al. / Neurocomputing 145 (2014) 381–391
Table 7 Forecasting results (four-step-ahead) of the signal residual of an IM (from phase 1). Forecasting schemes
No. of clusters
MSE (A2)
Running time (s)
EFFNN eNF DENFIS eFNN-ECM eFNN-CEC
20 8 13 5 3
0.021 0.015 0.011 0.008 0.002
41.593 38.988 40.124 25.553 20.704
1
Membership Degree
390
0.8 0.6 0.4 0.2 0
Forecasting schemes
No. of clusters
MSE (A2)
Running time (s)
EFFNN eNF DENFIS eFNN-ECM eFNN-CEC
20 6 12 4 3
0.018 0.017 0.008 0.007 0.004
39.051 34.129 37.478 20.732 16.046
Membership Degree
Table 8 Forecasting results (four-step-ahead) of the current signal of an IM (from phase 2).
-1
Membership Degree
0.5
1
0.8 0.6 0.4 0.2 0
-0.2
-0.1
0
0.1
ϕ
0.2
0.3
0.4
0.5
Fig. 14. The distribution of (a) input patterns φ1 and (b) input patterns φ2 , associated with the corresponding Gaussian MFs, for the current signal of an IM. The blue solid line represents the distribution of the input patterns; the red dotted line represents the Gaussian MFs. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
0.6 0.4 0.2 0
-1
-0.5
0
0.5
1
1
Membership Degree
0
1
1 0.8
-0.5
0.8
Table 9 The parameters of the trained Gaussian MFs for the signal residual of an FM. Parameters
μj,1 (Fig. 13a)
μj,2 (Fig. 13b)
s
Cluster 1 (j ¼1) Cluster 2 (j ¼2) Cluster 3 (j ¼3)
0.254 0.734 0.303
0.048 0.082 0.016
0.124 0.139 0.197
0.6 0.4 Table 10 The parameters of the trained Gaussian MFs for the current signal of an IM.
0.2 0
-0.2
-0.1
0
ϕ
0.1
0.2
0.3
Fig. 13. The distribution of (a) input patterns φ1 and (b) input patterns φ2 , associated with the corresponding Gaussian MFs, for the signal residual of an IM. The blue solid line represents the distribution of the input patterns; the red dotted line represents the Gaussian MFs. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.)
Parameters
μj,1 (Fig. 14a)
μj,2 (Fig. 14b)
s
Cluster 1 (j ¼1) Cluster 2 (j ¼2) Cluster 3 (j ¼3)
0.288 0.333 0.077
0.028 0.001 0.211
0.107 0.111 0.117
Gaussian MFs Bj,i in Eq. (3). The distributions of input patterns Φ, and the corresponding trained Gaussian MFs are shown in Figs. 13 and 14, respectively. The parameters μj;1 , μj;2 and sj of the Gaussian MFs in Eq. (3) corresponding to Figs. 13 and 14, are given in Tables 9 and 10, respectively. From Figs. 13 and 14, it is seen that the derived three Gaussian MFs can capture the distribution of the input patterns effectively. Compared to other predictors with more generated clusters (or Gaussian MFs), the proposed eFNN-CEC can conduct accurate prediction with less running time as demonstrated in Tables 7 and 8.
nonlinear network modeling approaches in dealing with multidimensional data sets. A novel evolving clustering algorithm, CEC, is proposed to adaptively generate fuzzy reasoning clusters and adjust the eFNN network structure. The effectiveness of the proposed eFNN predictor and the new clustering algorithm is verified by simulation using a multidimensional financial data set. The new predictor has also been implemented for the induction motor (IM) system state prognosis. Test results have showed that the developed eFNN predictor is an accurate forecasting tool, and can capture the dynamic behavior of the tested system quickly and accurately. The CEC is also an effective evolving technique for network structure formulation and improvement.
4. Conclusion
Acknowledgment
An evolving fuzzy neural network (eFNN) predictor has been developed in this work for multi-dimensional system state forecasting. It integrates the advantages of both the VARMA filter and
This work is supported in part by the Natural Sciences and Engineering Research Council of Canada (NSERC) and eMech Systems Inc.
D.Z. Li et al. / Neurocomputing 145 (2014) 381–391
References [1] H. Lütkepohl, New Introduction to Multiple Time Series Analysis, Springer, New York, 2005. [2] N. Haldrup, F.S. Nielsen, M. Nielsen, A vector autoregressive model for electricity prices subject to long memory and regime switching, Energy Econ. 32 (5) (2010) 1044–1058. [3] F.L. Joutz, G.S. Maddala, R.P. Trost, An integrated Bayesian vector autoregression and error correction model for forecasting electricity consumption and prices, J. Forecast. 14 (3) (1995) 287–310. [4] C. Vargas-Silva, The effect of monetary policy on housing: a factor-augmented vector autoregression (FAVAR) approach, Appl. Econ. Lett. 15 (10) (2008) 749–752. [5] C. Kascha, K. Mertens, Business cycle analysis and VARMA models, J. Econ. Dyn. Control 33 (2) (2009) 267–282. [6] K.W. Hipel, Time Series Modeling of Water Resources and Environmental Systems, Elsevier, Netherlands, 1994. [7] W. Yan, Toward automatic time-series forecasting using neural networks, IEEE Trans. Neural Netw. Learn. Syst. 23 (7) (2012) 1028–1039. [8] N.K. Roy, W.D. Potter, D.P. Landau, Polymer property prediction and optimization using neural networks, IEEE Trans. Neural Netw. 17 (4) (2006) 1001–1014. [9] Y. Lin, J. Chang, C. Lin, Identification and prediction of dynamic systems using an interactively recurrent self-evolving fuzzy neural network, IEEE Trans. Neural Netw. Learn. Syst. 24 (2) (2013) 310–321. [10] S. Yilmaz, Y. Oysal, Fuzzy wavelet neural network models for prediction and identification of dynamical systems, IEEE Trans. Neural Netw. 21 (10) (2010) 1599–1609. [11] P.A. Fishwick, Neural network models in simulation: a comparison with traditional modeling approaches, in: Proceedings of the Winter Simulation Conference, 1989, pp. 702–710. [12] Z. Tang, C. Almeida, P.A. Figurehwick, Time series forecasting using neural networks vs. Box-Jenkins methodology, Simulation 57 (5) (1991) 303–310. [13] N. Kasabov, Evolving fuzzy neural networks for supervised/unsupervised online knowledge-based learning, IEEE Trans. Syst. Man Cybern. B 31 (6) (2001) 902–918. [14] N. Kasabov, Evolving Connectionist Systems, Springer, London, 2007. [15] N. Kasabov, J. Kim, M. Wstts, A. Gray, FuNN/2 – a fuzzy neural network architecture for adaptive learning and knowledge acquisition in multimodular distributed environments, Inf. Sci. Appl. 101 (3–4) (1997) 155–175. [16] N.K. Kasabov, DENFIS: dynamic evolving neural-fuzzy inference system and its application for time-series prediction, IEEE Trans. Fuzzy Syst. 10 (2) (2002) 144–154. [17] X. Cai, N. Zhang, G.K. Venayagamoorthy, D.C. Wunsch II, Time series prediction with recurrent neural networks trained by a hybrid PSO-EA algorithm, Neurocomputing 70 (2007) 2342–2353. [18] G. Inoussa, H. Peng, J. Wu, Nonlinear time series modeling and prediction using functional weights wavelet neural network-based state-dependent AR model, Neurocomputing 86 (2012) 59–74. [19] T. Taskaya-Temizel, K. Ahmad, Are ARIMA neural network hybrids better than single models? In: Proceedings of the International Joint Conference on Neural Networks, 2005, pp. 3192–3197. [20] D. Li, W. Wang, F. Ismail, Enhanced fuzzy-filtered neural networks for material fatigue prognosis, Appl. Soft Comput. 13 (1) (2013) 283–291. [21] W. Wang, D. Li, J. Vrbanek, An evolving neuro-fuzzy technique for system state forecasting, Neurocomputing 87 (2012) 111–119. [22] W. Antweiler, Database Retrieval System (v2.15), University of British Columbia, February 2, 1996. 〈http://fx.sauder.ubc.ca/data.html〉 (accessed 02.10.12). [23] W. Wang, An enhanced diagnostic system for gear system monitoring, IEEE Trans. Syst. Man Cybern. B 38 (1) (2008) 102–112. [24] M.C. Medeiros, A. Veiga, A hybrid linear–neural model for time series forecasting, IEEE Trans. Neural Netw. 11 (6) (2000) 1402–1412. [25] M. Khashei, M. Bijari, A novel hybridization of artificial neural networks and ARIMA models for time series forecasting, Appl. Soft Comput. 11 (2) (2011) 2664–2675.
391
[26] X. Wang, L. Ma, B. Wang, T. Wang, A hybrid optimization-based recurrent neural network for real-time data prediction, Neurocomputing 120 (2013) 547–559. [27] R. Chandra, M. Zhang, Cooperative coevolution of Elman recurrent neural networks for chaotic time series prediction, Neurocomputing 86 (2012) 116–123. [28] J. Zhao, X. Zhu, W. Wang, Y. Liu, Extended Kalman filter-based Elman networks for industrial time series prediction with GPU acceleration, Neurocomputing 118 (2013) 215–224. [29] F. Liu, J. Wang, Fluctuation prediction of stock market index by Legendre neural network with random time strength function, Neurocomputing 83 (2012) 12–21.
De Z. Li received his B.Sc. degree in Electrical Engineering from the Shandong University, Jinan, China, in 2008, and M.Sc. degree in Control Engineering from the Lakehead University, Thunder Bay, ON, Canada in 2010. From 2010 to 2011, he was a Research Associate at Lakehead University. He is currently a Ph.D. candidate with the Department of Mechanical & Mechatronics Engineering at University of Waterloo. His research interests include signal processing, machinery condition monitoring, mechatronic systems, linear/nonlinear system control and artificial intelligence.
Wilson Wang received his M.Eng. in Industrial Engineering from the University of Toronto, Toronto, ON, Canada, in 1998 and the Ph.D. in Mechatronics Engineering from the University of Waterloo, Waterloo, ON, Canada, in 2002. From 2002 to 2004, he was a Senior Scientist with Mechworks Systems Inc. He joined the faculty of Lakehead University, Thunder Bay, ON, Canada, in 2004, where he is currently a Professor with the Department of Mechanical Engineering. His research interests include signal processing, artificial intelligence, machinery condition monitoring, intelligent control and mechatronics.
Fathy Ismail received the B.Sc. and M.Sc. degrees in Mechanical and Production Engineering, in 1970 and 1974, respectively, from the Alexandria University, Egypt, and the Ph.D. degree from the McMaster University, Hamilton, Ontario, Canada, in 1983. He joined the University of Waterloo, Waterloo, Ontario, Canada, in 1983, and is currently a Professor in the Department of Mechanical and Mechatronics Engineering. He has served as the Chair of the Department and the Associate Dean of the Faculty of Engineering for Graduate Studies. His research interests include machining dynamics, high-speed machining, modeling structures from modal analysis testing, and machinery health condition monitoring and diagnosis.