ARTICLE IN PRESS
Neurocomputing
(
)
– www.elsevier.com/locate/neucom
Reconstruction of chaotic time series by neural models: a case study Stefania Troncia , Massimiliano Gionab , Roberto Barattia;∗ a Dipartimento
di Ingegneria Chimica e Materiali, Universita di Cagliari, Piazza D’Armi, I-09123 Cagliari, Italy b Centro Interuniversitario di Ricerca su Sistemi Disordinati e Frattali nell’Ingegneria Chimica c/o Dipartimento di Ingegneria Chimica, Universita di Roma ‘La Sapienza’, Via Eudossiana 18, I-00184 Roma, Italy Accepted 11 March 2003
Abstract This work analyses the problems related to the reconstruction of a dynamical system, which exhibits chaotic behaviour, from time series associated with a single observable of the system itself, by using feedforward neural network model. The starting network architecture is obtained setting the number of input neurons according to the Takens’ theorem, and then is imporved by slightly increasing the number of inputs. The choice of the number of the hidden neurons is based on the results obtained testing di,erent net structures. The e,ectiveness of the method is demonstrated by applying it to the Brusselator system (Phys. Lett. 91 (1982) 263). c 2003 Elsevier Science B.V. All rights reserved. Keywords: Chaos; Neural networks; Time series; Reconstruction
1. Introduction Chaotic phenomena frequently appear in economics, meteorology, chemical processes, biology, hydrodynamics and many other situations. In order to predict and/or control the underlying systems, mathematical models are generally investigated analyzing the evolution properties of the corresponding equations of motion, and integrating them in order to predict the future state of the system. Often, our knowledge of the ∗
Corresponding author. Tel.: +39-0706-7550-56; fax: +39-0706-7550-67. E-mail address:
[email protected] (R. Baratti).
c 2003 Elsevier Science B.V. All rights reserved. 0925-2312/03/$ - see front matter doi:10.1016/S0925-2312(03)00394-1
ARTICLE IN PRESS 2
S. Tronci et al. / Neurocomputing
(
)
–
Nomenclature C(l) Cn () D D2 de dr G (dr) G (dr) n1 n2 l Tp {xi } y1 y2 ynn yb opt
Grassberger–Procaccia correlation function autocorrelation function attractor dimension correlation dimension Euclidean distance embedding dimension function which generates Xt+1 from past measurements component of function G (dr) number of input neurons number of hidden neurons distance between two points pseudo-period of oscillation time series of one observable variable concentration value of species 1 concentration value of species 2 concentration value of observable species predicted by the neural model concentration value of observable species calculated by the original system mean square error sampling time optimal sampling time
system is limited to a recorded numerical time system associated with a single observable (state variable). As a consequence the future evaluation of the system should be inferred from experimental measurements of a single time series. There are several techniques for modeling and forecasting nonlinear time series such as the threshold model [26]; exponential model [21]; local linear model [5,6]; nearest neighbor regression [24,27]; feedforward network model [8,16,17,19]. In addition, the Taylor series expansion, radial basis function of Casdagli [3] and the nonparametric kernel regression are other methods for forecasting nonlinear time series. Among these techniques, artiDcial neural networks are the most recent. This is partly due to some modeling problems encountered in the early stage of the development [16,17]. Recent literature, however, has demonstrated the theoretical foundations of the universality of feedforward networks as function approximators [4,7,12–14]. The advantage of the network models over the other methods of time series forecasting is that feedforward neural models use only linear parameters, whereas traditional polynomial, spline, and trigonometric extension use exponential parameters to achieve the same approximation rate [2]. In this work a chaotic time series reconstruction technique based on a feedforward neural network model is developed. The neural network architecture is made of three
ARTICLE IN PRESS S. Tronci et al. / Neurocomputing
(
)
–
3
layers: the input layer, the hidden layer, and the output layer. The dynamic of the system is accounted for by enforcing an externally recurrent network: this means that the output of the net becomes the new input for the following integration step. In order to show the practical application of the proposed method, this is applied for the reconstruction of the Brusselator kinetics [23]. 2. Feedforward neural network for time series prediction The following chaotic system is considered: Yt+ = g(Yt ); m
Yt ∈ Rm ;
(1)
m
where g : R → R is a continuous smooth function. Suppose that the system can be observed by a viewer function h : Rm → Rp , p 6 m, so that what is really observed is xt = h(Yt ), xt ∈ Rp . In the following we assume that p = 1 and Xt = (xt ; xt− ; : : : ; xt−(dr−1) ), Xt being the state of the a dr -dimensional space. The theorem of Takens [25], states that exists a dr -dimensional deterministic map of at most dr past measurements, sampled at an arbitrary time delay , of a temporal series, that allows the correct prediction of its future value, and the prediction is just as good as the one it would be obtained if one had been able to solve the complete system with all its degrees of freedom [11,25]. The dr -dimensional map governs the evolution of the state Xt , whose trajectory is di,eomorphic to that described by Yt . In particular, since a di,eomorphism is a one-to-one relationship that preserves topological properties, this means that there exists a map G (dr) such that Xt+ = G (dr) (Xt ):
(2)
Being xt+ the only unknown component of the vector Xt+ = (xt+ ; xt ; : : : ; xt−dr ), the evolution of the state Xt can be written as xt+ = G dr (xt ; xt− ; : : : ; xt−(dr−1) ); (dr)
(3) (dr)
is the component of the map G . The space in which where the function G G (dr) , and thus G (dr) , are deDned, is the reconstructed space and its dimension dr is called embedding dimension. In Takens’ theorem [25] dr is given by dr = 2D + 1, D being the dimension of the attractor of the dynamical system (1) from which Xt = (xt ; xt− ; : : : ; xt−(dr−1) ) is generated. Since G (dr) is deterministic, the problem of forecasting the component xt+ reduces to that of estimating the function G (dr) (cf., [3,6]). In this work a feedfoward neural network is used to predict time series. The dynamics is accounted for by including past inputs and target values in an augmented set of inputs. The neural network architecture is made of three layers: the input layer, the hidden layer, and the output layer. The number of neurons of the input layer n1 is set equal to the dimension of the reconstructed space dr =2D +1, according to the Takens’ theorem [25]. This should be the minimum value of the components necessary for the time series reconstruction. The embedding theorem does not give any information about the time delay. Indeed, it allows any time delay, as long as one has an inDnite amount of inDnitely accurate
ARTICLE IN PRESS 4
S. Tronci et al. / Neurocomputing
(
)
–
data. This condition is practically unreliable, and the reconstruction heavily depends on the time delay. The proper sampling time is an information that can be extracted from the time series data [1]. Giona [9] reported that the optimal time delay can be calculated as a function of the pseudo-period of oscillation Tp , deDned as the average value of the distance between the minima of the autocorrelation function Cn (), according to the relation opt = Tp =[4(dr − 1)]:
(4)
In order to optimize the prediction performance of the net, opt is used as delay time among the dr inputs of the network. The neurons of the hidden layer, whose number n2 depends on the results obtained testing di,erent net structures, are activated by a sigmoidal function. The same function is used to process signal coming from the hidden layer to the output one. 3. Simulation results In order to investigate the capability of neural networks of forecasting the future state of a chaotic system, the Brusselator kinetics was chosen. The Brusselator model was studied by several authors [18,23] and is a classical example of chaotic kinetics. It is described by the following equations: dy1 = p1 − (1 + p2 )y1 + y12 y2 + p3 cos(!t); dt dy2 (5) = p2 y1 − y12 y2 dt with p1 = 2:00, p2 = 5:4, p3 = 0:265, ! = 2:85. Time entering Eq. (5) is dimensionless and so all the other parameters do. For this system the pseudo-period of oscillation is equal to 4.0. The correlation dimension, D2 , is used here as a practical estimate of the attractor dimension D to calculate dr [9]. The value of D2 has been estimated according to the Grassberger and Procaccia algorithm [10] from the scaling of the Grassberger–Procaccia correlation function C2 (l) N N 1 (l − |xi − xj |) ≈ lD2 ; N →∞ N (N − 1)
C2 (l) ≈ lim
(6)
i=1 j=i+1
where is the Heaviside step function and xi and xj are two points of the set, whose distance is less that l. D2 is calculated from Eq. (6) performing the limit log|C(l)| D2 = lim (7) l→0 log(l) and is equal to 1.83 for system (5). This leads to dr = 5 as dimension of the reconstructed vector Deld, and opt = 0:3 from Eq. (4). It is important to observe that D2 6 D and a more accurate estimate of the attractor dimension without performing box counting can be achieved by the Lyapunov dimension.
ARTICLE IN PRESS S. Tronci et al. / Neurocomputing
(
)
–
5
y1,k+1
4
2
0 (a)
0
2
4
y1,k
0 (b)
2
4
y1,k
Fig. 1. PoincarIe maps: (a) original system and (b) neural network reconstruction.
The following procedure of identiDcation of the neural network for forecasting the Brusselator kinetics was used. A time series {xi } was generated by a dynamic simulator and it consists of samples of the observable variable y1 . The time series was divided in two subsets referred to the training and test subset: the training subset consists of 1000 time entries and the test subset consists of 1000 entries. In this case during the validation tests of the neural model, the net processes data that didn’t use during the training. To train the networks, the Levenberg–Marquartd algorithm was used. It was veriDed that the number of hidden neurons has to be greater than or at least equal to the number of inputs in order to get good prediction performance. The number of hidden neurons was obtained testing di,erent structures of the net in the range n1 6 n2 6 2n1 . As result the net structure (5,8,1) was obtained. The number of hidden neuron has been also veriDed using the SVD (singular value decomposition) approach, suggested by Kanjilal and Banerjee [15]. We considered a (5,18,1) network, which is overparameterized with respect to the model we obtained by the heuristic method. Throughout the training, the SVD is performed on the 500 × 18 matrix A, which is formed by the 500 data sets of 18 outputs from the hidden layer stage. The SVD has been performed for di,erent number of epochs and the number of dominant singular value has then been selected considering the energy contents of matrix A, that is equal to the sum of the squared singular value. We veriDed that a number of hidden neurons equal to 8 leads to a percentage of rejected energy of 10−5 order magnitude. Considering the guidelines suggested by Kanjilal and Banerjee [15], this result means that 10 units can be eliminated from the candidate network because their contribution on the information is not signiDcant. Of course, selecting the threshold for the percentage of rejected energy a certain degree of heuristic is maintained. It is worth to point out that the heuristic procedure used for determining the number of hidden nodes gives the same results obtained using other methodology [15], nevertheless the proposed procedure is rather simple and does not require excessive computational time. The obtained network architecture, which is quite simple if compared to the others proposed [19], has the capabilities of successfully predicting the future state of the chaotic system, as shown in Fig. 1, where the reconstructed attractor (Fig. 1a)
ARTICLE IN PRESS 6
S. Tronci et al. / Neurocomputing
(
)
–
5 4
y1
3 2 1 0
20
25
30
35
40
45
50
55
60
Time Fig. 2. Prediction of the Brusselator time series: (dotted line) original system; (continuous line) neural network reconstruction (5,8,1).
is compared with the original one (Fig. 1b). The reconstructed attractor is obtained by evolving the dynamical system over a time interval corresponding to the medium pseudo-period of oscillation, and then assuming the corresponding value of the original dynamics as a new starting point. The results of the forecasting process, for the test data set, are practically perfect when compared to real data, in fact the neural model is able to predict the chaotic system up to three cycle, as it is shown in Fig. 2, where at a time equal to 24 units the neural model is in a pure predictive mode, so it does not receive any direct information on the system states. The data used for testing the performance of the neural network model and shown in Fig. 2 had not been processed in the training of NN. This means that the neural model is able to predict the chaotic time series up to three cycles even processing data that it had never processed before. Because the aim of the present study is to model a chaotic system in order to use it in a process model control, this result could be considered fairly promising. Anyway, depending on the measurements delay, a wider prediction horizon could be necessary. Fig. 2 shows that by increasing the prediction time even further three cycles, the neural network model is not capable to cope the real system anymore. This result could reveal that the dimension D2 underestimates the attractor dimension D, so that a bigger embedding dimension is necessary for accurately reconstructing the system dynamics. Starting from the number of inputs calculated by Takens’ theorem, a new network structure has been determined. Because a slight change of the number of delayed measurements is expected, again a heuristic approach is preferred to determine the input structure of the network. It is worth to point out that other methodologies, like the network size reduction by singular value decomposition [15], converge to a number of inputs equal to the embedding dimension. The proposed heuristic strategy consists of solving the optimization problem for di,erent net structures, that is for di,erent number of inputs (n1 = 1; 2; : : : ; 10) and hidden neurons (n1 6 n2 6 2n1 ). The optimization problem was solved for every network
ARTICLE IN PRESS S. Tronci et al. / Neurocomputing
(
)
–
7
0.03 0.025
Error
0.02 0.015 0.01 0.005 0.0
1
2
3
4
5
6
7
8
9
10
Number of Inputs Fig. 3. Error as function of time-delayed inputs.
architecture changing the starting point one hundred times, and the resulting neural models was compared by calculating the mean square error: = (ynn − yb )2 =N;
(8)
where ynn is the value predicted by the network, yb is the actual value calculated by the simulator and N is the number of points considered for the test. Fig. 3 shows that the error is quite low for the architecture network (8,13,1). The network structures compared have di,erent number of hidden neurons, in fact every point represents the minimum error calculated for a Dxed n1 , (n1 = 1; 2; : : : ; 10) varying n2 ; (n16 n2 6 2n1 ). Again, the number of hidden units was veriDed by singular value decomposition [15], and assuming the same percentage rejected energy of 10−5 order magnitude. Starting from a (8,20,1) architecture network, that is a overparameterized neural model with respect to the (8,13,1) network, we came up with a (8,13,1) architecture, that is the one obtained by the heuristic approach. Fig. 4 shows the forecasting results obtained with the neural architecture (8,13,1) when the starting point is exactly known, that is when the array y=(yt ; yt−1 ; : : : ; yt−dr+1 ) calculated at t = t0 is available and is not corrupted by noise. In this case the neural network (8,13,1), starting at the same point of the real system, is able to give a good prediction of the chaotic attractor. 4. Remarks The results showed in the previous paragraph are good as far as we perfectly know the initial conditions, that is the vector of the delayed measurements. This is not a common situation in practical engineering applications, but generally the original system and the reconstructed one have to be synchronized. Fig. 5 shows the behaviour of the chaotic neural model, when only measurement at time t is known. Because the delayed measurements at times: t − ; t − 2; : : : ; t − 7, where = 0:3 as deDned above, are not available, components of the starting point are all set equal to the value of y
ARTICLE IN PRESS 8
S. Tronci et al. / Neurocomputing
(
)
–
5 4
y1
3 2 1 0
0
10
20
30
40
50
Time Fig. 4. Prediction of the Brusselator time series: (dotted line) original system; (continuous line) neural network reconstruction (8,13,1).
5 4
y1
3 2 1 0
0
5
10
15
20
Time Fig. 5. Synchronization of the Brusselator system and the neural network (8,13,1): (dotted line) original system; (continuous line) neural network reconstruction.
at time t. Fig. 5 shows that the synchronization of the two systems is obtained in eight time step, by reading at every the exact value of concentration from the test data set. Up to this point, we have not considered the presence of noise. Indeed, we trained the network considering that the data set was not contaminated by other sources, following the procedure generally proposed by other authors [9,15,19]. We are aware that this is an ideal situation because measurements are always corrupted by noise. Anyway, the aim of this work is to assess neural network capabilities for forecasting chaotic time series and Dnd a strategy for the selection of the optimal network structure, avoiding the complicated question, for which a general answer is not available [22],
ARTICLE IN PRESS S. Tronci et al. / Neurocomputing
(
)
–
9
5 4
y1
3 2 1 0
0
5
10
15
20
Time Fig. 6. Synchronization of the Brusselator system and the neural network (8,13,1): (dotted line) original system; (continuous line) neural network reconstruction with an average noise level equal to −20 dB; (dotted-dashed line) neural network reconstruction with and average noise level equal to −16 dB.
on how distinguish the deterministic components from a noisy time series. On the other hand, we cannot neglect the stochastic component of the measured input when the neural network is used for online application. For this reason, we have tested the robustness of the neural model when measurements are corrupted by noise. In this case the reconstructed system is capable to well reproduce the original one, up to an average noise level equal to −16 dB. In order to have a clearer representation of the attractors, results are shown in Fig. 6 for a small interval time. 5. Conclusion In this work a feedforward neural network with only one hidden layer is proposed to forecast chaotic time series. The performance of the neural model was tested in the reconstruction of the Brusselator kinetics, which exhibits chaotic behaviour for the set of parameter values we used. The neural network model was trained using the time series generated by the dynamic simulator of the Brusselator system. The structure of the inputs, which means the number of neurons in the input layer and the time delay among the inputs, was derived by the characteristic parameters of the chaotic attractor, i.e. correlation dimension and pseudo-period of oscillation. This structure exhibited good capabilities in forecasting the nonlinear system up to a pseudo-period of oscillation, when the number of the hidden neurons is greater than the number of the inputs. In the observed range (n1 6 n2 6 2n1 ) the best results were obtained with eight neurons in the hidden layer. In a control framework the prediction time of the neural network proposed, which has a very simple structure, can be considered a good result. In fact this reconstruction technique allows the application of model based control algorithm, or map based algorithm like OGY [20] when on-line continuous measurement of the controlled variables and detailed models are not available. Anyway, because a wider
ARTICLE IN PRESS 10
S. Tronci et al. / Neurocomputing
(
)
–
prediction horizon could be necessary, the results obtained with the network (5,8,1) was improved giving more past information to the network. Optimal results were obtained for an architecture structure (8,13,1) and, when the starting point is known, the chaotic neural network is capable to cope with the real data. The proposed procedure allows to develop simpler neural network than other proposed in literature [19], without requiring excessive computation e,ort. To Dnd out the limit of the neural approach, a synchronization problem has been considered when measurements are corrupted by noise. Forecasting capabilities of the network are satisfactory up to an average noise level equal to −16 dB. Acknowledgements S.T. gratefully acknowledges the Dnancial support given by CNR (CNRG0024E5). References [1] H.D.I. Abarbanel, Analysis of Observed Chaotic Data, Springer-Verlag, New York, 1996. [2] A. Barron, Approximation and estimation bounds for artiDcial neural networks, Technical Report 59, Department of Statistics, University of Illinois at Urbana-Champaign, 1991. [3] M. Casdagli, Nonlinear prediction of chaotic time series, Physica D 35 (1989) 335. [4] G. Cybenko, Approximation by superimposition of a sigmoidal function, Math. Control Signal Systems 2 (1989) 303. [5] J.D. Farmer, J.J. Sidorowich, Predicting chaotic time series, Phys. Rev. Lett. 59 (1987) 845. [6] J.D. Farmer, J.J. Sidorowich, Exploiting chaos to predict the future and reduce noise, in: Y.C. Lee (Ed.), Evolution, Learning and Cognition, World ScientiDc, Singapore, 1988. [7] K.I. Funahashi, On the approximate realization of continuous mappings by neural networks, Neural Networks 2 (1989) 183. [8] R. GanPcay, A statistical framework for testing chaotic dynamics via Lyapunov exponents, Physica D 89 (1996) 261. [9] M. Giona, Functional reconstruction of oscillating reaction: prediction and control of chaotic kinetics, Chem. Eng. Sci. 47 (1992) 2469. [10] P. Grassberger, T. Schreiber, C. Sha,rath, Nonlinear time sequence analysis, Int. J. Bif. Chaos 1 (1991) 521. [11] J. Hertz, A. Krogh, R.G. Palmer, Introduction to the Theory of Neural Computation, Addison-Wesley, Redwood City, CA, 1991. [12] K. Hornik, Approximation capabilities of multilayer feedforward neural networks, Neural Networks 4 (1991) 251. [13] K. Hornik, M. Stinchcombe, H. White, Multilayer feedforward networks are universal approximators, Neural Networks 2 (1989) 359. [14] K. Hornik, M. Stinchcombe, H. White, Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks, Neural Networks 3 (1990) 551. [15] P.P. Kanjilal, D.N. Banerjee, On the application of orthogonal transformation for the design and analysis of feedforward networks, IEEE Trans. Neural Networks 6 (5) (1995) 1061. [16] A. Lapedes, R. Farber, How neural nets work, in: D.Z. Anderson (Ed.), Neural Information Processing System, AIP, New York, 1987a, p. 442. [17] A. Lapedes, R. Farber, Nonlinear signal processing using neural networks, Los Alamos National Laboratory, LA-UR-87-2662, 1987b. [18] P.J. Nandapurkar, V. Hlavacek, V. Van Rompay, Chaotic behavior of a di,usion-reaction system, Chem. Eng. Sci. 36 (1986) 2747.
ARTICLE IN PRESS S. Tronci et al. / Neurocomputing
(
)
–
11
[19] K.A. de Oliveira, A. Vannucci, E.C. da Silva, Using artiDcial neural network to forecast chaotic time series, Physica A 284 (2000) 393. [20] E. Ott, C. Grebogy, J.A. Yorke, Controlling Chaos, Phys. Rev. Lett. 64 (1990) 1196. [21] T. Ozaki, The statistical analysis of perturbed limit cycle using nonlinear time series models, J. Time Series Anal. 3 (1982) 29. [22] T. Schreiber, Interdisciplinary application of nonlinear time series methods, Phys. Rep. 308 (1999) 1. [23] I. Schreiber, M. Marek, Transition to chaos via two-torus in coupled reaction-di,usion cells, Phys. Lett. 91 (1982) 263. [24] T. Stengos, Nonparametric forecasts of gold rates of return, Department of Economics, No. 1995-1, University of Guelph, 1995. [25] F. Takens, Distinguishing deterministic and random systems, in: B.L. Braksma, et al., (Eds.), Dynamical Systems and Bifurcations, Springer-Verlag, Berlin, 1981. [26] H. Tong, Threshold models in nonlinear time series analysis, in: Lecture Notes in Statics, Vol. 21, Springer-Verlag, New York, 1983. [27] S. Yakowitz, Nearest-neighbor methods for time series analysis, J. Time Series Anal. 8 (1987) 235. Stefania Tronci received the Laurea degree (summa cum laude) from University of Cagliari in 1994 and the Dottorato di Ricerca (Ph.D.) from University of Pisa in 2000. She is Assistant Professor of Process Dynamics and Control Design at University of Cagliari in the Chemical Engineering and Material Science Department. Her research interests include development of process model for control, monitoring and optimization purpose, development of control strategies and reconstruction of chaotic systems using neural network models.
Massimiliano Giona graduated cum laude in Electrical Engineering at “La Sapienza” University of Rome in 1989. In 1993 he received his Ph.D. degree in Chemical Engineering from the same University. From 1993 to 1998 he served as assistant professor at the Department of Chemical Engineering of the University of Cagliari, Italy. He is Associate Professor of Chemical Engineering at “La Sapienza” University of Rome since 1998. He co-authored more than 150 scientiDc papers. His main research interests are: transport phenomena, dynamics of fractal lattices, dynamical systems and Tuid mixing, mathematical modeling of chemical engineering processes.
Roberto Baratti received the Laurea degree (summa cum laude) from University of Cagliari in 1982 and the Dottorato di Ricerca (Ph.D.) from University of Pisa in 1986. He is an Associate Professor of Process Dynamics and Control Design at the University of Cagliari in the Chemical Engineering and Material Science Department. His research interests include development of process model for control, monitoring and optimization purpose and development of control strategies using neural network models. He is a member of the ScientiDc Committee of the International Conference EANN, since 1998, and he is a member of the IFAC, AIDIC and AIChE.