A Dynamical Systems Approach to Modeling Input-Output Systems

Report 1 Downloads 147 Views
A Dynamical Systems Approach to Modeling Input-Output Systems Martin Casdagli

SFI WORKING PAPER: 1991-05-023

SFI Working Papers contain accounts of scientific work of the author(s) and do not necessarily represent the views of the Santa Fe Institute. We accept papers intended for publication in peer-reviewed journals or proceedings volumes, but not papers that have already appeared in print. Except for papers by our external faculty, papers must be based on work done at SFI, inspired by an invited visit to or collaboration at SFI, or funded by an SFI grant. ©NOTICE: This working paper is included by permission of the contributing author(s) as a means to ensure timely distribution of the scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the author(s). It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may be reposted only with the explicit permission of the copyright holder. www.santafe.edu

SANTA FE INSTITUTE

A Dynamical Systems Approach to Modeling Input-Output Systems Martin Casdagli Santa Fe Institute, 1120 Canyon Road Santa Fe, New Mexico 87501 Abstract Motivated by practical applications, we generalize theoretical results on the nonlinear modeling of autonomous dynamical systems to input-output systems. The inputs driving the system are assumed to be observed, as well as the outputs. The underlying dynamics coupling inputs to outputs is assumed to be deterministic. We give a definition of chaos for input-output systems, and develop a theoretical framework for state space reconstruction and modeling. Most of the results for autonomous deterministic systems are found to generalize to input-output systems with some modifications, even if the inputs are stochastic.

1

Introduction

There has been much recent interest in the nonlinear modeling and prediction of time series data, as evidenced by this conference proceedings volume. For approaches motivated by deterministic chaos, see [2, 5, 7, 8] and references therein. For approaches motivated by nonlinear stochastic models, see [15] and references therein. In both these approaches, it is assumed that a time series x(t) is obtained from observations of an autonomous dynamical system, possibly perturbed by unobserved forces or noise. By contrast, when modeling input-output systems, in addition to an observed output time series x(t), there is also available an observed input time series u(t); see Figure 1. Modeling input-output systems is appropriate in many applications. For example, in vibration testing, a mechanical device may be sub jeeted to a controlled random forcing to test its robustness in a simulated environment. Also, in scientific experiments, one may be interested in the response of a system to various forms of stimulus. There are many other disciplines, for example meteorology and economics, in which pairs of input-output time series from input-output systems are available for analysis. In this paper we will develop a theory for the nonlinear modeling of input-output systems based on deterministic dynamics. This deterministic approach assumes that the input-output time series arises from a finite dimensional dynamical system

ds dt = f(s{t),u(t))

(1)

x(t) = h(s(t))

(2)

1

Input u(t)

Unknown System

Output x(t)

Figure 1: Conceptual model of a single input-single output system where s(t) E ~d denotes a d-dimensional state, and for simplicity we take u(t) E ~ to be a scalar input, f : ~d X ~ --> ~d to be a smooth flow, and h : ~d --> ~ to be a scalar measurement function. It is then natural to attempt to model and forecast the behavior of the input-output system with a nonlinear deterministic model of the form

x(t) = P(x(t - r), x(t - 2r), .. , x(t - mr), u(t), u(t - r), .. ,u(t - (1- l)r)),

(3)

where P is a nonlinear function fitted to the input-output time series data. This deterministic approach has been applied by Hunter to a variety of practical examples

[11]. Of course, the deterministic system (1,2) is only an approximation to reality. Firstly, it is assumed that only d independent modes of the system are excited to a significant amplitude, where in practice d is reasonably small. Secondly, it is assumed that effects due to unobserved sources of noise and measurement errors are small enol.\gh to be ignored. If these assumptions are violated, a stochastic approach to nonlinear modeling may be more appropriate, and it is natural to include noise terms in the above equations. For a stochastic approach to the nonlinear modeling of inputoutput systems, see Billings [1]. Under the above deterministic assumptions, we are interested in the following theoretical questions. Firstly, how should chaos be defined and quantified for the system (I)? If the input time series u(t) is periodic, this question trivially reduces to that of autonomous systems by considering the appropriate time-T Poincare map. However, we are mostly interested in the case where u(t) is a random time series. If u(t) is random, then x(t) is random, so neither time series by itself is chaotic. Hence this must be a question about the structure of the pair of time series u(t), x(t). We will address this question is Section 2. Secondly, if we wish to construct a nonlinear deterministic model of the form (3), how many lags m and I should be chosen for a system of dimension d? In the case of autonomous systems, this question has been addressed by Takens [14]; see also Sauer et al. [13]. We are also interested in how accurate such a model is likely to be as a function of the length of the time series available to construct it, and the dimension d of the underlying dynamical system. In the case of autonomous systems, this question has been addressed by Farmer and Sidorowich [7]; see also Casdagli [2]. We will investigate these questions for inputoutput systems in Section 3. Finally, we summarize the conclusions in Section 4. 2

2

Chaos in Input-Output Systems

In this section we give a definition of chaos for input-output systems. We also investigate the usefulness of this definition in quantifying predictability for a numerical example. For simplicity we will assume that time is discrete, so that Equation (1) is replaced by

Sn+l = j(Sn' un) Xn+l = h( Sn+l)

(4) (5)

All of the results in this section generalize naturally to continuous time input-output systems by using results about Liapunov exponents in continuous time autonomous systems; see [6, 16J.

2.1

Definition of the largest Liapunov exponent

Suppose the system (4) is initialized at two slightly different states, and in both cases is subjected to the same sequence of inputs Un' Then if the system states diverge exponentially in time, we say the input-output system is chaotic, and the rate of divergence is given by the largest Liapunov exponent. The largest Liapunov exponent quantifies the degree to which the system is predictable in the long term, assuming that the input time series is always observed. This observability assumption is satisfied in many of the applications mentioned in the introduction. For example, in vibration testing, one may only be able to observe the state of the system at rare intervals in the past due to measurement problems, and desire to predict the present state of the system given a sequence of inputs to the system. The assumption of having the input sequence available is also relevant for problems of reducing noise on output sequences observed in the past. Note that if the input time series is random and unobserved, the system is unpredictable even in the short term. We now make the above notions more precise. We will only be concerned with the largest Liapunov exponent, which is defined as follows. Let D j(s, u) denote the derivative of j at S with u held constant. Let D jT denote the matrix product

T-1 DjT(so,uO,u,UT_1) =

II Dj(Si,Ui)

(6)

i;:;O

where the Si are generated from (4) starting from a given initial state so. Then the largest Liapunov exponent A, is defined by (7), where ds is an arbitrary initial vector, and II . II denotes the Euclidean norm.

A, = lim .!:..log(11 DFds T~oo

T

If Al > 0 we say the system is chaotic.

3

II / II ds II)

(7)

The above definition of the Liapunov exponent Al at first sight depends on the initial state So, the sequence of inputs uo, Ul, .. and the tangent vector ds. However, suppose that the sequence of inputs is drawn from a realization of a stationary random or deterministic process. Then in the case of random inputs, by multiplicative ergodic theorems (see [6]), the limit (7) exists with probability one, and depends only on the ergodic invariant measure to which the initial state So is attracted. This invariant measure is often unique, for example in the case of Gaussian inputs. In the case of deterministic inputs, although there are no general theorems, it is observed numerically that the limit (7) exists, and depends only on the basin of attraction in which So lies. The above definition coincides with the definition of Liapunov exponents for randomly driven dynamical systems with unobserved inputs [6]. In this case, the input is assumed to be small, and models noise perturbations. However, if the inputs are observed, the largest Liapunov exponent may be used to describe the divergence of trajectories even for large amplitude inputs as follows. Let So and s~ denote two close initial conditions, subjected to the same sequence Uo, .. , UT_l of inputs. Then the divergence of trajectories will be described for moderate T by

(8) Hence using (7) we obtain

(9) We now illustrate the above ideas with the randomly driven Ikeda map where is taken to be

f(x,y,u) = 1 + a(xcost -ysint) +u,a(xsint +ycost)

f

(10)

where t = 0.4 - 6.0/(1 + x 2 + y2), a = 0.7 and the inputs Un are independently identically distributed (IID) Gaussians with variance 1]2. Figure 2 illustrates the dependence of the Liapunov exponent Al on the noise level 1]. The Liapunov exponent was computed numerically using 10 5 iterates for each value of 1], with the QR algorithm described in [6] to avoid overflow. Observe that a smooth transition from chaotic to non-chaotic behavior occurs at 1] "'" 0.95. Unlike autonomous systems, it is impossible to locate this transition by inspection of the invariant measure, which in randomly driven systems is always smooth. In the above example, ev:en if 1] lies in the non-chaotic region, the time series Si and Ui appear irregular. This is illustrated in Figure 3 for the non-chaotic case 1] = 1.2. Figure 3a illustrates a realization Ul, .. , UlOO of a time series of IID Gaussian inputs. Figure 3b illustrates how the invariant measure for the input-output system is filled out by the iterates Si = (Xi,Yi), for i = 1, ..,10000. Also shown in Figure 3b is the fractal invariant measure in the autonomous case 1] = O. The random inputs perturb the system so that the states Si fill out a more disperse, smooth invariant measure. Figure 3c illustrates the output time series Xl, .. , XlOO. The output time series 4

0.4 0.3 0.2

....

.-
::; 0.164. Three different pairs of initial conditions So, s~ were chosen with II So - s~ 11= E = 1.4 x 10-\ and subjected to five different sequences of random inputs. The solid lines are plots of lisT - Sy II against T. The dashed lines are plots of II D jT(s~, uo, .. ,UT_l) II E against T. The line AB represents the anticipated rate of divergence corresponding to a largest Liapunov exponent Al >::; 0.164. Observe that there are considerable fluctuations between the divergence of the 3 pairs. In fact one of the pairs appears to be following a non-chaotic path over the times T considered. Observe that the dashed lines give an excellent approximation to the divergence of trajectories, but that the line AB only gives a crude approximation to the divergence of trajectories. This shows that the Liapunov exponent .AI only gives an average 2 rate of divergence of trajectories; a more accurate analysis requires computations of the state dependent matrices D fT(s~, Uo, .. ,UT_l)' Finally, we consider a non-chaotic case with 71 = 1.2. Since the largest Liapunov exponent is negative, lisT - Sy II converges to zero as T increases with probability one (but note that ST does not converge to a fixed point or periodic orbit, as would be the case for an autonomous system). To make the problem more complicated, 'The largest singular value of a matrix M is equal to the square root of the largest eigenvalue of the matrix Mt M, where t denotes the transpose of a matrix. 2We found numerically that )" approximately describes the geometric average of the divergence of trajectories at short times T. In the case of one dimensional dynamical systems, this is a consequence of the additive ergodic theorem, but does not hold exactly in higher dimensional dynamical systems.

6

,,"=I~ o

20

40

60

80

a

100

b .".

.

,.

.

~

, ....:

".

2

o

- 2

L-L....JL-I--l-..L-l...-.L---.L---.L---L---L-L-..L...L...J

o

-5

5

10

x·1

~"J~c o

20

40

60

80

100

Figure 3: Input-output time series for a non-chaotic randomly driven Ikeda map in the case "l = 1.2. (a) Input time series Ui. (b) Invariant measures for "l = 1.2 and "l = O. (c) Output time series Xi.

7

I;-
2d, then such a smooth function P exists, for a generic set of functions j and h defining the underlying dynamics and measurement function [14]. This result has recently been strengthened by Sauer et al. so that "generically" can be replaced by "prevalent" (which essentially means full measure), and the state space dimension d can be replaced by the attractor's box counting dimension; investigations are also made into what happens when m ::; 2d and when other more general forms of reconstruction are used [13].

10

In the case of input-output systems, we will argue below that, subject to genericity conditions on f and h, then if m > 2d and I > 2d, a globally smooth function P exists satisfying (13) for almost all input sequences. Moreover, if m = I = d + 1, an almost everywhere smooth function P exists satisfying (13). As a corollary, it follows that for non-chaotic input-output systems with A, < 0, the input time series alone can be used to determine the future outputs arbitrarily accurately. This result is obtained by iterating the model (13), and observing that the outputs Xi for i > n + Al' log € become independent of X n , •• , X n _ m +l to accuracy proportional to €, and thus essentially depend only on the input sequence. By contrast, the output time series alone can be used to determine the future outputs only if the inputs come from a deterministic process. Rather than giving a rigorous mathematical treatment, we will give a heuristic argument here. We believe that this argument may be made rigorous by straightforward generalizations of the theorems for the autonomous case. Define the map q; : lRd X lR m - 1 -+ lRm by

The map q; is well defined and smooth. To obtain a good state space reconstruction, m must be chosen large enough so that there is a unique solution for s to the nonlinear equation (15) in terms of the Xi and Ui, which depends smoothly on the Xi and Ui. The solution for s identifies the unobserved state Sn-m+l. If this can be achieved for all integers n, it follows that a smooth function P exists satisfying (13) by substituting s = sn-m+I into (4) and iterating. It is clear that for (15) to have a unique solution for s, we must in general have m > d. This is because if m = d, then Equation (15) constitutes d simultaneous nonlinear equations for the d unknown components of s, and in general there are several different solutions for s. To break this degeneracy, suppose we take m = d +1. Then for generic f and h, the extra simultaneous equation for s is expected to pick out the unique solution, unless the solution lies on a "bad" subset Ed-l of lRd of dimension d - 1. The location of Ed - l depends on the inputs Ui as well as the functions f and h. The situation is illustrated geometrically for d = 2 in Figure 6. As m is increased by one, the dimension of the bad set E is generically decreased by one, until when m > 2d there is no bad set at all. So far the argument has paralleled that for the autonomous case. However, there is an additional complication that arises with the inputs when m > 2d. In the case that m = 2d + 1, it is expected that if the inputs lie on a "bad" subset U of lR 2d of dimension 2d - 1, then a bad subset E of dimension zero will be induced on lRd so that (15) will not have a unique solution for s. Since the set U generically has measure zero, we will ignore it. If m is increased further, the bad sets U will still have dimension 2d - 1 in lR m - 1 , and do not in general disappear.

11

Figure 6: An illustration of the map <J? with the inputs Ui held fixed, for d = 2 and m = 3. There is a one dimensional "bad" subset ~ of the state space ~2, which gets mapped by <J? to the one-dimensional self-intersection shown. If the vector of outputs X n , .. , Xn-m+l lies on the self intersection <J?(~), then there is not a unique solution for s to Equation (15).

12

The above argument has concentrated on the uniqueness of the solution s to Equation (15). To address the issue of the smooth dependence of the solution s on the Xi and Ui requires an application of the implicit function theorem. This says that there is smooth dependence if the matrix DiI> for the derivative of iI> with respect to s has full rank at the solution for s. The matrix DiI> generically has full rank if m is chosen large enough. It turns out that the conditions on m derived above to ensure uniqueness of solutions, are generically strong enough to also ensure full rank, hence smoothness. The arguments parallel those for the autonomous case, which can be found clearly expressed in [13J. In the case of autonomous dynamical systems, a theory of state space reconstruction has been developed which applies when there are low levels of observational noise on the scalar output time series [3, 4J. This theory can be generalized to multivariate time series, and input-output time series [9J. The theory quantifies the "goodness" of a state space reconstruction in terms of formulae involving the noise level, the measurement function, the information flow between variables, and the state space reconstruction technique used. By studying a variety of examples, insights can be gained into the limitations imposed on a reconstruction technique by observational noise, and how to minimize such effects.

3.2

Modeling and scaling laws

The results of Section 3.1 show how a good choice of lags m and I for the model P of (13) depends on the underlying state space dimension d. We now consider the problem of estimating P non-parametrically from an input-output time series of length N, .when the functions f and h are unknown. Suppose that a local approximation technique is used to construct an estimate FN for P. We measure the accuracy of the model FN by the RMS prediction error