Numerical strategies for filtering partially observed ... - Semantic Scholar

Report 1 Downloads 28 Views
Journal of Computational Physics 230 (2011) 744–762

Contents lists available at ScienceDirect

Journal of Computational Physics journal homepage: www.elsevier.com/locate/jcp

Numerical strategies for filtering partially observed stiff stochastic differential equations John Harlim Department of Mathematics, North Carolina State University, NC 27695, United States

a r t i c l e

i n f o

Article history: Received 25 February 2010 Received in revised form 6 October 2010 Accepted 13 October 2010 Available online 21 October 2010 Keywords: Filtering multiscale systems Data assimilation Stiff SDE Heterogeneous Multiscale Methods Inverse problems

a b s t r a c t In this paper, we present a fast numerical strategy for filtering stochastic differential equations with multiscale features. This method is designed such that it does not violate the practical linear observability condition and, more importantly, it does not require the computationally expensive cross correlation statistics between multiscale variables that are typically needed in standard filtering approach. The proposed filtering algorithm comprises of a ‘‘macro-filter’’ that borrows ideas from the Heterogeneous Multiscale Methods and a ‘‘micro-filter’’ that reinitializes the fast microscopic variables to statistically reflect the unbiased slow macroscopic estimate obtained from the macro-filter and macroscopic observations at asynchronous times. We will show that the proposed micro-filter is equivalent to solving an inverse problem for parameterizing differential equations. Numerically, we will show that this microscopic reinitialization is an important novel feature for accurate filtered solutions, especially when the microscopic dynamics is not mixing at all. Ó 2010 Elsevier Inc. All rights reserved.

1. Introduction An important scientific problem in applied sciences and engineering is to model nature signals with multiscale features. In many applications ranging from modeling polymeric fluids to weather and climate dynamics, we are interested in predicting the macroscopic dynamics which is typically observed on time scales much longer than on microscopic time scales. The practical difficulty in multiscale modeling is that we typically cannot resolve the microscopic variables and thus model errors through various parameterizations are unavoidable on the macroscopic level. For example, in the climate modeling, the planetary waves of interest have spatial structure on the order of 1000–10,000 km with yearly time scale and in a long run the behavior of these waves may depend on the energy inverse cascade from the microscopic processes such as the dry and the moist convections, clouds that occur on a spatial structure between 100 m to 100 km and time scales between hours to days. On the other hand, we typically understand the dynamics of the microscopic variables but these microscopic models are numerically very expensive and we do not regularly collect observations of the microscopic variables. Therefore, an important scientific problem is to develop a numerically fast and accurate filtering algorithm to obtain real-time estimates of the underlying true signal with multiscale features through observations of only the macroscopic variables. Filtering dynamical systems with multiscale features naturally arise in the context of filtering regularly spaced sparsely observed complex turbulent systems [1,2]; therein, the prototypical filtering problem consists of an observation model that couples Fourier components with distinct timescales. In that context, it was shown that accurate filtered solutions in various turbulent regimes are achievable with systematic reduced filtering strategies such as the Strongly Damped Approximate Filter (SDAF) [2]. In this paper, we aim to introduce a systematic reduced filtering strategy for partially observed stiff stochastic differential equations with an observation model that does not couple variables with different timescales: E-mail address: [email protected] 0021-9991/$ - see front matter Ó 2010 Elsevier Inc. All rights reserved. doi:10.1016/j.jcp.2010.10.016

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

dX _ X ðtÞ; ¼ f ðX; yÞ þ rX W dt dy 1 ry _ ðtÞ; ¼ gðX; yÞ þ pffiffiffi W dt   y V m ¼ X m þ rom ;

745

ð1Þ ð2Þ ð3Þ

NX

Ny

where X 2 R is the slow macroscopic variable and y 2 R is the fast microscopic variable with distinct time scales char_ denotes independent white noise in time; r denotes a constant noise strength; Vm denotes the acterized by small   1; W noisy observation of Xm  X(tm) at discrete observation time interval, Dt = tm+1  tm  , beyond the microscopic time scale; rom are time-independent Gaussian noises with mean zero and covariance Ro. There are two practical difficulties in filtering the stiff problem in (1)–(3). From the perspective of classical linear filtering theory, filtering such a stiff system practically violates the observability condition that guarantees filter stability. By filter stability, we mean there exists an asymptotic covariance estimate and the filtered solutions remain bounded [3]. To illustrate this issue, consider the following linear filtering problem with a 2  2 system of ODE,

du ¼ Au; uð0Þ ¼ u0 ; dt V m ¼ Gum ;

ð4Þ ð5Þ

where the components in the first row and second row of A are Oð1Þ and Oð Þ, respectively, with   1. To relate to the stiff SDE in (1)–(3), u = (X, y)T, noises are zero, and G = [1, 0]. Suppose that the matrix A above has two negative eigenvalues: one can show that one of the eigenvalues is on the order of jk1 j ¼ Oð1Þ with eigenvector w(1) and the second eigenvalue is on the order of jk2 j ¼ Oð1 Þ with eigenvector wð2Þ ¼ ðOðÞ; Oð1ÞÞT . Therefore, we can define a transformation matrix 1

T ¼ ½wð1Þ ; wð2Þ ;

ð6Þ

such that the solution of the initial value problem in (4) in the transformed space is given by

~0 : ~ ðtÞ  T 1 u ¼ expðKtÞT 1 u0 ¼ expðKtÞu u

ð7Þ

We can rewrite the observation model in (5) as follows

~u ~m  G ~ m ¼ ½Oð1Þ; OðÞu ~m ; V m ¼ GT u

ð8Þ

~ is a time-independent observation operator that maps the solutions in the transformed space to the observation where G space. For observation time interval Dt, each prior forecast in the filtering process solves the following discrete map

~mþ1 ¼ expðKDtÞu ~m  F u ~m ¼ u



 0 OðeDt Þ ~m : u Dt= Þ 0 Oðe

~ In this very simple situation, the linear observability condition [4,5] for the dynamical operator F and observation operator G ~T ; FT G ~ T Þ is nearly zero) when  is small for any observation times, Dt, is practically violated (i.e., the determinant of matrix ðG that is,

   OðeDt Þ  ~ T Þj ¼  Oð1Þ ~T ; FT G j detðG  OðÞ OðeDt= Þ  6 OðÞ:

ð9Þ

The classical linear filtering stability theory [6] states that the filter is stable whenever the filter model is stable (eigenvalues of A are all negative as in our case above) even when observability condition is not satisfied. However, the practical observability condition (i.e., the opposite of inequality (9)) is a necessary condition for accurate filtered solution when the signal-tonoise ratio is large [2] even when the filter model is stable. We will demonstrate this practical issue with numerical results in Section 4.1. The second difficulty in filtering stiff systems as in (1) and (2) is that it is numerically expensive to compute the cross covariances between the macroscopic and the microscopic variables that are typically needed for the update process in the filtering; an accurate estimate for the cross covariances requires direct numerical simulations (DNS) with a time step that resolves the microscopic processes. In a special case where the slow-fast systems are coupled through multiplicative noises [7,8], a Nonlinear Extended Kalman filter with exactly solvable cross covariance terms has shown very promising results. This similar technique was also used for estimating model errors on-the-fly [9,10]. However, an accurate cross covariance estimate is not attainable (with any approximate dynamical equations for the moments) even in a simple system such as the stochastically forced FitzHugh–Nagumo system [11]. Therefore, a reduced filtering strategy is needed to avoid calculating these cross correlations. Alternative approach for solving similar problem as in (1)–(3) is to fit the observations into a single-scale stochastically averaged (or homogenized) reduced dynamical system [12]. This technique, however, is superior only when the averaged (or homogenized) equation exists [13–15] and when the data is sparsely sampled at an appropriate rate. Independent from filtering, there is a vast literature on reduced numerical framework for integrating systems with multiscale features, the Heterogeneous Multiscale Method (HMM) [16,17]. The development of this reduced numerical method is also based on the

746

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

stochastic averaging procedure [13–15] which assumes that for a fixed X = x, the microscopic dynamics is ergodic and has an invariant probability measure lx(dy) such that the following limits exist

f ðXÞ ¼ lim

Z

f ðX; yÞlx ðdyÞ; Z r X ðXÞr X ðXÞT ¼ lim rX ðX; yÞrX ðX; yÞT lx ðdyÞ: !0

!0

ð10Þ ð11Þ

Then, as  ? 0, solution of the macroscopic subsystem in (1) converges weakly to the solution of the following effective dynamics:

dX  _ X ðtÞ;  X ðXÞW ¼ f ðXÞ þ r dt

ð12Þ

 X ¼ rX if rX is constant. The HMM is a numerical strategy that estimates the invariant measure lx(dy) when its exwhere r plicit expression is absent. In each macro-time-step integration, the HMM algorithm approximates the invariant measure lx(dy) with stationary solutions that are obtained by integrating the microscopic dynamics with frozen X = x for a short period of time (much shorter than Dt). Of course these stationary solutions can only be obtained quickly if the microscopic dynamics with constant X = x is strongly mixing at this short period of time. The HMM algorithm is computationally more efficient than the direct numerical simulations (DNS) if the strong mixing property above holds and thus we can resolve the macroscopic variables at macro-time-step Dt   using the effective dynamics in (12). In this paper, we introduce a reduced filtering technique that utilizes some aspects of HMM for systems with multiscale features in (1) and (2). This approach does not only satisfy the practical linear observability condition, but more importantly, it avoids the expensive computation of cross covariances between multiscale variables that are typically needed in classical filtering approaches. Furthermore, we require this filter to be robust even when the microscopic dynamics is not mixing (the temporal correlation function does not decay to zero). The proposed approach consists of a ‘‘macro-filter’’ that utilizes some aspects of HMM algorithm (see the detailed in Section 2) and a ‘‘micro-filter’’ that modifies the ‘reconstruction’ step in the standard HMM. The key feature in this algorithm is the ‘‘micro-filter’’ (see Section 3) that reinitializes the microscopic variables such that they statistically reflect the macroscopic estimate obtained from the macro-filter and macroscopic observations at asynchronous times. We will show that this reinitialization step is essentially solving an inverse problem for parameterizing differential equations [18]. Subsequently, we show numerical results for single asynchronous time observations with various filtering approaches, including those that require DNS (Section 4.1) and the proposed cheaper approaches (Section 4.2), on the stochastically forced slow-fast FitzHugh–Nagumo model in the regime where the model has an excitable stable equilibrium solution with non-mixing microscopic dynamics when X is frozen. In Sections 4.3, we discuss the sensitivity of the proposed Macro–Micro-Filter approach toward variation of parameters and its limitation. In Section 4.4, we compare the predictive skill of the Macro–Micro-Filter approach with the standard classical approach. In Section 4.5, we show numerical results for multiple asynchronous times observations. We close the paper with a short summary and discussion in Section 5 and an Appendix that describes an ensemble based MMF algorithm for high dimensional application. 2. The ‘‘macro-filter’’ We begin by reviewing the standard HMM framework [17] for integrating the stiff SDE in (1) and (2) in more details. Given initial conditions Xm and ym,0 (or an ensemble of fykm;0 gk¼1;...;K ), the HMM algorithm computes the (local) stationary solutions by integrating the microscopic model with fixed X = Xm, initial conditions ym,0, and micro-time-step dt <  for N steps up to time Ndt to obtain ym,N. Here, we assume that the microscopic dynamics is strongly mixing, that is, the temporal correlation function decays to zero, hym;NT ym;0 i ¼ 0, beyond the transient time NTdt < Ndt. To gain computational efficiency, the final time is expected to be less than the macro-time step, Ndt < Dt. Thus, we have a subset of (local) stationary microscopic solutions

  AX m  ym;n : NT 6 n 6 N

ð13Þ

for a single trajectory integration or

n o AX m  ykm;N : k ¼ 1; . . . ; K

ð14Þ

for an ensemble of integrations. The second step in HMM is called the ‘compression’, in which we approximate the effective dynamics (12) with operator

f ðXÞ ¼

1

XðAX m Þ

X

f ðX; yÞ;

ð15Þ

y2AX m

where XðAÞ denotes the number of components in subset A. To repeat the whole cycle, the HMM uses the microscopic state from the previous integration as the initial condition,

~; ymþ1;0 ¼ y ~ 2 AX m is from either (13) or (14). This step is called ‘reconstruction’ in HMM [17]. where y

ð16Þ

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

747

Different from the classical filtering approach discussed in the introduction in (1)–(3), we consider the following reduced Macro-Filter:

dX  _ X; ¼ f ðXÞ þ rX W dt V mþ1 ¼ X mþ1 þ romþ1 ;

ð17Þ

romþ1  N ð0; Ro Þ;

where f is the effective dynamical operator, obtained from (15), and romþ1 are Gaussian noises with mean zero and covariance Ro. Assuming that (10) and (11) hold, solutions of the effective model in (17) converge weakly (not path-wise) to the solutions of the original stiff system of SDE in (1) and (2) for finite time interval [0, T) where T < 1 when  ? 0 [19]. The reduced macro-filtering strategy with effective dynamics in (17) is not only observable (since the observation operator is identity), but it is also computationally efficient since it reduces the dimensionality of the filtering problem from NX + Ny to NX. Let us þ þ þ define X þ m and ym;0 as the best estimates for Xm and ym with their corresponding covariances Rm and r m;0 , respectively. In each filtering step, we utilize the HMM strategy described above to obtain a prior estimate X  and covariance R mþ1 mþ1 . Notice that  these prior statistical estimates depend on the microscale forecast state ym;N from the HMM. To complete a cycle of filter, we þ can employ any desirable filtering strategies [20–26] to obtain a posterior estimate X þ mþ1 with covariance Rmþ1 . To repeat the macro-filter cycle, we need to reinitialize the microscopic state at time m + 1. The standard HMM approach suggests to reinitialize as follows,

yþmþ1;0 ¼ ym;N :

ð18Þ

Probabilistically, the macro-filter in (17) estimates the following conditional (or posterior) distribution

PðX mþ1 jV mþ1 ; ym;N Þ / PðX mþ1 jym;N ÞPðV mþ1 jX mþ1 ; ym;N Þ;  where PðX mþ1 jy m;N Þ is the prior distribution that is numerically approximated by the HMM strategy and PðV mþ1 jX mþ1 ; ym;N Þ is the observation likelihood function. The macro-filter above, however, is sensitive to variation of parameters, that is, we need to keep the number of components in AX m small for numerical efficiency but we also need enough components to obtain an accurate effective dynamics in (12). This practical constraint is undesirable in practice, in particular when the microscopic dynamics is not strongly mixing (not equilibriates yet) on the macroscopic observation time scale D t. Next, we will focus our study on the regime where the microscopic dynamics for constant X is not mixing at all (the temporal correlation function does not decay to zero) such that the HMM alone is not an appropriate numerical approach [27] and we will introduce one way to resolve this issue.

3. The ‘‘micro-filter’’ For the case when the microscopic dynamics is not mixing within the macroscopic time scale, the HMM alone is not an appropriate approach [27]. As a consequence, the microscopic state at any times retains a memory of its initial condition and hence it is not appropriate to reuse the previous step microscopic state in (16) or (18) as initial condition even when the microscopic model depends on the macroscopic variables directly, i.e., g(X, y) in (2). The main idea here is to replace the standard microscopic reinitialization (or reconstruction) in (16) or (18) with initial conditions that reflect the unbiased macroscopic posterior estimate X þ mþ1 obtained from the macro-filter above. This turns out to be the standard inverse problem [28] to solve

X þmþ1 ¼ hðymþ1;0 Þ þ gmþ1 ;

ð19Þ

for ym+1,0, provided that h is a known observation operator that maps the microscopic state to the macroscopic state and gm is Gaussian noise with variance Rþ m . The standard approach for solving the inverse problem in (19) is to solve the following least squares problem

 2 min X þmþ1  hðyÞRþ ; y

mþ1

kuk2R

where we denote ¼ uT R1 u. The inverse problem in (19), even without noise, is not always a well-posed problem [29,28] since h may not be invertible when the dimensionality of X is different from the dimensionality of y as often encountered in practice. Moreover, the solutions may not be unique even when the inverse exists and the inverse function may not even be continuous. The latter may cause the inverse to be sensitive to perturbations of X. A well-known approach to avoid this illposedness is to solve a perturbed least squares problem with the hope that the perturbed problem is convex. This strategy is known as the Tikhonov regularization [30]. Motivated by this approach, we consider the following generalized Tikhonov regularization,

 2 min X þmþ1  hðyÞRþ y

mþ1

 2   þ y  ym;N   ; r m;N

ð20Þ

748

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

where r  m;N is the microscopic prior error covariance obtained from integrating the microscopic dynamics with constant k; þ X ¼ Xþ m and initial covariance r m;0 (when ensemble based covariance is used, the prior ensemble, ym;N , is readily available from step 1 in the HMM discussed in Section 2). From the Kalman filter perspective, the minimization problem in (20) is equivalent to solving the following micro-filtering problem:

dy 1 ry _ ¼ gðX þm ; yÞ þ pffiffiffi W ; dt   y

ð21Þ

X þmþ1 ¼ hðymþ1;0 Þ þ gmþ1 : Here, we can employ any desirable filtering technique (not necessarily Kalman based) to obtain the best posterior estimate þ yþ mþ1;0 and error covariance r mþ1;0 . Then we can restart the macro-filter. Probabilistically, the micro-filter samples initial conditions from the following conditional posterior distribution,

Pðymþ1;0 jX þmþ1 Þ / Pðymþ1;0 ÞPðX þmþ1 jymþ1;0 Þ;

ð22Þ y m;N

r m;N

PðX þ mþ1 jymþ1;0 Þ

where P(ym+1,0) is the prior distribution with mean and covariance whereas is the likelihood function with mean h(ym+1,0) and covariance Rþ mþ1 . Suppose now that the observation operator h is unknown. Our approach follows ideas from inverse problems for finding parameters from solutions of differential equations [18]. That is, we seek for the parameter y, assuming that discrete data 0 0 ~ig points fX i¼0;1;...;I at every time step Dt where IDt < Dt are noisy solutions of a difference equation,

X i ðyÞ ¼ M½X 0 ; X 1 ; . . . ; X i1 ; y;

ð23Þ

~ 0 . In (23), M denotes any desirable finite-difference operator that approximates the solutions of with initial condition X 0 ¼ X the slow dynamics in (1). One approach to solve this inverse problem is to minimize the following nonlinear least squares problem,

min y

I X

~ i  X i ðyÞk2 o kX Q

i¼1

mþ1;i

þ ky  ym;N k2r ; m;N

ð24Þ

~ i ¼ V ðmþ1ÞþiDt0 g regularized as in (20). To solve this problem, we need to collect asynchronous times observations, fX i¼1;...;I , if estimates at asynchronous times are not available. In (24), the observation operators Xi(y) are solutions of the difference ~ 0 ¼ X þ . Therefore, depending on M, the error covariance Q o equation in (23) with initial condition X mþ1 mþ1;i may include terms associated with the uncertainty of the initial condition X þ mþ1 , as we will show below. Generally speaking, the micro-filter solves an inverse problem of a differential equation with discrete data points consisted of the unbiased estimate from the macro-filter and observations at asynchronous times when they are available. This approach assumes that y is constant within the time interval [0, IDt0 ]. For more than one asynchronous time observations, I > 1, hybrid variational and ensemble based filtering techniques [31– 34] may be used to obtain an ensemble of y with ensemble average that minimizes the cost function in (24). Later in Section 4.5, we will show numerical results for multiple asynchronous times observations, I > 1, without using any ensemble filter. Now, let us focus on a special case with I = 1 and the Euler method so that exact ensemble filtering formula can be used to understand the advantage and the weakness of the proposed scheme. Here, the discretized macroscopic equation is

pffiffiffiffiffiffiffi X mþ10 ¼ X mþ1 þ Dt 0 f ðX mþ1 ; ymþ1 Þ þ rX Dt 0 N ð0; 1Þ

ð25Þ

0

with discrete time step Dt < Dt such that tmþ10  t ðmþ1ÞþDt0 is in between macroscopic observation times, t mþ1 < t mþ10 < tmþ2 ¼ t mþ1 þ Dt. In this simple situation, we can explicitly write the observation model as follows:

V mþ10 ¼ X þmþ1 þ Dt 0 f ðX þmþ1 ; ymþ1 Þ þ gmþ1 ;

ð26Þ

where V mþ10  V mþ1þDt0 and gm+1 is Gaussian observation noise with mean zero and covariance



Q omþ1  gmþ1 gTmþ1 ;

ð27Þ

where the angle bracket denotes expectation. This strategy, of course, is only useful when asynchronous time observation V mþ10 is available. In our numerical experiments in Section 4, we will vary Dt0 and we will check the robustness of the filter toward variation of Dt0 . The computation of the covariance in (27) is not trivial in general but for a special case where f is in the following form,

f ðX; yÞ ¼ f1 ðXÞ þ f2 ðyÞ;

ð28Þ

we can rewrite the observation model in (26) in an explicit form,

V mþ10  X þmþ1  Dt 0 f1 ðX þmþ1 Þ ¼ Dt0 f2 ðymþ1 Þ þ gmþ1

ð29Þ

with error covariance obtained by substracting (25) from (29) such that

D D 2 E T E Q omþ1 ¼ Ro þ Rþmþ1 þ r2X Dt 0 þ ðDt0 Þ2 f1 X þmþ1  f1 ðX mþ1 Þ  2Dt 0 f1 X þmþ1  f1 ðX mþ1 Þ X þmþ1  X mþ1 :

ð30Þ

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

749

Notice that this adjusted covariance matrix is a linear function of the observation error covariance Ro and a quadratic function of asynchronous time Dt0 . In a one-dimensional case X; y 2 R, it is easy to see the limitation of this approach. That is, when Q omþ1 is much larger than the prior error covariance r  m;N , the Kalman weight,

K y;mþ1 Dt 0 ¼

r m;N ðDt0 Þ2 r m;N ðDt 0 Þ2 þ Q 0mþ1

0;

ð31Þ

and consequently the Kalman filter posterior solution is nearly identical to the prior forecast state,

yþmþ1;0 ¼ ð1  K mþ1 Dt 0 Þym;N þ K mþ1 Hmþ1 V mþ10 ; X þmþ1 ym;N ;

ð32Þ

Hmþ1 V mþ10 ; X þmþ1  V mþ10  X þmþ1  Dt0 f1 X þmþ1

ð33Þ

where

is the modified observation (or the LHS of Eq. (29)). The condition in (31) that yields (32) suggests that when the observation model uncertainty is much larger than the prior uncertainty, the micro-filter is ineffective and we are more or less reinitializing the microscopic state as in the standard HMM in (16) or (18). In Section 4.3 below, we analytically and numerically find that the filtering problem with linear slow dynamics produces more accurate filtered solutions as Dt increases and Ro = 0. For observation model in (29), the micro-filter samples the following conditional distribution:

P ymþ1;0 jHmþ1 V mþ10 ; X þmþ1 / Pðymþ1;0 ÞP Hmþ1 V mþ10 ; X þmþ1 jymþ1;0 ; where the prior distribution P(ym+1,0) is identical to that in the Bayesian relation in (22), but the likelihood o 0 P Hmþ1 V mþ10 ; X þ mþ1 jymþ1;0 has mean Dt f2(ym+1,0) and covariance Q mþ1 in (30). In the numerical simulations below, we will test the Macro–Micro-Filter simultaneously for various regimes. We will also compare this approach with the macro-filter alone and with an intermediate approach, macro-filter + inverse. The latter approach is simply the macro-filter with microscopic states reinitialized by explicitly solving (29) for y with zero noise, gm = 0. 4. Numerical results on the stochastically forced FitzHugh–Nagumo model We consider the stochastically forced FitzHugh–Nagumo (FN) model [35,36],

dX _ X; ¼ y  bX þ a þ rX W dt   dy 1 1 ry _ ¼ y  y3  X þ pffiffiffi W ; dt  3  y

ð34Þ

as a testbed model for filtering. We choose parameters a = 0.7, b = 0.8,  = 0.01 such that the deterministic dynamical system has a single excitable stable fixed point (see the nullclines and phase space plot in Fig. 1) [37,38]. This model was introduced to describe the electrical behavior of a single neuron in the spatially homogeneous membrane; X represents the slower response (or relaxation) whereas y represents the faster excitable membrane potential. We consider white noises with fixed strengths rX = 0.1 and ry = 0.01 as the source of the chaotic excitation behavior (see [39] for detailed discussions on the noise induced chaos for this model in different parameter regimes). The dynamics in this model stays near the stable equilibrium point most of the time (see the distribution of the variables in Fig. 1) and there are occasional large excursions/spikes in the microscopic dynamics induced by the stochastic noises at random times (in the phase plane, the state jumps from the bottom left-leg to the bottom right leg of the null cline, see Fig. 1). Simultaneously, the macroscopic variable, X, increases (with a slower rate jdX/dtj relative to jdy/dtj) until it reaches a threshold near 1 in which the microscopic variable y suddenly drops to 2 whereas X relaxes, again, slowly. Filtering solutions from the FN model in (34) through observations of only the macroscopic variable is indeed a difficult task because the important ‘‘engine’’ that drives the interesting excitation behavior, the microscopic variable, is completely hidden and the microscopic dynamics with frozen X = x is not mixing at all. As a consequence, we will see below that accurate real-time prediction of X is not possible without proper reinitialization of y even if direct numerical simulation (and of course HMM) is implemented with a relatively precised X. The true trajectory shown in Fig. 1 is generated by solving the FN model with the Euler method, resolving at every microtime-step dt = 103 (this time step is sufficient to produce numerical accuracy, see [11]). Note that we can use any higher order scheme here, however, different schemes will introduce additional terms in the covariance Qo in (30) when the Euler method is utilized to approximate the linear observation model as in (29). We simulate observations at every macro-timestep Dt = 0.5 by adding a draw from a Gaussian noise with mean zero and covariance Ro = 0.01 to X (this is roughly 10% of the variance of X). In Section 4.3, we will also show results with larger noise variance. This observation time is large enough such that there are 500 micro-steps in between observation time, Dt = 500dt. To measure the filter performance, we compute the average RMS error and the average autocorrelation function between the macroscopic true state Xt and the macroscopic  ensemble mean estimate X,

750

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

Fig. 1. FitzHugh–Nagumo model: the time series of the macroscopic (bold) and microscopic (thin) variables for the FitzHugh–Nagumo model (upper left); the phase portrait and its associated null clines (upper right); the distributions of each variable (bottom left); and the distribution of the amplitude of the rate of change in log scale (bottom right).

1

  X t Þ2 i2 ; RMSE ¼ hðX T   hXi  ÞðX t  hX t i Þi hðX T T T ; Corr ¼   hXi  Þ2 i hðX t  hX t i Þ2 i hðX T T T T

ð35Þ ð36Þ

In (35) and (36), the angle bracket h iT denotes a temporal average. These averages are taken over T = 5000 assimilation cycles ¼X  þ . To with a same set of true signal and observations. To quantify the filter skill (Sections 4.1, 4.2, 4.3 and 4.5), we set X  ¼X . quantify the prior forecast skill (see Sections 4.3, 4.4, 4.5), we set X 4.1. Filter skill with expensive direct numerical simulations In this section, we discuss numerical results with expensive direct numerical simulations (DNS), that is, every filtering strategy in this section uses prior statistics that are obtained by propagating the model with a micro-time-step dt = 103. The goal of this section is to show numerical evidence of the practical difficulties mentioned in Section 1: the linear practical observability and the expensive cross-correlation terms for accurate filtered solutions. We will also use the best result obtained here as a benchmark for assessing the filtering skill with cheaper strategies in the next section. In Fig. 2, we show the time series of the filtered solutions (thick solid), obtained from the standard Extended Kalman Filter (EKF) [4,5] for various observation times Dt = 50 = 0.5, Dt = , and Dt = 0.1 = dt. In the same plot, we also show the observations (circle) on the top panel and the true signal (dashes). The numerical results suggest that the filtered solutions for X (first column) are more accurate as observation time decreases. This improvement is not so surprising since the observations noise variance is rather small Ro = 0.01 such that X will trust these observations. However, as one checks the microscopic variable, y (second column), it becomes apparent that when the practical observability condition is not satisfied (see Eq. (9)), the filtered solutions do not always track the spikes (with large signal-to-noise ratio) for any observation time, Dt, even when the linearized deterministic dynamical operator is stable near the equilibrium point. The significance of practical observability for large signal-to-noise ratio was also reported in [2] in the context of filtering sparsely observed turbulent signals. We can suppress the effect of violating the linear observability condition in (9) by considering an ensemble filtering strategy that uses nonlinear prior informations. To see this, we show the filtered solution as a function of time in Fig. 3 with a deterministic ensemble Kalman filter, Ensemble Transform Kalman Filter (ETKF) [22]. In this paper, we implement ETKF with the ‘‘spherical simplex’’ solutions [40] that satisfy the mean-preserving property discussed in [25]. Here, the ETKF is

751

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

X: 2D−EKF with Δ t=0.5=50ε

2

y: 2D−EKF with Δ t=0.5=50ε

3

1.5

2

1 1

0

y

X

0.5

−0.5

−1

−1

obs true estimate

−1.5 −2

0

5

15 time

20

25

−3

30

0.5

2

0

1

−0.5

0

−1

−1

−1.5

−2

−2

0

5

10

15 time

20

25

−3

30

x: 2D−EKF with Δ t=δ t = 0.1ε

1

0

1

−0.5

0

y

2

−1

−1

−1.5

−2

0

5

10

15 time

20

5

25

30

−3

10

15 time

20

25

30

25

30

y: 2D−EKF with Δ t=ε

0

5

10

15 time

20

y: 2D−EKF with Δ t=δ t = 0.1ε

3

0.5

−2

0

3

y

X

10

−2

x: 2D−EKF with Δ t=ε

1

X

0

0

5

10

15 time

20

25

30

Fig. 2. Two-dimensional Extended Kalman Filter: X (first column) and y (second column) as functions of time for observation times Dt = 50 = 0.5 (top), Dt =  (middle), and Dt = 0.1 = dt (bottom). In each panel, we plot the filtered solution or posterior mean state (thick solid line), observations on X only (circle), and the true signal (dashes).

performed on the two-dimensional FN system in (34) with 20 ensemble members. The filtered solutions track the true signal quite accurately with average RMS error 0.21 and average correlation 0.82 (see Table 1) although there are times when it generates a spurious spike (see Fig. 3 near 500 time units) and misses the true spike near 465 time units. We also include filtered solutions with 2D-EnKF with perturbed observations [20] and find that the filtered result is much improved with RMS error 0.13 and correlation 0.93 (see Fig. 4) compared to the 2D-ETKF above. Note that the accurate filtered solutions with 2D-ETKF and 2D-EnKF rely heavily on relatively accurate cross covariance estimates from the expensive direct numerically simulated ensemble of solutions. If we ignore these cross covariance estimates and perform only a one-dimensional linear Kalman filter on the linear macroscopic dynamics, the filter solution (see Fig. 5) is not accurate at all near the spikes although its average RMS error is 0.25 and correlation is 0.75 as shown in Table 1. In the same figure, we also show the failure of tracking the unobserved microscopic variable y even when the full system is solved with DNS.

4.2. Filter skill with numerically faster algorithms In this section, we present numerical results with cheaper algorithms proposed in Sections 2 and 3. First, we discuss the results with the macro-filter alone (see Section 2 for the scheme detailed). In this simulation, we implement the macro-filter

752

J. Harlim / Journal of Computational Physics 230 (2011) 744–762 2D−ETKF with DNS 1

X

0.5

0

−0.5

−1 450

455

460

465

470

475

480

485

490

495

500

455

460

465

470

475 time

480

485

490

495

500

3 2

y

1 0 −1 −2 −3 450

Fig. 3. Two-dimensional ETKF with 20 ensemble members and Direct Numerical Simulation (DNS). In each panel (the top for X and the bottom for y), we show the filtered solution or posterior mean state (thick solid line), observations (circle), and the true signal (dashes).

Table 1 The average RMS error and correlation between the posterior mean state and the true signal for various filtering schemes, averaged over 5000 assimilation cycles. Scheme

RMS

Corr

2D-EKF DNS 2D-ETKF DNS 2D-EnKF DNS 1D-KF DNS Macro-filter Macro-filter + inverse Macro–Micro-Filter

0.33 0.21 0.13 0.25 0.35 0.25 0.19

0.91 0.82 0.93 0.75 0.90 0.80 0.87

with ETKF with ensemble size 20 and we integrate the microscopic dynamics to obtain AX m in (14) for N = 250 steps (half of Dt/dt = 500, the number of micro-time-steps in between observation time Dt). We find that the filtered solution is inaccurate (see Fig. 6); this failure is expected since the microscopic dynamics is not mixing and hence the macro-filter alone with standard reinitialization as in (18) is an ill-advised approach. As an alternative intermediate approach, we implement the macro-filter and reinitialize the microscopic states by directly inverting Eq. (19) with zero noise, gm = 0. We call this approach the macro-filter + inverse. For general problems, the inverse may not exist as we discussed in Section 3 and regularization may be needed [30]. For simple scalar macroscopic dynamics with separable dynamics as in (28), the inverse is easily computed by solving (29) for ym+1 with asynchronous observation at time Dt0 = 0.35 and gm+1 = 0. In Fig. 7, we show the filtered solutions as functions of time with this approach, macro-filter + inverse, in which we use ETKF for the macro-filter (again with 20 ensemble members and N = 250 microsteps). Notice that the filtered solution, X, tracks all the excitation spikes on this time interval but the solutions look very noisy in the quiescent states (e.g., between 480 and 485 time units). Here, the corresponding average RMS error is 0.25 and the correlation is 0.80 (see Table 1). In Fig. 7, we also show the initial conditions ym,0 (thick solid line in the bottom panel), obtained by directly inverting the noiseless (29), and the short (N = 250) microscopic forecast (thin solid line) that is performed in each HMM step initiated from ym,0. The rapid change in the microscopic forecasts (see the thin solid line) provides an indication that we can, in fact, reduce the noise in the filtered solution in X by decreasing the micro-steps N; the appropriate choice of N can be found empirically. In Fig. 8, we show the main result of this section using the Macro–Micro-Filter. Here, we implement ETKF on both the macro- and the micro-filters, each with ensemble size 20. We implement the HMM by only integrating the microscopic dynamics for N = 100 micro-steps (this is ad hocly chosen). In Fig. 8, we see that the filtered solution (thick solid line) tracks the true signal (dashes) accurately with low average RMS error 0.19 and high correlation 0.87 (see Table 1). In some regimes

753

J. Harlim / Journal of Computational Physics 230 (2011) 744–762 2D−EnKF with DNS 1

X

0.5

0

−0.5

−1 450

455

460

465

470

475

480

485

490

495

500

455

460

465

470

475 time

480

485

490

495

500

3 2

y

1 0 −1 −2 −3 450

Fig. 4. Two-dimensional EnKF with 20 ensemble members and Direct Numerical Simulation (DNS). In each panel (the top for X and the bottom for y), we show the filtered solution or posterior mean state (thick solid line), observations (circle), and the true signal (dashes).

1D−KF with DNS 1

X

0.5

0

−0.5

−1 450

455

460

465

470

475

480

485

490

495

500

455

460

465

470

475 time

480

485

490

495

500

3 2

y

1 0 −1 −2 −3 450

Fig. 5. One-dimensional linear Kalman filter on the linear macroscale dynamics and DNS: observability condition is fully satisfied but the filtered solution is not accurate at all since the cross covariances between the macroscale and microscale dynamics are completely ignored. Filtered solution or posterior mean state (thick solid line), observations (circle), and the true signal (dashes). In the bottom panel, the thick solid line denotes the DNS integration since we do not filter this variable.

such as between times 478–480, the filtered solution, however, does not precisely track the true signal; this failure is partly due to the fact that the proposed observation model Hm in (33) does not even recover the y-spike correctly (see the circle in the lower panel of Fig. 8) and partly due to the large uncertainty in the observation model in (29) as discussed in Section 3. We will discuss this limitation in more detail in the next section.

754

J. Harlim / Journal of Computational Physics 230 (2011) 744–762 Macro−Filter 1

0.5

X

0

−0.5

−1 450

455

460

465

470

475

480

485

490

495

500

455

460

465

470

475 time

480

485

490

495

500

3 2

y

1 0 −1 −2 −3 450

Fig. 6. Macro-filter: implemented with ETKF, ensemble size of 20, and total micro-steps N = 250, which is one half of Dt/dt = 500. Filtered solution or posterior mean state (thick solid line), observations (circle), and true signal (dashes). In the bottom panel, the thin solid line denotes the N = 250 steps of micro-step dt integration with the standard HMM.

Macro−Filter + inverse 1

X

0.5

0

−0.5

−1 450

455

460

465

470

475

480

485

490

495

500

455

460

465

470

475 time

480

485

490

495

500

3 2

y

1 0 −1 −2 −3 450

Fig. 7. Macro-filter + inverse: implemented with ETKF, ensemble size of 20, asynchronous observation time Dt0 = 0.35, and total micro-steps N = 250 which is one half of Dt/dt = 500. Filtered solution or posterior mean state (thick solid line), observations (circle), and true signal (dashes). In the bottom panel, the thin solid line denotes the N = 250 steps microscopic forecast and the thick solid line denotes the initial ym,0 = Hm/Dt0 , obtained by directly inverting the observation model in (29) with zero noise and Hm defined in (33).

4.3. Sensitivity to variation of parameters and limitation In this section, we study the sensitivity of the Macro–Micro-Filter toward variation of parameters. First, we should mention that we do not find significant difference in terms of filtering skill for various ensemble size and various micro-steps N

755

J. Harlim / Journal of Computational Physics 230 (2011) 744–762 Macro−Micro−Filter 1

X

0.5

0

−0.5

−1 450

455

460

465

470

475

480

485

490

495

500

455

460

465

470

475 time

480

485

490

495

500

3 2 1 y

0 −1 −2 −3 −4 450

Fig. 8. Macro–Micro-Filter: implemented with ETKF, ensemble size of 20, asynchronous observation time Dt0 = 0.35, and total micro-steps N = 100 which is one fifth of Dt/dt = 500. Filtered solution or posterior mean state (thick solid line), observations (circle), and true signal (dashes). In the bottom figure, since we do not really observe y, the circle here denotes the proposed observation Hm/Dt0 in the y-space, where Hm is defined in (33).

0.4 posterior prior

RMS error

0.35 0.3 0.25 0.2

0.05

0.1

0.15

0.2

0.1

0.15

0.2

0.25

0.3

0.35

0.4

0.45

0.5

0.25

0.3

0.35

0.4

0.45

0.5

Average correlation

0.9

0.8

0.7

0.6

0.5 0.05

Asynchronous time (Δ t’)

Fig. 9. Average RMS errors and correlations as functions of asynchronous observation time Dt0 ; the filtering skill improves as Dt0 increases (see text for detailed explanation).

except that the filter completely diverges when the ensemble size is too small (results are not shown because this is the typical divergence as often encountered in implementing ensemble filtering [41]). Our main interest here is to understand the filter sensitivity toward variation of asynchronous observation time Dt0 . In Fig. 9, we show the average RMS error and correlation for posterior (solid) and prior (dashes) for simulations with various Dt0 ranging from 0.05 to the observation time Dt = 0.5. Notice that the filter skill improves as Dt0 increases and this

756

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

improvement can be explained by the following linear theory. Consider noiseless observations with Ro = 0 and assume that the macro-filter is stable with asymptotic covariance Rþ Rþ 1 m . For the scalar case, the asymptotic formulation in [1,42] suggests that

Rþ1 ¼ Ro K X;1 ¼ 0;

ð37Þ 0

where KX,1 is the asymptotic Kalman gain of the macro-filter. Substituting R ¼ tion model in (29) has errors with asymptotic covariance matrix

Rþ 1

¼ 0 into Eq. (30), our proposed observa-

Q o1 ¼ OðjDt 0 j2 Þ:

ð38Þ

If the Micro-Filter is stable, the asymptotic Kalman gain for the microscopic process is explicitly [1,42] given by

~; ~zÞDt 0 ¼ K y;1 ðy

1 ~  ~z þ ½ð1  y ~  ~zÞ2 þ 4y ~1=2 Þ; ð1  y 2

ð39Þ

where

r2y ðDt0 Þ3

¼ OðDt 0 Þ; jF y j2 Q o1 1 ~z ¼ > 1: jF y j2 ~¼ y

ð40Þ ð41Þ

Here Fy denotes a stable linearized dynamical operator for the microscopic dynamics and ry denotes the associated noise strength. It can be shown that the asymptotic Kalman gain in (39) satisfies the following conditions:

K y;1 ð0; ~zÞDt0 ¼ 0 8~z P 1; ~; ~zÞDt 0 is monotonically increasing in y ~ for fixed ~z; K y;1 ðy 0 ~; ~zÞDt ! 1 for y ~ ! 1: K y;1 ðy

ð42Þ

These conditions, together with the discussion in Section 3 (see (31) and (32)), imply that the micro-filter is more effective when Dt0 is large. In Fig. 10, we show snapshots of time series for simulations with asynchronous time Dt0 = 0.3, which is slightly shorter than that used in Fig. 8. Here, we see that there are two occasions where the filtered solutions do not capture the spikes (between times 478–480 and 488–490). If one looks carefully to the bottom panel in this figure, notice that every time the

Macro−Micro−Filter 1

0.5

X

0

−0.5

−1 470

472

474

476

478

480

482

484

486

488

490

472

474

476

478

480 time

482

484

486

488

490

3 2

y

1 0 −1 −2 −3 470

Fig. 10. Macro–Micro-Filter with Dt0 = 0.3. In each panel we show the observations (circle), true signal (dashes), filtered solutions (thick solid line), and the associated Kalman weights, KX and KyDt0 , (dot-dashes) that fluctuate between 0 and 1. In the top panel, we also plot the asynchronous time observations V mþ10 (square). In the bottom panel, the circle denotes Hm/Dt0 from (33) and the thin solid line denotes the short forecast in the HMM algorithm with N = 100.

757

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

microscopic Kalman gain, KyDt0 , (dash-dotted line that fluctuates between 0 and 1) is near zero before the maximum of the spike the filter solution cannot capture the spike. This is an instance where the filter needs the Kalman weight to be closer to one but the uncertainty of the proposed observation model in (29) is too large. Finally, we also show results with larger observation covariance Ro = 0.25 (see Fig. 11). Here, there is a lag in capturing the spike near 455–460 time units but the Macro–Micro-Filter is still effective most of the time considering how noisy the observation is and how cheap this strategy is numerically. 4.4. Prior forecast skill In this section, we discuss the prior forecast skill of two DNS strategies discussed earlier in Section 4.1, the one-dimensional linear Kalman filter and the two-dimensional EnKF, and another two strategies from Section 4.2, the macroscale + inverse and the Macro–Micro-Filter. Since asynchronous time observations (with Dt0 = 0.25) are used in the last two strategies, we also show results for 2D-EnKF and 1D-KF accounting all observations (including the asynchronous times observations) at every Dt = 0.25. The main result is shown in Table 2 and Fig. 12; the RMS and Corr in Table 2 correspond to the average RMS error and correlation between the true signal and the prior mean state. We see that without appropriate initialization in the microscopic dynamics (1D-KF), the prior forecast (thick solid line in Fig. 12) is not skillful at all with correlation less than 0.5, even when asynchronous observations are accounted. On the other hand, we see some improvement on the prior forecasts with 2D-EnKF with higher correlation 0.69 when asynchronous time observations are accounted. When direct inversion is used for the microscopic initialization (macroscale + inverse), the forecast skill improves moderately with correlation near 0.6; see the noisy forecast in Fig. 12. With the Macro–Micro-Filter, we achieve a more improved forecast skill with RMS 0.32 and correlation 0.69 (see Fig. 12). In the next section, we will see a much improved prior forecast skill when multiple asynchronous times observations are used. 4.5. Multiple asynchronous times observations In this section, we consider the Macro–Micro-Filter with multiple asynchronous times observations, I > 1. In our implementation, we apply the Macro-Filter with a one-dimensional linear Kalman filter since the slow dynamics in the Fitzhugh–Nagumo system in (34) is linear and thus no ensemble is used. Subsequently, we implement the HMM with an average microscopic state obtained from a single trajectory of solutions in subset AX of (13) with NT = 2 and N = 50. In the micro-filter, we use the standard nonlinear least squares (NLSQ) solver, including the Gauss–Newton iteration and the

Macro−Micro−Filter 2 1

X

0 −1 −2 −3 450

455

460

465

470

475

480

485

490

495

500

455

460

465

470

475

480

485

490

495

500

4 2

y

0 −2 −4 −6 450

o

Fig. 11. Macro–Micro-Filter with large observation noise R = 0.25: implemented with ETKF, ensemble size of 20, asynchronous observation time Dt0 = 0.4, and total micro-steps N = 100 which is one fifth of Dt/dt = 500. Filtered solution or posterior mean state (thick solid line), observations (circle), and true signal (dashes). In the bottom figure, since we do not really observe y, the circle here denotes the proposed observation Hm/Dt0 in the y-space, where Hm is defined in (33).

758

J. Harlim / Journal of Computational Physics 230 (2011) 744–762 Table 2 The average RMS error and correlation between the prior mean state and the true signal for various filtering schemes, averaged over 5000 assimilation cycles. Scheme

RMS

Corr

1D-KF DNS 2D-EnKF DNS 1D-KF DNS with async obs 2D-EnKF DNS with async obs Macro-filter + inverse Macro–Micro-Filter

0.40 0.37 0.56 0.32 0.45 0.32

0.20 0.48 0.16 0.69 0.58 0.69

X

1D−KF

2D−EnKF

1

1

0.5

0.5

0

0

−0.5

−0.5

−1 450

460

470

480

490

500

−1 450

X

1D−KF w async obs 1

0.5

0.5

0

0

−0.5

−0.5

460

470

480

490

500

−1 450

460

X

Macro−Filter + inverse 1

0.5

0.5

0

0

−0.5

−0.5

460

470

480

490

time

480

490

500

470

480

490

500

490

500

Macro−Micro−Filter

1

−1 450

470

2D−EnKF w async obs

1

−1 450

460

500

−1 450

460

470

480 time

Fig. 12. The mean prior forecast state as a function of time for various filtering schemes: the true signal (dashes); the prior forecast mean state (thick solid line).

Levenberg–Marquardt [43]. With this setup, we do not have to produce ensemble of microscopic solutions (as discussed in the general framework in Section 3 and the Appendix) and in fact we have no information about the microscopic error covariance rþ mþ1;0 associated with the minimum,

yþmþ1;0 ¼ arg min y

I X i¼1

kV mþ1þiDt0  X i ðyÞk2Q o

mþ1;i

;

ð43Þ

759

J. Harlim / Journal of Computational Physics 230 (2011) 744–762 Table 3 The average RMS error and correlation for simulations with multiple asynchronous observations I > 1 with MMF that combines the one-dimensional linear Kalman filter and a nonlinear least squares algorithm. We also include results from the 2D-ETKF and 2D-EnKF accounting all (including the asynchronous time) observations at every Dt = 0.1. Scheme

RMS post

Corr post

RMS prior

Corr prior

2D-ETKF w async obs 2D-EnKF w async obs MMF with KF and NLSQ for I = 2 MMF with KF and NLSQ for I = 3 MMF with KF and NLSQ for I = 4

0.28 0.28 0.04 0.02 0.013

0.78 0.82 0.91 0.95 0.96

0.45 0.45 0.14 0.07 0.04

0.52 0.58 0.61 0.75 0.82

Macro−Micro−Filter with NLSQ for I=4 1

X

0.5

0

−0.5

−1 450

455

460

465

470

475

480

485

490

495

500

455

460

465

470

475 time

480

485

490

495

500

2 1

y

0 −1 −2 −3 450

Fig. 13. Macro–Micro-Filter with nonlinear least square scheme for I = 4 asynchronous times observations and Dt0 = 0.1.

when standard minimization solver is used. For this problem, we do not need any regularization since the observation model Xi(y) is a linear function of y and the least squares problem is over-determined when I > 1 = Ny = dim(y). In our numerical implementation, we choose asynchronous times observations at every Dt0 = 0.1 for I = 2, 3, 4, with noise variance ro = 0.0025. For large noise variance, we expect the least squares fit to become inaccurate as discussed in Section 3 for the single asynchronous time observations case. From our simulations, we find that the MMF with NLSQ produces more accurate solutions compared to either the 2D-ETKF or the 2D-EnKF with fully resolved dynamics (see Table 3). Here, the ETKF and EnKF are implemented with an ensemble of size K = 20 and they both account all (including the asynchronous times) observations at every Dt = 0.1. We also learn that the posterior and prior forecast skills are significantly improved as I increases (see Table 3). Notice also that the average posterior RMS error is much lower than the observation noise error ffiffiffiffi p ro ¼ 0:05 with extremely high correlation 0.9, even with only I = 2 asynchronous times observations. In Fig. 13, we show a trajectory of the filtered solutions for I = 4 and we notice that the posterior state y is more accurate (see thick solid line in the second panel) compared to when only single asynchronous time observations are utilized (Figs. 7 and 8). 5. Summary and discussion In this paper, we present a numerical strategy for filtering stochastic differential equations with multiscale features. This method is designed such that it does not violate the practical linear observability condition and, more importantly, it does not require the computationally expensive cross correlation statistics between multiscale variables that are typically needed in standard filtering approach. Our proposed filtering algorithm comprises a ‘‘macro-filter’’ that utilizes some aspects of the HMM framework [16,17] and a ‘‘micro-filter’’ that essentially solves an inverse problem for parameterizing differential equations [18]. There are, of course, subtle issues on how to perform the micro-filter when the operator that maps the microscopic to the macroscopic states is unknown. In this paper, we discuss one way to approximate such operator with an approximate slow

760

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

dynamics through explicit discretization. This approach is only useful when macroscopic observations at asynchronous times are available and, so far, we only address the separable form slow dynamics in (28). We had also thought about alternative strategies such as using implicit schemes to avoid the need for asynchronous times observations. However, we found through various numerical tests (not shown in the paper) that the observation model error with implicit discretization is too large and the only way to avoid this error is to solve the filter with the corresponding implicit scheme which is undesirable in practice since it is expensive. In Section 4.3, we discuss thoroughly the sensitivity of the proposed approach toward variation of the asynchronous observation time for a single asynchronous time observations and linear dynamics; we find that the filter skill improves as the asynchronous time increases. Therefore, we also discuss the limitation of the proposed approach. In this paper, we test our proposed filtering strategy on the stochastically forced FitzHugh–Nagumo model. This simple noise induced chaotic fast-slow system turns out to be a difficult testbed when the microscopic state is hidden with nonmixing dynamics. This non-mixing condition makes the HMM approximation invalid [16,27] and as a consequence, expensive DNS is needed whenever classical filtering approach is used. In our numerical simulations in Section 4, we demonstrate the practical difficulty with DNS including the violation of practical linear observability condition and the needs for the computationally expensive cross covariances between multiscale variables. We show that our proposed approach, the Macro– Micro-Filter, implemented with the Ensemble Transform Kalman Filter [22] for single asynchronous time observations, is skillful relative to the macro-filter alone or even to the macro-filter + inverse. Moreover, it also produces more accurate prior estimates relative to those with the expensive DNS based filters. When multiple asynchronous times observations are utilized in the micro-filter (Section 4.5), we see significant improvement with the Macro–Micro-Filter on both the prior and the posterior estimates beyond the expensive DNS ensemble based filters. In many real applications, unfortunately, these asynchronous times observations are typically sparse both in time and in space. This sparsity will create an ill-posedness in the micro-filtering step, especially when the microscopic state is large, Ny > dim(V)  I. In the future work, we will address this issue with a four-dimensional ensemble filter [44,45] as the macro-filter to produce unbiased posterior estimates of the macroscopic variables at time m + 1 as well as at asynchronous times, (m + 1) + iDt0 . From the practical standpoint, the proposed approach is computationally cheaper relative to DNS only if the microscopic integration step in the HMM is short enough, N < Dt/dt, and if the macro-filter combined with the minimization solver in the micro-filter step is faster than computing, storing, and inverting the (NX + Ny)  (NX + Ny) covariance matrix of the full system. In Section 3, we mention the need for an optimization scheme that not only solves the least squares problem in (24), but also provides the uncertainty estimates associated with the minimum. In regard to this issue, we will consider various reduced based hybrid variational-ensemble techniques [31–34] in future work. On the other directions, we are pursuing the potential of using the coarse graining strategy, developed for the hybrid deterministic-stochastic lattice systems [46], relative to the HMM algorithm to improve the prior estimates and the potential of using the superparameterization strategy [27,47] that is more promising than HMM in the regime with modest values of scale separation. The interesting scientific question in regard to these alternative strategies is to whether we need the microfilter when improved prior estimates are available. Acknowledgments This work is motivated from discussions at the SAMSI working group for the interaction between deterministic and stochastic dynamics. The author thanks Peter Kramer for sharing his knowledge on the HMM, Andrew J. Majda for his comments on the manuscript, and Emily L. Kang for her thoughts after testing the MMF algorithm on different systems. The research of the author is partially supported by the NC State University startup grant and the SAMSI teaching buyout for the Fall 2009 semester. Appendix. An ensemble based Macro–Micro-Filter for high dimensional systems In this appendix, we describe an ensemble based MMF for high-dimensional systems. In each assimilation step, the goal is þ þ þ to generate a posterior ensemble of fX þ k;mþ1 ; yk;mþ1 ; k ¼ 1; . . . ; Kg, given a posterior ensemble fX k;m ; yk;m ; k ¼ 1; . . . ; Kg at preo vious time step and observations {V(m+1)+iDt0 , i = 0, . . . , I} with covariance R . Here, K denotes the ensemble size. We proceed as follows: I. Find a prior ensemble with the HMM: þ (a) Integrate the microscopic dynamics in (2) with frozen X ¼ X þ k;m for each initial condition yk;m forward in time with a  micro-time-step dt for N steps to obtain yk;mþNdt . The computational cost reduction relative to direct numerical solvers (DNS) with micro-time-step dt is attained only when N < Dt/dt where Dt is the observation time. 1 PK   (b) Compute y mþNdt ¼ K k¼1 yk;mþNdt . þ  (c) Integrate the macroscopic dynamics in (1) with constant y ¼ y mþNdt for each ensemble member, X k;m , to obtain prior ensemble estimate, X  . k;mþ1 II. Update the macroscopic dynamics: Perform an ensemble filter with any desirable filtering strategy [20–26] and use observation Vm+1 to obtain a macroscopic posterior ensemble X þ k;mþ1 . Here, the computational cost varies depending on

761

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

the choice of the filtering strategy. For example, the most expensive computation with ETKF is in computing the pseudo-inverse of a K  K matrix with SVD. This completes the Macro-Filter step. III. Micro-Filter: Solve the following nonlinear least squares problem

min

I X

y

kV ðmþ1ÞþiDt0  X i ðyÞk2Q o

i¼1

mþ1;i

mþNdt k2r ; þ ky  y

ð44Þ

m;N

 where r m;N is an empirical covariance constructed from the prior ensemble yk;mþNdt from Step I. (b). Here, asynchronous observations {V(m+1)+iDt0 }i=1,. . .,I are related to y through the following observation operator

 X i ðyÞ ¼ M X þmþ1 ; X 1 ; . . . ; X i1 ; y ;

ð45Þ

where the difference operator M is obtained through discretizing the macroscopic dynamical equation in (1) with any desirable finite-difference scheme. To account the uncertainty of each asynchronous observation, we construct covariances



Q omþ1;i  gmþ1 gTmþ1 ;

ð46Þ

where

gmþ1 ¼ V ðmþ1ÞþiDt0  X i ðyÞ:

ð47Þ Q omþ1;1

Q omþ1;i

For the Euler method with separable slow dynamical operator as in (28), is given explicitly by Eq. (30) and ¼ Ro , for i = 2, . . . , I. The computational cost depends heavily on how one minimizes (44). For practical application with ensemble based filters, one may need hybrid variational-ensemble filter methods such as [31–34] that provide an estimate for the uncertainty corresponds to the minimum that solves (44). Depending on the nature of the problem, one may also consider different types of regularization with different norms. The Macro–Micro-Filter is computationally cheaper relative to DNS only if N < Dt/dt in the HMM and if the macro-filter combined with the minimization solver in step III above is faster than computing, storing, and inverting the (NX + Ny)  (NX + Ny) covariance matrix of the full system. References [1] A.J. Majda, M.J. Grote, Explicit off-line criteria for stable accurate time filtering of strongly unstable spatially extended systems, Proc. Natl. Acad. Sci. 104 (2007) 1124–1129. [2] J. Harlim, A.J. Majda, Mathematical strategies for filtering complex systems: regularly spaced sparse observations, J. Comput. Phys. 227 (10) (2008) 5304–5341. [3] M.J. Grote, A.J. Majda, Stable time filtering of strongly unstable spatially extended systems, Proc. Natl. Acad. Sci. 103 (2006) 7548–7553. [4] B.D. Anderson, J.B. Moore, Optimal Filtering, Prentice-Hall, Englewood Cliffs, NJ, 1979. [5] C. Chui, G. Chen, Kalman Filtering, Springer, New York, 1999. [6] R.E. Kalman, R. Bucy, A new results in linear prediction and filtering theory, Trans. AMSE J. Basic Eng. 83D (1961) 95–108. [7] B. Gershgorin, A.J. Majda, A nonlinear test model for filtering slow-fast systems, Commun. Math. Sci. 6 (3) (2008) 611–649. [8] B. Gershgorin, A.J. Majda, Filtering a nonlinear slow-fast system with strong fast forcing, Commun. Math. Sci. 8 (1) (2010) 67–92. [9] B. Gershgorin, J. Harlim, A.J. Majda, Test models for improving filtering with model errors through stochastic parameter estimation, J. Comput. Phys. 229 (1) (2010) 1–31. [10] B. Gershgorin, J. Harlim, A.J. Majda, Improving filtering and prediction of sparsely observed spatially extended turbulent systems with model errors through stochastic parameter estimation, J. Comput. Phys. 229 (1) (2010) 32–57. [11] H.C. Tuckwell, R. Rodriguez, Analytical and simulation results for stochastic FitzHugh–Nagumo neurons and neural networks, J. Comput. Neurosci. 5 (1998) 91–113. [12] G.A. Pavliotis, A.M. Stuart, Parameter estimation for multiscale diffusions, J. Stat. Phys. 127 (4) (2007) 741–781. [13] T.G. Kurtz, Semigroups of conditional shifts and approximations of Markov processes, Ann. Prob. 3 (1975) 618–642. [14] G. Papanicolaou, Some probabilistic problems and methods in singular perturbations, Rocky Mt. J. Math. 6 (1976) 653–673. [15] A.J. Majda, I. Timofeyev, E. Vanden-Eijnden, Models for stochastic climate prediction, Proc. Natl. Acad. Sci. 96 (1999) 15687–15691. [16] W. E, B. Engquist, The heterogeneous multi-scale methods, Commun. Math. Sci. 1 (2003) 87–133. [17] W. E, B. Engquist, X. Li, W. Ren, E. Vanden-Eijnden, The heterogeneous multi-scale method: a review, Commun. Comput. Phys 2 (2007) 367–450. [18] G. Anger, Inverse Problems in Differential Equations, Kluwer Plenum Press, New York, 1990. [19] E. Vanden-Eijnden, Numerical techniques for multi-scale dynamical systems with stochastic effects, Commun. Math. Sci. 1 (2) (2003) 385–391. [20] G. Evensen, Sequential data assimilation with a nonlinear quasi-geostrophic model using Monte Carlo methods to forecast error statistics, J. Geophys. Res. 99 (1994) 10143–10162. [21] J.L. Anderson, An ensemble adjustment Kalman filter for data assimilation, Mon. Weather Rev. 129 (2001) 2884–2903. [22] C.H. Bishop, B. Etherton, S.J. Majumdar, Adaptive sampling with the ensemble transform Kalman filter: Part I: The theoretical aspects, Mon. Weather Rev. 129 (2001) 420–436. [23] M.K. Tippett, J.L. Anderson, C.H. Bishop, T.M. Hamill, J.S. Whitaker, Ensemble square-root filters, Mon. Weather Rev. 131 (2003) 1485–1490. [24] P.F.J. Lermusiaux, Adaptive modeling, adaptive data assimilation, and adaptive sampling, Physica D 230 (2007) 172–196. [25] P. Sakov, P.R. Oke, Implications of the form of the ensemble transformation in the ensemble square root filters, Mon. Weather Rev. 136 (2008) 1042– 1053. [26] X. Yang, T. DelSole, The diffuse ensemble filter, Nonlinear Process. Geophys. 16 (2009) 475–486. [27] A.J. Majda, M.J. Grote, Mathematical test models for superparametrization in anisotropic turbulence, Proc. Natl. Acad. Sci. 2106 (14) (2009) 5470–5474. [28] C.W. Groetsch, Inverse Problems in the Mathematical Sciences, Vieweg, 1993. [29] H.W. Engl, C.W. Groetsch, Optimal parameter choice for ordinary and iterated Tikhonov regularization, in: H.W. Engl, C.W. Groetsch (Eds.), Inverse and Ill-Posed Problems, Academic Press, New York, 1987, pp. 97–125. [30] A.N. Tikhonov, Regularization of incorrectly posed problems, Sov. Math. Dokl. 4 (1963) 1624–1627. [31] M. Zupanski, Maximum likelihood ensemble filter: theoretical aspects, Mon. Weather Rev. 133 (2005) 1710–1726.

762

J. Harlim / Journal of Computational Physics 230 (2011) 744–762

[32] J. Harlim, B.R. Hunt, A non-Gaussian ensemble filter for assimilating infrequent noisy observations, Tellus A 59 (2) (2007) 225–237. [33] O. Bashir, K. Wilcox, O. Ghattas, B. van Bloemen Waanders, J. Hill, Hessian-based model reduction for large-scale systems with initial-condition inputs, Int. J. Numer. Methods Eng. 73 (2007) 844–868. [34] H. Cheng, M. Jardak, M. Alexe, A. Sandhu, A hybrid approach for estimating error covariances in variational assimilation, Tellus A 62 (2010) 288–297. [35] R. FitzHugh, Impulses and physiological states in theoretical models of nerve membrane, Biophys. J. 1 (1961) 445–466. [36] J.S. Nagumo, S. Arimoto, S. Yoshizawa, An active pulse transmission line simulating nerve axon, Proc. IRE 50 (1962) 2061–2071. [37] J. Rinzel, Models in neurobiology, in: R.H. Enns, B.L. Jones, R.M. Miura, S.S. Rangnekar (Eds.), Nonlinear Phenomena in Physics and Biology, Plenum Press, New York, 1981, pp. 345–367. [38] J.D. Murray, Mathematical Biology, Springer-Verlag, Berlin Heidelberg, 1989. [39] D.S. Goldobin, A. Pikovsky, Antireliability of noise-driven neurons, Phys. Rev. E 73 (2006) 061906. [40] X. Wang, C.H. Bishop, S.J. Julier, Which is better, an ensemble of positive-negative pairs or a centered spherical simplex ensemble?, Mon Weather Rev. 132 (7) (2004) 1590–1605. [41] J.S. Whitaker, T.M. Hamill, Ensemble data assimilation without perturbed observations, Mon. Weather Rev. 130 (2002) 1913–1924. [42] E. Castronovo, J. Harlim, A.J. Majda, Mathematical criteria for filtering complex systems: plentiful observations, J. Comput. Phys. 227 (7) (2008) 3678– 3714. [43] C.T. Kelley, Iterative Methods for Optimization, Frontiers in Applied Mathematics, vol. 18, SIAM, 1999. [44] B.R. Hunt, E. Kalnay, E.J. Kostelich, E. Ott, D.J. Patil, T. Sauer, I. Szunyogh, J.A. Yorke, A.V. Zimin, Four-dimensional ensemble Kalman filtering, Tellus A 56 (2004) 273–277. [45] J. Harlim, B.R. Hunt, Four-dimensional local ensemble transform Kalman filter: numerical experiments with a global circulation model, Tellus A 59 (5) (2007) 731–748. [46] M.A. Katsoulakis, A.J. Majda, A. Sopasakis, Intermittency, metastability and coarse graining for coupled deterministic–stochastic lattice systems, Nonlinearity 19 (5) (2006) 1021–1047. [47] Y. Xing, A.J. Majda, W.W. Grabowski, New efficient sparse space–time algorithms for superparameterization on mesoscales, Mon. Weather Rev. 137 (2009) 4307–4323.