Lyapunov-based Adaptive State Estimation for a Class of Nonlinear ...

Report 1 Downloads 33 Views
Lyapunov-based Adaptive State Estimation for a Class of Nonlinear Stochastic Systems Li Xie∗ and Pramod P. Khargonekar† September 9, 2009

Abstract This paper is concerned with an adaptive state estimation problem for a class of nonlinear stochastic systems with unknown constant parameters. The systems have a linear-in-parameter structure. The nonlinearity of the systems under considerations is bounded in a Lipschitz-like manner. The design methods we propose are based on stochastic counterparts of Lyapunov stability theory. The ultimately exponentially bounded state and parameter estimators in the sense of mean square have been obtained both for continuous-time and discrete-time nonlinear stochastic systems. The sufficient conditions for the existence of such estimators are given in terms of the solvability of related linear matrix inequalities. In addition, we introduce a suboptimal criterion as an extra constraint relative to the boundedness so that we can obtain the design parameters of estimators in a suboptimal sense. The goal of this criterion is to minimize the upper bound of the expectation of estimation errors. By a martingale method, we demonstrate that this sub-optimization procedure also minimizes an upper bound of estimation errors in almost sure sense for continuous-time systems. Numerical examples have shown that the proposed estimator design methods are tractable and effective for a class of stochastic nonlinear systems. Keywords: Adaptive state estimation, boundedness, linear matric inequalities, Lipschitz-like conditions, Lyapunov methods, nonlinear stochastic systems, parameter identification.

1

Introduction

Adaptive state estimation for linear or nonlinear stochastic systems has been an active area of research over the past three decades. With unknown parameters and stochastic disturbances, if a system can provide estimates of states and parameters simultaneously based on related on-line measurements in some meaningful senses, we say that such a system realizes an adaptive state estimation. Generally speaking, an adaptive state estimator is composed of a state estimator and a parameter estimator, and there are coupling effects between them. In order to generate ∗

L. Xie is with the Department of Electrical and Computer Engineering, the University of Florida, Gainesville FL 32611 USA, and also with the Department of Automatic Control, Beijing Institute of Technology, Beijing 100081, China. Emails: [email protected]. † P.P. Khargonekar is with the Department of Electrical and Computer Engineering, the University of Florida, Gainesville FL 32611 USA. Email: [email protected].

1

acceptable estimates for both, these two estimators need to exchange information from each other. This leads both of them to work in the presence of uncertainties since states and parameters are exactly unknown, which makes adaptive state estimation problem very challenging. In adaptive control and estimation literature, there are a number of algorithms to deal with the unknown parameter issue such as self-turning algorithm and Bayesian multi-model estimation [2]. State estimation algorithms particularly designed for linear systems can also be used as adaptive state estimation algorithms for nonlinear systems. For example, if the unknown parameter is a constant, an efficient algorithm is the extended Kalman filter (EKF) in which by extending the state vector with the unknown parameter, the linear Kalman filter can realize the joint parameter and state estimation. This augmentation method often results in a nonlinear system even if the original system is linear. It has been shown in [11] and [23, 22] that due to linear approximation and the lack of exact statistical knowledge about initial state and noise, the performance of EKF may degrade or the filter diverges. One possible solution to know statistical properties of noise is on-line learning. Based on the maximum likelihood principle and the property of the innovation for linear optimal filtering, adaptive filtering algorithms with unknown prior statistics have been developed for linear systems; see, e.g., [24] and [15]. However if adaptive filters are used for nonlinear stochastic systems, theoretical analysis is still needed; otherwise the stability or boundedness property can not be guaranteed due to linear approximation. On the other hand, in view of the augmentation method, nonlinear filters such as high-order filters using Taylor series expansion can also be used as adaptive state estimators. It is noted that nonlinear filters including statistical filters in which conditional probability density is evolved are based on linear approximation and a priori knowledge of noise statistics. Hence if these filters work in uncertain environments, the properties of stability and boundedness are questionable as well. In a word, both uncertainties from environments (noise) and parameters and nonlinearity make the adaptive state estimation problem very difficult.

1.1

Background

In this paper, we deal with this problem and present adaptive state estimators for a class of nonlinear stochastic systems both in continuous time and discrete time without exact statistical knowledge of noise and linearization. More specifically, the main goal of this paper is to study a particular structured adaptive estimation problem arising from a class of nonlinear stochastic systems with a special linear-in-the-parameter structure. Consider continuous-time systems as example, we are concerned with an adaptive state estimation problem of the following nonlinear systems: x(t) ˙ = f (x(t)) + g(x(t))θ + B(θ)w(t)

(1.1)

z(t) = h(x(t)) + Dv(t)

(1.2)

where θ is a vector of unknown constant parameters and the Gaussian white noises w(t), v(t) are independent. We also assume that z(t) is continuous with respect to t. The problem is to design an adaptive state estimator to identify θ and estimate the state based on the continuous observation z(t). Our main motivation for investigating this adaptive state estimation problem with such a model is from a requirement to simulate and identify electroencephalogram (EEG) 2

signals. In [26] and [9], the model (1.1)–(1.2) is used to generate EEG-like signals by adjusting θ as slowly time-varying parameters. Hence a key issue to simulate, identify, and control EEG seizure signal is how to identify θ. When B is independent of θ or equal to zero, this model is also from adaptive control, in particular, arising from some practical situations such as process control for biochemical and chemical processes, and parameter identification for active automotive suspensions; see [5] and [20]. In order to conveniently use the notation and stability results of stochastic differential equations, instead of (1.1)–(1.2), we next work with the following equivalent continuous-time stochastic model: dx(t) = f (x(t))dt + g(x(t))θdt + B(θ)dW (t)

(1.3)

dy(t) = h(x(t))dt + DdV (t)

(1.4)

where W (t) and V (t) are independent standard Brownian motions (Wiener processes). B(θ) may be a function of x(t). The stochastic differential equation (1.3) is an Itˆo form of the differential Rt equation driven by white noise w(t). If we introduce a new random process y(t) = 0 z(s)ds, y(0) = 0, then we have the stochastic differential equation (1.4) from (1.2) since y(t) =

Z

t

h(x(s))ds + 0

Z

t

Dv(s)ds = 0

Z

t

h(x(s))ds +

0

Z

t

DdV (s).

(1.5)

0

dV (t) ”. Then the adaptive state dt estimation problem for system (1.1)–(1.2) given the continuous measurement z(t) is equivalent to the adaptive state estimation problem for system (1.3)–(1.4) given the continuous measurement y(t); for details, see [19, p.84] and [3, p.57]. Here we replace the white noise v(t) by the notation “v(t) =

1.2

Related work

For nonlinear stochastic systems, due to nonlinearity and Gaussian noise or Brownian motion, it seems impossible or difficult to directly design an adaptive state estimator (in other word, observer) in which the estimation error approachs to zero as the time t → ∞. For state estimator, instead of convergence to zero for the estimation error, by using the boundedness results of [28] for the solution to stochastic differential equations and [1] for stochastic difference equations via Lyapunov-like method, an exponentially ultimately bounded in mean square observer was developed in [25]. Special structured nonlinear observeres dˆ x(t) = f (ˆ x(t)) + L(dy(t) − h(ˆ x(t))

(1.6)

for continuous-time systems and x ˆ(k + 1) = f (ˆ x(k)) + K(y(k) − h(ˆ x(k)))

(1.7)

for discrete-time systems were employed, where f (·) is the deterministic term in differential or difference equations without involving unknown parameter θ, and L, K are observer gains. The method has merits of simplicity and robustness. For example, the observer only needs the bound of noise covariances, and the nonlinearity is limited to be Lipschitz functions. For the general 3

cases, one sufficient condition for the existence of the observer was given in terms of the negative definiteness of some matrix involving function derivatives. Meanwhile, the optimization issue was also addressed by [21]. In [27], instead of Lipschitz nonlinearity, the nonlinearity was described in terms of Lipschitz-like manners such that the sufficient condition was presented in terms of Riccati equation for the general cases. Numerical examples were also presented to show the performance of the observer in [27] better than other most common used filters for a class of nonlinear stochastic systems. For nonlinear deterministic continuous-time systems with linear-in-parameter structures as described by (1.1), an adaptive observer was presented in [4]. The deterministic Lyapunove theory was used to design a convergent adaptive observer. The sufficient condition involved a linear matrix inequality. Convergence analysis of the observer was also provided. A similar adaptive observer was used to identify parameters for active automotive suspensions [20].

1.3

Contribution and organizations

In this paper, we use a similar method as in [25,27,4] to solve the adaptive state estimation problem of system (1.3)–(1.4). In particular, we adopt the Lipschitz-like manners in [27] to describe the nonlinearity. This leads us to give a sufficient condition for the boundedness of estimators in terms of a linear matrix inequality (LMI). The state and parameter estimators have the same structure as in (1.6) and (1.7) given by [25] since this structure is easily tractable. Exponentially ultimately bounded adaptive state estimators in mean square are presented both for continuous-time and discrete-time nonlinear systems with linear-in-parameter structures. By outlining the organization of the paper, we next give an overview of our approach and contribution. In Section 2, we study the boundedness properties of Lyapunov functions. We first summarize sufficient conditions for exponential stability and boundedness in mean square in terms of Lyapunov functions. By Lemma 2.1, we give a specific expression for the expectation of Lyapunov functions which is used to establish the main result of this section. The main result of this section is Theorem 2.1 in which by using the supermartingale inequality, we establish an upper bound for continuous-time Lyapunov functions in almost sure sense. We find that there exists a coefficient which characterizes the upper bounds both for continuous-time and discretetime systems in mean square and almost sure senses. Then optimizing this coefficient can be considered to be an optimal criterion for the design of nonlinear estimators in the remaining parts of the paper. In Section 3, we deal with the adaptive state estimation problem for continuous-time nonlinear stochastic systems. The method is similar to adaptive control design method for state estimation, e.g., in [4]. That is, by introducing an adaptive law, a time-varying parameter estimator together with a state estimator is derived. Comparing with [25, 27], instead of a complicated matrix equation in [25] and a Lyapunov matrix equation [27], the sufficient condition is proposed in terms of an LMI. Here one technique in [25, 27] is improved such that an extra condition both in [25, 27] is incorporated into the LMI. Note that based on the results in Section 2, a significant improvement is that we introduce a suboptimal procedure to minimize the upper bound of the expectation of estimate error. This suboptimal procedure is also realized by an LMI optimization. In addition, comparing to [4], we do not require that the linear part of the deterministic part in the observation equation has full column rank. Instead, we make a boundedness assumption on 4

g(x) in (1.3). The full column rank assumption is removed by introducing an extra matrix which is also incorporated into the LMI. In Sections 4, we consider the discrete-time case, and an adaptive state estimator is obtained by using the augmentation method and the discrete version of stochastic Lyapunov theory. Comparing with [27], the sufficient condition cooperating with the suboptimization for the boundedness in mean square are presented in terms of LMIs. Numerical examples are given to illustrate the adaptive state estimators both for continuous-time and discrete-time systems in Sections 5. Concluding remarks are drawn in Section 6.

1.4

Notation

Throughout this paper, we use kxk to denote the Euclidean norm of vector x and kAk F the Frobenius norm of matrix A. A0 denotes the transpose of real matrix A. Let tr(·) denote the trace operator of a matrix and |·| be the absolute value of a number. Let λ(·), λ max (·), and λmin (·) denote any eigenvalue, the maximum eigenvalue, and the minimal eigenvalue of a square matrix respectively. We use the notation P > 0 (P < 0) to indicate that P is a positive (negative) definite matrix. Let the triple (Ω, F, P) be the underlying probability space. Also we let E denote the mathematical expectation with respect to the probability measure P. Let I be the identity matrix with a compatible dimension. In some situations the dimension of I is explicitly shown for example In×n denotes the n × n identity matrix. R+ , Rn and Rn×m denote the set of positive real numbers, the n dimensional Euclidean space, and the set of all n × m real matrices, respectively.

2

Upper bounds for Lyapunov functions

In this section, we study the boundedness properties of Lyapunov functions or related process x(t). The upper bounds are characterized. Then these properties will be used to design and optimize nonlinear estimators in the remaining parts of the paper. We first consider continuoustime stochastic processes described by the Itˆo stochastic differential equation, dx(t) = f (t, x(t))dt + g(t, x(t))dW (t)

(2.1)

with initial value x(0) = x0 , where f : R+ × Rn 7→ Rn , g : R+ × Rn 7→ Rn×m and W (t) is an m-dimensional Brownian motion. Throughout the paper, we always assume that all standard conditions, e.g., Lipschitz and linear growth conditions for f (t, x(t)) and g(t, x(t)), are satisfied such that equation (2.1) has a global and unique t-continuous solution. Consider nonnegative functions V (t, x) which are continuously twice differentiable in x and once in t. Let LV denote the differential operator of (2.1) associated with V (t, x): LV (t, x) = Vt (t, x) + Vx (t, x)f (t, x) + tr[g 0 (t, x)Vxx (t, x)g(t, x)]

(2.2)

where Vt (t, x) =

∂V (t, x) , ∂t

Vx (t, x) =



∂V (t, x) ∂V (t, x) ,..., ∂x1 ∂xn



,

For any x0 , we also assume that Z T (A) E[ kVx (t, x(t))g(t, x(t))k2 dt] is finite for any T ≥ 0. 0

5

Vxx (t, x) =



∂ 2 V (t, x) ∂xi ∂xj



(2.3) n×n

We denote the set of such functions V (t, x) by 1,2 CA = {V (t, x) : R+ × Rn 7→ R+ subject to Condition A}

Condition A guarantees the expectation of a stochastic integral in the Itˆo formula will vanish. There exist more specific conditions under which Condition A is satisfied; see [28, 25, 21, 8, 18, 17]. For example, Vx (t, x) increases no faster than a linear function of x and E[kx 0 k4 ] < ∞; see [8, p.81] and [29]. Definition 2.1. Consider the stochastic differential equation (2.1). Then the stochastic process x(t) is said to be exponentially ultimately bounded in mean square if there exist positive constants c1 , c2 , c3 such that E[kx(t)k2 ] ≤ c1 exp(−c2 t) + c3

(2.4)

for all t ≥ 0. Obviously, if c3 = 0, x(t) is exponentially stable in mean square. From Definition 2.1, one can see that for the exponential ultimate boundedness of the process x(t) in mean square, the expectation of x(t) initially decreases exponentially and in the steady state remains within a certain bound. The exponential boundedness in mean square has been studied by [7, 28] and further results were given in [17]; see also Theorem 1 in [25], Lemma 1 in [23], and Theorem 5.2 [14] for the asymptotic boundedness in mean square of stochastic differential equations with Markovian switching. The next lemma summarizes sufficient conditions for exponential stability and boundedness in mean square in terms of Lyapunov functions. The lemma is a special case of Theorem 2.1 in [13] for stability and Theorem 1 in [28] (or Theorem 1 in [25]) for boundedness. Here what we give is a specific expression for the expectation of Lyapunov functions which will be used to establish the main result of this section. Lemma 2.1. Consider the stochastic differential equation (2.1). Assume that there exist a func1,2 tion V ∈ CA and constants c1 > 0, k1 > 0, k2 > 0, k3 ≥ 0, k1 6= k3 such that c1 kxk2 ≤ V (t, x)

(2.5)

LV (t, x) ≤ −k1 V (t, x) + k2 exp(−k3 t).

(2.6)

and

Moreover E[V (0, x0 )] < ∞. Then we have (i) If k3 > 0, x(t) is exponentially stable in mean square. (ii) If k3 = 0, x(t) is exponentially ultimately bounded in mean square. Proof. We use the method in [28]. By applying the Itˆo formula, we have Z t E[V (t, x)] − E[V (0, x)] = E[ LV (s, x(s))ds].

(2.7)

0

Then differentiating this equality with respect to t and using (2.6), we have d (E[V (t, x)] exp(k1 t)) ≤ k2 exp((k1 − k3 )t). dt 6

(2.8)

Integrating the both sides of the last inequality from 0 to t yields E[V (t, x(t))] ≤ (E[V (0, x0 )] − k2 /(k1 − k3 )) exp(−k1 t) + k2 /(k1 − k3 ) exp(−k3 t)

(2.9)

from which and (2.5) the lemma follows. 1,2 Lemma 2.2. Consider a function V (t, x) ∈ C A associated with (2.1) satisfying

LV (t, x) ≤ −k1 V (t, x) + k2

(2.10)

where k1 , k2 > 0. Define another function M (t, x) by M (t, x) = (V (t, x) + k2 /k1 ) exp(−k1 t).

(2.11)

Then M (t, x) is a positive supermartingale. Also E[M (t, x)] ≤ 2k2 /k1 exp(−k1 t) + (E[M (0, x)] − 2k2 /k1 ) exp(−2k1 t).

(2.12)

Moreover M (t, x) exponentially converges to zero almost surely. Proof. By using (2.10), the derivative of M (t, x) with respect to t is given as LM (t, x) = L [(V (x, t) + k2 /k1 ) exp(−k1 t)]

= −(k1 V (t, x) + k2 ) exp(−k1 t) + LV (t, x) exp(−k1 t)

≤ −(k1 V (t, x) + k2 ) exp(−k1 t) + (−k1 V (t, x) + k2 ) exp(−k1 t)

= −2k1 V (t, x) exp(−k1 t) ≤ 0.

(2.13)

1,2 It is obvious that M (t, x) ∈ CA . By Lemma 5.2.1 and Theorem 5.7.1 in [8], it follows from (2.13) that M (t, x) is a positive supermartingale with respect to the filtration {F t , t ∈ R+ } generated by the continuous processes x(t), that is, for all t, s ∈ R + with t > s,

E[M (t, x)|Fs ] ≤ M (s, x), and hence converges to a finite random variable almost surely. Furthermore, E[M (t, x)] ≤ E[M (s, x)].

(2.14)

Note that M (t, x) can also be rewritten as LM (t, x) ≤ −2k1 (V (t, x) + k2 /k1 ) exp(−k1 t) + 2k2 exp(−k1 t) = −2k1 M (t, x) + 2k2 exp(−k1 t).

(2.15)

Then (2.12) follows from (2.9) in the proof of Lemma 2.1. (2.12) together with (2.14) implies that E[M (t, x)] → 0,

as t → ∞

monotonously exponentially converges to zero. Since M (t, x) is a positive supermartingale, it is also closed. That is, there exists an integrable random variable M ∞ such that lim M (t, x) = M∞ ≥ 0,

t→∞

E[M∞ |Ft ] ≤ M (t, x) =⇒ E[M∞ ] ≤ E[M (t, x)],

=⇒ E[M∞ ] ≤ lim E[M (t, x)] = 0; t→∞

7

see, e.g., Theorem VI.6 in [16], from which M∞ = 0 follows. That is, M (t, x) converges to zero almost surely. The almost sure exponential convergence follows from Theorem 1 in [10] or Theorem 5.8.1 of [8, p.190]. Remark 2.1. It is well-known that if the process x(t) is exponentially stable in mean square, then x(t) is also exponentially stable in the almost sure sense [10]. However this claim is not valid for exponential ultimate boundedness. For example, consider the following so-called Langevin equation in statistical physics dx(t) = −cx(t)dt + KdW (t) where constants c, K > 0 and W (t) is a one-dimensional Brownian motion. An asymptotic bound has been given by Example 2.6.4 in [12] as follows   1 2 sup |x(s)| = K 2 /c, a.s. lim sup t→∞ log t 0≤s≤t It implies that x(t) is not almost surely uniformly bounded with respect to t. That is, there does not exist a constant such that |x(t)| ≤ c, a.s. However x(t) is exponentially bounded in mean square since L |x(t)|2 = −2c |x(t)|2 + K 2 , which only implies that x(t) is finite almost surely. Actually, the limit process of the solution x(t) is a Gaussian and called Ornstein-Uhlenbeck  process  Rt process. The analytical expression of x(t) is exp(−ct) x0 + K 0 exp(cs)dW (s) ; see [3, p.135] for details. Next theorem gives an almost sure upper bound of V (t, x) and its Lyapunov exponent. Theorem 2.1. Consider the stochastic differential equation (2.1). Assume that there exist a 1,2 function V ∈ CA and positive numbers k1 , k2 such that LV (t, x) ≤ −k1 V (t, x) + k2

(2.16)

for all (t, x) ∈ R+ × Rn . Then for arbitrary constants  ∈ (0, k 1 /2), δ > 0, there exists a random instant t0 (ω, , δ, k1 ) such that for all t > t0 , we have V (t, x) ≤ k2 /k1 (2 exp(k1 δ) − 1) exp(t) + exp(2k1 δ) (E[V (0, x)] − k2 /k1 ) exp(−(k1 − 2)t) (2.17) almost surely. Furthermore the Lyapunov exponent lim sup t→∞ log V (t, x)/t ≤ 0. Proof. Define a function of V (t, x) as M (t, x) = (V (t, x) + k2 /k1 ) exp(−k1 t)

(2.18)

where the positive numbers k1 , k2 are given by (2.16). By Lemma 2.2, we can see that (2.18) transforms V (t, x) into a positive supermartingale with nice convergence properties. To establish (2.17), we will use the supermartingale inequality and the first part of Borel-Cantelli’s lemma. Let  ∈ (0, k1 /2) and δ > 0 be arbitrary. Define N (t, ) = 2k2 /k1 exp(−(k1 − )t) + (E[M (0, x)] − 2k2 /k1 ) exp(−(2k1 − )t), Nk = N (kδ, ),

k = 1, 2, . . . .

8

By using the supermartingale inequality (see, e.g., Theorem VI.1 in [16] and Theorem 1.8 in [14]) for the positive supermartingale M (t, x), we have for any c > 0 and  ∈ (0, k 1 /2), ! E[M ((k − 1)δ), x] P ω: sup M (t, x) ≥ c ≤ c (k−1)δ≤t≤kδ ! c=Nk−1

=====⇒ P ω :

sup

(k−1)δ≤t≤kδ

M (t, x) ≥ Nk−1

≤ exp(−δ(k − 1)).

(2.19)

The second inequality of (2.19) is due to the first inequality and (2.9). P In view of the Borel-Cantelli lemma, since ∞ k=1 exp(−δ(k − 1)) < ∞, one can see that for almost all ω ∈ Ω, sup

M (t, x) < Nk−1

(2.20)

(k−1)δ≤t≤kδ

holds for all but finitely many k. Hence there exists a k 0 (ω, , δ), except a P-null set, such that whenever k ≥ k0 (ω, , δ), inequality (2.20) holds with probability one. Since we let  ∈ (0, k1 /2), we have for t > ln 3/k1 , N (t, ) is monotonously decreasing in t. Meanwhile for t > t0 = ln 3/k1 + δ, there must exist a positive interger k for a given δ > 0 such that (k − 1)δ ≤ t < kδ. Then M (t, x) ≤

sup

M (t, x)

(k−1)δ≤t t 0 which guarantees that N (t, ) is monotonically decreasing. Hence we have for t ≥ max(t 0 , (k0 − 1)δ), M (t, x) ≤ exp(k1 δ) [2k2 /k1 exp(−(k1 − )t) + exp(k1 δ) (E[M (0, x)] − 2k2 /k1 ) exp(−2(k1 − )t)] (2.22) which gives an upper bound of M (t, x). (2.17) directly follows from (2.18) and (2.22). We now consider (2.22). Letting t → ∞ yields lim sup t→∞

log(V (t, x) + k2 /k1 ) exp(−k1 t) ≤  − k1 . t

Since  is arbitrary, we have 0 = lim sup t→∞

log k2 /k1 log(V (t, x) + k2 /k1 ) ≤ lim sup ≤ 0, t t t→∞

from which and lim inf log(V (t, x) + k2 /k1 )/t ≥ lim inf log(k2 /k1 )/t = 0, t→∞

t→∞

we have limt→∞ log(V (t, x) + k2 /k1 )/t = 0 and further lim supt→∞ log V (t, x)/t ≤ 0. 9

Remark 2.2. If k3 = 0 in (2.6), it follows from (2.9) that E[V (t, x(t))] ≤ (E[V (0, x0 )] − k2 /k1 )) exp(−k1 t) + k2 /k1 .

(2.23)

One can see that the term k2 /k1 defined by (2.10) is an asymptotic bound of E[V (t, x(t))]. By (2.17), we have shown that there exists an exponential upper bound for V (t, x). k 2 /k1 also characterizes the amplitude of the upper bound of V (t, x) in the sense of almost sure. In the next section, we will see that by these facts, the optimization of k 2 /k1 provides us with an optimal criterion for estimator designs. In Theorem 2.1, if an extra condition is introduced, by using the exponential martingale inequality, some tighter bounds can be obtained; see Section 2.6 in [12] for more details. We now are in a position to consider the exponential boundedness of discrete-time stochastic processes in mean square. Instead of the exponential term in (2.4) by using (1 − q)k , q < 1, the exponential ultimate boundedness of discrete-time stochastic processes can be defined similarly. Agniel and Jury [1] have given a sufficient condition for the exponential boundedness in mean square for discrete-time stochastic processes. We next recall their results [1] and also [22, 25]. In the next (second) section, we will use them to design nonlinear estimator for discrete-time systems. Theorem 2.2. [25, 1, 22] Consider a discrete-time stochastic process x k . Assume there exist a stochastic function Vk (xk ) and numbers 0 < k1 ≤ 1, k2 , k3 > 0 such that k3 kxk k2 ≤ Vk (xk )

(2.24)

and E[Vk+1 (xk+1 )|xk ] − Vk (xk ) ≤ −k1 Vk (xk ) + k2 ,

a.s.

(2.25)

Then the Lyapunov function Vk (xk ) and the stochastic process xk are exponentially bounded in mean square and bounded almost surely. More specifically, we have k3 E[kxk k2 ] ≤ (1 − k1 )k V (0) + k2 /k1

(2.26)

k3 kxk k2 ≤ k2 /k1 − W0

(2.27)

and also

where W0 is a finite random variable. Proof. We only need to establish (2.27) and see [25,1,22] for other parts of this theorem. Theorem 1 in [1] has shown that the submartingale −W k := k2 /k1 −Vk converges to a finite random variable W as k → ∞. Hence inf k≥0 (−Wk ) is finite almost surely. By (2.24), we have k3 kxk k2 ≤ k2 /k1 − inf (−Wk ). k≥0

(2.28)

Roughly speaking, k2 /k1 characterizes the upper bound of kx k k. Note that the rightmost term in (2.28) is dependent of k2 /k1 based on the definition of −Wk . From (2.26) and (2.27), one can see that the term k2 /k1 characterizes the bounds for exponential boundedness and boundedness of x k in the senses of mean square and almost sure respectively. This fact can be regarded as a theoretical support for the optimization of discrete-time nonlinear estimator. Note that (2.28) says that xk is finite and there exists a finite random variable as its upper bound. In this sense, xk is also called bounded in [22]. 10

3

Continuous-time adaptive state and parameter estimation

The system under consideration is described by a stochastic differential equation in the Itˆo form: dx(t) = f (x(t))dt + g(x(t))θdt + B(θ)dW (t)

(3.1)

dy(t) = h(x(t))dt + DdV (t)

(3.2)

where x(t) ∈ Rn×1 , θ ∈ Rp×1 , y(t) ∈ Rm×1 . W (t) and V (t) are independent standard Brownian motions (Wiener processes). B(θ) may be a function of x(t). We next assume that all involved matrices have compatible dimensions. The problem is to identify the unknown constant parameter θ based on the continuous observation y(t). We assume that the required nonlinear estimator has a form as follows ˆ dˆ x(t) = f (ˆ x(t))dt + g(ˆ x(t))θ(t)dt + L[dy(t) − h(ˆ x(t))dt] ˆ = Γη(ˆ ˆ dθ(t) x(t), θ(t))dt + K[dy(t) − h(ˆ x(t))dt].

(3.3)

Here the term dy(t) − h(ˆ x(t))dt plays a similar role as the innovation process in Kalman filter. L, K are unknown adaption gain matrices, Γ > 0 is a constant design matrix, and η(·, ·) ∈ R p×1 is an unknown adaptive law. ˜ = θ − θ(t). ˆ ˜ Let x ˜(t) = x(t) − x ˆ(t) and θ(t) Then for x ˜(t), θ(t), we have d˜ x(t) = dx(t) − dˆ x(t)

ˆ = f (x(t))dt − f (ˆ x(t))dt + g(x(t))θdt − g(ˆ x(t)) θ(t)dt + B(θ)dW (t) − L[h(x(t))dt + DdV (t) − h(ˆ x(t))dt]

ˆ = f (x(t))dt − f (ˆ x(t))dt + g(x(t))θdt − g(ˆ x(t)) θ(t)dt − L[h(x(t))dt − h(ˆ x(t))dt] " # h i dW (t) + B(θ) −LD dV (t)

ˆ ˜ ˜ (t) := f (x(t), θ, x ˆ(t), θ(t))dt + B(θ)d W ˜ = −Γη(ˆ ˆ dθ(t) x(t), θ(t))dt − K[dy(t) − h(ˆ x(t))dt]

(3.4)

ˆ = −Γη(ˆ x(t), θ(t))dt − K[h(x(t))dt − h(ˆ x(t))dt] + ˆ ˜ W ˜ (t). := η˜(ˆ x(t), θ(t))dt + Kd Let z(t) =

h

˜ x ˜(t) θ(t)

i0

h

0 −KD

i

"

dW (t) dV (t)

# (3.5)

. The error system (3.4)–(3.5) can be written more compactly by

dz(t) = ˜ (t) = where W

h

W (t) V (t)

"

˜ ˆ f(x(t), θ, xˆ(t), θ(t)) ˆ η˜(x(t), θ, xˆ(t), θ(t))

#

i0

is the standard Brownian motion.

dt +

"

˜ B(θ) ˜ K

#

˜ (t) dW

(3.6)

Assumption 3.1. There exist positive constants γ, γ 0 , . . . , γ7 and matrices A, C such that for any x, θ, (i) Boundedness. kB(θ)kF ≤ γ, kθk ≤ γ0 , kg(x)kF ≤ γ7 . 11

(ii) Lipschitz-like conditions. kg(x + δ) − g(x)k ≤ γ 1 kδk + γ2 and kf (x + δ) − f (x) − Aδk ≤ γ3 kδk + γ4 ,

kh(x + δ) − h(x) − Cδk ≤ γ5 kδk + γ6 .

The matrices A, B are constant matrices. All constants such as γ, γ 0 are chosen such that the deviations or the upper bounds in Assumption 3.1 should be made as small as possible [27]. Assumption 3.2. There exist a positive definite matrix P and matrices L, M such that Q := P (A − LC) + (A − LC)0 P +(γ0 γ1 + γ0 γ2 + γ3 + γ4 )P P + (γ5 + γ6 )P LL0 P

+ (I − M C)0 P P (I − M C) + (γ0 γ1 + γ3 + 2γ5 )I < 0. (3.7)

It is obvious that inequality (3.7) implies that (A, C) must be detectable. Theorem 3.1. With Assumptions 3.1 and 3.2, an adaptive nonlinear estimator is given by ˆ dˆ x(t) = f (ˆ x(t))dt + g(ˆ x(t))θ(t)dt + L[dy(t) − h(ˆ x(t))dt], ˆ = Γη(ˆ ˆ dθ(t) x(t), θ(t))dt + K[dy(t) − h(ˆ x(t))dt]

(3.8)

where x(t))P M, K = Γg 0 (ˆ

ˆ ˆ η(ˆ x(t), θ(t)) = (λmax (Q) − (γ5 + γ6 )Γ−1 KK 0 Γ−1 )θ.

(3.9)

The estimator is is mean-square exponentially ultimately bounded. Before we verify Theorem 3.1, we first present an inequality which is used to present the main result in Theorem 3.1. Lemma 3.1. Let M ≥ 0 be a matrix with compatible dimension. Then ˜ ≤ θ 0 M θ − 2θ˜0 (t)M θ(t). ˆ θ˜0 (t)M θ(t)

(3.10)

ˆ we have Proof. The proof is straightforward. Since θ˜ = θ − θ, ˜ = θ˜0 (t)M (θ − θ(t)) ˆ ˆ θ˜0 (t)M θ(t) = θ˜0 (t)M θ − θ˜0 (t)M θ(t).

(3.11)

Substituting the following inequality ˜ + θ0M θ 2θ˜0 (t)M θ ≤ θ˜0 (t)M θ(t)

(3.12)

into (3.11) yields ˜ = 2θ˜0 (t)M θ − 2θ˜0 (t)M θ(t) ˆ ≤ θ˜0 (t)M θ(t) ˜ + θ 0 M θ − 2θ˜0 (t)M θ(t) ˆ 2θ˜0 (t)M θ(t) from which (3.10) follows. The proof of Theorem 3.1. Consider a stochastic Lyapunov function candidate: " #" # h i P 0 x ˜ (t) 0 0 −1 ˜ ˜ = x V (˜ x(t), θ(t)) =x ˜ (t)P x ˜(t) + θ˜ (t)Γ θ(t) ˜0 (t) θ˜0 (t) ˜ 0 Γ−1 θ(t) , z 0 (t)P¯ z(t).

(3.13) 12

Then the differential generator LV with respect to (3.6) can be calculated by ! " # " # i h ˜ ˆ ˜ B(θ) f(x(t), θ, x ˆ (t), θ(t)) ∂V 1 ˜ ˜ 0 (θ) K ˜ 0 P¯ LV (˜ x(t), θ(t)) = LV (z(t)) = · + tr B ˜ ˜ ∂z 0 2 K η˜(x(t), θ, xˆ(t), θ(t)) " # h i f(x(t), ˜ ˆ θ, x ˆ (t), θ(t)) = 2˜ x0 (t)P 2θ˜0 (t)Γ−1 ˜ η˜(x(t), θ, xˆ(t), θ(t))  1 + tr (B(θ)B 0 (θ) + LDD 0 L0 )P + KDD 0 K 0 Γ−1 2 ˆ ˜ := 2˜ x0 (t)P f˜(x(t), θ, xˆ(t), θ(t)) + 2θ˜0 (t)Γ−1 η˜(x(t), θ, x˜(t), θ(t)) + c1   ˆ − L[h(x(t)) − h(ˆ = 2˜ x0 (t)P f (x(t)) − f (ˆ x(t)) + g(x(t))θ − g(ˆ x(t)) θ(t) x(t))]   ˆ − 2θ˜0 (t) η(ˆ x(t), θ(t)) + Γ−1 K[h(x(t)) − h(ˆ x(t))] + c1  ˜ = 2˜ x0 (t)P A˜ x(t) + [f (x(t)) − f (ˆ x(t)) − A˜ x(t)] + [g(x(t))θ − g(ˆ x(t))(θ − θ(t))]  − L[h(x(t)) − h(ˆ x(t)) − C x ˜(t)] − LC x ˜(t)   ˆ x(t), θ(t)) + Γ−1 K[h(x(t)) − h(ˆ x(t)) − C x ˜(t)] + Γ−1 KC x ˜(t) + c1 − 2θ˜0 (t) η(ˆ =x ˜0 (t)[P (A − LC) + (A − LC)0 P ]˜ x(t) + 2˜ x0 (t)P [f (x(t)) − f (ˆ x(t)) − A˜ x(t)] ˜ + 2˜ x0 (t)P [g(x(t)) − g(ˆ x(t))]θ + 2˜ x0 (t)P g(ˆ x(t))θ(t)

ˆ − 2˜ x0 (t)P L[h(x(t)) − h(ˆ x(t)) − C x ˜(t)] − 2 θ˜0 (t)η(ˆ x(t), θ(t)) − 2θ˜0 (t)Γ−1 KC x ˜(t) − 2θ˜0 (t)Γ−1 K[h(x(t)) − h(ˆ x(t)) − C x ˜(t)] + c1 (3.14)

 where c1 = 1/2 tr (B(θ)B 0 (θ) + LDD 0 L0 )P + KDD 0 K 0 Γ−1 . After introducing a matrix M , we have ˜ LV (˜ x(t), θ(t)) ≤x ˜0 (t)[P (A − LC) + (A − LC)0 P ]˜ x(t) + 2˜ x0 (t)P [f (x(t)) − f (ˆ x(t)) − A˜ x(t)] x(t))]θ + 2θ˜0 (t)g 0 (ˆ x(t))P (I − M C)˜ x(t) + 2˜ x0 (t)P [g(x(t)) − g(ˆ ˆ − 2˜ x0 (t)P L[h(x(t)) − h(ˆ x(t)) − C x ˜(t)] − 2 θ˜0 (t)η(ˆ x(t), θ(t))  + 2θ˜0 (t) g 0 (ˆ x(t)P M − Γ−1 K C x ˜(t) − 2θ˜0 (t)Γ−1 K[h(x(t)) − h(ˆ x(t)) − C x ˜(t)] + c1 .

(3.15)

Furthermore based on Assumption 3.1, inequality (3.10), and |x0 y| ≤ kxkkyk, we have the estimations for related terms in (3.15) as follows: 2˜ x0 (t)P [f (x(t)) − f (ˆ x(t)) − A˜ x(t)] ≤ 2kP x ˜(t)k k[f (x(t)) − f (ˆ x(t)) − A˜ x(t)]k ≤ 2kP x ˜(t)k(γ3 k˜ x(t)k + γ4 )

≤ γ3 (˜ x0 (t)P P x ˜(t) + x ˜0 (t)˜ x(t)) + γ4 (˜ x0 (t)P P x ˜(t) + 1)

=x ˜0 (t)[(γ3 + γ4 )P P + γ3 I]˜ x(t) + γ4 ,

(3.16)

0

2˜ x (t)P [g(x) − g(ˆ x)]θ ≤ 2kP x ˜(t)kkg(x(t)) − g(ˆ x(t))kkθk ≤ 2γ0 kP x ˜(t)k(γ1 k˜ x(t)k + γ2 )

≤ γ0 γ1 (˜ x0 (t)P P x(t) + x ˜0 (t)˜ x(t)) + γ0 γ2 (˜ x0 (t)P P x ˜(t) + 1)

= γ0 x ˜0 (t)[(γ1 + γ2 )P P + γ1 I]˜ x(t) + γ0 γ2 , 13

(3.17)

˜ 2θ˜0 (t)g 0 (ˆ x(t))P (I − M C)˜ x(t) ≤ x ˜ 0 (t)(I − M C)0 P P (I − M C)˜ x(t) + θ˜0 (t)g 0 (ˆ x)g(ˆ x(t))θ(t) ≤x ˜0 (t)(I − M C)0 P P (I − M C)˜ x(t) + θ 0 g 0 (ˆ x(t))g(ˆ x(t))θ ˆ − 2θ˜0 (t)g 0 (ˆ x(t))g(ˆ x(t))θ(t) x(t) + γ0 γ7 ≤x ˜0 (t)(I − M C)0 P P (I − M C)˜ ˆ − 2θ˜0 (t)g 0 (ˆ x(t))g(ˆ x(t))θ(t)

(3.18)

−2˜ x0 (t)P L[h(x(t)) − h(ˆ x(t)) − C x ˜(t)] ≤ 2kL 0 P x ˜(t)k k[h(x(t)) − h(ˆ x(t)) − C x ˜(t)]k ≤ 2kL0 P x ˜(t)k(γ5 k˜ x(t)k + γ6 )

≤ γ5 (˜ x0 (t)P LL0 P x ˜(t) + x ˜0 (t)˜ x(t)) + γ6 (˜ x0 (t)P LL0 P x ˜(t) + 1)

=x ˜0 (t)[(γ5 + γ6 )P LL0 P + γ5 I]˜ x(t) + γ6 , ˜ x(t)) − C x ˜(t)] ≤ 2kK 0 Γ−1 θ(t)k −2θ˜0 (t)Γ−1 K[h(x(t)) − h(ˆ k[h(x(t)) − h(ˆ x(t)) − C x ˜(t)]k 0 −1 ˜ x(t)k + γ6 ) ≤ 2kK Γ θ(t)k(γ5 k˜

(3.19)

˜ +x ≤ γ5 (θ˜0 (t)Γ−1 KK 0 Γ−1 θ(t) ˜0 (t)˜ x(t)) −1 0 −1 ˜ ˜ + γ6 + γ6 θ(t)Γ KK Γ θ(t)

˜ + γ5 x = (γ5 + γ6 )θ˜0 (t)Γ−1 KK 0 Γ−1 θ(t) ˜0 (t)˜ x(t) + γ6 (3.20) Using (3.10) again for the first term of (3.20) and then substituting these inequalities (3.16)(3.20) and (3.7) into (3.15), we have  ˜ LV (˜ x(t), θ(t)) ≤x ˜0 (t)Q˜ x(t) + 2θ˜0 (t) g 0 (ˆ x(t))P M − Γ−1 K C x ˜(t) 0 0 0 ˆ ˆ x(t), θ(t)) − 2θ˜ (t)g (ˆ x(t))g(ˆ x(t))θ(t) − 2θ˜ (t)η(ˆ

ˆ + c2 + (γ5 + γ6 )(θ 0 Γ−1 KK 0 Γ−1 θ − 2θ˜0 (t)Γ−1 KK 0 Γ−1 θ(t) ˜ ˜ ≤ λmax (Q)(˜ x0 (t)˜ x(t) + θ˜0 (t)θ(t)) − λmax (Q)θ˜0 (t)θ(t)  ˆ + 2θ˜0 (t) g 0 (ˆ x(t))P M − Γ−1 K C x ˜(t) − 2θ˜0 (t)g 0 (ˆ x(t))g(ˆ x(t))θ(t)   ˆ ˆ − 2θ˜0 (t)η(ˆ x(t), θ(t)) + (γ5 + γ6 ) θ 0 Γ−1 KK 0 Γ−1 θ − 2θ˜0 (t)Γ−1 KK 0 Γ−1 θ(t) + c2

(3.21)

where c2 = c1 + γ0 (γ2 + γ7 ) + γ4 + 2γ6 . Using (3.10) once again for the second term of the last inequality in (3.21), we have ˜ ˜ ˆ LV (˜ x(t), θ(t)) ≤ λmax (Q)(˜ x0 (t)˜ x(t) + θ˜0 (t)θ(t)) − λmax (Q)(θ 0 θ(t) − 2θ˜0 (t)θ(t))  ˆ + 2θ˜0 (t) g 0 (ˆ x(t))P M − Γ−1 K C x ˜(t) − 2θ˜0 (t)g 0 (ˆ x(t))g(ˆ x(t))θ(t)   ˆ ˆ − 2θ˜0 (t)η(ˆ x(t), θ(t)) + (γ5 + γ6 ) θ 0 Γ−1 KK 0 Γ−1 θ − 2θ˜0 (t)Γ−1 KK 0 Γ−1 θ(t) + c2 ≤

 λmax (Q) ˜ V (˜ x(t), θ(t)) + 2θ˜0 (t) g 0 (ˆ x(t))P M − Γ−1 K C x ˜(t) −1 max{λmax (P ), λmax (Γ )} ˆ ˆ + g 0 (ˆ ˆ − 2θ˜0 (t) η(ˆ x(t), θ(t)) − λmax (Q)θ(t) x(t))g(ˆ x(t))θ(t)  ˆ + (γ5 + γ6 )Γ−1 KK 0 Γ−1 θ(t) − λmax (Q)θ 0 θ + (γ5 + γ6 )θ 0 Γ−1 KK 0 Γ−1 θ + c2 . (3.22) 14

If we let ) g 0 (ˆ x(t))P M − Γ−1 K = 0 =⇒ ˆ ˆ ˆ + g 0 (ˆ ˆ + (γ5 + γ6 )Γ−1 KK 0 Γ−1 θ(t) η(ˆ x(t), θ(t)) − λmax (Q)θ(t) =0 x(t))g(ˆ x(t))θ(t) ( K = Γg 0 (ˆ x(t))P M , ˆ ˆ η(ˆ x(t), θ(t)) = (λmax (Q) − g 0 (ˆ x(t))g(ˆ x(t)) − (γ5 + γ6 )Γ−1 KK 0 Γ−1 )θ(t) then inequality (3.22) reduces to λ (Q) max ˜ ˜ V (˜ LV (˜ x(t), θ(t)) ≤ − x(t), θ(t)) + k2 −1 max{λmax (P ), λmax (Γ )} ˜ = −k1 V (˜ x(t), θ(t)) + k2 .

(3.23)

Here the constants k1 , k2 > 0 are defined by λ (Q) max , k1 = −1 max{λmax (P ), λmax (Γ )}  1 k2 = tr (B(θ)B 0 (θ) + LDD 0 L0 )P + KDD 0 K 0 Γ−1 + γ02 |λmax (Q)| + γ02 (γ5 + γ6 )kK 0 Γ−1 k2F 2 + γ4 + γ0 (γ2 + γ7 ) + 2γ6 . (3.24) Hence it follows from Theorem 1 in [28] (see Lemma 2.1) that ˜ ˜ E[V (˜ x(t), θ(t))] ≤ E[V (˜ x(0), θ(0))] exp(−k1 t) + k2 /k1 (1 − exp(−k1 t)) :≤ k3 exp(−k1 t) + k2 /k1 .

(3.25)

Since 2 ˜ ˜ E[k˜ x(t)k2 + kθ(t)k ] ≤ E[V (˜ x(t), θ(t))]/ min{λmin (P ), λmin (Γ−1 )},

(3.26)

the estimator (3.8) is mean-square exponentially ultimately bounded. Remark 3.1. The matrix M lets us avoid requiring that the matrix C has a full column rank for the existence of the adaptive estimator (3.8)–(3.9); c.f. [4]. Instead, we make a boundedness assumption on g(x) in Assumption 3.1. The full column rank assumption is removed by introducing M which is incorporated into the LMI. If C has a full rank, we can let M = (C 0 C)−1 C 0 . Then the term I − M C = 0 and the assumption on the boundedness of g(x) can be ruled out. A suboptimal criterion We now consider a suboptimal criterion which can be used to improve the performance of the estimator. Since 2 ˜ ˜ ≥ (λmax (Γ))−1 kθ(t)k ˜ ˜(t) + θ˜0 (t)Γ−1 θ(t) , V (˜ x(t), θ(t)) =x ˜0 (t)P x

it follows from (3.25) that 2 ˜ E[kθ(t)k ] ≤ λmax (Γ) (k3 exp(−k1 t) + k2 /k1 ) .

(3.27)

Note that Γ is given. Hence the asymptotic bound k 2 /k1 provides us with a performance criterion for the estimation error of the parameter θ in terms of expectation. It implies that by minimizing 15

k2 /k1 , we can obtain an optimization algorithm for the nonlinear estimator. Meanwhile from (2.17) in Theorem 2.1, we have 2 ˜ λmax (Γ)kθ(t)k ≤ k2 /k1 (2 exp(k1 δ) − 1) exp(t) + exp(2k1 δ) (E[V (z(0))] − k2 /k1 ) exp(−(k1 − 2)t)

where the related constants are referred to Theorem 2.1. Hence we can see that k2 /k1 also 2 in the sense of almost sure. By Jensen’s inequality, (E[k θk]) ˜ ˆ 2≤ characterizes the bound of kθ(t)k 2 2 ˜ ]. Hence k2 /k1 characterizes the bound of E[kθk] ˆ as well. 2kθk + 2E[kθk Let α > 0 be a given number. Then it follows from Q < −αI that |λ(Q)| > α. By (3.24), we have  k2 = 0.5 tr (B(θ)B 0 (θ) + LDD 0 L0 )P + KDD 0 K 0 Γ−1 + γ02 |λmax (Q)| + α1 kK 0 Γ−1 k2F + α0  ≤ 0.5 tr kB(θ)k2F P + D 0 L0 P LD + D 0 K 0 Γ−1 KD + tr(α1 Γ−1 KK 0 Γ−1 ) + γ02 |λmax (Q)| + α0 (3.28) where α0 = γ4 + γ0 (γ2 + γ7 ) + 2γ6 and α1 = γ02 (γ5 + γ6 ). Here we use the facts: tr(AB) = tr(BA),

kAkF =

p tr(A0 A)

and tr(AB) ≤ tr(A)tr(B) for positive semidefinite matrices. Substituting (3.9) into (3.28) yields k2 ≤ 0.5 tr γ 2 P + D 0 L0 P LD + g(ˆ x(t))Γg 0 (ˆ x(t))P M DD 0 M 0 P



x(t))g 0 (ˆ x(t))) + γ02 |λmax (Q)| + α0 + α1 tr(P M M 0 P g(ˆ  ≤ 0.5 tr γ 2 P + α2 D 0 M 0 P P M D + 2α3 P M M 0 P + 0.5tr(D 0 L0 P LD) + γ02 |λmax (Q)| + α0  = 0.5γ 2 tr P + 2α3 /γ 2 P M M 0 P + 0.5tr(α2 D 0 M 0 P P M D + D 0 L0 P LD) + γ02 |λmax (Q)| + α0 := 0.5γ 2 k¯ + 0.5k˜ + γ 2 |λmax (Q)| (3.29) 0

where α2 = tr(Γ)kg(x)k2F , α3 = α1 kg(x)k2F , and  k¯ = tr P + 2α3 /γ 2 P M M 0 P + 2α0 /γ 2 ,

k˜ = tr(α2 D 0 M 0 P P M D + D 0 L0 P LD).

Hence we have

k2 max{λmax (P ), λmax (Γ−1 )}k2 = k1 |λmax (Q)|

0.5γ 2 k¯ + 0.5k˜ + γ02 |λmax (Q)| |λmax (Q)|   −1 2 ˜ ≤ max{λmax (P ), λmax (Γ )} γ0 + 1/α(0.5γ 2 k¯ + 0.5k)   ˜ = 0.5γ 2 /α max{λmax (P ), λmax (Γ−1 )} 2αγ02 /γ 2 + k¯ + 1/γ 2 k)   ˜ ≤ 0.5γ 2 /α max{tr(P ), λmax (Γ−1 )} 2αγ02 /γ 2 + k¯ + 1/γ 2 k) = max{λmax (P ), λmax (Γ−1 )}

(3.30)

ˆ inequality (3.30) can be rewritten as Let kˆ = 2αγ02 /γ 2 + k¯ + 1/γ 2 k˜ > 0. Since tr(P ) ≤ k¯ ≤ k, k2 ˆ λmax (Γ−1 )}kˆ = 0.5γ 2 /α max{kˆ 2 , λmax (Γ−1 )k}. ˆ ≤ 0.5γ 2 /α max{k, k1 16

(3.31)

˜ we can obtain an sub-optimal From the last inequality, by minimizing kˆ or equivalently γ 2 k¯ + k, upper bound of k2 /k1 . In order to solve this optimal problem by LMI technique, we now introduce two matrices Z, W > 0 as follows 0.5γ 2 P + α3 P M M 0 P < Z, α2 D 0 M 0 P P M D + D 0 L0 P LD < W.

(3.32)

Then we can solve this optimal problems by using the solver mincx in Matlab LMI toolbox. Algorithm (i) Boundedness. The matrix inequality Q < −αI can be transformed into a matrix inequality by using Schur’s complements:   P (A − LC) + (A − LC)0 P + (η1 + α)I η2 P η3 P L (I − M C)0 P     η2 P −I 0 0   0, where α ≥ 0 and

η1 = γ0 γ1 + γ3 + 2γ5 ,

η2 =

√ γ 0 γ1 + γ 0 γ2 + γ 3 + γ 4 ,

Setting X = P L and Y = M 0 P yields an LMI as follows  P A − XC + A0 P − C 0 X 0 + η1 I η2 P η3 X   η2 P −I 0   0 η3 X 0 −I  P − Y 0C

0

η3 =

P − C 0Y

0

0

0 −I

√ γ5 + γ 6 .



   0, W > 0, " # √ 0.5γ 2 P − Z α3 Y < 0, √ α3 Y 0 −I   √ −W α2 Y D 0 X 0  √   α2 Y 0  < 0. −I 0   XD 0 −P

(3.35)

Solving LMIs (3.34)–(3.35) yields the (suboptimal) estimator gains L, K. We will illustrate how to use this procedure to optimal the performance of estimators by an example in Section 5.

4

Discrete-time adaptive state and parameter estimation

Let us consider the discrete-time nonlinear stochastic system, xk+1 = f (xk ) + g(xk )θ + B(θ)wk yk = h(xk ) + Dvk 17

(4.1)

where x ∈ Rn×1 , θ ∈ Rp×1 , y ∈ Rm×1 . The Gaussian noises wk , vk and the initial state x0 are mutually independent, and wk , vk have zero mean and bounded variances n w , nv . The problem is to identify the unknown constant parameter θ based on the observation y k . Unlike continuous-time estimators in Section 3, for discrete-time nonlinear systems, instead of the derivative of Lyapunov function calculated by the Itˆo formula, we use a difference-like function defined by (2.25) to design estimators. This function is actually a quadratic function. Therefor we cannot use the idea from adaptive control, that is, we cannot introduce one adaptive law term to cancel out some cross terms such that the derivative of Lyapunov functions has a form as (3.23) or (2.10). Instead, we next use the augmentation method to solve the adaptive state estimation problem for discrete-time stochastic systems. That is, we define the unknown parameter θ as a new state, then for the augmented system, we use Theorem 2.2 to design a full state estimator, from which we can obtain the estimate of the parameter θ provided that related LMIs are feasible. The required nonlinear estimator has a form as follows: " # " # x ˆk+1 f (ˆ xk ) + g(ˆ xk )θˆk xk )] (4.2) zˆk+1 = = + L[yk − h(ˆ θˆk+1 θˆk Here L is an (n + p) × m adaption gain matrix. Let x ˜ k = xk − x ˆk and θ˜k = θ − θˆk . Then the error system is given by " # " # " # " # x ˜k+1 f (xk ) + g(xk )θ f (ˆ xk ) + g(ˆ xk )θˆk B(θ) z˜k+1 = = − + wk − L[yk − h(ˆ xk )] θ˜k+1 θ θˆk 0 ˜ xk , θˆk ) + B(θ)w ˜ = f˜(xk , θ) − f(ˆ xk )] − LDvk . k − L[h(xk ) − h(ˆ

(4.3)

For Assumption 3.1, we here use ¯ F ≤ γ7 kg(x) − Bk

(4.4)

¯ is a non-zero constant matrix with a compatible dimension and instead of kg(x)kF ≤ γ7 , where B makes the upper bound γ7 as small as possible. Note that here the Frobenius matrix norm can be replaced by any compatible matrix norm with the Euclidean norm. Then we have

# " #"

¯ A B x ˜ k

˜ ˜ xk , θˆk ) −

f(xk , θ) − f(ˆ

0 I θ˜k

" #

f (x ) − f (ˆ ¯ θ˜k xk ) − A˜ xk + g(xk )θ − g(ˆ xk )θˆk − B k

=

0 ¯ θ˜k k ≤ kf (xk ) − f (ˆ xk ) − A˜ xk k + kg(xk )θ − g(ˆ xk )θˆk − B ¯ θ˜k k xk k + γ4 + k[g(xk ) − g(ˆ xk )]θ + [g(ˆ xk ) − B] ≤ γ3 k˜ ≤ γ3 k˜ xk k + γ4 + γ0 (γ1 k˜ xk k + γ2 ) + γ7 kθ˜k k ≤ (γ3 + γ0 γ1 )k˜ xk k + γ7 kθ˜k k + γ4 + γ0 γ2

" #

x

˜k ≤ γ8 zk k + γ9

+ γ9 = γ8 k˜

θ˜k 18

(4.5)

where γ8 =

√ 2 max{γ3 + γ0 γ1 , γ7 } and γ9 = γ4 + γ0 γ2 . Define " # h i ¯ A B A¯ = , C¯ = C 0 0 I

(4.6)

We will see latter that it is necessary to introduce the matrix B such that the matrix inequality (4.19) in Theorem 4.1 is solvable; otherwise (4.19) does not hold for any positive definite matrix P (see Remark 4.1). Then h i ˜ k , θ) − f˜(ˆ ¯ zk + f(x ¯zk − L [h(xk ) − h(ˆ ˜ z˜k+1 = (A¯ − LC)˜ xk , θˆk ) − A˜ xk ) − C x ˜k ] + B(θ)w k − LDvk ˜ := A0 z˜k + pk − Lqk + B(θ)w k − LDvk

(4.7)

where ¯ A0 = A¯ − LC,

˜ k , θ) − f˜(ˆ ¯zk , pk = f(x xk , θˆk ) − A˜

qk = h(xk ) − h(ˆ xk ) − C¯ z˜k .

We now apply Theorem 2.2 to (4.7) for the design of nonlinear estimator. Define a Lyapunov function Vk (zk ) as Vk (zk ) = z˜k0 P z˜k

(4.8)

where P is an unknown positive definite matrix. Using (4.7) and calculating ∆Vk , E[Vk+1 |˜ zk ] − Vk yield h i 0 ˜ ˜ ˜k − Vk ∆Vk+1 = E (A0 z˜k + pk − Lqk + B(θ)w − LDv ) P (A z ˜ + p − Lq + B(θ)w − LDv ) 0 k k k k k k k z   0 0 = z˜k0 A00 P A0 z˜k + E 2˜ zk A0 P pk − 2˜ zk0 A00 P Lqk − 2p0k P Lqk + p0k P pk + qk0 L0 P Lqk z˜k 0 ˜ 0 ˜ + E[(B(θ)w (4.9) k ) P B(θ)wk + (LDvk ) P LDvk ] − Vk . Here we use the statistical properties of the noises w k and vk . Then using similar inequalities as (3.16) for the term 2˜ zk0 A00 P pk and 1 −2˜ zk0 A00 P Lqk ≤ η˜ zk0 A00 P A0 z˜k + qk0 L0 P Lqk , η

η > 0,

(4.10)

we have   ∆Vk+1 ≤ z˜k Q˜ zk + E −2p0k P Lqk + p0k P pk + (1 + 1/η)qk0 L0 P Lqk z˜k 0 ˜ 0 ˜ + E[(B(θ)w k ) P B(θ)wk + (LDvk ) P LDvk ] + γ9

(4.11)

where Q = (1 + η)A00 P A0 − P + (γ8 + γ9 )A00 P P A0 + γ8 I.

(4.12)

Furthermore, by using (4.5) and the Lipschitz-like conditions in Assumption 3.1, we have pk P pk ≤ λmax (P )(γ8 k˜ zk k + γ9 )2 ≤ 2λmax (P )(γ82 k˜ zk k2 + γ92 )

xk k + γ6 )2 ≤ 2λmax (L0 P L)(γ52 k˜ xk k2 + γ62 ) qk0 L0 P Lqk ≤ λmax (L0 P L)(γ5 k˜

≤ 2λmax (L0 P L)(γ52 k˜ zk k2 + γ62 ) 1 −2p0k P Lqk ≤ p0k P pk + qk0 L0 P Lqk ,  > 0,  19

(4.13)

and also by the statistic properties of w k and vk , we have 0 ˜ 2 ˜ E[(B(θ)w k ) P B(θ)wk ] ≤ γ λmax (P )nw ,

E[(LDvk )0 P LDvk ] ≤ kDk2F λmax (L0 P L)nv .

Substituting these inequalities (4.13) and (4.14) into (4.11) yields   ∆Vk+1 ≤ z˜k0 Q˜ zk + 2(1 + ) λmax (P )γ82 k˜ zk k2 + λmax (P )γ92   + 2(1 + 1/η + 1/) λmax (L0 P L)γ52 k˜ zk k2 + λmax (L0 P L)γ62 + γ 2 λmax (P )nw + kDk2F λmax (L0 P L)nv + γ9 .

(4.14)

(4.15)

Now if the following conditions are satisfied, (1 + η)A00 P A0 − P + (γ8 + γ9 )A00 P P A0 + (ξ + ζ)I + γ8 I < −αI, 2(1 + )γ82 P < ξI,

2(1 + 1/η +

1/)γ52 L0 P L

α > 0, ξ > 0,

< ζIm×m ,

ζ > 0,

(4.16)

α Vk + k2 = −k1 Vk + k2 , λmax (P )

(4.17)

then zk ] − Vk < −αk˜ z k k2 + k 2 ≤ − E[Vk+1 |˜ where k1 = α/λmax (P ) and k2 = [2(1 + )γ92 + γ 2 nw ]λmax (P ) + [2(1 + 1/η + 1/)γ62 + kDk2F nv ]λmax (L0 P L) + γ9 > 0. (4.18) This implies that we may apply Theorem 2.2 to (4.3) for the mean-square boundedness and almost sure boundedness. We now summarize the above arguments so far in the following theorem. Theorem 4.1. Consider the discrete-time nonlinear stochastic system (4.1). If there exist positive numbers η, , α, ξ, ζ and matrices P > 0 and L such that the following inequalities (1 + η)A00 P A0 − P + (γ8 + γ9 )A00 P P A0 + (ξ + ζ)I + γ8 I < −αI, 2(1 + 1/η +

2(1 + )γ82 P 1/)γ52 L0 P L

(4.19)

< ξI,

(4.20)

< ζIm×m ,

(4.21)

hold, then the estimator (4.2) is exponentially bounded in mean square and bounded almost surely. Here A0 = A¯ − LC¯ and see Assumption 3.1 and (4.5)–(4.18) for related constant numbers and matrices. Proof. From (4.19), we have λmax (P ) > α since P − αI > 0. Hence 0 < k1 < 1. Inequality (4.20) and (4.21) is equivalent to the following inequalities 2(1 + )γ82 λmax (P ) < ξ,

2(1 + 1/η + 1/)γ52 λmax (L0 P L) < ζ

(4.22)

respectively. Hence it follows from (4.19) that (1 + η)A00 P A0 − P + (γ8 + γ9 )A00 P P A0 + 2(1 + )γ82 λmax (P )I + 2(1 + 1/η + 1/)γ52 λmax (L0 P L)I + γ8 I < (1 + η)A00 P A0 − P + (γ8 + γ9 )A00 P P A0 + (ξ + ζ)I + γ8 I < −αI

(4.23)

Hence all conditions required to apply Theorem 2.1 for the error system (4.3) are satisfied. The estimator (4.2) is exponentially bounded in mean square and bounded in almost sure sense. 20

Remark 4.1. In order to let k1 < 1 and use Theorem 2.2, by (4.15), a less conservative inequality may be Q + 2(1 + )γ82 λmax (P )I + 2(1 + 1/η + 1/)γ52 λmax (L0 P L)I < −αI

(4.24)

where Q is defined by (4.12). For specific systems, we may assume that P and L 0 P L have some special structures such as diagonal matrices, then it is possible that inequality (4.24) can be solved by using Matlab LMI toolbox. Another numerical method has been given in [27]; see (9), (26) and Figure 1 there. It is noted that γ 8 , γ5 must be small enough in (4.20) and (4.21); otherwise the related LMIs are infeasible. This can be see from (4.19) in which P must be large than ξI, ζI. ¯ must not be zero; otherwise (A, ¯ C) ¯ is not detectable, then (4.19) has Another constraint is that B not solutions. This constraint is from the augmentation method by extending θ as a state. Algorithm Inequalities (4.19)– (4.21) can be transformed into LMIs. Let X = L 0 P . Then   (γ8 + α + ξ + ζ)I − P a1 (A¯0 P − C¯ 0 X) a2 (A¯0 P − C 0 X)    < 0,  ¯ − X 0 C) ¯ a (P A −P 0 1   0 −I a2 (P A¯ − X 0 C) " # −ζI a3 X 2 2(1 + )γ8 P < ξI, < 0, (4.25) a3 X 0 −P p √ √ where P > 0, α > 0, and a1 = 1 + η, a2 = γ8 + γ9 , a3 = γ5 2(1 + 1/η + 1/). By solving the above LMI, we can obtain the estimator gain L. A suboptimal criterion Since  k2 λmax (P ) = [2(1 + )γ92 + γ 2 nw ]λmax (P ) + [2(1 + 1/η + 1/)γ62 + kDk2F nv ]λmax (L0 P L) + γ9 , k1 α (4.26) for a given α > 0, a suboptimal criterion for minimizing k 2 /k1 is min λmax (P ),

min λmax (L0 P L).

(4.27)

The solver gevp in LMI toolbox provides a suboptimal solution for the optimization (4.27) by solving the following generalized eigenvalue minimization under LMI constraints: min λ s.t.

P < λI, Z < λIm×m , " # −Z X < 0. X 0 −P

(4.28)

Solving (4.25) and (4.28) provides us a suboptimal estimator gain L.

5

Illustration Examples

In this section, we give two examples to demonstrate that the adaptive state estimators proposed in the last two sections are applicable for some continuous-time and discrete-time nonlinear stochastic systems. 21

Example 5.1. We first consider a continuous-time system: dx1 (t) = (−0.7x1 (t) + x2 (t) − (1 + 0.1 sin(x1 (t)))θ) dt + 0.1dW1 (t) dx2 (t) = (0.5x1 (t) − x2 (t) + 0.2 cos(x2 (t))θ) dt + 0.1dW2 (t) dy(t) = (4x1 (t) + 0.5 sin(4x1 (t)) + 2x2 (t)) dt + 0.1dV (t).

By this example, we illustrate how to use the suboptimal procedure to design the suboptimal estimator. If we directly use LMIs (3.34) and (3.35), the solver feasp of Matlab LMI toolbox gives us very small M1 as follows h i0 h i0 M1 = P M = Y 0 = 10−3 0.1143 0.1018 , L = 1.1466 0.829 .

One can see from (3.8) that the rightmost term describes the “steady” value of θ estimate. So roughly speaking, if the estimator gain K is small, then the stead value of the estimate θˆ is small too. For this example, we first choose the weighting Γ −1 = 0.05 in (3.13). Then using this Γ and the solver feasp of Matlab LMI toolbox to (3.34), we obtain " # h i h i0 0.8564 0.1871 Y = 0.19 0.0943 , L = 0.0112 0.3024 , P = . (5.1) 0.1871 0.5691 Substituting P and Y into (3.35) and using the function setmvar and the solver mincx give us h i0 Lsuboptimal = 0.0484 0.0741 .

Obviously, the upper bound k2 /k1 with Π1 = (P, Y, Lsuboptimal ) is less than k2 /k1 with Π2 = ˜ 2 ] and E[k˜ xk2 ] with the set Π1 is less than the (P, Y, L) in (5.1). Hence the upper bounds of E[kθk one with the set Π2 . This can be see from (3.26) for a fixed P and is also verified by simulations. Figure 1 shows the sample averages of the estimate of θ over 100 sample paths. One can see that the suboptimal average is better than the average for the boundedness case. We also compare the small noise case with the large noise. Here for the small noise case, we let the coefficients of all Brownian motions be 0.01. We can see from Figure 2 that the estimate of θ with small noise is better than with large noise. In simulations, all parameters in Assumption 3.1 are set as θ = 0.8, Γ−1 = 0.05,

γ = 0.1414, α = 0.01,

γ0 = 1, x0 = 0,

γ1 = 0.2, θ0 = 0,

γ2 = γ3 = γ4 = γ5 = 0, γ6 = 1, γ7 = 1.118 " # h i −0.7 1 seed=23, A = , C= 4 2 . 0.5 −1

The sampling period is 0.001s. We use Euler’s method to numerically solve all stochastic differential equations; see [6, p.186]. Example 5.2. The discrete-time system for simulation is xk+1 = 0.7xk − (1 + 0.1 sin xk )θ + wk yk = 4xk + 0.5 sin 4xk + vk

(5.2) (5.3)

The initial state x0 = 1, wk ∼ N (0, 0.01), and vk ∼ N (0, 0.02). The estimator based on LMIs does not need the statistical properties of noises. The true parameter θ = 0.8. The extended Kalman 22

The sample average of θ estimate over 100 sample paths

1.2

Sample average of suboptimal estiamte 1

Sample average

0.8

0.6

0.4

0.2

0

−0.2

0

2

4

6

8

10 Second

12

14

16

18

20

Figure 1: Sample averages of the suboptimal estimate and the bounded estimate for θ Parameter estimates with different noises

1.4

Suboptimal estimate with small noise 1.2 Suboptimal estimate with large noise

Sample paths of θ estimates

1

0.8

0.6

0.4

0.2

Boundedeness case with small noise

0

−0.2

0

2

4

6

8

10 Second

12

14

16

18

20

Figure 2: Sample paths of θ estimate with large and small noises filter is better than the estimator based on Lyapunov method if the initial state and variances of noises are known. With significant incorrect data including initial state and statistical properties of noises, the latter is better than the former. This can be seen from Figures 3 and 4. The estimator gain given by the solver gevp is h i0 L = 0.3822 −0.2072 . 23

Adaptive state estimation based on Lyapunov method

1.5

Parameter estimate 1

Estimation with incorrect initial values

0.5

0 Error in state estimation

−0.5 True state −1

−1.5 State estimate −2

−2.5

−3

0

50

100

150 Time instants

200

250

300

Figure 3: State estimation based on Lyapunov method with incorrect initial values: x 0 = 3, θ0 = 5 Extended Kalman filter

Estimation with incorrect initial values and covariance of process noise

5

4 Parameter estimate

3

2 Error in state estimation

1

0 State estimate −1 True state −2

−3

−4

0

50

100

150 Time instants

200

250

300

Figure 4: Extended Kalman filter with incorrect data: x 0 = 3, θ0 = 5, Nw = 5 In simulations, all parameters in Assumption 3.1 and (4.4) are set as γ = 1,

γ0 = 1,

state=0,

γ1 = 0.1,

x0 = 1,

θ = 0.8,

γ2 = γ3 = γ4 = γ5 = γ9 = 0, Γ−1 = 0.1,

α = 0.01,

24

γ6 = 1, γ7 = 0.1, γ8 = 0.1414 " # h i −0.7 1 A¯ = , C¯ = 4 0 . 0 1

6

Conclusion

In this paper, we have considered adaptive state estimation problems for a class of nonlinear stochastic systems with a linear-in-parameter structure based on stochastic counterparts of Lyapunov theory. The ultimately exponentially bounded state and parameter estimators in the sense of mean square have been obtained for both continuous-time and discrete-time nonlinear stochastic systems. The sufficient conditions are described in terms of LMIs. A significant improvement is that we also present a suboptimization procedure for the estimator gains. The global Lipschitz conditions in Assumption 3.1 are very restrictive, for example, x 2 function does not satisfy this condition. Hence as a future direction, local Lipschitz-like conditions or other nonlinearity descriptions should be pursued.

References [1] R.G. Agniel and E.I. Jury. Almost sure boundedness of randomly sampled systems. SIAM Journal on Control, 9(3):372–384, 1971. [2] B.D.O. Anderson and J.B. Moore. Optimal Filtering. Prentice-Hall, 1979. [3] L. Arnold. Stochastic Differential Equation: Theory and Application. John Wiley & Sons, 1974. [4] Y.M. Chow and R. Rajamani. A systematic approach to adaptive observer synthesis for nonlinear systems. IEEE Transactions on Automatic Control, 42:534–537, 1997. [5] D. Dochain. State and parameter estimation in chemical and biochemical processes: a tutorial. Journal of Process Control, 13:801–808, 2003. [6] T.C Gard. Introduction to Stochastic Differential Equations. Marcel Dekker, 1988. [7] I.I. Gihman and A.V. Skorohod. Stochastic Differential Equations. Springer-Verlag, 1972. Translation of the Russian edition, Keiv, 1968. [8] R.Z. Has’minskii. Stochastic Stability of Differential Equations. Sijthoff & Noordhoff, Alphen aan den Rijn, The Netherlands, 1980. Translation of the Russian edition, Moscow, Nauka, 1969. [9] B.H. Jansen and V.G. Rit. Electroencephalogram and visual evoked potential generation in a mathematical model of coupled cortical columns. Biological Cybernetics, 73:357–366, 1995. [10] F. Kozin. On almost sure asymptotic sample properties of diffusion processes defined by stochastic differential equations. Journal of Mathematics of Kyoto University, 4(3):515–528, 1964. [11] L. Ljung. Asymptotic behavior of the extended Kalman filter as parameter estimator for linear systems. IEEE Transactions on Automatic Control, 24(1):36–50, 1979. [12] X.R. Mao. Exponential Stability of Stochastic Differential Equations. Marcel Dekker Inc., 1994. Monographs and Textbooks in Pure and Applied Mathematics Series. 25

[13] X.R. Mao. Stochastic versions of the LaSalle theorem. Journal of Differential Equations, 153(1):175–195, 1999. [14] X.R. Mao and Yuan C.G. Stochastic Differential Equations with Markovian Switching. Imperial College Press, 2005. [15] R.K. Merha. On the identification of variances and adaptive kalman filtering. IEEE Transactions on Automatic Control, 42:534–537, 1970. [16] P.A. Meyer. Probability and Potentials. Blaisdell Publishing Company, 1966. [17] Y. Miyahara. Ultimate boundedness of the systems governed by stochastic differential equations. Nagoya Mathematical Journal, 47:111–144, 1972. [18] M.B. Nevelson and R.Z. Hasminskii. Stochastic Approximation and Recursive Estimation. American Mathematical Society, 1973. [19] Øksendal. Stochastic Differential Equations: an Introduction with Applications. SpringerVerlag, 6th edition, 2005. [20] R. Rajamani and J.K. Hedrick. Adaptive observer for active automotive suspensions: theory and experiments. IEEE Transactions on Control Systems Technology, 3:86–93, 1995. [21] Y. Rasis. Stochastic Observers for Nonlinear Dynamic Systems. PhD thesis, Washington University, 1974. [22] K. Reif, S. G¨ unther, E Yaz, and R. Unbehauen. Stochastic stability of the discrete-time extended Kalman filter. IEEE Transactions on Automatic Control, 44(4):714–728, 1999. [23] K. Reif, S. G¨ unther, E Yaz, and R. Unbehauen. Stochastic stability of the continuous-time extended Kalman filter. IEE Proceedings Control Theory & Applications, 147(1):45–52, 2000. [24] A.P. Sage and G.W. Husa. Adaptive filtering with unknown prior statistics. Joint Automatic Control Conference, pages 760–769, 1969. [25] T.J. Tarn and Y. Rasis. Observers for nonlinear stochastic systems. IEEE Transactions on Automatic Control, 21:441–448, 1976. [26] F. Wendling, F. Bartolomei, J.J. Bellanger1, and P. Chauvel. Epileptic fast activity can be explained by a model of impaired gabaergic dendritic inhibition. European Journal of Neuroscience, 15:1499–1508, 2002. [27] E Yaz and A. Azemi. Observer design for discrete and continuous non-linear stochastic systems. International Journal of Systems Sciences, 42:534–537, 1997. [28] M Zakai. On the ultimate boundedness of moments associated with solutions of stochastic differential equations. SIAM Journal on Control, 5(4):390–397, 1967. [29] M Zakai. Some moment inequalities for stochastic integrals and for solutions of stochastic differential equations. Israel Journal of Mathematics, 5(3):170–176, 1967.

26