Parameter Estimation for Stochastic Evolution Equations with Non-commuting Operators Sergey V. Lototsky
Boris L. Rosovskii
y
Abstract
A parameter estimation problem is considered for a stochastic evolution equation on a compact smooth manifold. Unlike previous works on the subject, no commutativity is assumed between the operators in the equation. The estimate is based on nite dimensional projections of the solution. Under certain non-degeneracy assumptions the estimate is proved to be consistent and asymptotically normal as the dimension of the projections increases.
1 Introduction Parameter estimation is a particular case of the inverse problem when the solution of a certain equation is observed and conclusions must be made about the coecients of the equation. In the deterministic setting, numerous examples of such problems in ecology, material sciences, biology, etc. are given in the book by Banks and Kunisch [1]. The stochastic term is usually introduced in the equation to take into account those components of the model that cannot be described exactly. In an abstract setting the parameter estimation problem is considered for an evolution equation du(t) + (A + A )u(t)dt = dW (t); 0 < t T ; u(0) = u ; (1.1) where is the unknown parameter belonging to an open subset of the real line and W = W (t) is a random perturbation. If u is a random eld, then a computable estimate of must be based on nite dimensional projections of u even if the whole trajectory is observed. A question that arises in this setting is to study the asymptotic properties of the estimate as the dimension of those projections increases while the length T of the observation interval and the amplitude of the noise remain xed. When u is the solution of the Dirichlet boundary value problem in a domain of IRd this question was rst studied by Huebner et al. [4] and further investigated by Huebner and Rozovskii [5], Huebner [3], and Piterbarg and Rozovskii [11]. The main assumption used in all those works was that the operators A and A in (1.1) have a common system of eigenfunctions. 0
1
0
0
1
Institute for Mathematics and its Applications, University of Minnesota, Minneapolis, MN 55455. This work was partially supported by ONR Grant #N00014-95-1-0229 and the NSF through the Institute for Mathematics and its Applications. y Center for Applied Mathematical Sciences, University of Southern California, Los Angeles, CA 90089. This work was partially supported by ONR Grant #N00014-95-1-0229 and ARO Grant DAAH 04-95-1-0164.
1
The objective of the current paper is to consider an estimate of for equation (1.1) without assuming anything about the eigenfunctions of the operators in the equation. For technical reasons the equation is considered on a compact smooth d - dimensional manifold so that there are no boundary conditions involved. The main assumption is that the operators A and A are of dierent orders and the operator A + A is elliptic for all admissible values of . The model is described in Section 2 and the main results are presented in Section 3. If A is the leading operator, then the estimate of is consistent and asymptotically normal as the dimension K of the projections tends to in nity. On the other hand, if A is the leading operator, then the estimate of is consistent and asymptotically normal if order(A ) 21 (order(A + A ) ? d) (1.2) and the operator A satis es a certain non-degeneracy property. In particular, condition (1.2) is necessary for consistency. When (1.2) does not hold, the asymptotic shift of the estimate is computed. The proof of the main theorem about the consistency and asymptotic normality is given in Section 5. In Section 4 an example is presented, illustrating how the obtained results can be applied to the estimation of either thermodiusivity or the cooling coecient in the heat balance equation with a variable velocity eld. 0
1
0
1
1
0
1
1
0
1
1
2 The Setting
Let M be a d-dimensional compact orientable C1 manifold with a smooth positive measure dx. If L is an elliptic positive de nite self-adjoint dierential operator of order 2m on M , then the operator = L = m is elliptic of order 1 and generates the scale fHsgs2IR of Sobolev spaces on M [7, 13]. All dierential operators on M are assumed to be non-zero with real C1(M ) coecients, and only real elements of Hs will be considered. The variable x will usually be omitted in the argument of functions de ned on M . In what follows, an alternative characterization of the spaces fHsg will be used. By Theorem I.8.3 in [13], the operator L has a complete orthonormal system of eigenfunctions fek gk in the space L (M; dx) of square integrable functions on M . With no loss of generality it can be assumed that each ek (x) is real. Then for every f 2 L (M; dx) the representation X f= k (f )ek 1 (2
)
1
2
2
k1
holds, where
k (f ) =
Z M
f (x)ek (x)dx:
If lk > 0 is the eigenvalue of L corresponding to ek and k := lk= m , then, for s 0, P Hs = ff 2 L (M; dxq) P: k ksj k (f )j < 1g and for s < 0, Hs is the closure of L (M; dx) in the norm kf ks = k ksj k (f )j : As a result, every element f of the space Hs ; s 2 IR; 1 (2
2
2
1
1
2
2
)
2
2
It was shown in [5] that in the case of the Dirichlet problem in a domain of d , if the operators A0 and A1 are selfadjoint elliptic with a common system of eigenfunctions, then condition (1.2) is necessary and sucient for consistency, asymptotic normality and asymptotic eciency of the estimate. 1
IR
2
can be identi ed with a sequence f k (f )gk such that Pk ks j k (f )j < 1. The space Hs , equipped with the inner product X (f; g)s = ks k (f ) k (g); f; g 2 Hs; (2.1) 1
1
2
2
2
k1
is a Hilbert space. A cylindrical Brownian motion W = (W (t)) tT on M is de ned as follows: for every t 2 [0; T ], W (t) is the element of [sHs such that k (W (t)) = wk (t), where fwk gk is a collection of independent one dimensional Wiener processes on the given probability space ( ; F ; IF; P) with a complete ltration IF = fFtg tT . Since by Theorem II.15.2 in [13] k k =d; k ! 1; it follows that W (t) 2 Hs for every s < ?d=2. Direct computations show that W is an Hs - valued Wiener process with the covariance operator s. This de nition of W agrees with the alternative de nitions of the cylindrical Brownian motion [9, 14]. Let A, B, and N be dierential operators on M of orders order(A), order(B), and order(N ) respectively. It is assumed that max(order(A); order(B); order(N )) < 2m: Consider the random eld u de ned on M by the evolution equation du(t) + [ (L + A) + B + N ]u(t)dt = dW (t); 0 < t T; u(0) = u : (2.2) Here > 0; 2 IR; and the dependence of u and W on x and ! is suppressed. If the trajectory u(t); 0 t T; is observed, then the following scalar parameter estimation problems can be stated: 1). estimate assuming that is known; 2). estimate assuming that is known. Remark 2.1 The general model du(t) + [ A + A + N ]u(t)dt = dW (t); 0 < t T; u(0) = u is reduced to (2.2) if the operator A + A is elliptic of order 2m for all admissible values of parameters ; and order(A ) 6= order(A ). For example, if order(A ) = 2m, then L = (A + A)=2 + (c + 1)I; A = (A ? A)=2 ? (c + 1)I; B = A , where c is the lower bound on eigenvalues of (A + A)=2 and I is the identity operator. Indeed, by Corollary 2.1.1 in [7], if an operator P is of even order with real coecients, then the operator P ? P is of lower order than P . With obvious modi cations the results presented below are also valid when the operators A , A have the same order under an additional assumption that Ai = Li + A0i, i = 1; 2, where the operators Li are elliptic of order 2m with a common system of eigenfunctions and A0i are operators of lower order. 0
1
1
0
2
2
1
1
2
0
2
1
2
2
1
1
0
2
1
1
1
1
2
0
0
2
1
1
1
1
1
1
0
1
0
1
1
2
Before discussing possible solutions to the above parameter estimation problems, it seems appropriate to mention the analytical properties of the eld u. 2
Notation
ak
k means b
0
< c1
lim inf( k!1
ak =bk
) lim sup( k!1
3
ak =bk
)
c2
0. Then ellipticity of the operator L implies that for every s 2 IR there exist positive constants C and C so that for every f 2 C1 1
1
2
?(( (L + A) + B + N )f; f )s ?C kf ks m + C kf ks ; which means that the operator ?( (L + A) + B + N ) is coercive in every normal triple fHs m ; Hs ; Hs?m g. The statement of the theorem now follows from Theorem 3.1.4 in [12]. 1
2
1
1
+
2 +
2
2
2
2
3 The Estimate and Its Properties
Both parameter estimation problems for (2.2) can be stated as follows: estimate 2 from the observations of du (t) + (A + A )u (t)dt = dW (t): (3.1) Indeed, if is known, then A = B + N ; = ; = (0; +1), A = L + A and if is known, then A = (L + A) + N , = , = IR, A = B. All main results will be stated in terms of (2.2), and (3.1) will play an auxiliary role. It is assumed that the observed eld u satis es (3.1) for some unknown but xed value of the parameter . Depending on the circumstances, can correspond to either or in (2.2), the other parameter being xed and known. Even though the whole random eld u (t; x) is observed, the estimate of will be computed using only nite dimensional processes K u , K A u ; and K A u . The operator K used to construct the estimate is de ned as follows: for every f = f k (f )gk 2 [s Hs, 0
2
0
0
1
1
2
1
2
1
1
1
0
0
1
2
0
0
0
0
0
0
1
1
K f =
K X k=1
k (f )ek :
By (3.1),
dK u (t) + K (A + A )u (t)dt = dW K (t); (3.2) where W K (t) = K W (t). The process K u = (K u (t); Ft) tT is nite dimensional, continuous in the mean, and Gaussian, but not, in general, a diusion process because the operators A and A need not commute with K . Denote by P;K the measure in C([0; T ]; K(H )), generated by the solution of (3.2). The measure P;K is absolutely continuous with respect 0
1
0
0
0
1
4
to the measure P ;K for all 2 and K 1. Indeed, denote by FtK; the -algebra generated by K u (s); 0 s t; and let Ut;K (X ) be the operator from C([0; T ]; K(H )) to C([0; T ]; K (H )) such that for all t 2 [0; T ] and 2 , Ut;K (K u ) = E K (A + A )u jFtK; (P- a.s.) 0
0
0
0
1
Then by Theorem 7.12 in [8] the process K u satis es dK u (t) = Ut;K (K u )dt + dW~ ;K (t); K u (0) = 0; where W~ ;K (t) = PKk w~k (t)ek and w~k (t); k = 1; : : :; K; are independent o n K one dimensional standard Wiener processes in general dierent for dierent . Since (A + A )u ; W K is a Gaussian system for every 2 , it follows from Theorem 7.16 and Lemma 4.10 in [8] that ( dP;K (K u ) = exp Z T U ;K (K u ) ? U ;K (K u ); dK u (t) ? t t dP ;K Z ) 1 T kU ;K (K u )k ? kU ;K (K u )k dt : t t 2 By de nition, estimate (MLE) of is then equal to K likelihood ;Kthe maximum ;K ( u ), but since, in general, the functional Ut;K (X ) is not known arg max dP =dP explicitly, this estimate cannot be computed. The situation is much simpler if the operators A and A commute with K so that K Ai = K AiK , i = 0; 1; and Ut;K (X ) = K (A + A )X (t); in this case, the MLE ^K of is computable and, as shown in [5], RT K K K K ^ = ( A u R(tT); dK u (t) ? A u (t)dt) (3.3) k A u (t)k dt with the convention 0=0 = 0. Of course, expression (3.3) is well de ned even when the operators A and A do not commute with K , and if the whole trajectory u is observed, then the values of K A u (t) and K A u (t) can be evaluated, making (3.3) computable. Even though (3.3) is not, in general, the maximum likelihood estimate of , it is a natural estimate to consider. To simplify the notations, the superscript will be omitted wherever possible so that u(t) is the solution of (2.2) or (3.1), corresponding to the true value of the unknown parameter. To study the properties of (3.3), note rst of all that for all suciently large K , ZT (3.4) Pf kK A u(t)k dt > 0g = 1: Indeed, by assumption, the operator A is not identical zero and therefore K A Wt t is a R continuous nonzero square integrable martingale, while t K A [ (L + A) + B]u(s)ds t is a continuous process with bounded variation. It then follows from (3.3) and (3.4) that R T K A u(t); dW K (t)) ^K = + ( (P- a.s.) (3.5) R T kK A u(t)k dt =1
0
0
0
0
0
0
1
0
0
0
0
0
2 0
0
0
2 0
0
0
0
0
1
0
0
1
0
0
0
1
0
0
1
0
0
0
2 0
0
0
1
0
0
0
1
0
0
2 0
1
0
1
1
0
0
1
0
0
0
1
5
2 0
1
1
2
0
0
Representation (3.5) will be used to study the asymptotic properties of ^K as K ! 1. To get ZT a consistent estimate, it is intuitively clear that kK A u(t)k dt should tend to in nity as K ! 1, and this requires certain non-degeneracy of the operator A . 2 0
1
0
1
De nition 3.1 A dierential operator P of order p on M is called essentially non-degenerate if kP f ks "kf ks p ? Lkf ks p? for all f 2 C1 (M ); s 2 IR; with some positive constants "; L; . 2
2 +
2 +
(3.6)
If the operator P P is elliptic of order 2p, then the operator P is essentially non-degenerate because in this case the operator P P is positive de nite and self-adjoint so that the operator (P P ) = p generates an equivalent scale of Sobolev spaces on M . In particular, every elliptic operator satis es (3.6). Since, by Corollary 2.1.2 in [7], for every dierential operator P the operator P P ? PP is of order at most 2p ? 1; the operator P is essentially non-degenerate if and only if P is. Let us now formulate the main result concerning the properties of the estimate (3.5). Recall that the observed eld u satis es 1 (2 )
du(t) + [ (L + A) + B + N ]u(t)dt = dW (t); 0 < t T ; u(0) = u ; (3.7) with one of = or = known. According to (3.5), the estimate of the remaining parameter is given by RT K K K K ^ = ( (L + A)RuT(t); dK du(t) ? d ( B + N )u(t)) ; (3.8) k (L + A)u(t)k dt R T (K Bu(t); dK du(t) ? dK ( (L + A) + N )u(t)) K : (3.9) ^ = R T kK Bu(t)k dt The following assumptions will be in force throughout the rest of the section. H1. Equation (3.7) is considered on a compact d-dimensional smooth manifold M ; H2. > 0, 2 IR; H3. L is a positive de nite self-adjoint elliptic operator of order 2m; H4. max(order(A); order(B); order(N )) < 2m; H5. u is F -measurable, u 2 L ( ; H?d= ), and u is independent of W . 1
2
0 2
2
0
0 1
1
0 2
0
1
0
0 1
0
2
0
0
2 0
0
0 1
0
2 0
0 2
0
0
2
2
0
Theorem 3.2 If is known, then the estimate (3.8) of is consistent and asymptotically normal: P ? Klim j^K ? j = 0; !1 K; ( ? ^K ) !d N (0; 1); 0 1
2
1
where K;1
q = (T=(2 )) PKn ln. 0 1
1
0 1
1
=1
6
0 1
If 1 is known, then the estimate (3.9) of 20 is consistent and asymptotically normal under an additional assumption that the operator B is essentially non-degenerate and order(B) = b m ? d=2. In that case, P ? Klim j^K ? 20j = 0; !1 2 K;2(20 ? ^2K ) !d N (0; 1); where K;2
qP b?m =m K ln . n (
)
=1
This theorem is proved in Section 5.
Remark 3.3 1. Since lk k m=d, the rate of convergence for ^K is K; K m=d 2
for ^2K , it is
(
1
1
=
+1 2
, and
p b?m =d = if b > m ? d=2, K; K ln K if b = m ? d=2. 2. All the statements of the theorem remain true if, instead of dierential operators, pseudo are considered with > [7, 13]. dierential operators of class S; (
)
+1 2
2
Denote by the set of real valued non-negative functions h = h(x); x 2 IR; that are nondecreasing for x > 0 and satisfy h(0) = 0, h(?x) = h(x).
Theorem 3.4 Assume that Eku kq?d= < 1 for all q > 0. Let h 2 be a function so that jh(x)j C (1 + jxj ) for some C; > 0. Denote by g a Gaussian random variable with zero 0
2
mean and unit covariance. If 2 is known, then the estimate (3.8) of 10 satis es
^K ? ) = Eh(g ): lim E h ( K; K !1 1
0 1
1
If 1 is known, the operator B is essentially non-degenerate, and order(B) m ? d=2, then the estimate (3.9) of 20 satis es
^K ? ) = Eh(g ): lim E h ( K; K !1 2
0 2
2
The proof of Theorem 3.4 is based on the following result to be proved later.
Lemma 3.5 If P is an essentially non-degenerate operator of order p > m ? d=2 and s ZT
K = E
0
kK P u(t)k dt; 2 0
then for every q > 0 there exists a K0 = K0(q) > 0 so that
q R T K K (t) P u ( t ) ; dW K < 1: sup E R T K k P u(t)k dt K K 0
0
0
2 0
0
7
Proof of Theorem 3.4. With no loss of generality it can be assumed that the function h is continuous. Indeed, the monotonicity assumption implies that h has at most countably many discontinuities, while the random variables in question have densities with respect to the Lebesgue measure. After that the statements of the theorem 3.2, since follow^Kfrom0 Theorem Lemma3.5 implies that the families of random variables fh K;1 (1 ? 1 ) ; K K0( +1)g K 0 and fh K;2 (^2 ? 2 ) ; K K0( + 1)g are uniformly integrable.
2
Theorem s3.6 If is known and order(B) = b < m ? d=2, then the measures generated in C([0; T ]; H ); s < ?d=2; by the solutions of (3.7) are equivalent for all 2 IR and R T (Bu(t); dW (t)) K ^ (3.10) P ? Klim = + RT !1 kBu(t)k dt : Proof. By (2.4), ZT (3.11) E kBu(t)k dt < 1 0 1
2
0
0
0 2
2
2 0
0
2 0
0
ZT
for all 2 IR, and therefore the stochastic integral (Bu(t); dW (t)) is well de ned [9, 14]. Then (3.10) follows from (3.9) and the properties of the stochastic integral. Next, denote by P the measure generated in C([0; T ]; Hs); s < ?d=2; by the solution of (3.7) corresponding to the given value of . Inequality (3.11) implies that ZT kBu(t)k dt < 1 (P- a.s.) (3.12) 2
0
0
2
2
2 0
0
and therefore by Corollary 1 in [9] the measures P are equivalent for all 2 IR with the likelihood ratio dP (u) = dP (3.13) ZT ZT exp ( ? ) (Bu(t); dW (t)) ? (1=2)( ? ) kBu(t)k dt ; 2
2
2
0 2
2
0 2
0
0
0 2 2
2
0
where u(t) is the solution of (3.7) corresponding to = . Note that RT ^ = + R(TBu(t); dW (t)) kBu(t)k dt maximizes the likelihood ration (3.13). 0 2
2
2
0 2
2 0
0
0
0
2 0
2
If the operators A; B; N have the same eigenfunctions as L, then the coecients k (u(t)) are independent (for dierent k) Ornstein-Uhlenbeck processes and K Au(t) = K AK u(t), with similar relations for B and N . As a result, other properties of (3.8) and (3.9) can be established, including strong consistency and asymptotic eciency [3, 5, 11], and, in the case of the continuous time observations, all estimates are computable explicitly in terms of k (u(t)); k = 1; : : :; K . 8
In general, the computation of ^K and ^K using (3.8) and (3.9) respectively requires the knowledge of the whole eld u rather than its projection. Still, the operators K (L + A), K B, and K N have nite dimensional range, which should make the computations feasible. Another option is to replace u by K u. This can simplify the computations, but the result is, in some sense, even further from the maximum likelihood estimate, because some information is lost, and the asymptotic properties of the resulting estimate are more dicult to study. In general, the construction of the estimate depending only on the projection K u(t) is equivalent to the parameter estimation for a partially observed system with observations being given by (3.2). Without special assumptions on the operators A and A , this problem is extremely dicult even in the nite dimensional setting. 1
2
0
1
4 An Example Consider the following stochastic partial dierential equation:
du(t; x) = (Dr u(t; x) ? (~v(x); r)u(t; x) ? u(t; x))dt + dW (t; x):
(4.1)
2
It is called the heat balance equation and describes the dynamics of the sea surface temperature anomalies [2]. In (4.1), x = (x ; x ) 2 IR , ~v(x) = (v (x ; x ); v (x ; x )) is the velocity eld of the top layer of the ocean (it is assumed to be known), D is thermodiusivity, is the cooling coecient. The equation is considered on a rectangle jx j a; jx j c with periodic boundary conditions u(t; ?a; x ) = u(t; a; x ); u(t; x ; ?c) = u(t; x ; c) and zero initial condition. This reduces (4.1) to the general model (3.7) with M being a torus, d = 2, L = ?r = ?@ =@x ? @ =@x , A = 0, B = I (the identity operator), N = (~v; r) = v (x ; x )@=@x + v (x ; x )@=@x , = D, = . Then order(L) = 2 (so that m = 1), order(A) = 0, order(B) = 0 (so that b = 0), and order(N ) = 1. The basis fek gk is the suitably ordered collection of real and imaginary parts of np o gn ;n (x ; x ) = p1 exp ?1(x n =a + x n =c) ; n ; n 0: 4ac By Theorem 3.2, the estimate of D is consistent and asymptotically normal, the rate of convergence is K; K ; the estimate of is also consistent and asymptotically normal with p the rate of convergence K; ln K , since b = 0 = m ? d=2 and (3.6) holds. Unlike the case of the commuting operators, the proposed approach allows non-constant velocity eld. Still, a signi cant limitation is that the value of ~v(x) must be known. 1
2
2
1
1
2
2
1
2
1
2
2
1
1
2
2
1
2 1
2
2
1
2
2
2
1
1
2 2
2
1
2
1
1
2
1
2
1
1
2
2
1
2
1
2
5 Proof of Theorem 3.2 Hereafter, u(t) is the solution of (3.7) corresponding to the true value of the parameters ( and ) and C is a generic constant with possibly dierent values in dierent places. To prove the asymptotic normality of the estimate, the following version of the central limit theorem will be used. The proof can be found in [3]. 0 1
0 2
9
Lemma 5.1 If P is a dierential operator on M and RT K k P u(t)k dt = 1; P ? Klim R T !1 E kK P u(t)k dt 2 0
0
then
(5.1)
2 0
0
RT
K P u(t); dW K (t)) dt ( q lim = N (0; 1) K !1 E R T kK P uk dt 2 0
0
in distribution.
Once (5.1) and (5.2) hold and lim E K !1 the convergence
(5.2)
0
0
ZT 0
kK P uk dt = +1; 2 0
R T (K P u; dW K (t)) dt P ? Klim R T kK P uk dt = 0 !1 0
0
2 0
ZT follows. Thus, it suces to establish (5.1) and compute the asymptotics of E kK P uk dt for a suitable operator P . If k (t) := k (u(t)), then (3.7) implies d k (t) = ? lk k (t) ? k ( A + B + N )u(t) dt + dwk (t); k (0) = k (u ): 0
2 0
0
0 1
0 1
0 2
0
According to the variation of parameters formula, the solution of this equation is given by k (t) = k (t) + k (t), where Zt k (t) = e? lk t?s dwk (s); Zt ? l t k k (t) = k (0)e ? e? lk t?s k ( A + B + N )u(s) ds := k (t) + k (t): 0 1
(
)
0
0 1
0 1
(
)
0
0 1
2
0
1
If (t) and (t) are the elements of [s Hs de ned by the sequences fk (t)gk and fk (t)gk respectively, then the solution of (3.7) can be written as u(t) = (t) + (t). The following technical result will be used in the future. The proof is given in the Appendix. 1
Lemma 5.2 If a > 0 and f (t) 0, then Z T Z t 0
0
e?a(t?s)f (s)ds
2
1
R T f (t)dt dt a : 0
2
2
It is shown Z T in the next lemma that under certain conditions on ZtheT operator P the asymptotics of E kK P u(t)k dt is determined by the asymptotics of E kK P (t)k dt. 0
2 0
0
10
2 0
Lemma 5.3 If P is an essentially non-degenerate operator of order p on M and p m ? d=2,
then
E
ZT 0
kK P (t)k dt 2 0
N X k=1
lkp?m =m; K ! 1; (
(5.3)
)
R T kK P (t)k dt E = 0; lim R K !1 E T kK P (t)k dt R T kK P (t)k dt P ? Klim = 0; !1 E R T kK P (t)k dt R T kK P (t)k dt P ? Klim = 1: !1 E R T kK P (t)k dt 2 0 2 0
0
0
(5.4)
2 0
0
(5.5)
2 0
0
2 0
0
(5.6)
2 0
0
Proof.
Proof of (5:3). It follows from the independence of k (t) for dierent k that K X K X X E j k (P (t))j = E n (t)(en; P ek ) = k n k K X 1 X (1 ? e? lnt)j(en; P ek ) j : 2 l n k n 2
2
0
=1
=1
0 1
1
=1
1
0 2 1
0
2
Integration yields: ! ZT K X 1 X 1 ? l T K n ) j(e ; P e ) j : T ? (1 ? e E k P (t)k dt = n k 2 l 2 l 2 0
0
0 1
k=1 n1
0 1
n
2 10
n
0
2
Since lk > 0 and > 0, it follows that 1 ? e? lk T > 0 for all k. Then the last inequality and the de nition of the norm k ks imply ZT K K K X T X K P (t)k dt T X kP e k : k kP e kP e k k? m E k k?m ? C k ?m 2 k 2 k k 0 2 1
0 1
2
2
0 1
2
0
=1
=1
2 0
2
0 1
=1
Since P satis es (3.6),
kP k k?m "kek kp?m ? K kek kp?m? = "k p?m (1 ? (K=")?k ): 2
2
2(
2
)
2
In addition, kP ek kr C kekkr p and k = lk= m . The result (5.3) follows. Proof of (5:4). Consider rst (t) = f k(t)g. With the notation = 2(p ? m)=d, ZT K K X X E kN P (t)k dt C l1 Ej k (P u )j C k ?k p d= Ej k (P u )j : k k k 2
1 (2
2 +
0
0
Note that
0
2 0
)
0
0
k1
?k
p d=2) Ej
2( +
+1
2( +
2)
0
2
=1
=1
X
2
k (P u0)j
2
11
C Eku k?d= < 1: 0
2
2
(5.7)
If = ?1, then
R T kK P (t)k dt C Eku k?d= E lim lim R T K !1 E kK P (t)k dt K !1 ln K = 0: 2 0
0
0
0
2
2
2 0
0
If > ?1, then R T kK P (t)k dt PK k ? p d= Ej (P u )j E C k k k lim R Klim =0 K !1 E T kK P (t)k dt !1 K by (5.7) and the Kronecker lemma. Next consider (t) = f k (t)g. By assumptions, c := max(order(A); order(B); order(N )) < 2m: By Lemma 5.2, ZT ZT 1 j n (t)j dt ( l ) j n ( A + B + N )u(t) j dt; n which implies that for every r 2 IR, ZT ZT X rZ T n j n(P (t))j dt kP (t)kr dt C k (t)kr pdt n ZT ZT X C n r p j n(t)j dt C ku(t)kr? m c pdt: 2 0
0
0
2( +
+1
1
2)
0
2
+1
2 0
0
1
2
1
0
0 1
2
2( + )
n
0 1
0
0 2
2
2
1
0
2
2
0
2 +
1
0
2
1
0
2
2
1
0
1
=1
+ +
If c := 2m ? c > 0 and r = ?x where x = max(0 ; d=2 + c =2 + p + c ? 3m), then R T ?x ? 2m + c + p = m ? d=2 ? c =2 and, by (2.3), E ku(t)k?x? m c p < 1. As a result, since k k =d; E RRT kK P (t)k dt = PKn ?n xRnx E R T j n(P (t))j dt E Tx=d E T kK P (t)k dtx=d kPK P (t)k dt R T ? x CK n n E j n (P (t))j dt CK PK p?m ! 0 as K ! 1; E R T kK P (t)k dt 1
1
0
1
1
0
0 2
1
0
2 0
2 0 2
2
=1
2
0
1
0
2 0
+ +
2
2(
2 0
2
2
1
0
2
1
2
k=1 k
)
because if p ? m = ?d=2, then Pd=2 + c =2 + p + c ? 3m = ?c =2 < 0 so that x = 0, while for p ? m > ?d=2 the sum Nk k p?m is of order N p?m =d and 2(p ? m)=d + 1 > (d + 2(p ? m) ? c =2) = 2x=d. Equality (5.4) is proved. Then (5.5) follows from (5.4) and the Chebychev inequality. Proof of (5:6): There are two steps in the proof. Writing XK (t) := kK P (t)k , the rst step is to show that, for all t 2 [0; T ], K X var(XK (t)) C k p?m : (5.8) 2(
1
1
)
2(
=1
)
+1
1
2 0
4(
)
k=1
The second step is to show that (5.8) implies RT XK (t)dt = 1: P ? Klim !1 E R T X (t)dt 0
0
12
K
1). If XKM (t) := PKk j PMn n (t)(en; P ek ) j , then XKM (t) is a quadratic form of the Gaussian vector ( (t); : : :; M (t)). The matrix of the quadratic form is A = [Ann ]n;n ;:::;M with K X Ann = (en; P ek ) (en ; P ek) ; =1
0
=1
2
1
0
0
0
k=1
0
=1
0
0
and the covariance matrix of the Gaussian vector is ! ? ln t 1 ? e R = diag 2 lk ; n = 1; : : : ; M : Direct computations yield M 1 K X X (1 ? e? lnt)j(en; P ek ) j = trace(AR): EXKM (t) = 2 l n k n Analysis of the proof of (5.3) shows that for every t 2 [0; T ] and k = 1; : : : ; K the series X n(t)(en; P ek ) converges with probability one and in the mean square. Consequently, 2 10
0 1
=1
=1
2 10
0 1
0
2
0
n1
lim X M (t) = XK (t) (P- a.s.); M !1 K K X X M Ejn(t)j2 j(en; P ek )0j2 = EXK (t): lim EXK (t) = M !1 k=1 n1 Next,
(5.9)
X var(XKM (t)) = 2trace((AR) ) C l 1l Ann = n;n n n K K K X X X ~ j(P ek ; ek ) j k p?m kP~ ek k k p?m C k p?m ; 2
2
0
0
2
4(
)
0
0
0
2 0
4(
4(
)
)
k=1
k=1 k;k =1 := P ?2mP 2(m?p) is a bounded operator in H0. 0
where P~ After that, inequality (5.8) follows from (5.9) and the Fatou lemma: var(XK (t)) = E Mlim jX M (t)j ? jE Mlim X M (t)j = !1 K !1 K E Mlim jX M (t)j ? Mlim jEXKM (t)j lim inf EjXKM (t)j ? Mlim jEXKM (t)j !1 K !1 M !1 !1 K M (t)) C X p?m : lim inf var ( X K k M !1 2
2
2
2
4(
2
2
)
k=1
2). If YK := R T (XK (t) ? EXK (t))dt=E R T XK (t)dt then R T X (t)dt K R T E XK (t)dt = 1 + YK and R T (var(X (t))dt PK p?m T K EYK R T C PKk k p?m ! 0 as K ! 1: E XK (t)dt k k By the Chebychev inequality, P ? limK!1 YK = 0, which implies (5.6). 0
0
0
0
2
0
4(
)
2(
) 2
=1
2
=1
0
13
2
Corollary 5.4 If P is an essentially non-degenerate operator of order p on M and p m ?
d=2, then
E
ZT 0
kK P u(t)k2dt
and
K "T X p?m =m l ; K ! 1; k 2 k (
0 1
)
=1
R T kK P u(t)k dt P ? Klim = 1: !1 E R T kK P u(t)k dt 2 0
0
(5.11)
2 0
0
(5.10)
Proof. By the inequality j2xyj x + ? y , which holds for every > 0 and every real 2
1
2
x; y,
ZT ZT (1 ? )E kK P (t)k dt + (1 ? 1 )E kK P (t)k dt ZT E kK P u(t)k dt ZT ZT 1 K (1 + )E k P (t)k dt + (1 + )E kK P (t)k dt: Since is arbitrary, (5.10) follows from (5.4) and (5.3). After that, (5.11) follows from (5.6). 2 0
0
2 0
0
2 0
0
2 0
0
2 0
0
2
To prove the rst part of Theorem 3.2, it now suces to apply Lemma 5.1 and Corollary 5.4 with P = L + A; the non-degeneracy condition (3.6) holds with p = 2m; " = 1; = m ? order(A)=2, because kLf ks = kf ks m and, since the order of the operator AL is 4m ? 2, +2
(ALf; f )s = (? m? ALf; m? f )s k? m? ALf ks k m? f ks C kf ks m? : (2
(2
)
)
2
2
2 +2
Similarly, the second part of the theorem follows with P = B; now (3.6) is assumed. Analysis of the proof shows that K; "T : lim P b ? m =m K K !1 2 k lk 2
2
(
0 1
)
=1
2
6 Proof of Lemma 3.5. The following notation will be used:
= 2(p ? m)=d ?1 Since by Lemma 5.3
K 2
K X k=1
14
k ;
it follows that it is sucient to prove the inequalities q Z T K !q= X E K P u(t); dW K (t) C k k
2
0
0
and
E
ZT
!?q
kK P u(t)k2dt 0
0
(6.1)
=1
K !?q X C k
(6.2)
k=1
for all q > 0 and all suciently large K . The numbers C in the above inequalities do not depend on K but can depend on everything else, including q and T . By de nition, Z T K ZT X P u ( t ) dwk (t); K P u(t); dW K (t) = k 0
0
kK P u(t)k = 2
k=1
K X
k=1
0
j k P u(t) j ; 2
and for each t the coecients k P u(t) are Gaussian random variables. Indeed, denote by Ptf the solution of the equation dv(t) + ( (L + A) + B + N ) dt = 0; 0 < t T ; v(0) = f; The solution of (3.7) can then be written as Zt u(t) = Pt u + Pt?s dW (s) := u (t) + u (t); and the properties of the stochastic integral [12, Chapter 2] imply that k P u (t) are Gaussian random variables with zero mean and covariance Zt E k P u (t) m P u (t) = (PsP ek ; PsP em) ds := Akm(t): 0 1
0 2
0
1
0
2
2
2
2
0
0
Remark 6.1 For integers K and K > K denote by ak (K ; K ; t); 1 k K ? K + 1; the eigenvalues of the matrix [Akm(t); K k; mP K ]. If k are independent standard Gaussian random variables, then the random variable Kk K j k P u (t) j has the same distribution as 0
0
0
0
0
K ?X K0 +1 k=1
=
2
0
2
ak (K ; K ; t)k . This follows from the general properties of Gaussian random vectors. 2
0
Proof of 6.1. With no loss of generality it will be assumed that q = 2n is an even integer. By the Burkholder-Davis-Gandy inequality [6, Theorem IV.4.1]
n X ZTX K K ZT !n j k P u(t) j dt E k P u(t) dwk (t) C E k k 0 1 ZTX ZTX K K !n !n j k P u (t) j dt A : j k P u (t) j dt + E C @E 2
2
=1
0
0
0
k=1
1
=1
2
0
15
k=1
2
2
The properties of the operator Pt imply that !n ZTX ZT K !n n
E E kPtP u km?p?d= dt j k P u (t) j dt CK k K !q= X q n
C K Eku k?d= C k : 2
1
0
( +1)
0
=1
2
0
2
2
( +1)
0
2
k=1
Next, by the Holder inequality ZTX ZT X K K !n !n E j k P u (t) j dt C E j k P u (t) j : k=1
0
2
2
0
2
2
k=1
By Remark 6.1 and the multinomial expansion formula !n K K !n X X E j k P u (t) j = E ak (1; K ; t)k k k X n ! m a (1; K ; t) amKK (1; K ; t)E m KmK = m ! m ! K m mK n !n !n K K Zt X X (2n ? 1)!! ak (1; K ; t) = (2n ? 1)!! kPs P ek k ds k ! k q= K X C kek kp?m ; 2
2
2
=1
1+
=1
+
1
1
=
=1
2 1
1
=1
2
2
1
2 0
0
2
k=1
where the last inequality is a consequence of (A.4). Since kek kp?m = k p?m k , inequality (6.1) follows. Proof of 6.2. Note rst of all that the Jensen inequality implies 1?q 0Z K ZTX K !?q T X E j k P u(t) j dt E @ T= j k P u(t) j dtA k 0 k K 1 1 ZT ZT 0X K K ?q ?q X C E @ j k P u(t) j A dt = E @ j k P u (t) + k P u (t) j A dt; 2(
2
2
0
2
2
=1
T=2
)
2
=
0
T=2
k=K0
k=K0
2
2
1
and then in view of Lemma A.2 it is sucient to consider the case u = 0. According to Remark 6.1, if u = 0, then inequality (6.2) will follow from 0K?K 1?q X E@ ak(K ; K ; t)k A C (F (K ))?q ; T=2 t T; 0
0
0 +1
2
0
k=1
where
ln K; if = ?1 F (K ) = K
; if > ?1: Assume for the moment that, when ordered appropriately, the numbers ak (K ; K ; t) have the following property: there exist an integer K and a real number C > 0 so that for all K > K , 1 k K ? K + 1, and T=2 t T ak (K ; K ; t) C (k + K ) : (6.3) 1+
0
0
0
0
0
0
16
If (6.3) holds, then for all suciently large K 0K?K 1?q 0K= 1?q X X E@ ak (K ; K ; t)k A C E @ k k A ; 2
0 +1
2
0
k=1
2
k=1
and it remains to estimate the right hand side of the last inequality. Since for every non-negative random variable and every q > 0 Z1 1 ? q E = ?(q) tq? Ee?tdt; ?() is the Gamma function; 1
0
it follows that
K X
E
k=1
k k2
!?q
Z1
C
Z1 0
=C
so that
1