January 1982 LIDS-P-1174 OPTIMAL CONTROL ... - Semantic Scholar

Report 2 Downloads 31 Views
LIDS-P-1174

January 1982

OPTIMAL CONTROL AND NONLINEAR FILTERING FOR NONDEGENERATE DIFFUSION PROCESSES

by

Wendell H. Fleming* Lefschetz Center for Dynamical Systems Division of Applied Mathematics Brown University Providence, Rhode Island 02912

and

Sanjoy K. Mitter** Department of Electrical Engineering and Computer Science and Laboratory for Information and Decision Systems Massachusetts Institute of Technology Cambridge, Massachusetts 02139

September 9, 1981

*The research of the first author was supported in part by the Air Force Office of Scientific Research under AF-AFOSR 76-3063D and in part by the National Science Foundation, MCS-79-03554, and in part by the Department of Energy, DOE/ET-76-A-012295. **The research of the second author was supported by the Air Force Office of Scientific Research under Grant AFOSR 77-3281-D. Submitted to STOCHASTICS.

1. Introduction

We consider an n-dimensional signal process

(Xl(t),---,xn(t))

and a 1-dimensional observation process

x(t) =

y(t),

obeying

the stochastic differential equations (1.1)

dx = b[x(t)]dt + a[x(t)]dw

(1.2)

dy = h[x(t)]dt + dw, y(O) = 0,

with n, 1.

w, w independent standard brownian motions of respective dimensions (The extensions to vector-valued

y(t)

need only minor modifications.)

The Zakai equation for the unnormalized conditional density

dq = A qdt + hqdy,

(1.3) where

A

example.

q(x,t) is

t > O,

is the generator of the signal process x(t) .

See [3]

for

By formally substituting

q(x,t) = exp [y(t)h(x)]p(x,t)

(1.4)

one gets instead of the stochastic partial differential equation (1.3) a linear partial differential equation

(1.5)

with

Pt =

tr a(x)p

of the form

x

p(x,O) = p (x) the density of a(x) = a(x)a(x)'

+

x(O).

,

Vg(x,t)p,

Here

"u

=(P li

P

t > 0,

)

n tr a(X)Pxx p

aij(X)Px.x.

'

i,j=l Explicit formulas for

gY, VY

are given in

1

§6.

Equation (1.5) is the

basic equation of the pathwise theory of nonlinear filtering.

See [2] or

-2-

[9].

The superscript

y = y(.).

y

indicates dependence on the observation trajectory

Of course, the solution

p = pY

also depends on

y nx n

We shall impose in (l.l)the nondegeneracy condition that the matrix a, h, p0

a(x)

has a bounded inverse

will be stated later.

a

(x).

Certain unbounded functions

allowed in the observation equation (1.2). polynomial in

x = (Xl,---,Xn)

Other assumptions on

For example,

such that

h(x)J + o

as

h

h

b,

are

can be a Ix

+-o

.

The connection between filtering and control is made by considering the function

S =--log p. This logarithmic transformation changes (1.5) into

a nonlinear partial differential equation for below.

S(x,t), of the form (2.2)

We introduce a certain optimal stochastic control problem for

which (2.2) is the dynamic programming equation. In

§3

upper estimates for

S(x,t) as

ixi + -

are obtained, by

using an 'easy Verification Theorem and suitably chosen comparison controls. Note that an upper estimate for

S

A lower estimate for

Jx[ + X

S(x,t) as

gives a lower estimate for is obtained in

method from a corresponding upper estimate for

p(x,t).

applied to the pathwise nonlinear filter equation in

2. The logarithmic transformation.

§5

p = -log S.

by another

These results are

§6.

Let us consider a linear parabolic

partial differential equation of the form Pt =

(2.1)

2tra(x)p

+ g(x,t)

+ V(x,t)p,

t > 0,

p(x,O) = p (x). When

g = gY,

V = VY

this becomes the pathwise filter equation (1.5), to

§6.

which we return in

p E C2 '1 , i.e. with

"classical" solution

i,

j -

PX.

xx'

Pt

continuous,

,n.

l,

If

p(x,t) to (2.1) we mean a

By solution

p

is a positive solution to (2.1), then

S = -log p

satisfies

the nonlinear parabolic equation

(2.2)

St

=-Ly Qx)Sx

+

H(xt,S,)

t

>

0

S(x,O) = S (x) = -log p (x),

H(x,t,S)

Conversely, if

S(x,t)

= g(x,t)- S 1-S' a(x)Sx - V(x,t).

is a solution to (2.2), then

p = exp(-S)

is a

solution to (2.1). For example., if

This logarithmic transformation is well known. g = V = 0,

then it changes the heat equation into Burger's equation [8]. 0 < t < tl,

We consider x

P

[O,tl].

with

tl 7

We say that a function

with domain

is continuous and, for every compact

uniform Lipschitz condition on

K

fixed but arbitrary.

K E Rn ,

for 0 < t < t1.

satisfies a polynomial growth condition of degree if there exists

M

Throughout this section and

§3

MI(l+lxlr),

if

P(it) satisfies a 4

We say that r, and write

1 E

r'

, a

-1

all (x,t) E Q.

the following assumptions are made.

Somewhat different assumptions are made in

a

is of class -

such that

P(x,t)I
t0 > 0.

is best possible, and this is made in

We first consider

§5.

m > 1. By (2.3)-(2.6)

and (2.9), L(x,t,u) < Bl(l+lx 2m+1u12 )

(3.1)

so (x) < Bl(l+xlt) for some

B1.

x E R

Given

u(T), O < T < t.

Let

we choose the following open loop control

u(T) = n(T), where the components

ni(T), satisfy

the differential equation i = -(sgn Xi)lni m i = 1,---,n,

(3.2) with

(0O)= x.

From (2.7)

i(l) = ln() + C(T) ,

0 < T < t,

f

·

C = Since

a

n(

'for each

r.

By explicitly integrating

m > 1, that 2md




1,

g ECJ~

arbitrarily small.)---- Weassume that

_ a2 < -V(x,t) < A(

+ Ix

2m )

with

P < m,

-10-

for some positive

al, a2 , A

(4.5)

and that

gx E 0

3

S E C

We assume that

n A im

(4.7)

IS

for some positive Example.

Vx E

2m > 0, and

for some

.

(4.6)

m'

S (x) = + o

0

< CS

0

+

C2

C1, C 2 V(x,t) = -kV (x) + V1 (x,t) with

Suppose that

2m, k > 0, and

positive, homogeneous polynomial of degree polynomial in functions of < m-l in

x t

V (x) a Vl(x,t) a

of degree R1

Let

by (4.6).

R1

there exists

- x It

=

ll WI lt >

A3 =

II

with

lit

Al cA

U A

2

T)

For

.

P(A1 ) + P(A2 ) >

3

R 2 -R

.

IxI >R

3

S(x,t')

-

,

k m

EB1 4t (R2

Ix]

as

Ix

=

i

uk(O)dO

P(A3)


A

and hence

-I

For

Ixl>R 2

~2 -

R 1) P(A2 ) + XP(A 1) - ( t

S (x)

on

Rn .

+

3)

Since the right side does not This implies that

, uniformly for 0 < t < t

+

To obtain uniqueness, p(x,t) + 0

large enough,

1

satisfies the same inequality.

S

as

vk(T)

Since

Kk(t) I >R

a lower bound for

depend on

,

(R2 - R 1 )}

R 1 ) P(A2 ) < tEx

A 1,

Sk(x,t) > with

R 1 )}

From Cauchy-Schwarz

On

2

-

- x = vkT) + w(T), 0 < T < t,

4 (R2

Let

(R2

the sup norm on [O,t].

k(

S (x) > X

implies

and consider the events

A1 = {ik A2

IxI > R 1

such that

p = exp(-S) is a

uniformly for

2,1 C21

0 < t 1.

section we make the following assumptions.

(5.1)

U

'

, ax.bounded, a

1

for some

r > O.

For each

(5.2)

and

g

go gx.' i

Moreover,

gx

E

as

p(x,t).-In this a E C2

with

*,n,

j=l,

j

Moreover,

Q. 1

' gx.x gxr For each t j

Q.

For each

< m

are continuous on

4+

t

V

t

, V( ,t) E C

i j

V satisfies (4.4),

(5.3)

V i

and

i,

r

t , g(.,t) E C 2 .

,

for

We take

X.X.

-+

This is done by establishing 0

a corresponding exponential rate of decay to

S(x,t)

V, Vx, X. 1

Vx.x. X. . 1

E f

V, v

r

X

are continuous on

Q .

We assume that

j

and that there exist positive

B--; M-such that

P0 E C2

(5.4)

exp [xixlm+l][p Let

Theorem 5.1. p(x,t) - 0 6 >0

Jx| - o

as

such that

Proof.

+ Ip

(x)+ Jp.O (X)l 2,1

be a

p(x,t)

() I]

solution to (2.1) such that

C

0 < t < tl .

, uniformly for

is bounded on

exp[61fxm+l]p(x,t)

M

Then there exists

Q .

Let m+l

Then

(x,t) = exp [6P(x)]p(x,t).

,

P(x) = (1+1x2) 2

is a solution to.

f

'at =

(5.5)

axx+

2 tr

g = g-

g .

+ V'

x

6a x

V= V - 6g

·

+

( x

2

x

(

a

Following an argument in [10], equation (5.5) with initial data 0

= exp(6P)p0

1(x,t) = Ex{

(5.6) where

0(s.6) [X(t)]exp

= x.

This solution below. data

It

[- lgdw

dX = Q[X(T)]dw,

X(0O)

Then p

the probabilistic solution

-

1

Co-1-1 2 d

,VdT] +

satisfies

X(t)

(5.7)

wit

6> 0

has for small enough

i

is bounded and

p = exp(-6)ir

, and with

is a

g

a

In the integrands

T > 0,

and

V

are evaluated at (X(T),T).

We sketch the proof of these facts

C2'1 2,1

C

1

p(x,t) tending to

solution to (2.1), with initial 0

as

Ixl + o

uniformly for

-16-

O< t < t

.

p = p

By the maximum principle,

which implies that

exp [6ixim+l]p is bounded on' Q It remains to indicate why properties.

We have

*x

i

is a solution to (5.5) with the required

E

e

1 (4.4),

a

is bounded,

there exist positive

and

-1

By assumption

.

V

satisfies

1j '

gE E9

al, a2

,

p < m

.

Hence, for

6

small enough

such that 2m

V(X,t) < a Moreover, for some

2

- all

2mXI

K 1 2 |(5 (x)g(x,t)l < K(1 + jxlj 2),

< m .

From these inequalities one can get a bound

E(exp

for any

j > 0.

one gets that

, - gdw

la

g-2dt dl + Vd t)]

This gives a uniform integrability condition from which 2,1 C2 '

ir is a bounded

solution of (5.5) by the usual

technique of differentiating. (5.6) twice with respect to the components x 1 ,---,xn Since Corollary.

(5.8)

6.

of the initial state

x = X(0O).

This proves Theorem 5.1.

S = -log p, we get by taking logarithms: For some positive

6 ,

6

S(x,t) > 61 x

i+l -

Connection with the pathwise filter equation.

signal process in (1.1) satisfies for

~ E C

The generator --A- of-the---- -,-

-17-

=

tr a(Cx)xx + b(x)-.x

xx

x

The pathwise filter equation. (1.5) for

(6.1)

Pt

=

AY =

y (A

)p

p = pY

is

+ VY p, where

AQ - y(t)a(x)hx(xxx · )

VY(x,t)

.i

h(x)

2

1

y(t)Ah(x) + 2 y(t)

-

2

II(x)'a(x)hlx().

Hence, in (1.5) we should take n

(6.2)

-b + y(t)ah x + Y,

gY

Yj

=

Da.. ax. i=1 j xa

vY

(6.3)

= vY

div(b - y(t)ahx) +

axax axi ax

,j=

To satisfy the various assumptions about

above, suitable conditions on

cr, b, and

the local Ho3lder conditions needed in continuous on

[O,t].

h

g = gY,

Y V = V

must be imposed.

§4 we assume that

made

To obtain

y(-) is HI'lder

This is no real restriction, since almost all

observation trajectories

y(.)

are H6older continuous.

To avoid unduly complicating the exposition let us consider only the following special case.

made for the existence theorem in b, b x

§4.

We assume that

3 b E C

with

bounded, and all second, third order partial derivatives of

of class

2

for some

r.

Let

a polynomial of degree

L

, with

(6.4)

a = identity, an assumption already

We take

r

liirm

IXIc

h

be a polynomial of degree

h(x)[

=

,

lim S (x)

IXIfIc

= +c.

b

m and

S

Then all of the hypotheses in § 's 2-4 hold. polynomial growth of degree Vy

m-l

as

is the sum of the degree' 2m

polynomial growth of degree SY = -log pY .

Let

M1 ,M2

depend on

t > m+l .

-

2 h2(x)

and terms with

< 2m.

From Theorem 3.1 we get the upper bounds

SY(x,t) < M2 (l+Ixlm+), 0 < t

(ii)

we need

polynomial

has

, while in (6.3)

X

SY(x,t) < M(l+]x[P), O < t< t

(i)

where

Ix +

In (6.2), gY

y

.

For

p

, P = max(m+l,£)

< t < t

1,

m > 1

= exp(-S ) to satisfy (5.4)

The Corollary to Theorem 5.1 then gives the lower

bound SY(x,t)-> 6xlm +1

(6.6)

From (6.5)(ii) and (6.6) we see that Ix

m+ l

, at least for

0 < t
1

- 61

is a solution to the Zakai equation.

~ E Cb (i.e., Pcontinuous and bounded on

At(+): =

For

Rn ) let

(x)q(x,t)dx Rn

At(+) = E [x(t)]exp

(h[x()]dy - - lh[x(T)] dT)l , (y)

0

where

E

o

denotes expectation with respect to the probability measure

P

obtained by eliminating the drift term in (1.2) by a Girsanov transformation. The measure

At

t

is the unnormalized conditional distribution of

x(t)

By a result of Sheu [10] At=A t and hence

q(',t)is the density of

At

.

In fact, both. At , At are weak solutions to the Zakai equation. Moreover, ot o t 0

0-

EAt(1 ) = 1, EA t ( l)

< 1

The inequality is seen by approximating corresponding density Akt .

Then

qk(x,t)

h

hk

by bounded

with

of the unnormalized conditional distribution

(see [10])

EA.t(l) for any continuous

P

1,

Akt()+ At ( ) as k

with compact support.

Hence,

+ o

,

EAt ( 1)