TECHNICAL REPORT No. 289 - Semantic Scholar

Report 1 Downloads 73 Views
INTERVAL CENSORING, CASE 2: ALTERNATIVE HYPOTHESES

by

Jon A. Wellner

TECHNICAL REPORT No. 289

January 1995

Department of Statistics, GN-22 University of Washington Seattle, Washington 98195 USA

..

Interval Censoring,Case 2: Alternative Hypotheses Jon A. Wellner

1

Department of Statistics GN-22, University of Washington, Seattle, Washington 98195 January 31, 1995

"Interval censoring case 2" involves observation times (U, V) with distribution H concentrated on the set u ::; v and a time of interest X with distribution F. The goal is to estimate F based only on observation of i.i.d, copies of (l[x:$U] , l[U<x:$V], U, V). Groeneboom (1991) initiated the study of the nonparametric maximum likelihood estimator Fn of F; see Groeneboom and Wenner (1992), especially pages 43 - 50 and 100-108. Geskus (1992) and Geskus and Groeneboom (1994) have studied the estimation of smooth functionals (such as the mean of F) in case 2. Under hypotheses ensuring that the observations times U and V are close with (sufficiently) positive probability, Groeneboom (1991) showed that a one-step approximation F~l) to the nonparametric MLE satisfies

where Z is the last time where standard two-sided Brownian motion W minus the parabola y(t) = t 2 reaches its maximum. While it is conjectured in Groeneboom and Wellner (1992) that the nonparametric MLE Fn has this same behavior, this conjecture is still unproved. The goal of this paper is to explore alternative hypotheses under which U and V are not dose with high probability. Under these alternative 'hypotheses, the one-step approximation to the nonparametric MLE will be shown to converge at rate n- 1!3 rather than (nlogn)-1!3, much as in interval censoring case 1 (current status data). We will also briefly discuss the behavior of the one-step NPMLE with k > 2 observation points and estimators of smooth functionals. n.esearcn supported in part by National Science Foundation grant DMS-9108409, NIAID grant 2R01 AI291968-04, NATO grant NWO Grant B 61-238. AMS 1980 60B12. words and processes, hypotheses, interval censored data, mean, nonparametric

1. Interval Censoring: Models and Estimators. We begin with a review of interval censoring models starting with "case I" or "current status data" . Suppose that X "'" Fo is a "time of interest", and that U rv H is an "observation time". We will assume that X and U are independent random variables. Unfortunately we do not observe (X, U) but just (1[x:sUJ' U) == (A, U). Thus (AjU = u) rv Bernoulli(Fo(u)) and if H has density function h with respect to Lebesgue measure, then the joint density of (A, U) is

for 0 E {O, I}. The goal is to estimate the distribution function Fo, or functions of Fo such as the mean, based on observation of a sample (AI,U1 ) , ..• ,(An,Un ) i.i.d. as (A,U). Another commonly arising observaton scheme involves two observation times, and hence is called "case 2" interval censoring in Groeneboom (1990) and Groeneboom and Wellner (1992) (which we henceforth refer to as "GW (1992)"). Again X "'" Fo is a "time of interest", but now suppose that (U, V) rv H is independent of X where PH(U ~ V) = 1. In this case we observe not (X, U, V) but just

Clearly

(AjU

= u, V = v) "'" Multinomial-f I, (Fo(u), Fo(v) -

Fo(u),l- Fo(v)))

and if H has density h with respect to Lebesgue measure on RZ, then the joint density of (A, U, V) is given by

=

=

where Oi E {O, I} for i 1,2,3 and 01 +02+03 1. For an application of this case 2 model to data involving AIDS survival times (X = time from onset of AIDS to death) for 92 members of the U.S. Air Force, see Aragon and Eberly (1992). [This data set also suggests the need for regression methods for interval censored data. See Huang and Wellner (1994), Huang (1994a,b), Rabinowitz, Tsiatis, and Aragon for work in this direction.] Cases 1 2 to observation schemes involving several observation times in a ways. We describe natural obvious Suppose that X rv is "time "'" H are observation times with for j = 1, ... ,

we observe Uo == 0,

==

00.

U) where t::.j _ 1(Uj_I,Uj](X), j = 1, ... , k Then

+ 1 where

and ifH has density h with respect to Lebesgue measure on the subset {u E R k : S U1 S ... S Uk} C R k , then the joint density of (t::., U) is bziven bv

°

v

k+1

p(§..,Y:.; Fa) =

II {Feuj) -

F(Uj-1)}Ojh(Y:.)

j=l

=

=

where OJ E {O, I} for j 1, .. . ,k + l"and Lj~i OJ L Other models for interval censoring are also of interest: see e.g. Rabinowitz, Tsiatis, and Aragon (1993). Now we turn to a description of the Nonparametric Maximum Likelihood Estimators (NPMLE's) for these models. For a problem slightly more general than case 1, the NPMLE of Fo was described by Ayer, Brunk, Ewing, Reid, and Silverman (1955). The following characterization and computational method is from GW (1992), proposition 1.2, page 41: First, order the observation times as

and let t::.(l), ... , t::.(n) denote the corresponding t::.i'S. Plot (i, 2:;=1 t::.(j)), i = 1, . . . ,n and (0,0). Form the Greatest Convex Minorant (GCM) G* of these points. Then the NPMLE Fn of Fo is given by: Fn(U(i») is the leftderivative of the function G* at i, i 1, ... , n. For example, if n 5, U(.) = (1.2,1.8,2.1,3.0,3.5), and t::.(.) = (1,0,1,1,0), then F~(L2) = Fn(L8) = 1/2 and Fn (2.1) == Fr.(3.0) = Fn (3.5) = 2/3. The NPMLE does not specify where to place the remaining mass, and we will leave this undefined. Characterization of the NPMLE of Fo in case 2 was accomplished by Groeneboom (1991), and is given in Groeneboom's part of GW (1992), pages 43 - 50. To state Groeneboom's characterization, we need the following motivation and notation. The part of the log-likelihood for F divided by n in case 2 is given by

=

=

1 n

- 2.::: {t::. 1i log F(Ui) + t::.Zi 10g(F(lli) -

F(Ui)) + t::. 3;log(1 - F(lli))}

n i=l

+

log(F( v) is the = l, ... ,n.

F( u))

+

10g(1

the cumulative sums of the first derivatives with respect to F of i n ( F) is the process

-1 {

WF(t)- u O. Conjectured Theorem 4.1. Suppose that 0 < Fo(to), H(t o, ... ,to) < 1, and let F~1) be the estimator of Fo obtained at the first step of the iterative convex minorant algorithm. Suppose that fo(to) > 0 and hj(to, to) > 0 for j E J where J is nonvoid and

hj(t,t) == lim hj{t, v) 'Ult

is continuous in t in a neighborhood of to for all j E J. Then

where Z

(4.10)

rv

Chernoff, and where

2

b(to) == - ~ hj(to, to)/ fo(to). 3

jEJ

When k = 2 and J = {2}, then conjectured theorem 4.1 reduces to the statement of GW (1992), Theorem 5.3, page 100. Note that the conclusion of the conjectured theorem 4.1 can be restated as

1

To state our conjectured theorem for k observations points under hypotheses similar to those in section 2, we define

klj( u)

M r l;

hj( u, v)

d

Fo(v) - Fb(u) v,

and

'-l

k2jV) ( =

v D

hj( U, v)

.,

D (

o £o( t') - £0

U

for j = 2, ... , k where hj denotes the joint density of (Uj-l, Uj), j = 2, as before. We will suppose that all of the functions kmj, j = 2, m = 1,2, are finite, and moreover that with

)

d

U

,k , k,

and

we have

a

(4.12)

r

J( to ,to+t/ a]

kmj(u; w)du ..... 0

as

for each € > 0 and j = 2, .. " k, m = 1,2. Conjectured Theorem 4.2. Let Fo and H have densities fo and h respectively, and suppose that hI, hk , klj, and k2j, j = 2, ... , k are continuous at to and f( to) > 0 where hI, hk are the marginal densities of Us, Uk respectively. Suppose that (4.12) holds. Let 0 < Fo(to), H(to, to) < 1, l and let FA ) be the estimator of Fo, obtained at the first step of the iterative convex minorant algorithm. Then

where Z

rv

Chernoff, and where

(4.13)

5. Discussion and Further Problems. First a summary of the rationale for and possible advantages of the alternative hypotheses suggested here: • Under the alternative hypotheses we obtain limit theorems for the at a point (or at least the theoretical construct FP)) which are comparable with case 1, and give the possibility of addressing is gained by two observation times over one

• Study of the properties of smooth functionals such as EFoX under case 2 may be easier under the separation type hyptheses and certainly will be considerably easier under the "strict separation hypothesis" - U 2:: E) = 1. • Realism: in practice, separation of the observation times U and V may be forced by practical or economic considerations. • Mathematical completeness: we need to understand how these estimators behave on as much of the parameter space

e=

{(Fo, H) : Fo a d.f. on R+, H a d.f. on R+ 2S}

as possible. Despite the slow rates of convergence of the NPMLE or the onestep NPMLE in cases 1, 2, and k, smooth functionals such as means or other moment estimators, are sufficiently smooth to enjoy n- 1 / 2 rates of convergence; this has been shown in GW (1992) and Huang and Wellner (1995) for case 1, and seems very likely to be true in case 2; see Geskus (1992), Geskus and Groeneboom (1994), and Groeneboom (1995). It might be worth considering smooth functionals in case 2 (and k) under hypotheses sufficiently broad to include the "separated" observation times formulation formulated in sections 2 and 3. Here are a few problems suggested by the above development. A. Does the MLE itself have the same behavior as F~l) under the hypotheses of either theorem 2.1 or theorem 3.1? In other words, does the "working hypothesis" hold?

B. What is the behavior of the NPMLE of the mean and other smooth functionals under these hypotheses? [I conjecture that it will be easier . to prove.] C. What is the rate of convergence for (F, H) pairs "between" the alternative hypotheses of theorem 3.1 and those of theorem 2.1? Is there a way of unifying the various cases by using a random - norming? D. Are the conjectured theorems 4.1 and 4.2 true? Do they remain true for. the NPMLE Fn itself? 6. Proofs for Section 3. processes

=

Proceeding as in GW 1992), we introduce

where WFo and GFo are defined by GW (1992), page 45, (1.25), and page (1.29), respectively. The process V~o) is defined by

=

W~O)(t) +

r (Fo(tl) J[O,tJ

Fo(to)) dG~O)(tl),

t

2

o.

We have the following result for vJO) corresponding to GW (1992) lemma 5.5. Lemma 6.1. Suppose that the hypotheses of theorem 3.1 hold, and define the process UAO) by , t E R,

where UAO)(t) = 0, if t :S -ton- 1 / 3 . Then UAO) converges in distribution, in the topology of uniform convergence on compacta on the space of locally bounded real-valued functions on R, to the process U, defined by

t E R, where W is (standard) two-sided Brownian motion on R, originating from zero and C?(to) is as defined in (3.7). Proof. We first show that the process

(6.14) converges, m the topology of uniform convergence on compacta, to the process

(6.15)

t

1-+

vcz(to)W(t),

t

2

o.

To do this, we will use Kim and Pollard (1990), theorem 4.7, or equivalently lemmas 4.5 and 4.6. We first verify the hypotheses of lemma 4.5: note that t) = n Z / 3 png(·, to + n- 1 / 3t) for the family of functions {g(., tl)hl2: t O defined by

zi°>C

g( z , u, V,

t1)

(6.16)

==

1 1 u) - 1[to0

functions g(', t 1 ) are given for tl 2 0 in (6.16); for t 1 ::; 0 an obvious analogous expression holds. First note that an envelope function for gR is

==

Fo~ u) + 1(1.£<x:S;v] -:::--:--:---=:--:--:-} 1

l(to-R:S;u:::;to+R] { 1[x::S;u] +1[to-R