Computing Chernoff's distribution - Semantic Scholar

Report 3 Downloads 52 Views
Computing Chernoff’s distribution Piet Groeneboom1 and Jon A. Wellner2 July 26, 1999

Abstract A distribution which arises in problems of estimation of monotone functions is that of the location of the maximum of two-sided Brownian motion minus a parabola. Using results of Groeneboom (1985), (1989), we present algorithms and programs for computation of this distribution and its quantiles. We also present some comparisons with earlier computations (Dykstra and Carolan (1996)) and simulations (Narayanan and Sager (1989), and Keiding, Begtrup, Scheike, and Hasibeder (1996)).

1

Research supported in part by National Science Foundation grant DMS-9743445 Research supported in part by National Science Foundation grant DMS-Research supported in part by National Science Foundation grant DMS-9532039 and NIAID grant 2R01 AI291968-04. AMS 1980 subject classifications. Primary: 62E17; secondary 65D20. Key words and phrases. Airy functions, Brownian motion, density, distribution function, monotone, parabolic drift, quadratic drift, quantiles. 2

1

1. Introduction Our goal here is to compute, table, and plot the density, distribution function, moments, and quantiles of the location Z of the maximum of two-sided Brownian motion B minus the parabola t2 . We also provide several examples of the application of this distribution to the construction of approximate confidence intervals in problems including interval censoring, monotone density estimation, and mode estimation. To be explicit, let B(t), −∞ < t < ∞, be two-sided standard Brownian motion with B(0) = 0. Then Z ≡ argmaxt (B(t) − t2 ) . It follows from Lemma 2.6 of Kim and Pollard (1990) that Z is uniquely defined with probability 1. The distribution of Z apparently first arose in work of Chernoff (1964) on the estimation of the mode of a distribution function, and hence we refer to the distribution of Z as Chernoff’s distribution. Prakasa Rao (1969) showed that the distribution of the slope at zero of the greatest convex minorant of B(t) + t2 is exactly 2Z. This follows from the “switching relation”; see Groeneboom (1985), (2.2), page 541 for the finite sample version of this relation. Groeneboom (1985), (1989) completely described the distribution of Z and characterized the process V (a) ≡ sup{t ∈ IR : B(t) − (t − a)2 is maximal} . In particular, Z has a density fZ with respect to Lebesgue measure on IR which is symmetric about zero, and which satisfies fZ (z) ∼

1 2

44/3|z| a1|z|) exp(− 23 |z|3 + 21/3˜ Ai0 (˜ a1 )

as

z → ∞.

where ˜a1 ≈ −2.3381 is the largest zero of the Airy function Ai and where Ai0 (˜ a1 ) ≈ 0.7022. In unpublished notes, Groeneboom and Sommeyer (1984) numerically computed the absolute moments E(|Z|k ), k = 1, . . ., 4. The first of these was reported by Devroye and Gyorfi (1985), page 214; and all four of them were reported by Keiding, Begtrup, Scheike, and Hasibeder (1996). Note that by symmetry of fZ it follows that E(Z k ) = 0 for k odd.

2. Applications and Examples. Here we present several examples showing how the distribution of Z enters in the construction of confidence intervals in several different statistical problems. Example 1. (Interval censoring, case 1). Suppose that X is a “survival time” with distribution function F on [0, ∞) and Y is an observation time which is independent of X and has distribution function G on [0, ∞). However we can observe only (Y, 1[X≤Y ] ) = (Y, ∆), and want to estimate F , the distribution function of X based on i.i.d. replications (Y1 , ∆1), . . . , (Yn , ∆n) of (Y, ∆). In b n of F is known from Ayer, Brunk, Ewing, Reid, and Silverman (1955). this case the NPMLE F 2

It was proved in Groeneboom and Wellner (1992) that if F has density f and G has density g at t0 ∈ (0, ∞) with g(t0) > 0, f (t0 ) > 0, then 1/3  1 1/3 b n (Fn (t0 ) − F (t0 )) →d 2Z . F (t0 )(1 − F (t0 ))f (t0)/g(t0) 2 Thus from Table 2 in section 2 below, it follows that an asymptotic 95% confidence interval for F (t0 ) is given by 1/3  −1/3 1 b b b b Fn (t0 )(1 − Fn (t0))f (t0 )/b Fn (t0) ± n g(t0 ) 2 · (.99818) 2 g(t0 ) are any consistent estimators of f (t0 ) and g(t0) respectively; e.g. based on where fb(t0 ) and b b n and G b n (t) ≡ n−1 Pn 1[Y ≤t] . kernel smoothing of F i=1 i In the particular application discussed by Keiding et al. (1996), Xi represents “age of immunization” of individual i against rubella, Yi represents “current age” of person i. Example 2. (Mode Estimation; Venter’s estimator). Suppose that X1 , . . ., Xn are i.i.d. with unimodal density f satisfying 1 1 f (x) = γ0 − γ(x − θ)2 + γ3(x − θ)3 + o(|x − θ|3 ) . 2 6 If we take rn = An4/5 , then as shown by Venter (1967), his estimator θbn of the mode θ satisfies n1/5(θbn − θ) →d 21/3A−2/3 γ −2/3γ0 Z . Thus, if γˆ0 and γˆ are consistent estimators of γ0 and γ respectively, then θbn ±

21/3γˆ0 · (.99818) n1/5A2/3γˆ 2/3

yields an approximate 95% confidence interval for the mode θ. Narayanan and Sager (1989) give several nice examples of mode estimation via both Chernoff’s estimators and Venter’s estimators and their (simulated) quantiles of the distribution of Z to form confidence intervals; see especially pages 46 - 50. Example 3. (Panel Count Data). Wellner and Zhang (1998) show that a pseudo-likelihood b n of the mean function Λ of a counting process with “panel count” data satisfies estimator Λ  2 1/3 σ (t)Λ0 (t) 1/3 b ps n (Λn (t) − Λ(t)) →d 2Z 2G0 (t) P Pk 0 b 0 , and G b 0 (t) are consistent estimators ˆ 2 (t), Λ where G0 (t) = ∞ k=1 P (K = k) j=1 Gk,j (t). Thus if σ of σ 2(t), Λ0 (t), and G0 (t) respectively, then ( )1/3 2 (t)Λ b 0 (t) σ ˆ 1 b ps Λ · 2 · (.99818) n (t) ± 1/3 b 0 (t) n 2G 3

yields an approximate 95% confidence interval for Λ(t). For a rather different approach to examples of the type presented here, see Politis and Romano (1994), especially their example 2.1.1, pages 2035-2036.

3. Computation of the density fZ and distribution function FZ . The density fZ can in principle be found by solving the following partial differential equation (heat equation), given in Chernoff (1964): ∂2 ∂ u(t, x) = − 12 2 u(t, x), ∂t ∂x

(3.1)

for x ≤ t2 , under the boundary conditions: def u(t, t2) = lim u(t, x) = 1,

lim u(t, x) = 0, t ∈ IR.

x↑t2

x↓−∞

(3.2)

In terms of the (smooth) solution u(t, x), the density fZ is given by fZ (t) = 12 u2 (−t)u2 (t), t ∈ IR,

(3.3)

where (as in Groeneboom (1985), the function u2 is defined by u2 (t) = lim

x↑t2

∂ u(t, x), t ∈ IR. ∂t

(3.4)

In fact, the original computations of the density were based on a numerical solution of this differential equation (this information is based on personal communications from professors Herman Chernoff and Willem van Zwet). The trouble with this approach is the behavior of the function u2 for negative values of t. In fact, since, by (4.25) in Groeneboom (1985),  u2 (t) ∼ c1 exp − 23 |t|3 − c|t| , t → −∞, where c ≈ 2.9458 . . . and c1 ≈ 2.2638 . . ., the function u2 tends to zero extremely rapidly, as t decreases away from zero. Some experiments with the numerical approach by the first author, in cooperation with B. Sommeyer, back in 1984, showed that this simple analytic fact invalidates any direct numerical approach, based on the partial differential equation: an error analysis showed that even with very fine grids the numerical solution was highly unstable. For this reason a more thorough analytic analysis of the problem was made, and the results of this analysis are given in Groeneboom (1985) and Groeneboom (1989) The following development is from Groeneboom (1985), section 4, pages 548 - 553. Define a function p : [0, ∞) → R as follows:  p π P∞ P∞ if y ∈ [0, 1]  − 2 k=0 ak y 3k + k=1 bk y 3(k−1/2) , p(y) = √ P  1/3 a ˜k y) , if y ∈ (1, ∞) . −y −3/2 + s 2π exp(−y 3 /6) ∞ k=1 exp(2 4

Here the a˜k ’s are the zeros of the Airy function Ai, and ak , bk are defined recursively as follows: set c0 = 1 and (2n − 3)(2n + 1) n = 1, 2, . . . . cn = −2−4 cn−1 , n2 (2n − 1) The recursive relations for the coefficients ak and bk follow from the integral equation (4.14) in Groeneboom (1985). The integral equation leads to an accurate and useful analytic representation of the density in a neighborhood of zero, whereas the expansion on the second line of (3.5) does similar job away from zero. Then with a0 = 1, b1 = 2/3, and B(p, q) ≡ Γ(p)Γ(q)/Γ(p + q), the standard Beta function, set an = cn −

n−1 X k=0

bn =

n−1 X k=0

(−1/2)k bn−k B(3n − 2k − 1/2, k + 3/2) , πk!

n = 1, 2, . . . ;

(−1/2)k an−k−1 B(3n − 2k − 2, k + 3/2) , k!

n = 2, 3, . . . .

The reason for treating the intervals [0, 1] separately is that the series using the zeros of the Airy function diverges at zero and gives a bad approximation in neighborhoods of zero. We then define g : R → R by Z ∞ 1 g(x) = 2x − √ p(y) exp(− 12 y(2x + y)2)dy 2π 0 r Z 2 ∞ {(2x + y 2 )y 2 + 12 (2x + y 2 )2} exp(− 12 y 2(2x + y 2 )2) dy +2 π 0 if x ∈ [−1, ∞), and g(x) = exp( 23 x3)41/3

∞ X

exp(−21/3a ˜k x)/Ai0 (˜ ak )

k=1

if x ∈ (−∞, −1]; here Ai0 is the derivative of the Airy function Ai. The reason for using y 2 in the integrand of the first part of the definition of g instead of y as in Groeneboom (1985), is purely √ numerical: the present change of variables avoids a factor of y in the denominator of the integrand. Finally, the density fZ is expressed in terms of g as fZ (z) = 12 g(z)g(−z), The distribution function FZ of Z is simply Z z Z fZ (w)dw = 12 FZ (z) = −∞

z ∈ (−∞, ∞) .

z

g(w)g(−w)dw , −∞

z ∈ (−∞, ∞) .

Because of the symmetry of fZ about 0, it suffices to calculate Z z Z z 1 fZ (w)dw = 2 g(w)g(−w)dw , z ∈ [0, ∞) . 0

0

5

0.7 0.6 0.5 0.4 0.3 0.2 0.1

-1.5

-1

-0.5

0.5

1

1.5

Figure 1: Density function of Z, fZ .

1

0.8

0.6

0.4

0.2

-1.5

-1

-0.5

0.5

1

1.5

Figure 2: Distribution function of Z, FZ .

6

Figures 1 and 2 show plots of the density function fZ and the distribution function FZ respectively; in these figures we used the first 18 terms of the series defining the function p, and the first 100 terms of the series defining g in the region (−∞, −1). All the figures and tables shown here were computed using Mathematica; see Wolfram (1996). Dykstra and Carolan (1997) computed the density function fZ by numerical Fourier inversion of formula (3.8), page 91, Groeneboom (1989). This section shows how fZ is computable without numerical Fourier inversion. Table 1 gives the distribution function FZ (z) and and Table 2 the density function fZ (z) for z = 0.0(.01)1.5. Table 1. Values of the distribution function FZ and density function fZ . z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09 .10 .11 .12 .13 .14 .15 .16 .17 .18 .19 .20 .21 .22 .23 .24 .25 .26 .27 .28 .29

FZ (z) .500000 .507583 .515163 .522739 .530306 .537863 .545408 .552937 .560448 .567938 .575406 .582848 .590263 .597647 .604998 .612314 .619593 .626832 .634029 .641182 .648289 .655347 .662354 .669309 .676208 .683052 .689836 .696560 .703222 .709820

fZ (z) 0.758345 0.758215 0.757828 0.757183 0.756281 0.755123 0.753709 0.752042 0.750122 0.747951 0.745532 0.742866 0.739957 0.736806 0.733416 0.729792 0.725935 0.721849 0.717539 0.713008 0.708260 0.703299 0.698131 0.692758 0.687187 0.681422 0.675469 0.669332 0.663017 0.656529

z .30 .31 .32 .33 .34 .35 .36 .37 .38 .39 .40 .41 .42 .43 .44 .45 .46 .47 .48 .49 .50 .51 .52 .53 .54 .55 .56 .57 .58 .59

7

FZ (z) .716352 .722817 .729213 .735538 .741792 .747972 .754077 .760107 .766059 .771933 .777728 .783442 .789075 .794626 .800094 .805477 .810776 .815991 .821119 .826161 .831117 .835985 .840766 .845460 .850066 .854584 .859014 .863357 .867612 .871779

fZ (z) 0.649874 0.643059 0.636088 0.628967 0.621704 0.614303 0.606771 0.599115 0.591341 0.583455 0.575464 0.567374 0.559192 0.550925 0.542578 0.534159 0.525674 0.51713 0.508533 0.49989 0.491208 0.482492 0.473749 0.464986 0.456209 0.447424 0.438638 0.429855 0.421083 0.412327

z .60 .61 .62 .63 .64 .65 .66 .67 .68 .69 .70 .71 .72 .73 .74 .75 .76 .77 .78 .79 .80 .81 .82 .83 .84 .85 .86 .87 .88 .89

FZ (z) .875858 .879851 .883756 .887575 .891308 .894955 .898517 .901994 .905388 .908698 .911925 .915071 .918136 .921120 .924025 .926852 .929601 .932273 .934870 .937392 .939841 .942218 .944523 .946759 .948925 .951024 .953056 .955023 .956926 .958766

fZ (z) .403594 .394887 .386214 .377580 .368989 .360447 .351960 .343531 .335166 .326870 .318646 .310499 .302433 .294452 .286560 .278760 .271057 .263452 .255950 .248553 .241264 .234086 .227020 .220070 .213237 .206523 .199931 .193460 .187113 .180891

z .90 .91 .92 .93 .94 .95 .96 .97 .98 .99 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18 1.19

FZ (z) .960544 .962262 .963921 .965522 .967067 .968556 .969992 .971375 .972706 .973988 .975221 .976406 .977545 .978639 .979689 .980697 .981664 .982590 .983478 .984328 .985142 .985920 .986665 .987376 .988055 .988703 .989322 .989912 .990474 .991010

8

fZ (z) .174795 .168827 .162985 .157272 .151687 .146231 .140904 .135705 .130635 .125694 .120880 .116194 .111633 .107199 .102889 .0987031 .0946394 .0906969 .0868741 .0831697 .0795821 .0761096 .0727506 .0695033 .0663658 .0633362 .0604126 .0575930 .0548753 .0522574

z 1.20 1.21 1.22 1.23 1.24 1.25 1.26 1.27 1.28 1.29 1.30 1.31 1.32 1.33 1.34 1.35 1.36 1.37 1.38 1.39 1.40 1.41 1.42 1.43 1.44 1.45 1.46 1.47 1.48 1.49 1.50

FZ (z) .991520 .992005 .992466 .992905 .993321 .993717 .994092 .994448 .994785 .995105 .995407 .995693 .995964 .996220 .996461 .996689 .996904 .997107 .997298 .997478 .997648 .997807 .997957 .998097 .998229 .998353 .998469 .998578 .998680 .998775 .998864

fZ (z) .0497372 .0473125 .0449811 .0427408 .0405894 .0385246 .0365441 .0346458 .0328273 .0310864 .0294208 .0278282 .0263065 .0248534 .0234666 .0221441 .0208837 .0196831 .0185404 .0174534 .0164201 .0154385 .0145066 .0136226 .0127844 .0119902 .0112384 .0105269 .00985428 .0092187 .00861855

4. Quantiles of FZ and some comparisons. Dykstra and Carolan (1997) suggested that fZ and FZ are closely approximated by the N (0, (.52)2) density and distribution functions respectively. While this results in a simple approximation for the corresponding quantiles FZ−1 (p), the differences between the exact quantiles and the approximate quantiles, or exact distribution function and approximate distribution function based on the normal approximation can be substantial. Table 2 compares a few exact quantiles computed directly by inverting the distribution function computed in the preceding section, with exact, approximate, and Monte carlo quantiles as computed by Dykstra and Carolan (1997), Narayanan and Sager (1989), and Keiding et al. (1996). The Dykstra and Carolan (1997) approach seems to fail in the tail, and we indeed believe that it is absolutely necessary to use different representations in a neighborhood of zero and in the tail (which is a common phenomenon in the numerical evaluation of special functions), whereas Dykstra and Carolan (1997) essentially use the same representation for the whole domain. The normal approximation clearly cannot be good for the whole domain; is this case it is reasonable for the outer values, but not so good for the intermediate values. The results of the Monte Carlo simulation of Narayanan and Sager seem pretty good for the values tabulated here and slightly better than the Monte Carlo simulation results, reported by Keiding et al. Neverthless, there is no need for Monte Carlo simulations and these will generally deteriorate rapidly if one goes further out in the tail. Table 3 gives further quantiles of the distribution FZ . Table 2. Comparison of several computations and estimators of the quantiles FZ−1 (p) for certain p’s. Percentile p

Exact FZ−1 (p)

.9 .95 .975 .99 .995 .999

.66423 .84508 .99818 1.17153 1.28666 1.51666

Dykstra and Carolan; Fourier inversion .664 .846 .998 1.156 1.270 1.452

N (0, (.52)2)

.666 .856 1.018 1.205 1.314 1.515

9

Narayanan and Sager; Monte-Carlo .658 .838 .986 1.156 1.281 1.510

Keiding et al. Monte-Carlo .66 .836 1.009 1.176 1.278 NA

Table 3. Quantiles of FZ−1 (p) for p = .5(.01).99 and p = .99(.001).999. p .50 .51 .52 .53 .54 .55 .56 .57 .58 .59 .60 .61 .62 .63 .64 .65 .66 .67 .68 .69 .70 .71 .72 .73 .74 .75 .76 .77 .78 .79

FZ−1 (p) 0.00 0.013187 0.026383 0.039595 0.052830 0.066096 0.079402 0.092757 0.106168 0.119645 0.133196 0.146831 0.160560 0.174393 0.188342 0.202418 0.216633 0.230999 0.24553 0.260242 0.275151 0.290274 0.305629 0.321238 0.337123 0.353308 0.369821 0.386694 0.403959 0.421656

p .80 .81 .82 .83 .84 .85 .86 .87 .88 .89 .90 .91 .92 .93 .94 .95 .96 .97 .98 .99 .991 .992 .993 .994 .995 .996 .997 .998 .999 .9999

10

FZ−1 (p) 0.439828 0.458525 0.477804 0.497731 0.518383 0.539855 0.562252 0.585706 0.610378 0.636468 0.664235 0.694004 0.726216 0.761477 0.800658 0.845081 0.896904 0.960057 1.043030 1.171530 1.18981 1.20990 1.23224 1.25750 1.28666 1.32137 1.36464 1.42302 1.51666 1.78406

5. Moments of Z. As remarked in the introduction, the first four moments were computed by Groeneboom and Sommeijer (1984). Table 3 shows the first 10 moments of Z. Table 4. Absolute moments of Z; E(|Z|k ), k = 1(1)10.

k 1 2 3 4 5 6 7 8 9 10

E(|Z|k ) .41273655 .26355964 .21135025 .197157 .205733 .234550 .287604 .375099 .516042 .744103

6. Mathematica Code. (* (* (* (* (*

The following Mathematica code computes the density function, distribution function, quantiles, and moments of the random variable Z = argmax_t ( B(t) - t^2) where B(t) is two-sided Brownian motion starting from zero. It is based on the results of Groeneboom, 1985, 1989.