on the relationship between stability of extreme ... - Semantic Scholar

Report 0 Downloads 47 Views


The Annals of Statistics 1989, Vol . 17, No . 3, 1070-1086

ON THE RELATIONSHIP BETWEEN STABILITY OF EXTREME ORDER STATISTICS AND CONVERGENCE OF THE MAXIMUM LIKELIHOOD KERNEL DENSITY ESTIMATE BY MICHEL BRONIATOWSKI, PAUL DEHEUVELS AND LUC DEVROYE I Universite Pierre et Marie Curie, Universite Pierre et Marie Curie and McGill University Let f be a density on the real line and let f,~ be the kernel estimate of f in which the smoothing factor is obtained by maximizing the cross-validated likelihood product according to the method of Duin and Habbema, Hermans and Vandenbroek . Under mild regularity conditions on the kernel and f, we show, among other things that f Jf,~ - f ( --~ 0 almost surely if and only if the sample extremes of f are strongly stable .

1. Introduction . Let X 1 , X2 , . . . be an i.i .d. sequence of random variables with distribution function F and density f and consider the kernel density estimate [Parzen (1962) and Rosenblatt (1956)] 1 n, x-Xi (1 .1) fn , h (x) nh i- ~

where the kernel

K

h

is a nonnegative function such that

(Ki)

JK(t) ~ dt = 1

and for some (K2)

r > 0, R > 0, m > 0

and M > 0,

ml[_ r, r ](x) < K(x) < M1[_R R] (x)

for all

-00

< x

< 00 .

Inspired by maximum likelihood theory, Habbema, Hermans and Vandenbroek (1974) and Duin (1976) proposed selecting the smoothing factor h > 0 by maximizing 1 n

_ -log n L(h) n j=1

(1 .2) where, for j =

1, . . . , n

and

fn,'h('1j)'

n >- 2, 1

(1 .3)

fn,'h(x)

x - Xi K

nh 1 0 where DN maxi i n mini < j i < nlXi - Xj ~, with the X i 's and Xj 's restricted to [-A, A] . This, jointly with the inequalities Dn >- D, = DN, implies (3.1) as sought. 0 N-* oo

THEOREM 4 . Assume that K satisfies (K1) and (K2) . Then, for any density f such that the sample extremes Xi, n and Xn, n are strongly stable together with (1 F )/f being monotone in the upper tail and F/f monotone in the lower tail, the cross-validated choice h n of h satisfies h n -p 0 almost surely as n oo . PROOF . We follow the proof of Theorem 6 .4 of DG (1985) corresponding to the case where the support S =_ {x : f (x) > 0} of f is bounded. We use the same notation, with the exception of T == S n [-A, A], where A is a large constant. We see that : (i) Lemmas 6 .10 and 6.12 remain valid if all integrals are taken over T . (ii) Lemma 6 .11 remains valid without change . (iii) Lemma 6 .13 is in general false when S is not compact. It is replaced by Lemma 3 below. (iv) Lemma 6.14 is crucial to the proof . It is restated and proved in a more general setting in Lemma 4 below . We now conclude the proof of Theorem 4 by mimicking the proof of Lemma 6.15 without change . Note that this proof requires parts (C), (E), (G) and (H) of Lemma 3, the monotonicity condition on (1- F)/f and F/f , and Lemma 4



1 07 5

MAXIMUM LIKELIHOOD DENSITY ESTIMATE

presented below. We have to verify that property (H) of Lemma 3 can be applied. To do so, we note that the ultimate monotonicity of (1- F)/f in the right tail and the stability of X,, , together imply that (1 - F)/f - 0 as x - 00 (see Lemma 5 and Remark 1 below) . The following Lemma 3 states some useful properties of the entropy needed in our proofs . In the sequel, we assume without loss of generality that R = 1 in (K2) . LEMMA 3. Assume extremes . Then :

(Kl) and (K2) and that f is a density with strongly stable

(A) ff log_( f * K h ) > - 00 and ff log_( f * uh) > - 00 for all h > 0, where log_ = min(log, 0), K h = h - 'K( . /h) and u h is the uniform density on [- h, h] . (B) fflog_f> - 00 . (C) ff log( f * Kh ) < ff log f for all h > 0 . (D) For a fixed h > 0 and a sequence h n -p h, we have sup l f * K h~ x

-

f * K hI

0

as n -~ oo . (E) f f log( f * K h ) is continuous in h on (0, oo) . (F) limb o ff log( f* K h ) = f f log f whenever

lim lim inf inf ( fA + f = ~) f log_ ( f * K h )

=

0.

AToo h->O

(G) hm h

ff log( f * Kh ) = - 00 . (H) Property (F) holds when conditions (i) and (ii) below hold simultaneously : (i) f has bounded support on the right, or the right tail o f f is infinite and f/(1 - F) is ultimately nondecreasing . (ii) f has bounded support on the left, or the left tail of f is infinite and f /F is ultimately nonincreasing .

If S is bounded, then Lemma 3 is contained in Lemma 6 .13 of DG . So we assume without loss of generality that F(x) < 1 for all x. (1985) (A) The first statement of (A) follows from the second one and (K2) . For the second one, we will make use of the fact (see Lemma 6 in the sequel) that if Q(u) = inf {x :1 - F(x) 1, PROOF .

Q(1/Cx) - Q(1/x) - 0 as x

(3.2)

oo .

Let A = kh, where k >- 1 is an arbitrary integer . By partitioning the interval [-A, A ] in 2 k disjoint intervals of length h having probabilities p~, i = 1, . . . , 2 k, we see that fA

2k

log(-j1

f

log( f* u h )

~ ~ p` i=1

-

e

log(2h) > -

00,

where we have used the fact that infx, 0 x log x = - 1/e . Choose by (3.2), m >-1 so large that Q(2 - `- 1) - Q(2 - `) - m, where m >- 1 is such that



107 6

M. BRONIATOWSKI, P. DEHEUVELS AND L . DEVROYE

Q(2m) _< A f

log( f*uh) -

A

F(x+h)-F(x-

f(x)log(

i=m

2h

Q(2 - ` - ')

dx )

f (x)log_(F(x + h) - F(x - h)) c~

e( 2-`)

-log(2h)

f

°° f (x) A

dx .

Since F(x + h) - F(x - h) > F(Q(2-i-1)) - F(Q(2 -i )) = 2 -i-1 for Q(2 -i ) x < Q(2 - i_1), this last expression is greater than or equal to (i + 1)log 2 00 00 log(2h) > - oo . 2 i l jog- 2- `- 1- log(2 h) _ 2 i+i This, jointly with (3.3) and a similar argument used in the lower tail, completes the proof of (A). (B), (C) By Jensen's inequality, fs f log((1/f ) f * K h ) -< log( fs( f / f ) f * K h ) _ log( fs f * K h ) . We are done if ff s * K h < 1, so assume that fs f * K h =1 . Since equality in Jensen's inequality occurs if (f * Kh )/ f = f * K h a.e. in S, we must have f (x) =1 whenever f * K(x) > 0 . Thus ff - f * Khj = 0, which by the arguments used in the proof of Lemma 1 implies h = 0, a contradiction . In view of (A) and of the inequality (C) so obtained, we have ff log f >f log( f * K h ) >- ft log_( f * Kh ) > - oo, and hence ff log_ f > - oo . This establishes (B). (D) By (K2), and using the same "o(1)" convention as in Lemma 1 in DG (1985), page 156,

t

f

~f tit (Khn - K h )l ~ C K h n -

Kh l

+ sup(Khn(x) +

Kh(x))

f~

c

f

2M+0(i) < o(l) + where C > 0 is an arbitrary fixed constant. (D) follows from the fact that we can make the last term in the bound above as small as desired . (E) This part follows from Lebesgue's dominated convergence theorem if, for some 0 < 6 < l, (3 .4)

ft

log + ( sup f * K,)

S

oo and ft log_( sup f * K,)> - oo ,

vEH

vEH

where log= max(log, 0) and H = [h(1 - 6), h(1 + 6)] . By (K2), routine arguments show that (3 .5) af * u b -< inf f * K v _ ft lim inf (- log _ ( f * K h )) = oo . h-~ oo h-• o0 (H) We limit ourselves to show that limA T lim inf h o fX f log_( f * K h ) = 0 under (i). A similar proof holds for fT under (ii). The case of bounded support is again proved in DG (1985) . In the second case, we have, for A large enough and y >- A - 1,1 - F(y) < f (y)/C for some constant C . Also, we can choose A such that (1 - F)/f above A - 1. Thus by (K2), . 00

A

mrhC(1- F(x ))

00

f

log( f * K h ) >-

A

f

log

2h

mrC

=log ff+fflog(1_F),foro 0 there exists an A large enough so that (3 .9)

inf f °°t log( hEl A

f * Kh)> - E

and

sup f°°f tog( f * Kh) < ~ h€l A



M. BRONIATOWSKI, P . DEHEUVELS AND L. DEVROYE

1 078

together with (3 .10)

lira sup sup Ln2 (h)

< E

and

lira inf inf L 2 (h)

> -

E a .s .

n-->oo hEl

n-~oo hEl

In the first place, (3.9) follows from parts (A) and (E) of Lemma 3, jointly with (3.5) (in which we replace H by suitable intervals in terms of c 1 and c2 ) . Next, we note by (K2) (recall that R =1) that fn(X) < ( MN~)/(nc1), where N~ is the number of points (not including X~) falling in [ X~ - c2 , X~ + c2 ] and h E I. Since N~ < n, we have 1

M

n

L 2 (h) _< 1 [A, o~)(X; n j=1 1

M\'

--log n

C

)log

N~ ncl

1 [A, ~)( X~)

for all h E I,

l 1 ;=1

which in turn can be made almost surely less than e in the upper tail if we choose A in such a way that ff A -< 2 E/log(M/c 1 ) . This proves the first statement in (3.10) . To complete our proof, assume without loss of generality that r = 1 in (K2) (R being now arbitrary) . By a similar argument as used for (3 .11) we have (3 .12)

1 n m L2 (h) _< 1 [A, ~)(X; )log h j=1 nc2

By choosing A in such a way that need is to prove that (3 .13)

1 n

hmmf n--* oo

n =1

ff A

-< 2

E/log(m/c2 ), we see that all we

n~

1 [A , OO)(Xf)log

Ni .

>- -

21 E

a .s.

n

In view of (3 .13) and (3.12), the proof of (3 .10) completes the proof of Lemma Observe that in the proofs of Lemmas 3 and 4, we have used the stability of extremes in (3 .2) and (3.13). In the remainder of this section, we prove these two statements . 0 4.

Our next lemma captures some useful properties of distributions with stable extremes . Its proof follows from routine Karamata-type representations used jointly with characterizations such as given in de Haan and Hordijk (1972) and Deheuvels (1984) . We omit details [see also Seneta (1975), Barndorff-Nielsen (1963) and Geffroy (1958)] . LEMMA 5 . Let Q(u) = Xn, n is equivalent to :

inf{x : l - F(x) 0 is a constant and e(s) is a continuous nonnegative function with limit zero as s -p oo . Assume further that f is a density such that f/(1 - F) is ultimately monotone and that F(x) < 1 for all x . Then i f Xn, n is strongly stable, limx f (x)/(1 F(x)) = oo, and : (B) Q can be represented in a right neighborhood o f zero by T

(3.15)

Q(u) = 8(u)

+ 1/u «(S)

ds, c slog log s where 9(u) is bounded with finite limit 00 as u 0, C > e is a constant and a(s)

is a continuous nonnegative function with limit zero as s - oo . Conversely, if (B) holds, then Xn n is strongly stable . REMARK 1 . Since Q'(u) = 1/f (Q(u )), the change of variable u =1 - F(x) used jointly with (3 .14) leads to the sufficient condition for stability of Xn n [Geffroy (1958)], 1-F(x) (3.16) hm = 0.

f (x)

x~oo

Likewise, (3 .15) gives the the sufficient condition for strong stability of Xn [de Haan and Hordijk (1972)], (3 .17)

1-F(x) f (x)

hm log log x~oo

1 1- F(x)

n

= 0.

Using (3 .17) it is easily verified that the normal distributions have strongly stable extremes . Moreover (3 .16) motivates the monotone-failure-rate-type assumptions in Theorem 1 . PROOF OF (3 .2) .

This statement follows directly from (3.14). D

PROOF OF (3 .13) . We make use of the representation in (3.15), assuming without loss of generality that Q(1) = 0, Q(0) = oo and that for i = 1, . . ., n, X,_ i + 1, n = Q(U~, n ) where n < . . . < Un, n are the order statistics of i .i.d. uniform (0,1) random variables with empirical distribution function U(x) _ n -1#{1 < i 0 and let p(u) = 1- F(Q(u) - t) and a(u) =1 F(Q (u) + t) for 0 < u < 1 . If (3 .14) holds, then for any 0 < < 1, there exists a u0 > 0 such that for all 0 < u < u 0, (3 .18) p(u) < u