The Annals of Statistics 1989, Vol . 17, No . 3, 1070-1086
ON THE RELATIONSHIP BETWEEN STABILITY OF EXTREME ORDER STATISTICS AND CONVERGENCE OF THE MAXIMUM LIKELIHOOD KERNEL DENSITY ESTIMATE BY MICHEL BRONIATOWSKI, PAUL DEHEUVELS AND LUC DEVROYE I Universite Pierre et Marie Curie, Universite Pierre et Marie Curie and McGill University Let f be a density on the real line and let f,~ be the kernel estimate of f in which the smoothing factor is obtained by maximizing the cross-validated likelihood product according to the method of Duin and Habbema, Hermans and Vandenbroek . Under mild regularity conditions on the kernel and f, we show, among other things that f Jf,~ - f ( --~ 0 almost surely if and only if the sample extremes of f are strongly stable .
1. Introduction . Let X 1 , X2 , . . . be an i.i .d. sequence of random variables with distribution function F and density f and consider the kernel density estimate [Parzen (1962) and Rosenblatt (1956)] 1 n, x-Xi (1 .1) fn , h (x) nh i- ~
where the kernel
K
h
is a nonnegative function such that
(Ki)
JK(t) ~ dt = 1
and for some (K2)
r > 0, R > 0, m > 0
and M > 0,
ml[_ r, r ](x) < K(x) < M1[_R R] (x)
for all
-00
< x
< 00 .
Inspired by maximum likelihood theory, Habbema, Hermans and Vandenbroek (1974) and Duin (1976) proposed selecting the smoothing factor h > 0 by maximizing 1 n
_ -log n L(h) n j=1
(1 .2) where, for j =
1, . . . , n
and
fn,'h('1j)'
n >- 2, 1
(1 .3)
fn,'h(x)
x - Xi K
nh 1 0 where DN maxi i n mini < j i < nlXi - Xj ~, with the X i 's and Xj 's restricted to [-A, A] . This, jointly with the inequalities Dn >- D, = DN, implies (3.1) as sought. 0 N-* oo
THEOREM 4 . Assume that K satisfies (K1) and (K2) . Then, for any density f such that the sample extremes Xi, n and Xn, n are strongly stable together with (1 F )/f being monotone in the upper tail and F/f monotone in the lower tail, the cross-validated choice h n of h satisfies h n -p 0 almost surely as n oo . PROOF . We follow the proof of Theorem 6 .4 of DG (1985) corresponding to the case where the support S =_ {x : f (x) > 0} of f is bounded. We use the same notation, with the exception of T == S n [-A, A], where A is a large constant. We see that : (i) Lemmas 6 .10 and 6.12 remain valid if all integrals are taken over T . (ii) Lemma 6 .11 remains valid without change . (iii) Lemma 6 .13 is in general false when S is not compact. It is replaced by Lemma 3 below. (iv) Lemma 6.14 is crucial to the proof . It is restated and proved in a more general setting in Lemma 4 below . We now conclude the proof of Theorem 4 by mimicking the proof of Lemma 6.15 without change . Note that this proof requires parts (C), (E), (G) and (H) of Lemma 3, the monotonicity condition on (1- F)/f and F/f , and Lemma 4
1 07 5
MAXIMUM LIKELIHOOD DENSITY ESTIMATE
presented below. We have to verify that property (H) of Lemma 3 can be applied. To do so, we note that the ultimate monotonicity of (1- F)/f in the right tail and the stability of X,, , together imply that (1 - F)/f - 0 as x - 00 (see Lemma 5 and Remark 1 below) . The following Lemma 3 states some useful properties of the entropy needed in our proofs . In the sequel, we assume without loss of generality that R = 1 in (K2) . LEMMA 3. Assume extremes . Then :
(Kl) and (K2) and that f is a density with strongly stable
(A) ff log_( f * K h ) > - 00 and ff log_( f * uh) > - 00 for all h > 0, where log_ = min(log, 0), K h = h - 'K( . /h) and u h is the uniform density on [- h, h] . (B) fflog_f> - 00 . (C) ff log( f * Kh ) < ff log f for all h > 0 . (D) For a fixed h > 0 and a sequence h n -p h, we have sup l f * K h~ x
-
f * K hI
0
as n -~ oo . (E) f f log( f * K h ) is continuous in h on (0, oo) . (F) limb o ff log( f* K h ) = f f log f whenever
lim lim inf inf ( fA + f = ~) f log_ ( f * K h )
=
0.
AToo h->O
(G) hm h
ff log( f * Kh ) = - 00 . (H) Property (F) holds when conditions (i) and (ii) below hold simultaneously : (i) f has bounded support on the right, or the right tail o f f is infinite and f/(1 - F) is ultimately nondecreasing . (ii) f has bounded support on the left, or the left tail of f is infinite and f /F is ultimately nonincreasing .
If S is bounded, then Lemma 3 is contained in Lemma 6 .13 of DG . So we assume without loss of generality that F(x) < 1 for all x. (1985) (A) The first statement of (A) follows from the second one and (K2) . For the second one, we will make use of the fact (see Lemma 6 in the sequel) that if Q(u) = inf {x :1 - F(x) 1, PROOF .
Q(1/Cx) - Q(1/x) - 0 as x
(3.2)
oo .
Let A = kh, where k >- 1 is an arbitrary integer . By partitioning the interval [-A, A ] in 2 k disjoint intervals of length h having probabilities p~, i = 1, . . . , 2 k, we see that fA
2k
log(-j1
f
log( f* u h )
~ ~ p` i=1
-
e
log(2h) > -
00,
where we have used the fact that infx, 0 x log x = - 1/e . Choose by (3.2), m >-1 so large that Q(2 - `- 1) - Q(2 - `) - m, where m >- 1 is such that
107 6
M. BRONIATOWSKI, P. DEHEUVELS AND L . DEVROYE
Q(2m) _< A f
log( f*uh) -
A
F(x+h)-F(x-
f(x)log(
i=m
2h
Q(2 - ` - ')
dx )
f (x)log_(F(x + h) - F(x - h)) c~
e( 2-`)
-log(2h)
f
°° f (x) A
dx .
Since F(x + h) - F(x - h) > F(Q(2-i-1)) - F(Q(2 -i )) = 2 -i-1 for Q(2 -i ) x < Q(2 - i_1), this last expression is greater than or equal to (i + 1)log 2 00 00 log(2h) > - oo . 2 i l jog- 2- `- 1- log(2 h) _ 2 i+i This, jointly with (3.3) and a similar argument used in the lower tail, completes the proof of (A). (B), (C) By Jensen's inequality, fs f log((1/f ) f * K h ) -< log( fs( f / f ) f * K h ) _ log( fs f * K h ) . We are done if ff s * K h < 1, so assume that fs f * K h =1 . Since equality in Jensen's inequality occurs if (f * Kh )/ f = f * K h a.e. in S, we must have f (x) =1 whenever f * K(x) > 0 . Thus ff - f * Khj = 0, which by the arguments used in the proof of Lemma 1 implies h = 0, a contradiction . In view of (A) and of the inequality (C) so obtained, we have ff log f >f log( f * K h ) >- ft log_( f * Kh ) > - oo, and hence ff log_ f > - oo . This establishes (B). (D) By (K2), and using the same "o(1)" convention as in Lemma 1 in DG (1985), page 156,
t
f
~f tit (Khn - K h )l ~ C K h n -
Kh l
+ sup(Khn(x) +
Kh(x))
f~
c
f
2M+0(i) < o(l) + where C > 0 is an arbitrary fixed constant. (D) follows from the fact that we can make the last term in the bound above as small as desired . (E) This part follows from Lebesgue's dominated convergence theorem if, for some 0 < 6 < l, (3 .4)
ft
log + ( sup f * K,)
S
oo and ft log_( sup f * K,)> - oo ,
vEH
vEH
where log= max(log, 0) and H = [h(1 - 6), h(1 + 6)] . By (K2), routine arguments show that (3 .5) af * u b -< inf f * K v _ ft lim inf (- log _ ( f * K h )) = oo . h-~ oo h-• o0 (H) We limit ourselves to show that limA T lim inf h o fX f log_( f * K h ) = 0 under (i). A similar proof holds for fT under (ii). The case of bounded support is again proved in DG (1985) . In the second case, we have, for A large enough and y >- A - 1,1 - F(y) < f (y)/C for some constant C . Also, we can choose A such that (1 - F)/f above A - 1. Thus by (K2), . 00
A
mrhC(1- F(x ))
00
f
log( f * K h ) >-
A
f
log
2h
mrC
=log ff+fflog(1_F),foro 0 there exists an A large enough so that (3 .9)
inf f °°t log( hEl A
f * Kh)> - E
and
sup f°°f tog( f * Kh) < ~ h€l A
M. BRONIATOWSKI, P . DEHEUVELS AND L. DEVROYE
1 078
together with (3 .10)
lira sup sup Ln2 (h)
< E
and
lira inf inf L 2 (h)
> -
E a .s .
n-->oo hEl
n-~oo hEl
In the first place, (3.9) follows from parts (A) and (E) of Lemma 3, jointly with (3.5) (in which we replace H by suitable intervals in terms of c 1 and c2 ) . Next, we note by (K2) (recall that R =1) that fn(X) < ( MN~)/(nc1), where N~ is the number of points (not including X~) falling in [ X~ - c2 , X~ + c2 ] and h E I. Since N~ < n, we have 1
M
n
L 2 (h) _< 1 [A, o~)(X; n j=1 1
M\'
--log n
C
)log
N~ ncl
1 [A, ~)( X~)
for all h E I,
l 1 ;=1
which in turn can be made almost surely less than e in the upper tail if we choose A in such a way that ff A -< 2 E/log(M/c 1 ) . This proves the first statement in (3.10) . To complete our proof, assume without loss of generality that r = 1 in (K2) (R being now arbitrary) . By a similar argument as used for (3 .11) we have (3 .12)
1 n m L2 (h) _< 1 [A, ~)(X; )log h j=1 nc2
By choosing A in such a way that need is to prove that (3 .13)
1 n
hmmf n--* oo
n =1
ff A
-< 2
E/log(m/c2 ), we see that all we
n~
1 [A , OO)(Xf)log
Ni .
>- -
21 E
a .s.
n
In view of (3 .13) and (3.12), the proof of (3 .10) completes the proof of Lemma Observe that in the proofs of Lemmas 3 and 4, we have used the stability of extremes in (3 .2) and (3.13). In the remainder of this section, we prove these two statements . 0 4.
Our next lemma captures some useful properties of distributions with stable extremes . Its proof follows from routine Karamata-type representations used jointly with characterizations such as given in de Haan and Hordijk (1972) and Deheuvels (1984) . We omit details [see also Seneta (1975), Barndorff-Nielsen (1963) and Geffroy (1958)] . LEMMA 5 . Let Q(u) = Xn, n is equivalent to :
inf{x : l - F(x) 0 is a constant and e(s) is a continuous nonnegative function with limit zero as s -p oo . Assume further that f is a density such that f/(1 - F) is ultimately monotone and that F(x) < 1 for all x . Then i f Xn, n is strongly stable, limx f (x)/(1 F(x)) = oo, and : (B) Q can be represented in a right neighborhood o f zero by T
(3.15)
Q(u) = 8(u)
+ 1/u «(S)
ds, c slog log s where 9(u) is bounded with finite limit 00 as u 0, C > e is a constant and a(s)
is a continuous nonnegative function with limit zero as s - oo . Conversely, if (B) holds, then Xn n is strongly stable . REMARK 1 . Since Q'(u) = 1/f (Q(u )), the change of variable u =1 - F(x) used jointly with (3 .14) leads to the sufficient condition for stability of Xn n [Geffroy (1958)], 1-F(x) (3.16) hm = 0.
f (x)
x~oo
Likewise, (3 .15) gives the the sufficient condition for strong stability of Xn [de Haan and Hordijk (1972)], (3 .17)
1-F(x) f (x)
hm log log x~oo
1 1- F(x)
n
= 0.
Using (3 .17) it is easily verified that the normal distributions have strongly stable extremes . Moreover (3 .16) motivates the monotone-failure-rate-type assumptions in Theorem 1 . PROOF OF (3 .2) .
This statement follows directly from (3.14). D
PROOF OF (3 .13) . We make use of the representation in (3.15), assuming without loss of generality that Q(1) = 0, Q(0) = oo and that for i = 1, . . ., n, X,_ i + 1, n = Q(U~, n ) where n < . . . < Un, n are the order statistics of i .i.d. uniform (0,1) random variables with empirical distribution function U(x) _ n -1#{1 < i 0 and let p(u) = 1- F(Q(u) - t) and a(u) =1 F(Q (u) + t) for 0 < u < 1 . If (3 .14) holds, then for any 0 < < 1, there exists a u0 > 0 such that for all 0 < u < u 0, (3 .18) p(u) < u