Non Parametric Maximum Likelihood Estimation of Features in Spatial ...

Report 2 Downloads 62 Views
Non Parametric Maximum Likelihood Estimation of Features in Spatial Point Processes Using Vorono Tessellation Denis Allard and Chris Fraley December 3, 1996

Abstract

This paper addresses the problem of estimating the support domain of a bounded point process in presence of background noise. This situation occurs for example in the detection of a mineeld from aerial observations. A Maximum Likelihood Estimator for a mixture of uniform point processes is derived using a natural partition of the space dened by the data themselves: the Vorono tessellation. The methodology is tested on simulations and compared to a model-based clustering technique.

KEY WORDS: Mixture Poisson process Uniform distribution Boundary Estimation Mineeld detection Sesimic faults.

1 Introduction The problem of contouring clusters of points belonging to a spatial point process in the presence of background noise arises in many practical situations. A typical example is the detection of a mineeld from aerial observations. An image returned by an electronic device contains putative mine locations and is corrupted by clutter. The objectives are to nd the boundaries Denis Allard is Researcher at the Institut National de la Recherche Agronomique, Unite de Biometrie, INRA Saint-Paul, Site Agroparc, 84914 Avignon, France. This work was done while he was Acting Assistant Professor at the Department of Statistics, University of Washington, and was supported in part by the O ce of Naval Research under Contract N00014-91-J-1074. Chris Fraley is a Research Associate at the Department of Statistics, University of Washington, Seattle, WA 98195. Her work was supported entirely by the O ce of Naval Research under contracts N00014-96-1-0192 and N00014-96-1-0330. The authors are grateful to Adrian Raftery, Julian Besag and Christian Lantuejoul for helpful comments and productive discussions, and to the editor, an associate editor and two anonymous referees for constructive criticism that led to considerable improvements. The Vorono tessellation procedure was provided by Christian Lantuejoul from the Centre de Geostatistique, Fontainebleau, France.

1

of the mineeld, to estimate the number of mines inside the mineeld, and to delineate a safe area. Typically, some technique for density estimation is applied, followed by thresholding of the density contours to dene the clusters, see for example Silverman (1986). A di erent approach, which has also been proposed in the context of mineeld detection, is that of model-based clustering. Here the models are usually mixtures of Gaussian densities, although other densities may be included to model noise see for example Baneld and Raftery (1993) and Celeux and Govaert (1992). This idea has several limitations: a point process is rarely a Gaussian mixture and model-based clustering is sensitive to departures from the model: highly nonlinear features are given a piecewise linear representation in this framework, which can often lead to suboptimal classications. In this paper, a nonparametric approach is proposed to the contouring problem, with particular attention to specic requirements in mineeld detection. We make the following assumptions: 1. The problem of testing for the presence of two populations is not addressed here since many standard methods are applicable, see for example Diggle (1983), Cressie (1993) and the references cited there. It will therefore be assumed that the data are known to contain two populations, and our focus will be on the estimation of the domain of the relevant point process in the presence of background noise. 2. The basic model is a mixture of two uniform point processes, with the support domain of one being included in the support domain of the other. An unusual and important feature of this model is that its likelihood allows direct estimation of the boundaries. In the univariate case, Cherno and Rubin (1954) solved this problem completely as a special instance of the estimation of a discontinuity in a density. They proved the consistency of the estimator and showed that the estimates di er from the optimal parameters by an amount which is of the order of 1=n where n is the sample size. Later, Gupta, Taneja and Janardan (1986) worked out the estimation for arbitrary mixtures of uniform univariate distributions using moment estimators. The estimator proposed here is the union of Vorono polygons maximizing the likelihood of the model. The method is very general and requires no assumptions about the shape of the domain, but some geometrical constraints can be added, depending on the phenomenon under study. For example, the domain can be restricted to be connected (in one piece), without holes and with a regular boundary. Figure 1 shows a simulated arrow-headed mineeld with its estimate. Data points are depicted in Figure 1a: they are a mixture of two uniform point processes. The mineeld is in the lower part of the image. Figure 1b shows the true mineeld (interior of the solid line). Black squares are points identied as being within the mineeld, whereas small dots are those identied as being outside it. Black squares outside the solid line are false alarms, while small dots inside the mineeld are points that could possibly be mines but are not identied as such. The estimator based on Vorono tessellations is introduced in Section 2 and applied to the mineeld detection problem in Section 3. Performance and robustness are tested through a simulation study, in which we compare our method to a model-based approach via an EM 2

1.0 0.0

0.2

0.4

0.6

0.8

1.0 0.8 0.6 0.4 0.2 0.0 0.0

0.2

0.4

0.6

0.8

1.0

.. . . . . . . . . . .... . . . . . . . .. . . .... . .. . . . . . .. . . . ... . .. .. . .. . . . . . .. ... . . . .. . . . . . . . . . ... . .. . . . . . . . . . . .. .. . . . . .. .. .. . . . . ... .. .... . . .. . .. ..... . .. . . . . . . . . .. . . . . . . . . . . . . .. . . .. .. . . . . . . . . . . . . . . .. . . . . .. .. . . . .. . .. . . . . . . . . . . . .. .. . . . . . . .. . . . .. . . .. . . . . ... . . . . . . . . . . . . . . . . . . .. . . .. . .. . . . . . .. . . .. .. . ... . . . . .. . . . . . . .. . . . . ... . . . . . .. . . . . . . . . .. . .. . .. . .. . . . . . .. . . . . ... . .. . . . .. . . .. . . . .. . . . . . . . . . . .. . . . . . . ... . .. . .. . . . . . .. . . . . .. . . . . . . . . . . . .. .. . . . .. . .. . . . . . .. . . . . . . . . . . . . . .. . . . . . . . . ...... . . . . . . .. . ... . . .. .. . . ... . . . . . . .. .... . . . . . .. . .. . .. . . .. . . . . . . . .. . . .. ... ....... . ... . . .... . .. . . . . . . . . . . . . .. . . . . . . . . . . . ... .. . . . . . .. . . . . . . .. . .. ... . ... .......... ... . .... .. . .. .. . . . . . . . . . . . . . . . . . . .. . .... . .. . .. . . .. .. ... . . .. . . . . . . . . .. . ... ..... . . . .. . . . .. . ..... .. ..... ... . . . . . . . .. .. . . . ..... .... . .. .. . ... . ...... . . . . . . .. . . .. . . . . . . . . . .. . . . . .. .. .. . .. . . . . . . . . . . . .. ... . .. . . . . .. . .. .... . . ... . . .. . . .. . . . . . . .. .. . .. . .. . .. . . . . .. .. . . . . . . . . .... .. .. .. .. . . . . . . . . . . .. . .. . . . . . . . . . . .. . . ..... . . . . ... . . .. .. .. . . . . . . . . . .. .. . . . . .. . . . ... . . . . . . . . . . . . . . .. .. . . . . . . . . . . . ... . . .. .. . . . . . . . . . . . . . .. .. . .. . . .. . .. . . . . . . ... . . . . . . 0.0

(a)

0.2

0.4

0.6

0.8

1.0

(b)

Figure 1: Example of the Vorono - Maximum Likelihood Estimator: detection rate is 97%, false alarm rate is 4.5%. (a) Data points (b) solid lines indicate the true domain, black squares are points identied as mines. algorithm. In Section 4 we apply the method to a geological example involving the analysis of earthquake data. Finally, a discussion of the advantages and limitations of this method is given in Section 5, along with suggestions for its extension and future work.

2 The Maximum Likelihood Estimator 2.1 A prole likelihood

Consider a bounded domain K of size 1 in R d on which we observe a random sample fxi g from a mixture of two uniform random variables: UA with a support A  K and probability p and UK with support K and complementary probability 1 ; p. Both the support A and the mixture parameter p are unknown and are to be estimated simultaneously. The density associated with a point x 2 K is f (x) = jAp j 1A (x) + 1 ; p (1) where 1A denotes the indicator function of A (1A = 1 if x 2 A and 1A = 0 otherwise), and jAj denotes the Lebesgue measure of A (length, area, volume for d = 1 2 3 respectively). This mixture is always identiable if p > 0, which will be assumed henceforth. If p = 1, there is no 3

background noise, and the problem is the estimation of the boundary of a point process. This problem has been widely addressed, see for example Ripley and Rasson (1977). The likelihood based on the data fx1  : : :  xn g is n Y (2) L(x A p) = ( jAp j 1A (xi ) + 1 ; p) = ( jAp j + 1 ; p)#(A) (1 ; p)n;#(A) i=1 P where #(A) = ni=1 1A (xi ) is the number of points in A. We do not work with this likelihood directly, since the mixture parameter p is not known and has to be estimated along with A instead, we reduce it to a partial maximized (prole) likelihood. For a xed A, the MLE of p is p^ = (#(A) ; jAjn)=(n ; jAjn) and the partial maximized likelihood becomes  n  #(A) #(A)  n ; #(A) n;#(A) L(x A) = n1  (3) jAj 1 ; jAj which is more conveniently expressed as a log-likelihood A) : (4) `(x A) = ;n ln n + #(A) ln #(jAAj ) + (n ; #(A)) ln n1;;#( jAj In one dimension, if A  0 1] is an interval, it can be shown (Cherno and Rubin, 1954) that the MLE is an interval whose endpoints may include 0 or 1 but must otherwise be one of the data points, and that the estimator is consistent with a variance of the order of 1=n2 . In two dimensions, if A is restricted to be a rectangle parallel to the coordinate axes, the MLE is a rectangle dened by four corners chosen from among the data points and the corners dening the larger domain. This can be explained as follows: given a set of m data points contained in A, the rectangle maximizing the likelihood and containing these m points is the smallest possible, dened by the two opposite corners whose coordinates are equal to the smallest abscissa and ordinate and the largest abscissa and ordinate respectively. There are C4n+4 of those rectangles to test (where Cji denotes the number of combination of j elements from a set of i elements) the MLE of A is the rectangle among these that maximizes `. The argument can be extendedd to higher dimension: if A is a parallepiped parallel to the coordinate axes, there are C2nd+2 parallepipeds to compute. By a generalization of Cherno and Rubin's reasoning, it can be shown that in these cases the MLE is consistent and that the estimators of the bounds of the intervals dening A have a variance of the order of 1=n2 . Our goal however is to nd an estimator for A in the most general case, that is i) for arbitrary spatial dimensions and ii) for an arbitrary and unknown shape. As a result, the partial maximized likelihood (3) does not provide a way to nd a MLE without additional constraints on A this can be seen by surrounding each data point by a disc of arbitrary small radius, thus creating a likelihood that is arbitrary large. One approach is then to consider a restricted class of candidate estimators on which a maximum is well dened, and nd the candidate within that class maximizing the likelihood. We propose an estimator for which A^ is made up of components of a nite class of template sets dened by the data themselves: the Vorono cells. 4

2.2 The Vorono tessellation

A collection of disjoint points fxi g in R d induces a natural division of the space into disjoint \territories": we assign to xi the part of R d closer to xi than to any other xj . This subdivision of the space is referred to as the Vorono (or Dirichlet) tessellation. Each cell contains one and exactly one point xi called the \nucleus" of the cell, and is the intersection of the n ; 1 halfspaces closer to xi than to xj . The cells are convex polytopes: line segments for d = 1, convex polygons for d = 2, convex polyhedra for d = 3, and so on. Vorono tessellations are widely used to model natural phenomena in agriculture, astrophysics, metallurgy, biology, physics, and other elds. They provide an intuitively appealing means of dening the concept of \area of in!uence" of a single point among a set of points, see Okabe, Boots and Sugihara (1992) and the references cited there. Vorono tessellations have been used in spatial statistics for interpolation (Okabe et al., 1992) and for declustering data (Isaaks and Srivastava, 1989). A theoretical review of random Vorono tessellation in the case of a planar homogeneous Poisson point process can be found in M"ller (1994). Hinde and Miles (1980) performed careful Monte-Carlo simulations to estimate higher moments and to t the distribution of some quantities of interest using Vorono tessellations. Although not much is known about the analytical properties of Vorono tessellations, it is well known that the number of vertices per cell is insensitive to the intensity , whereas the distribution of S (S being the area of a typical cell) does not depend on . In particular, E S ] = ;1 and (S ) ' 0:53;1 , so that, the higher the density of points, the smaller the cells. Hence the size of the Vorono cells can be used for local density estimation.

2.3 The template MLE

Before we dene the MLE using the Vorono tessellation, we rst consider the case in which no restrictions are made on the shape of A. Because the density of points is higher in A than in K , #(A) > jAjn and L(A) is maximal when jAj is minimal since L(A) / jAj#(A) (1 ; jAj)n;#(A) . Hence, for a xed m, the subregion maximizing (4) is the union of the m smallest cells. As a consequence, the MLE for A, hereafter denoted A^, will be the union of the m^ smallest cells, where m^ is such that it maximizes the log-likelihood (4). In summary, the MLE nds the solution separating the cells in two groups, the `smaller' cells and the `larger' cells, by maximizing the contrast between them. The MLE will not necessarily be connected since it could nd (relatively) small Vorono cells outside the region of interest. This is not always satisfactory, as illustrated in Figure 2. Two simulations for di erent values of n and p have been performed for a rectangular mineeld located on the left. Each small dot indicates a data point and a black square is superimposed in the Vorono cells belonging to A^. In both gures, the boundaries of the MLE are very irregular. The contrast between the density of the mineeld and the density of the noise is high in Figure 2a and the mineeld is reasonably well estimated, although A^ does contain some remote cells and misses a few cells within the 5

0.2

.

.

.

. .. . ..

.

.

.

.

.

. .. .

. . ..

. . . ..

.

0.6

.

.

.

. . .

.

.. .

.

.

.

.

. . .. .

. .. .

. . . .

.

.

..

.

. .

. .

.

. .

...

..

.

.

. 0.8

.. . .

. . . . . . . . . . ... . .. . . .. . .

. .

. . .

. . ..

. 1.0

. 0.0

(a)

.. . . . . . .. . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . .. . .. .. . . . .. . .. . . . . .. . . . . . . .. . . . . . .. . . .. .... . . . .. .. . . . . . . . . .. . . . . . . . . . . . . . .. . . .. . . . . . .. . . . . . . . . . ... . . . . . .. . ... . .. . . . .. . . .. . . . . . ... . . . ... . . .. .. . . . .. . . . . . . . .. . .. .. . . . . .. . . . ... . . . . . . . .. . .. . .. . .. . . . . . . . . . . . . .. ..... . . .. . . . . ... . . .. . . . . .. . . .. .. . . . . . .. . . . . .. . . . . . . . . . .. . . . . . ..... .. . . ... . . . .. .. . . . . . .. . . . . .. . . . .. .. ... . . . . . . .. . . . . . . . . . . .. .. . . . .... .. . . .. . . . . . . . .. . . ... .. . . . . . .. ...... .. .. .. . . . . .. . .. . . . . . .. .. . . . . . . . . . . .. . . . ... .. . . . . . . . . .. . . ... . . .. .. . . . . . . . . . . . . . . . .. .. . . . .. . . . .. .

.

.

.

.. ..

.

. .

.

.

. .

.

0.4

. . . . .

. ..

. . . .

. .

.

0.8

. .

.

.

.

.

0.6

. .. . . . .. . . .. . .. . .. . . . . .. .. . .. . . ... . . . .. . . .. .. . . . . . . . . . . . . .. ... . . .. ..... .. .. .. . . . .. .. . . . . . . . . . . .. .. . . .. . .. . . .. . . .. . . . . . . . .. ... . . . . . . . . .. . . .. . . . . . ... . . . . . .. . . .. . . ..... . . . . .. . . .. . . . . . ... . . .. .. . .. . . . . .. . .. . . .. . . .. . . . .. . .. . . . . . ... . . . . . .. .. . . . . 0.0

.

.

1.0

..

. .

0.4

. .

0.2

0.8 0.6 0.4 0.2 0.0

.

0.0

1.0

. .

.

. . . . .. . . . . . .

.

.

0.2

0.4

0.6

0.8

.

.. . . . .

.

. .

. . .. . . . .

.

. . . . . .

.. . . . ... . . .. . . .. . . . .. . . . .

1.0

(b)

Figure 2: Two examples of unrestricted MLE. The true area is indicated by the rectangle black squares indicate points identied as mines. mineeld A. In Figure 2b, the two densities are not as well separated. The MLE A^ is made up of many of clusters, one being much larger than the others. Although it does contain almost all of the cells of the true mineeld A, it also includes a signicant number of small cells outside A. Some characteristics of A are often known even if its precise shape is not. In many applications, for example, it is reasonable to expect the mineeld to be in one piece without holes, and to have regular boundaries. The MLE needs to re!ect these properties, and they should consequently play a role in the estimation procedure. On the line, imposing connectedness is equivalent to restricting the Vorono MLE to be an interval. The Vorono cells and hence the MLE are in this case intervals whose endpoints are midpoints between data points. In this context, Cherno and Rubin's arguments can be used to prove the consistency of the estimator and show that it has a variance of the order of 1=n2 . Moreover, we can show through simulations that the Vorono based estimator has lower bias and variance than the true MLE. Table 1 illustrates the performances of the two estimators for 500 simulations performed with the following parameters: p = 1=3 and A = a b] = 0:25 0:5]. Both estimators are biased, overestimating a and underestimating b, with a bias decreasing with n. Their standard deviation behaves roughly as 1=n. The Vorono estimator has better performance, both in terms of bias and variance. The di erence in performance diminishes with increasing n. In higher dimension, approximate solutions must be found, and no formal proofs of consis6

n = 60

E ^a] E ^b] E ^p] (^a) (^b) (^p)

MLE 0.315 0.481 0.277 0.149 0.127 0.186

Vor. 0.274 0.492 0.328 0.095 0.097 0.133

n = 120

MLE 0.260 0.490 0.330 0.064 0.062 0.094

Vor. 0.255 0.494 0.333 0.049 0.043 0.071

n = 240

MLE 0.253 0.498 0.335 0.021 0.019 0.044

Vor. 0.253 0.498 0.334 0.014 0.018 0.039

NOTE: K = 0 1] A = a b] = 0:25 0:5] p = 1=3 500 simulations.

tency are available. In the next section we discuss inclusion of characteristics such as connectedness in the maximum likelihood estimation in two dimensions, illustrated by the mineeld detection problem.

3 The mineeld detection problem 3.1 Presentation of the problem

In the most general setting, the mineeld detection problem is as follows. An area has been surveyed using some imaging system. There may be one or several mineelds contained within the surveyed area, or none. Often, the area surveyed includes shallow water, beach area and a surface zone with vegetation. Processed images that return a list of putative mine locations are corrupted by clutter, due to objects that may look like mines at the image's level of resolution. The aim of the analysis is to determine whether there are mineelds present, and what their boundaries are. It is anticipated that in the mineeld there may be as many as one false alarm for every mine. We have restricted ourselves (see section 1) to the case where it is known that the image contains exactly one mineeld. The focus is on the estimation of its boundaries. In section 5, we will brie!y explain how this work can be extended to estimate the number of mineelds. A second assumption was that clutter points on the image and mines within the mineeld are uniformly distributed. It is possible that some sort of inhibition process with a more regular pattern would be more appropriate for the mineeld this issue will be addressed in section 3.3.

3.2 Approximate MLE algorithms for the mineeld detection problem

If A is known to be simply connected with regular boundaries, as is the case in the mineeld problem, solutions satisfying these constraints should be preferred. One possible strategy is to consider a likelihood in which a term penalizing irregular solutions is included. Another strategy 7

is to restrict the MLE to be a union of Vorono polygons sharing the same properties as those assumed for A in other words, A^ must belong to the subset of unions of Vorono polygons for which all geometrical constraints are satised. In addition to the connectivity constraints that i) A is made of one piece and ii) A is simply connected (that is, without holes), the mineeld detection problem requires some regularity conditions on the border, to be detailed below. Some additional notation is needed before we dene the regularity condition. The complementary set of A^ (relative to K ) is denoted A^c . We dene @i A^, the inside border of A^, to be the union of the polygons of A^ adjacent to at least one polygon of A^c . Similarly, @o A^, the outside border of A^, is dened as the union of the polygons of A^c adjacent to at least one polygon of A^. Among the polygons of @oA^, some are in contact with border polygons of @o A^ or @i A^ only. These polygons of Ac are called \surrounded" border cells since there are not in contact with Ac n@o A. The regularity condition for the mineeld detection problem is the following: iii) the outside border @o A^ contains no \surrounded" polygons. This condition ensures that there will be no narrow bay in the domain of A^. A condition on the inside border could have been chosen, constraining @i A^ to contain only polygons adjacent at least to one polygon in An@i A. The former is more conservative than the latter, and for obvious reasons it should be preferred in the mineeld context. There is a close relationship between this regularity assumption and the Closing operation on graphs in mathematical morphology (Heijmans and Vincent 1993, Serra 1988). To see this, consider the Erosion and Dilation operators from mathematical morphology, denoted Er and Di, which map one set of Vorono polygons into another: ^ Er(A^) = A^ n @i A^ and Di(A^) = A^  @o A: Erosion removes a layer of polygons, namely the inside border, from A^, while dilation merges a layer of polygons, namely the outside border, into A^. The closing Cl(A^) of A^ is the mapping dened as a dilation followed by an erosion: Cl(A^) = Er(Di(A^)). The set Cl(A^) is usually referred to as the closure of A^. Clearly, A^  Cl(A^). It can be shown (Serra 1988, pp. 20) that closing is idempotent, i.e. Cl(Cl(A^)) = Cl(A^). Sets not altered by a closing are the result of the closing of another set, and are called closed sets. With these denitions, the regularity condition becomes: iii) A^ is a closed set of polygons. Finding A^ The only certain way to nd the estimator A^ satisfying constraints i) through iii) is to try all 2n possible solutions. This is completely unrealistic for large n, so that one must instead nd good approximate solutions. Here we propose several methods for doing so, and compare them through simulations. Method # 1 uses as a rst guess the unconstrained solutions for A^. More precisely, for each m from 1 to n, the set of the m smallest polygons is considered (the unconstrained MLE 8

for each value of m). Each such solution is then transformed so that it satises the three constraints, thereby dening A~m : all holes in the clusters are lled, only the largest cluster is kept, and its closure is computed. The approximate MLE is then the set A~m with the highest ~ log-likelihood over all values of m: A^ = A~m^ with m^ = argfmax m `(Am )g. We call this method the morphological algorithm. Method # 2 is a classical greedy algorithm. A^ is initialized as the empty set, and new cells are merged into A one at a time. At each stage each cell of the outside border is considered a candidate for merging, and the increase in likelihood of the closure after it is merged is computed. The merge corresponding to the maximum-likelihood closure is ultimately selected. After all cells are merged into A, the stage at which the log-likelihood is maximized is chosen as the approximate MLE. This greedy algorithm generally produces a di erent approximate solution for A^ than the morphological algorithm. Method # 3 is a combination of these two methods. The morphological algorithm is rst run, and its output is used as input for the greedy algorithm. If this does not succeed in nding a higher log-likelihood solution, the morphological solution is retained. Method # 4, also a hybrid, selects the highest log-likelihood solution from among those produced by the three methods above. These four methods nd approximate solutions to the maximum likelihood problem. Although they are generally accurate in detecting the core of the mineeld, there may be some di%culties at the border, depending on the ratio of the densities. To see this, we dene the signal/noise ratio r as the following ratio of probabilities: p is a mine) r  P (X isP (aXclutter point in A) = jAj(1 ; p) : In the mineeld terminology, r is the number of mines per clutter point within the mineeld (it can be as low as 1 on real images).

If r is low, random clumps of the noise process may have a density close to the density of

points in A. If one of these clumps is close to A, the MLE solution will include it. On the other hand, random voids of the mine process could easily have a local density comparable to the clutter density. If such regions occur near the boundary, they will not be included in A^. If r is high, the distributions of the area of the Vorono polygons inside and outside A will be very di erent and discrimination based on size should work well. Around the border however, the cells will have an intermediate size, and could be excluded in the MLE, causing A to be underestimated.

It should be clear that, in the mineeld detection problem, the estimator should be very conservative. It must detect mines with a high probability and accurately estimate the mine 9

density within the mineeld. False alarms are not so serious, but should be avoided if possible. A straightforward way to dene a safe area around the mineeld is to include its outside border, and consider the complementary set as safe. We call this the border method (Method # 5). In mathematical morphology terms, the alleged mineeld is no longer A^ but rather its dilation Di(A^). The border method will be applied to the morphological algorithm. We compare these 5 methods in simulations, for various values of the sample size n and the signal/noise ratio r. Assessing the performance of the approximate Vorono MLE The estimator A^ is dened to be consistent if jA'A^j !p 0 as n ! 1, where jA'A^j is the area of the symmetric di erence of A and A^. In the mineeld detection problem we are actually more interested in the classication of the points as being inside or outside the mineeld. Within the mineeld, a mine cannot be distinguished from a clutter point. This is true for real images as well as for the model, and as a consequence it is customary to consider every point within the mineeld as a potential mine and every point outside the mineeld as clutter. If we dene N1 = #(A^ \ A), the number of points within A belonging to A^ and N2 = #(A^ \ Ac), the number of points in A^ not in A (complementation is with respect to the larger domain K ), then, in the mineeld problem, N1 is the number of detected mines and N2 is the number of false alarms. It follows that DR = N1 =#(A) is the detection rate, and FAR = N2 =#(Ac ) the false alarm rate. If the estimator is consistent, then DR ! 1 and FAR ! 0 as n ! 1, since jA'A^j = jA \ A^cj + jAc \ A^j. As n ! 1, it follows from the law of large numbers and the fact the expectation of the area of a typical Vorono cell is equal to the inverse of the density of points, that jA'A^j is asymptotically equal to  #(A \ A^c) + #(Ac \ A^) = 1  #(A) ; N1 + N2 (p + jAj(1 ; p))n (1 ; jAj)(1 ; p)n n p + jAj(1 ; p) (1 ; jAj)(1 ; p) : Therefore, jA'A^j ! 0 is equivalent to DR ! 1 and FAR ! 0. The total misclassication rate ERR, is dened by ERR = 1 ; DR + FAR, so that if the estimator is consistent, ERR ! 0 as n ! 1. Other statistics of interest are the area jA^j and the estimated number of mines, np^ = (#(A^) ; jA^jn)=(1 ; jA^j). Results of simulations

Table 2 and 3 show the in!uence of the signal/noise ratio and sample size in 200 simulations carried out on a rectangular eld A = 0:2 0:4] 0:2 0:8]. As one would expect, performance improves as n and r increase. In Table 2, the detection rate (DR) ranges from 80% to 92% and the false alarm rate (FAR) ranges from 12% to 2% for the morphological algorithm. Including the border increases the detection rate dramatically (it ranges from 97% to 99%), at the cost of a higher false alarm rate (ranging from 31% to 8%). The total misclassication rate (ERR) is 10

1.0 0.8

1.0 0.8

.

.

. .

.

.

0.6

0.6

. .

. .

. . .

. .

.

0.2

0.2

.

.

0.0

0.0

.

1.0

. 0.0

(a)

.

.. .

0.2

.

.

..

.

.

. . .

..

.

.

.

. .

.

.

.

.

.

.

.

.

. .

.

.

.

.

. ..

.

. .

. ..

.

.

.. .

. .

. .

. ..

.

. . .

.

..

.

..

. . . .

. .

.

. .

.

.

.

0.8

.

. .

.

.

.

.

. .

0.6

. . . .

.

0.4

0.4

.

0.4

. . . .

. .

.

.

.

.

.

.

0.2

.

.

.

0.0

.

.

..

.

.

.

. .

.

.

.

. .

. 0.4

0.6

0.8

.

.

. . ..

. . 1.0

(b)

Figure 3: Simulated data and morphological estimator in the worst case: (a) simulated data points with 16 mines and 134 clutter points (r = 1) (b) true mineeld (rectangle) and estimate (union of polygons containing a black square). relatively insensitive to the algorithm chosen to nd A^, although it is highly dependent on the sample size and to the signal/noise ratio. From 33% for n = 150 and r = 2, the misclassication rate drops to 12% for n = 816 and r = 3. It is remarkable that including the border has no e ect on ERR. Instead, the error is transferred from DR to FAR. The choice of the algorithm for nding an approximate solution for A^ is then a matter of deciding how DR and FAR are to be allocated within a given error ERR. The number of mines is generally well estimated by all of the methods. The area of the mineeld is usually overestimated, a direct consequence of the use of the Vorono tessellation for measuring the area. Table 3 shows the in!uence of the signal/noise ratio in the case of very few data points: n = 150. The worst case considered is a ratio r = 1, for which there are only 16 mines for 134 clutter points. Figure 3 shows a simulation in which visual identication of the mineeld is very di%cult, together with the Vorono MLE via the morphological algorithm. In this case, the algorithm detects an average of 71% of the points within the mineeld with an average false alarm rate of 21%. The total misclassication rate is about 50%. As the signal/noise ratio increases, there is of course a marked improvement in performance. For r = 4, the detection rate is as high as 84% and 99% when including the border. The misclassication rate is 21%. The morphological algorithm leads generally to the lowest total misclassication rate (although there is not much di erence among the methods) and to less 11

biased estimators for the number of mines and the area (except when both n and r are very high). If the morphological algorithm is followed by a greedy algorithm (Method # 3), the area of the alleged mineeld cannot decrease any increase in area is accompanied by an increase in both the detection and false alarm rates, although the misclassication rate is not generally a ected. For a conservative algorithm, this is probably the best combination. A safe area, containing on average more than 98% of the mines, can then be drawn by including the outside border. The morphological algorithm is also the fastest. On a typical workstation (e.g. a Sun Sparc 4), it takes about 10 seconds for 500 points.

3.3 Robustness of the estimator

Robustness is dened as the sensitivity of the estimator to departures from the model assumptions. In the mineeld detection problem, robustness is crucial since mines are usually laid out in a more regular pattern than the uniform distribution. It is possible that the mineeld could be more accurately modeled by Markov point processes, or even by regular grids. In this section, the estimator is tested both for a regular grid and for a special case of the Markov point process: a pure inhibition (or hard core) process. This last process is dened in the following way: conditional on the number of points inside A, the joint density of the n points is uniform provided that no two points are closer than some distance . Simulations were done using a Gibbs sampling technique (Ripley, 1987 p.112). A typical realization of these point processes (with noise) is shown on Figure 4a and 4c, with their morphological estimates (4b and 4d) . Results on 200 simulations (65 mines, 270 clutter points, r = 2) are summarized on Table 4 for the morphological method, with and without inclusion of the border. The results for the other methods are similar. In general the morphological algorithm performs remarkably well for these two non-uniform processes. The detection rate is even better than in the uniform case, but at the cost of a higher false alarm rate and a higher total misclassication rate (a very slight increase in the case of the inhibition process). The number of mines is well estimated in all cases, but the area of the mineeld jAj is overestimated, which is consistent with the high false alarm rate. When including the borders, the detection rate increases up to 99:6% for the inhibition process, with a false alarm rate of 28%. These results show that the algorithm is quite robust if the point process is more regular than a uniform point process, especially with regard to the detection rate and the estimate of the number of mines, which are the two quantities of most interest in the mineeld detection problem. The algorithm is not expected to be as robust for point processes less regular than uniform point processes (like cluster processes), since then the situation would be more similar to the presence of several mineelds.

12

Table 1: Results of 200 simulations. Detection rate DR (%), false alarm rate FAR (%), total error rate ERR (%), estimated number of mines, and estimated area of the mineeld for increasing values of the sample size n and the signal/noise ratio r. DR FAR ERR Mines jA^j Morph. 80 12 32 30 0.15 n = 150 Greedy 82 16 34 33 0.17 29 mines M. + G. 85 17 32 34 0.19 r = 2 High. Lik. 82 15 33 33 0.17 Border 97 31 34 33 0.37 Morph. Greedy 50 mines M. + G. r = 2 High. Lik. Border

84 84 88 84 98

7 9 11 9 22

23 25 23 25 24

49 51 53 51 53

0.13 0.14 0.15 0.14 0.30

Morph. n = 500 Greedy 100 mines M. + G. r = 2 High. Lik. Border

89 87 91 87 98

6 8 8 7 16

17 21 17 20 18

99 102 107 102 105

0.13 0.14 0.15 0.13 0.25

Morph. n = 816 Greedy 216 mines M. + G. r = 3 High. Lik. Border

92 90 93 90 99

2 2 3 2 8

10 12 10 12 9

202 202 206 201 213

0.12 0.11 0.12 0.11 0.19

n = 250

NOTE: M. + G. stands for method # 3. A = 0:2 0:4]0:2 0:8] A = 0:12 j

13

j

Table 2: In!uence of the signal/noise ratio r.

jA^j

DR FAR ERR # Mines Morph. 71 21 50 25 Greedy 76 30 54 30 r=1 M. + G. 82 33 51 33 16 mines High. Lik. 78 30 52 31 Border 93 47 54 28

0.20 0.26 0.30 0.26 0.50

Morph. Greedy r=2 M. + G. 29 mines High. Lik. Border

80 82 85 82 97

12 16 17 15 32

32 34 32 33 35

30 33 34 33 33

0.16 0.17 0.19 0.17 0.37

Morph. Greedy r=3 M. + G. 40 mines High. Lik. Border

84 83 86 83 99

8 9 10 8 23

24 26 24 25 24

37 37 39 37 41

0.13 0.14 0.15 0.13 0.31

Morph. Greedy r=4 M. + G. 50 mines High. Lik. Border

84 84 86 84 99

5 5 6 5 21

21 21 20 21 22

43 43 45 43 49

0.13 0.12 0.13 0.31 0.31

NOTE: Same statistics as in Table 2 computed on 200 simulations. The sample size is constant: n = 150. A = 0:2 0:4]0:2 0:8] A = 0:12. j

j

14

1.0

1.0

.

.

..

0.8 0.6 0.4 0.2 0.0

0.0

0.2

0.4

0.6

0.8

..

0.0

0.2

0.4

0.6

0.8

1.0

.

.

. ..

. . . . .. . . . . . . .. . . . . . .. . . . . . . . . . . . . .. . . . . . . . . . . . .. . . . . . .. . . . . ... . . . . .. . . . . . . . . . . . .... . . . . .. . . . . . . . . . . ... . . . . . . .. . . .. .. . . . . . . . . . . . . . . .. . . . ... .. . . .. . . .. . .. . . . . . . . . .. . .. . . . . .. . .. . . . . .. .. . . .. . . . . . . . . .. . . . . . . . . . . .

0.0

0.2

1.0

1.0

0.8 0.0

0.2

0.4

0.6

0.8 0.6 0.4 0.2 0.0

0.4

.

0.4

.

.

..

. .

.

. . . ... .

.. . .

. . . .

.

.

.

. . . . . .

. . .

.

..

. .

. . . .

.. . . .. . . . . . . . . . . . . . .

.

. .

. . .

..

. .

.

.

..

.

.

.

.

.

. . . .

. . . . ..

.

.

.

0.6

.

.

. . .

.

0.8

1.0

(b) .

.

..

..

0.2

..

.

. . .

.

.

(a)

0.0

. .

0.6

0.8

1.0

..

.

..

.

.

. . . . .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . .. . . . . . . . . . . . . . .. . . . . . . . .. . .. . .. . . . . .. . .. . .. . . . . . . . . . .. . .. . . . . . . . . .. . . . . . . . . . . . .. . . . . . . ... . . . . . . . . . . . . .. . . . . . . .. . . .. . . . . . ... . . . . . . . . .. .. . . .. . . . . . . . . .. . . . . . . . . . . .

.

.

0.2

0.4

.

..

.

. .

. . . . ... .

.. . .

.

0.0

(c)

.

.

.

. .

.

.

.

. . .

. .

. . . . . .

.

. .

. . . . . .

.

0.6

..

. . . . . .

..

.

. .

.

..

.

.

.. . . .. . . . . . . . . . . . . . .

.

.

.

.

. . . . . .. .

. 0.8

.

. . .

. . 1.0

(d)

Figure 4: Simulated data and Morphological Estimate for two di erent mine processes (65 mines and 270 clutter points, r = 2): (a) and (b) hard-core process with = 0:035, (c) and (d) regular grid.

15

Table 3: Robustness of the morphological estimator, without and with border cells. DR (%) FAR (%) Err (%) p^ jA^j Unif. 87 7.3 20 0.20 0.14 Morph. Inhib. 90 9.4 20 0.20 0.16 Grid 88 13 25 0.20 0.19 Unif. Border Inhib. Grid

98.1 99.6 98.5

21 28 33

23 28 34

0.21 0.29 0.22 0.35 0.22 0.39

NOTE: Statistics computed on 200 simulations with 65 mines and 270 clutter points: n = 335, r = 2, p = 0:194 and A = 0:12. The same rectangular mineeld as in Table 2 and 3 was used. j

j

3.4 Comparison with Model Based Clustering

The method is now compared to a parametric approach, based on model-based Gaussian clustering with noise, developed by Baneld and Raftery (1993). The mineeld detection problem as specied here is a relatively simple instance of the class of problems considered by Baneld and Raftery, in that the there are known to be only two groups present (mines and clutter). The relevant model consists of a Poisson term for the clutter and a Gaussian term for the mineeld:     #(K );#(A)] Y p1 exp ; 12 (xi ; )T (;1 (xi ; ) : L(X  () = jK1 j i2A 2 det(()

In order to maximize the likelihood, we applied a classical EM algorithm to the augmented likelihood #  1 n " (1 ; z ) Y i + pzi T ;1 exp ; 2 (xi ; ) ( (xi ; ) 2 det(() i=1 jK j where ( xi is a mine zi = 10 ifotherwise,

see for example Celeux and Govaert (1992). As a starting value, we used ( the rst coordinate of xi does not exceed :5, zi = 10 ifotherwise, since a higher density can be visually detected on the left side of the image than on the right. The group corresponding to zi = 1 includes all of the mines, along with considerable clutter. We observed, however, that the method is relatively insensitive to the starting value. 16

Table 4: Comparison with model-based clustering using the EM algorithm. Case 1 Vor. EM DR 71 45 FAR 21 3 Err 50 57

Case 2 Vor. EM 87 78 7 1 20 25

Case 3 Vor. EM 92 91 2 2 10 11

Case 4 Vor. EM 90 71 10 1 20 30

Case 5 Vor. EM 88 63 13 1 21 38

NOTE: Case 1: uniform mine process, 16 mines, 134 clutter points, r = 1. Case 2 (uniform mine process), Case 4 (inhibition process) and Case 5 (regular grid): 65 mines, 270 clutter points, r = 2. Case 3: uniform mine process, 216 mines, 600 clutter points, r = 3.

Table 5 compares the model-based clustering approach to the morphological algorithm for the three di erent models of mine process and three di erent values of (n r) considered in the simulations of section 3.2. For uniform processes, our method has superior detection rates, particularly for small numbers of data points, while the EM method has fewer false alarms. EM also has somewhat larger error rates. For the more regular point processes, the Vorono estimator substantially outperforms EM in terms of both the detection rate and overall error rate, although again EM produces fewer false alarms. In the model-based method, cluster shapes are elliptical, and in these examples the ellipse tends to stay within the boundaries of the rectangular mineeld given su%cient data. When the mineeld has a more irregular shape, we would anticipate that more favorable performance from the Vorono method, because of the geometrical constraints inherent in the Gaussian model.

4 A Geological Example We now apply the algorithm to the problem of detecting geological faults from an earthquake record. The basic premise is that earthquakes usually occur on active faults and their epicenters are clustered along them. Figure 5a shows the location of all earthquake epicenters with a Richter intensity above 2.5 in the Central California area from 1962 to 1981 (McKenzie, Miller, and Uhrhammer, 1982). The San Francisco Bay area is visible in the upper-left part. There are a total of 2049 observations, for which time, longitude, latitude, depth and intensity are recorded. Here we use only the longitude and latitude. The algorithm was modied slightly so as to allow data points to have identical coordinates. Points that coincide share the same Vorono polygon and the counting variable #(A) is the number of data points in A. The data suggest that there are several faults, and that they are connected. We will apply the algorithm to extract the clusters dening the 17

• •

• •

• ••••••••••• ••• ••••••••• •

• ••• • •• • •• •• •••• • • ••• • ••• ••••••••••• • •••• • • • ••••••••••••••• •••••••••••• •••• ••••• •• •••••••• •••••••••••••• •• ••• •••••• ••• • ••••••• ••••••••• • ••••••• •• •••• ••••••••••• • •••• ••••••••• ••••• •••• • ••• •••• •••• • •• • • • • • •••• •• • • ••••••••• • • •• ••••••••••••••••• •• • • •••••••• • ••• ••• • • • • •••••••••••••• • •• •• • • ••••••••• • •••••••••••••••••• • • •••• •• •• ••• • • •••••••••••••••••••••• ••••••••••••• • • •••• • •••••••••••••••••••••••••••••••• ••• ••• • • ••• • • •• • ••••••••••• • • •• ••••• • • •••••••••••••••••••••••••••••••••••••••••••••••••••• ••••• ••••• • •••••••••• ••••••••• • • • •••• • ••• • •• •• • • • ••• •• •• • • • • •• ••• •••••••••••••••••••••• •••••••••••••••••••••••• •••• ••••• • • •• ••• • • ••• ••••••••••••••••••••••••• ••••••• •••••••••• •••• ••••• •••••••• • • ••• • • • • ••••••••••• • •••• • ••• •••••••••• • • • • • •• • • • •••••• • • • •• • •••••••• • • • •••• • • • • • • •••••• • • • ••••••••••• • • • •• • • • • • • • •••••

• •• •••• •••••••••••• • ••••••••• ••••••••••• •••••••• • ••••• •••••••••••••• ••••••• • •••• • ••• •••• • • • ••••• ••••• ••• ••••• •• •••• •• • ••• ••• • ••••••• • • ••••••••••• ••••••••••••• •••••••••••• •• •• ••••••• •••••••••••••••••••••••• •••••••••••• •••••••••••••••••••••• • ••• • • •• •••••••••••••••••••••••••••••••••••••••••••••• •••••••••••• ••••••••••••••• • • ••••••••••••• •••••••• •• ••••••••••••••••••••••••••••••••••• •• ••••••• •••••• •••• ••••••••• •••••

(a)

(b)

Figure 5: Geological example: (a) Earthquake epicenters in Northern California and the main faults (b) Cluster of epicenters extracted by the morphological algorithm. Note how the cluster matches the main faults. faults. Figure 5b shows the morphological solution (with one connected component) with documented faults. The algorithm detects the main faults quite e%ciently, despite of the nonuniformity of the point process. The only exception is the Northern part of the San Andreas fault (near San Francisco) where the activity is less pronounced. There is a cluster captured by the algorithm which does not correspond to any of the documented faults in the North-Eastern part of the map. A reason might be that those earthquakes are clustered along an underground fault, not documented on a surface map. This example shows that the Vorono based MLE procedure is an e%cient algorithm to extract features from a noisy real-life point processes.

5 Discussion and Conclusions We introduced a new method for estimating the boundaries of a point process in the presence of background noise and for extracting features of interest by removing noise. The method is based on the Vorono tessellation of the data points, and nds a Maximum Likelihood Estimator of the support domain for mixtures of uniform point processes. The method is nonparametric and requires no assumptions on the shape of the domain, although constraints can be added 18

if necessary. For example, good solutions were obtained in the mineeld detection problem by requiring that the support domain be simply connected and closed (i.e. with regular boundaries). The method is accurate and computationally e%cient. A simulation study for a rectangular mineeld has shown our method to be superior to a model-based method using the EM algorithm when there are small to moderate numbers of data points. The di erence in performance is particularly striking for the more regular point processes studied. Our algorithm would be expected to perform even better on irregularly-shaped domains, because in the EM algorithm the mineeld is assumed to have a Gaussian distribution, implying an elliptical shape. A typical real-life example is the detection of features in the seismic data set of Northern California. Although the uniform hypothesis is strongly violated, the Vorono procedure nevertheless accurately delineates the main faults. The Vorono MLE o ers an alternative in the spatial statistics context, where the area of in!uence around a data point has some signicance. Our method not only leads to a classication of the data points, but also yields a region dened by the data. Moreover, it is easy to obtain a conservative estimate of the feature by including its border, an important consideration in applications such as mineeld detection, Some limitations remain. If geometrical constraints are included, we can only approximate the MLE for data sets of any size. We proposed two basic approximation algorithms: a greedy algorithm merging cells so as to increase the likelihood by the maximum amount at each iteration, and a morphological approach which uses the unconstrained solution as a starting point. These lead to di erent solutions, although the total misclassication rate seems to be relatively insensitive to the choice of algorithm. For the moment the number of clusters is not estimated but must be provided by the user (one cluster in this paper). However, the morphological algorithm can be easily generalized to nd solutions with k clusters. It could be applied for various values of k and the log-likelihood of A^k be compared. A \signicant" di erence between `(A^k ) and `(A^k;1 ) would indicate that the feature is much more likely to have k clusters than k ; 1. A key issue is the quantication of \signicance", which would have to be assessed by further studies.

References 1] Baneld, J. D. and Raftery, A. E. (1993), \Model-Based Gaussian and non-Gaussian Clustering," Biometrics, 49, 803 { 821. 2] Celeux, G. and Govaert, G. (1992), \A Classication EM Algorithm for Clustering and Two Stochastic Versions," Computational Statistics and Data Analysis, 14, 315 { 332. 3] Cherno , H. and Rubin, H. (1954), \The Estimation of the Location of a Discontinuity in Density," Proceedings of the Third Berkeley Symposium on Mathematical Statistics and Probability, University of California Press, 19 { 37. 19

4] Cressie, N. (1993), Statistics for spatial data, Wiley-Interscience. 5] Diggle, P. J. (1983), Statistical analysis of spatial point patterns, Academic Press. 6] Gupta, A. K., Taneja, V. S. and Janardan, K. G. (1986), \Estimation of an Arbitrary Mixture of Uniform Models," Metron, 44, 247 { 256. 7] Heijmans, H. J. A. M and Vincent, L. (1993), \Graph Morphology in Image Analysis," in Mathematical Morphology in Image Processing, E. R. Dougherty, ed., Marcel Dekker (Ch. 6, 171 { 203). 8] Hinde, A. L. and Miles, R. E. (1980), \Monte Carlo Estimates of the Distributions of the Random Polygons of the Vorono Tessellation with Respect to a Poisson Process", Journal of Statistical Computation and Simulation, 10, 205 { 223. 9] Isaaks, E. H. and Srivastava, R. M. (1989), An Introduction to Applied Geostatistics, Oxford University Press. 10] McKenzie, M., Miller, R. and Uhrhammer, R. (1982), Bulletin of the Seismological Stations University of California-Berkeley. 53, No. 1-2. 11] M"ller, J. (1994), Lectures on Random Vorono Tessellations, Series Lecture Notes in Statistics, Springer Verlag. 12] Okabe, A., Boots, B. and Sugihara, K. (1992), Spatial Tessellations. Concepts and Applications of Vorono Diagrams, Wiley & Sons. 13] Ripley, B. D. and Rasson, J. P. (1977), \Finding the edge of a Poisson forest," Journal of Applied Probability, 14, 483 { 491. 14] Ripley, B. D. (1987), Stochastic Simulation, Wiley & Sons. 15] Serra, J., ed. (1988), Image Analysis and Mathematical Morphology II: Theoretical Advances, Academic Press. 16] Silverman, B. (1986), Density Estimation for Statistics and Data Analysis, Chapman & Hall, London.

20