Contextual and non-contextual performance ... - Semantic Scholar

Report 3 Downloads 313 Views
Pattern Recognition Letters 21 (2000) 805±816

www.elsevier.nl/locate/patrec

Contextual and non-contextual performance evaluation of edge detectors T.B. Nguyen, D. Ziou * Dept. de Math. & d'Informatique, Faculte de Sciences, Universit e de Sherbrooke, Sherbrooke, Que., Canada J1K 2RI Received 23 April 1999; received in revised form 29 May 2000

Abstract This paper presents two new evaluation methods for edge detectors. First is non-contextual and concerns the evaluation of edge detector performance in terms of detection errors. The second contextual method evaluates the performance of edge detectors in the context of image reconstruction. Both methods study the in¯uence of image characteristics and edge detector properties on detector performance. Five detectors are evaluated and the performance is compared. Ó 2000 Published by Elsevier Science B.V. All rights reserved. Keywords: Edge detection; Performance evaluation; Detector properties; Image characteristics

1. Introduction Several edge detectors have been proposed, with di€erent goals and mathematical and algorithmic properties (Ziou and Tabbone, 1998). Consequently, one problem encountered by vision systems developers is the selection of an edge detector to be used in a given application. This selection is primarily based on the de®nition of the in¯uence of image characteristics and the properties of the detectors on their performance, a process we call edge detector performance evaluation (Ziou and Koukam, 1998). Several performance evaluation methods have already been proposed. Certain authors (Heath et al., 1997; Cho et al., 1997; Bowyer and Phillips, 1998; Dougherty et al., 1998)

*

Corresponding author. Tel.: +1-819-821-3031; fax: +1-819821-8200. E-mail address: [email protected] (D. Ziou).

group the existing methods according to the presence or absence of ground truth. Such grouping is based on the complexity of the images (e.g., real images, synthetic images) and the performance criteria used. For example, without ground truth, it is dicult to measure the displacement of an edge from its true location. Existing methods that rely on ground truth use either synthetic images or simple real images for which it is easy to specify the ground truth. This grouping of existing evaluation methods takes into account neither the subsequent use of edges nor the intervention of humans (i.e., subjectivity vs objectivity) during the evaluation process. Thus, we propose to group the existing work in two classes according to whether it considers subsequent use of edges in a given application (contextual and non-contextual) and the type of performance criteria used (subjective, objective; with or without ground truth). Both contextual and non-contextual evaluation methods can be either objective or subjective. The

0167-8655/00/$ - see front matter Ó 2000 Published by Elsevier Science B.V. All rights reserved. PII: S 0 1 6 7 - 8 6 5 5 ( 0 0 ) 0 0 0 4 5 - 3

806

T.B. Nguyen, D. Ziou / Pattern Recognition Letters 21 (2000) 805±816

contextual method involves evaluating edge detectors taking into account the requirements of a particular application. It has been used by a few authors in the areas of object recognition (Heath et al., 1997; Sanocki et al., 1998) and motion (Shin et al., 1998). Non-contextual evaluation is carried out independently of any application. Subjective methods (Nair et al., 1995; Heath et al., 1997), borrowed from the ®eld of psychology, use human judgment to evaluate the performance of edge detectors. More precisely, these methods involve presenting a series of edge images to several individuals and asking them to assign scores on a given scale (Nair et al., 1995). Even if these methods seem easy to put into practice, they have some drawbacks. The number of characteristics a human eye can distinguish is limited. For example, the eye cannot di€erentiate between two gray levels that are slightly di€erent. As well, the judgment depends on the individual's experience and attachment to the method, as well as on the image type (e.g., multi-spectra, X-ray). The basic idea behind the objective methods (Abdou, 1978; Kitchen and Rosenfeld, 1981; Pratt, 1991; Kitchen and Venkatesh, 1992; Kanungo et al., 1995) is to measure the performance of the detector according to prede®ned criteria. This can be accomplished empirically or theoretically. In the presence of a ground truth, the criteria concern the di€erence between detected edges and the ground truth, measured by errors of omission, localization, multiple responses, sensitivity, orientation, continuity, and thinness. If there is no ground truth, then the methods evaluate the likelihood of a detected edge being a true edge and the scattering of its orientation and location. Usually the criteria concern the continuity, thinness, variance of location and orientation of edges. This paper presents two new performance evaluation methods for edge detectors contextual and non-contextual. There is a signi®cant di€erence between our methods and earlier ones that concerns the taking into account of image features, properties of detectors and parameters used in the evaluation method. The non-contextual method involves evaluating the performance of edge detectors in terms of detection errors. Detection errors include classical errors (of omission,

localization, multiple responses, sensitivity, and orientation) as well as a new error related to falseedge suppression. The basic idea behind this method consists of running a given detector several times on an image with a known structure, varying the parameters of the detector and the image, and then measuring its performance. The drawback of this approach is that it does not completely characterize the performance of edge detectors, and it seems important to take into account the subsequent use, that is, to know whether they satisfy the requirements of a particular application. For this reason, we proposed a contextual performance evaluation method. This method involves evaluating the performance of edge detectors in the context of image reconstruction. It consists of measuring the performance of an edge detector using the mean square di€erence between the reconstructed image and the original one. Both methods study the in¯uence of image characteristics and detector properties on detector performance. This paper is divided into six sections. Section 2 describes the non-contextual method of performance evaluation. Section 3 describes the experimental results yielded by this evaluation method. Sections 4 and 5 are devoted to the contextual performance evaluation method and the experimental results obtained and Section 6 summarizes the main results. 2. Non-contextual performance evaluation method The non-contextual method consists of running an edge detector several times on a synthetic image, varying the image characteristics and the detector properties. We then determine the in¯uence of these parameters on the performance of the edge detector. Detector performance is determined by comparing the obtained edges with the ideal edges, which are assumed to be known. For this purpose, a given edge pixel is assigned to one of the following four classes: ideal, unambiguous, ambiguous, or false. False edge pixels do not belong to the support region of ideal edge (i.e., pixels in the vicinity of the ideal edge). Fig. 1(a) presents an example of an ideal edge, identi®ed in black. The pixels identi®ed in gray belong to the support

T.B. Nguyen, D. Ziou / Pattern Recognition Letters 21 (2000) 805±816

807

Fig. 1. Detection errors: (a) ideal edges, black pixels are in the ideal edge and grey pixels belong to the support region of ideal edge; (b) omission error; (c) localization error; (d) multiple-response error; (e) sensitivity error. Suppression and orientation errors are not easy to depict.

region of the ideal edge. Among edge pixels detected within the support region, there are ambiguous edge pixels corresponding to multiple responses. All edge pixels detected within the support region, which are not ambiguous, are called unambiguous edge pixels. Among the multiple responses of a detector to an ideal edge, the detected edge closest to the ideal edge belongs to the unambiguous edge. The performance of an edge detector is de®ned by six types of errors: omission, localization, multiple-response, sensitivity, suppression, and orientation errors. Suppression and orientation errors concern only gradient detectors. A good detector must minimize all of these errors. De®nitions of these performance measures are as follows: · Omission error. This error occurs when the detector fails to ®nd an ideal edge Fig. 1(b). The error is measured by dividing the total number of omitted edge pixels by the total number of ideal edge pixels. · Localization error. This error occurs when the location of the unambiguous edge is di€erent from the location of the ideal edge (Fig. 1(c)). The error is measured by dividing the total distance between unambiguous edge pixels and ideal edge pixels by the total number of unambiguous edge pixels. · Multiple-response error. This error occurs when multiple edges are detected in the vicinity of an ideal edge (Fig. 1(d)). The error is de®ned by dividing the total number of ambiguous edge pixels by the total number of unambiguous edge pixels. · Sensitivity error. This error occurs when the detector localizes edges, which do not belong

to the support region of the ideal edge (Fig. 1(e)). The error is de®ned by dividing the total number of false edge pixels by the total number of edge pixels detected. · Suppression error. Usually false-edge suppression is done by a thresholding operation. The edges that have a gradient modulus below a given threshold are suppressed. However, the gradient modulus of an unambiguou edge may be lower than the gradient modulus of a false edge. Suppression errors occur when there is a suppression of unambiguous edges while false edges persist. Let us consider the distribution of the gradient modulus of false edges and the distribution of the gradient modulus of unambiguous edges; the suppression error is measured by the overlap between these distributions. · Orientation error. This error occurs when the estimated orientation of the detected edge is not equal to the given orientation. The error is de®ned by dividing the sum of the absolute value of the di€erence between the estimated and given orientations of unambiguous edge pixels by the total number of unambiguous detected edge pixels. The parameters in¯uencing detector performance that were considered related to detector properties, image characteristics, and the performance evaluation method. The parameter related to the performance evaluation method is the size of the edge support region. The parameters related to image characteristics concern edges characteristics such as type, sharpness, signal-to-noise ratio and subpixel. The edge types we considered are the step, staircase and pulse (Fig. 2). Fig. 3 presents an example of a synthetic image containing

808

T.B. Nguyen, D. Ziou / Pattern Recognition Letters 21 (2000) 805±816

Fig. 2. Pro®les of (a) step; (b) staircase; (c) pulse.

256  256 pixels and 256 gray levels used in the evaluation. The image contains ®ve edges, their type being (from left to right) step, ascending staircase, descending staircase, pulse and inverted pulse. The vertical step edge is determined by the following equation: I…x; y† ˆ

8   < c 1 ÿ …1=2†eÿl…xÿLocedge †

and if x 6 Locedge ;

: …c=2†  el…xÿLocedge †

and if x > Locedge ;

where c is the contrast, l the sharpness and Locedge is the location of the edge. This location can be real (subpixel). The staircase and pulse edges are formed by the combination of two steps I…x; y† ‡ aI…x ÿ D; y† where a < 0 is a pulse and a > 0 is a staircase. To this image, we added white noise of a given standard deviation. The edge detectors used are the gradient of Gaussian (DGG) (Canny, 1983), gradient of Deriche (DGD) (Deriche, 1987), gradient of Shen (DGS) (Shen and Castan, 1992), and Laplacian of Gaussian (DLG) (Marr and Hildreth, 1980), Laplacian of Deriche (DLD). The parameters of these detectors are the scale, the order of the differentiation operator, and the smoothing ®lter. The scale for all of these detectors is controlled by one parameter. In the cases of DGG and DLG, the scale corresponds to the standard deviation of the Gaussian used as the smoothing ®lter. For DGD, DLD, and DGS the scale is equal to 2 divided by the ®lter parameter (Lacroix, 1990). Thus, it is easy to run all of these detectors at a similar scale. All of these detectors have analytical de®nitions and they obey the convolution theorem because they can be written as the di€erentiation of the

Fig. 3. Synthetic image

convolution of the image with the ®lter. Conceptually, they include three operations: smoothing, di€erentiation, and false-edge suppression. The importance of the last operation depends on the properties of the two others and the image characteristics. For gradient detectors, the previously de®ned suppression criterion takes into account the false-edge suppression step. In other words, false edges can be cleaned easily for detectors having a low value for this criterion. The falseedge suppression step is omitted for Laplacian detectors. The ®ve detectors share either the differentiation operators (Gradient and Laplacian) and ®lters (Gaussian and exponential) on the performance measures, it is possible to characterize the behavior of each operation independently of the other. This makes it possible to build an edge detector that ful®lls the given requirements by selecting the di€erentiation operator and the ®lter. For example the di€erence in performance between the Laplacian of Gaussian and the gradient of Gaussian is due to the di€erentiation operator. Similarly, the di€erence in performance between the gradient of Deriche and the gradient

T.B. Nguyen, D. Ziou / Pattern Recognition Letters 21 (2000) 805±816

of Shen is due to the ®lter. To reduce the e€ect of the implementation method, all detectors have been implemented using convolution masks. The evaluation method can be summarized as follows. Recall that in order to obtain performance measures for the edge detectors, we ran them, varying the parameters mentioned above. Parameters that take their values in continuous intervals were sampled. We considered large intervals for the parameters and a small step for the sampling, i.e., subpixel 2 ‰0:0; 0:5Š, sharpness 2 ‰1; 10Š, signal-to-noise ratio 2 ‰2; 5Š, scale 2 ‰1; 2:5Š, and support region size 2 ‰3; 11Š. Each performance measure is a function of eight discrete variables: di€erentiation operator, ®lter, scale, edge type, sharpness, signal-to-noise ratio, subpixel, and size of support region. There is no ecient way to analyze this function. Thus, to carry out the performance analysis, we chose to reduce the number of variables by using two comparison techniques. Firstly, detector performance measures were compared by computing the correlation between them. Secondly, the mean performance measures were computed by varying parameters over the entire intervals given above and the detectors were ranked. The two techniques gave similar results. The quantity of data generated by the non-contextual method is overwhelming. Only a part of the data related to the mean performance measures is given in this paper. The reader will ®nd all the complete results in (Nguyen, 1998). 3. Experimental results In this section, we present the general observations derived from the results for the non-contextual method. We will start by describing the mean

809

performance measures from which the e€ects of the ®lters and di€erentiation operators used are deduced. Then, we will present the e€ect of the other parameters considered on performance. The mean performance measures are computed separately for each edge type by varying the other parameters. Table 1 presents the experimental results obtained in the case of a step edge. The ®rst number indicates the mean error. To facilitate comparison of detectors, we have normalized the errors by dividing each of them by the highest. The normalized errors are presented in brackets. For example, in Table 1, for DGS, the omission error is 0.05, the normalized omission error 0.56, the suppression error 0.26, and the normalized suppression error 0.93. Performances for staircase and pulse edges are given in Tables 2 and 3. By analyzing the results obtained, we conclude that: · The ranking of detectors is the same for multiple-response, sensitivity, suppression and orientation errors. The sensitivity error of all detectors is comparable. The DGG has the lowest multiple-response and sensitivity errors while the DLD has the highest. A detector with low multiple-response and sensitivity errors has a high omission error and vice versa. · Performance is in¯uenced by the di€erentiation operator. Laplacian detectors have lower omission errors than their corresponding gradient detectors. However, the latter have lower multiple-response and sensitivity errors than the corresponding Laplacian detectors. This explains why Laplacian detectors are not suitable for noisy or textured images. A Laplacian detector is more suitable for the localization of staircase and pulse edges, whereas, a gradient detector is more suitable for the localization of step edges.

Table 1 Mean value of performance measures in the case of a step edge Omission

Localization

Multiple-response

Sensitivity

Suppression

Orientation

DLD 0.02 (0.22) DLG 0.04 (0.44) DGS 0.05 (0.56) DGD 0.06 (0.67) DGG 0.09 (1.00)

DGS 0.70 (0.87) DGD 0.73 (0.92) DGG 0.77 (0.97) DLG 0.78 (0.99) DLD 0.79 (1.00)

DGG 1.21 (0.52) DGD 1.58 (0.68) DGS 1.65 (0.71) DLG 1.75 (0.76) DLD 2.31 (1.00)

DGG 0.93 (0.97) DGD 0.94 (0.98) DGS 0.94 (0.98) DLG 0.95 (0.99) DLD 0.96 (1.00)

DGG 0.18 (0.64) DGD 0.21 (0.75) DGS 0.26 (0.93)

DGG 31.22 (0.86) DGD 33.42 (0.92) DGS 35.12 (0.97)

810

T.B. Nguyen, D. Ziou / Pattern Recognition Letters 21 (2000) 805±816

Table 2 Mean value of performance measures in the case of a staircase edge Omission

Localization

Multiple-response

Sensitivity

Suppression

Orientation

DLD 0.02 (0.18) DLG 0.06 (0.55) DGS 0.07 (0.64) DGD 0.08 (0.73) DGG 0.11 (1.00)

DLD 0.25 (0.36) DLG 0.49 (0.70) DGS 0.67 (0.96) DGG 0.69 (0.99) DGD 0.70 (1.00)

DGG 0.74 (0.43) DGD 0.97 (0.57) DGS 1.08 (0.63) DLG 1.38 (0.81) DLD 1.71 (1.00)

DGG 0.86 (0.96) DGD 0.88 (0.98) DGS 0.88 (0.98) DLG 0.89 (0.99) DLD 0.90 (1.00)

DGG 0.16 (0.57) DGD 0.19 (0.68) DGS 0.26 (0.93)

DGG 24.27 (0.77) DGD 26.85 (0.85) DGS 30.37 (0.96)

Table 3 Mean values of performance measures in the case of a pulse edge Omission

Localization

Multiple-response

Sensitivity

Suppression

Orientation

DLD 0.02 (0.18) DLG 0.03 (0.27) DGS 0.07 (0.64) DGD 0.08 (0.73) DGG 0.11 (1.00)

DLD 0.28 (0.41) DLG 0.50 (0.72) DGS 0.67 (0.97) DGG 0.68 (0.99) DGD 0.69 (1.00)

DGG 0.74 (0.47) DGD 0.98 (0.62) DGS 1.08 (0.68) DLG 1.17 (0.74) DLD 1.59 (1.00)

DGG 0.86 (0.95) DGD 0.88 (0.97) DGS 0.88 (0.97) DLG 0.89 (0.98) DLD 0.91 (1.00)

DGG 0.16 (0.57) DGD 0.20 (0.71) DGS 0.26 (0.93)

DGG 24.27 (0.77) DGD 27.17 (0.86) DGS 30.43 (0.96)

· Performance is in¯uenced by the ®lter. The ®lter that has lower suppression and orientation errors has a higher omission error and lower multiple-response and sensitivity errors. The ®lter of Shen has the lowest omission error for all the edges, followed by the ®lters of Deriche and Gauss. For the localization error, the ranking varies according to the scale and the edge type. We will now deal with the in¯uence of the parameters considered on the performance of a detector. As mentioned above, the quantity of data generated by the non-contextual method is overwhelming. In order to analyze the variations of performance, we decided to de®ne a ``language'' to describe the behavior of the detectors. Fig. 4 shows the increasing curves (see, Nguyen, 1998 for the decreasing curves). Fig. 4(a)±(d) present curves that increase linearly. Fig. 4(e) presents a curve

that increases exponentially and Fig. 4(f) presents a curve that increases logarithmically. Borders 1 and 2 show the interval of variation of the error. To complete our language we needed to add three curves. The ®rst represents quasi-linear measures Fig. 5(a) and the second, oscillating measures Fig. 5(b). Finally, it is possible that a measure may oscillate between the two borders, that is, it is neither increasing nor decreasing Fig. 5(c). Fig. 6 presents an example of detector performance as a function of the signal-to-noise ratio in the case of a step. The results below concern all types of edges, since the behavior of the detectors is the same for step, stair and pulse edges: · When the subpixel increases, the omission error oscillates for gradient detectors and increases linearly with oscillations for Laplacian detectors. The localization error increases linearly

Fig. 4. Increasing curves: (a), (b), (c) and (d) are linear; (e) is exponential; (f) is logarithmic.

T.B. Nguyen, D. Ziou / Pattern Recognition Letters 21 (2000) 805±816

811

Fig. 5. Other curves.

Fig. 6. Performance of detectors as a function of the signal-to-noise ratio in the case of a step edge. Results obtained are rounded to two decimals; this explains why some borders are equal (e.g., in the sensitivity column).

with oscillations for all detectors. The multipleresponse error oscillates for gradient detectors and decreases linearly for Laplacian detectors. The sensitivity error oscillates for all detectors. The suppression and orientation errors oscillate for all gradient detectors. · When the sharpness increases, the omission error decreases exponentially for all detectors. The localization and multiple-response errors decrease exponentially for gradient detectors and decrease linearly with oscillations for Laplacian detectors. The sensitivity error decreases linearly with oscillations for all detectors. The suppression and orientation errors decrease exponentially for all gradient detectors. · When the signal-to-noise ratio increases (Fig. 6), the omission and multiple-response errors

decrease quasi-linearly for all detectors. The localization error decreases quasi-linearly for gradient detectors and decreases linearly with oscillations for Laplacian detectors. The sensitivity error decreases linearly with oscillations for all detectors. The suppression error decreases exponentially for DGG, DGD and DGS. The orientation error decreases quasi-linearly for all gradient detectors. · When the size of the support region increases, the omission error decreases exponentially, and the localization error increases logarithmically for all detectors. The multiple-response error increase linearly for DLG and DLD, increases quasi-linearly for DGD and DGS, and increases exponentially for DGG. The sensitivity error decreases linearly for all detectors.

812

T.B. Nguyen, D. Ziou / Pattern Recognition Letters 21 (2000) 805±816

The suppression error increases quasi-linearly for all detectors, except for DGG where it increases exponentially. The orientation error increases logarithmically for all detectors. We conclude that the non-contextual evaluation method is sensitive to the size of the support region. · When the scale increases, the omission error increases logarithmically for gradient detectors and increases linearly for Laplacian detectors. The localization error increases exponentially for gradient detectors and increases logarithmically for Laplacian detectors. The multipleresponse error decreases linearly for gradient detectors and decreases exponentially for Laplacian detectors. The sensitivity error decreases linearly for all detectors. The suppression error decreases linearly with oscillations and the orientation error increases logarithmically for DGG, DGD and DGS. 4. Contextual performance evaluation method This method consists of measuring the performance of the detectors in the context of image reconstruction from edges. Carlsson (1988) proposed an algorithm for image compression from

coded edges. The image-coding algorithm is based on the principle that important features like edges should be coded and reproduced as exactly as possible and no spurious features should be introduced in the image reconstruction process. The reconstructed image is smooth and is obtained as the solution to a heat di€usion equation. The drawback is that the decompressed image is degraded because there is a loss of information during the edge detection process. As we will show, the reconstructed image is in¯uenced by the detector used and the image characteristics. More recently, the Carlsson algorithm has been used to reconstruct images from the representation of edges in scale space (Elder and Zucker, 1998). We are not interested in the image compression process; rather, our primary interest lies in using the reconstructed image to characterize the performance of the edge detector used. The performance evaluation methods applied in two steps. The ®rst consists of obtaining edges by performing an edgedetection with a given detector. Fig. 7 presents eight images used in the evaluation. These images have a size of 256  256 pixels and contain 256 gray levels. We also considered di€erent types of edges in order to determine the in¯uence of image characteristics on detector performance (Fig. 7(a)±(c)). The second step consists of reconstructing the

Fig. 7. Synthetic images: (a) step; (b) staircase; (c) pulse. Real images: (d) nuts; (e) glasses; (f) Lena; (g) back; (h) Sherbrooke.

T.B. Nguyen, D. Ziou / Pattern Recognition Letters 21 (2000) 805±816

original image from the edge image, using the di€usion process. The considered edge detectors are DGG, DGD, DGS, DLG and DLD. The interval used for the scale is between 0.95 and 5.0. The performance of the detector is de®ned by the mean square di€erence between the reconstructed image Lrec and the original Iori

Equadratic ˆ

q P P 2 x y …Irec…x;y† ÿ Iori…x;y† † n

;

813

where n is the image size. When Equadratic equals 0, it means that the reconstructed image is identical to the original one. The greater the value of Equadratic , the more degraded the reconstructed image. 5. Experimental results This section presents the experimental results for the contextual method. To provide the reader with reference point, Fig. 8 shows as example of a reconstructed image obtained by the ®ve detectors we

Fig. 8. Image reconstruction: (a) and (b) edges obtained by DGS and the reconstructed image; (c) and (d) DGD; (e) and (f) DGG; (g) and (h) DLD; (i) and (j) DLG.

814

T.B. Nguyen, D. Ziou / Pattern Recognition Letters 21 (2000) 805±816

Fig. 9. Mean square errors. Synthetic images: (a) step; (b) staircase; (c) pulse. Real images: (d) nuts; (e) glasses; (f) Lena; (g) back; (h) Sherbrooke.

T.B. Nguyen, D. Ziou / Pattern Recognition Letters 21 (2000) 805±816

have used. The scale of all these detectors is two. As we mentioned earlier, the scale of Deriche and Shen ®lters is equal to two over the scale of Gaussian ®lter. Edges used in the reconstruction process have not been cleaned. Experimentation showed that the cleaning operation has no e€ect on the rank of the ®ve detectors. This allows avoiding, taking the threshold into account in our study. Subjectively, according to the sharpness of reconstructed images the ®ve detectors are ranked as follows (Fig. 8): DGS, DGD, DLD, DGG, and DLG. For each image, Fig. 9 presents the mean square di€erence between the reconstructed image and the original, as a function of the scale. We conclude that: · The quadratic error depends on the scale. It increases when the scale increases, for all detectors. In fact, at high scale, there are few detected edges. The di€usion process, which is iterative, has edges to start from. · Performance depends on the edge type. In the case of a step edge, the ranking for a small scale 2 ‰1:0; 1:6Š is DGG, DGD, DGS, DLG, and DLD (see Fig. 9(a)). The ranking for a larger scale 2 ‰1:6; 5:0Š is DGS, DGD, DLD, DGG, DLG. At high scale, a gradient detector produces a lower error than the corresponding Laplacian detector. In the case of staircase and pulse edges, the ranking for a small scale 2 ‰1:0; 1:6Š is DGG, DGD, DGS, DLG and DLD (see Fig. 9(b) and (c)). We noticed that this ranking is similar to the one for a step edge. The ranking for a larger scale 2 ‰1:6; 5:0Š is DGS, DLD, DLG, DGG. In this case, a Laplacian detector has lower error than the corresponding gradient detector. · For Fig. 9(d) and (e), gradient detector has a lower error than a Laplacian one. For Fig. 9(f) and (g), gradient detector has a lower error than the corresponding Laplacian detector. For Fig. 9(h), a gradient detector has a lower error than the corresponding Laplacian detector, except for the Gaussian detectors. We conclude that performance is in¯uenced by the di€erentiation operator. · Performance is in¯uenced by the ®lter. The ®lter of Shen gives the best results, followed by the ®lters of Deriche and Gauss.

815

6. Conclusion In this paper, we have presented two methods for measuring the performance of edge detectors. The ®rst one is non-contextual and evaluates the performance of edge detectors in terms of detection errors. The main features of this method are: · The detection errors include classical errors (of omission, localization, multiple responses, sensitivity, and orientation) as well as a new error to false-edge suppression. · The in¯uence of image characteristics and of the properties of detectors on their performance is determined. The image characteristics used are edge type, subpixel, sharpness, and signalto-noise ratio. The detector parameters are the scale, the order of the di€erentiation operator, and the ®lter. A last parameter, the size of the ideal-edge support region is used to measure the performance of the detectors. Most quantitative evaluation methods are noncontextual. However, these methods do not completely characterize the performance of an edge detector. It is important to take into account the subsequent use of the detector to know whether it satis®es the requirements of a particular application. This is why we proposed a second evaluation method for the performance of edge detectors in the context of image compression. It involves measuring the performance of an edge detector according to a mean square di€erence between the reconstructed image and the original one. In both methods, we studied the in¯uence of the image characteristics and detector properties on detector performance. The results of this study will be helpful in selecting an edge detector for a given application. However, several improvements can be made in order to make these performance evaluation methods more complete. These include de®ning a better synthesis of experimental results and considering other types of edges.

References Abdou, I.E., 1978. Quantitative methods of edge detection. Technical Report No. 830, Image Processing Institute, University of Southern California.

816

T.B. Nguyen, D. Ziou / Pattern Recognition Letters 21 (2000) 805±816

Bowyer, K.W., Phillips, P.J., 1998. Empirical Evaluation Techniques in Computer Vision. IEEE Computer Society Press, Los Alamitos, CA. Canny, J.F., 1983. Finding edges and lines in images. Technical Report No. 720, Massachusetts Institute of Technology. Carlsson, S., 1988. Sketch based coding of grey level images. Signal Process. 15 (1), 57±83. Cho, K., Meer, P., Cabrera, J., 1997. Performance assessment through bootstrap. IEEE Trans. Pattern Anal. Machine Intell. 19, 1185±1198. Deriche, R., 1987. Using canny's criteria to derive a recursive implemented optimal edge detector. Internat. J. Comput. Vision 1 (2), 167±187. Dougherty, S., Bowyer, K.W., Kranenburg, C., 1998. ROC curve evaluation of edge detector performance. In: Proc. IEEE Internat. Conf. Image Process. Elder, J.H., Zucker, S.W., 1998. Local scale control for edge detection and blur estimation. IEEE Trans. Pattern Anal. Machine Intell. 20 (7), 699±716. Heath, M.D., Sarkar, S., Sanocki, T., Bowyer, K.W., 1997. A robust visual method for assessing the relative performance of edge-detection algorithms. IEEE Trans. Pattern Anal. Machine Intell. 19 (12), 1338±1359. Kanungo, T., Jaisimha, M.Y., Palmer, J., Haralick, R.M., 1995. A methodology for quantitative performance evaluation of detection algorithms. IEEE Trans. Image Process. 4 (12), 1667±1674. Kitchen, L., Rosenfeld, A., 1981. Edge evaluation using local edge coherence, IEEE Trans. Systems Man Cybernat. SMC11(9), 597±605. Kitchen, L.J., Venkatesh, S., 1992. Edge evaluation using necessary components. CVGIP: Graphical Models and Image Processing 54 (1), 23±30.

Lacroix, V., 1990. Edge detection: what about rotation invariance? Pattern Recognition Letters 11, 797±802. Marr, D., Hildreth, E.C., 1980. Theory of edge detection. In: Proc. Roy. Soc. London B207, pp. 187±217. Nair, D., Mitiche, A., Aggarwal, J.K., 1995. On comparing the performance of object recognition systems. Internat. Conf. Image Process, 631±634.  Nguyen, T.B., 1998. Evaluation des algorithmes d'extraction de contours dans des images  a niveausx de gris. Memoire de Ma^õtrise, Universite de Sherbrooke. Pratt, W.K., 1991. Digital Image Processing, second ed. WileyInterscience, New York. Sanocki, T., Bowyer, K.W., Heath, M.D., Sarkar, S., 1998. Are edges sucient for object recognition? J. Exp. Psychol. 24 (1), 340±349. Shen, J., Castan, S., 1992. An optimal linear operator for edge detection. CVIGIP: Graphical Models and Image Processing 54 (2), 122±133. Shin, M.C., Goldgof, D., Bowyer, K.W., 1998. An objective comparaison methodology of edge detection algorithms using a structure from motion task. In: Bowyer, K.W., Phillips, P.J., (Eds.), IEEE Computer Society Press, Los Alamitos, CA. Ziou, D., Koukam, A., 1998. Knowledge-based assistant for the selection of edge detectors. Pattern Recognition 31 (5), 587±596. Ziou, D., Tabbone, S., 1998. Edge detection techniques ± an overview. Internat. J. Pattern Recognition Image Anal. 8, 537±559.