Robust watershed segmentation using wavelets - Semantic Scholar

Comment

Report 14 Downloads 138 Views

Image and Vision Computing 23 (2005) 661–669 www.elsevier.com/locate/imavis

Robust watershed segmentation using wavelets Cla´udio Rosito Junga, Jacob Scharcanskib,* a

UNISINOS—Universidade do Vale do Rio dos Sinos, Cieˆncias Exatas e Tecnolo´gicas, Av. UNISINOS, 950. Sa˜o Leopoldo, RS 93022-000, Brazil UFRGS—Universidade Federal do Rio Grande do Sul, Instituto de Informa´tica, Av. Bento Gonc¸alves, 9500. Porto Alegre, RS 91501-970, Brazil

b

Received 6 February 2004; received in revised form 8 November 2004; accepted 30 March 2005

Abstract The use of watersheds in image segmentation relies mostly on a good estimation of image gradients. However, background noise tends to produce spurious gradients, causing over-segmentation and degrading the result of the watershed transform. Also, low-contrast edges generate small magnitude gradients, causing distinct regions to be erroneously merged. In this paper, a new technique is presented to improve the robustness of the segmentation using watersheds, which attenuates the over-segmentation problem. A redundant wavelet transform is used to de-noise the image, enhance edges in multiple resolutions, and obtain an enhanced version of image gradients. Then, the watershed transform is applied to the obtained gradient image, and the segmented regions that do not satisfy specific criteria are removed or merged. Applications of our segmentation approach to noisy and/or blurred images are discussed, emphasizing a case study in fingerprint segmentation. q 2005 Elsevier B.V. All rights reserved. Keywords: Image de-noising; Image enhancement; Watersheds; Segmentation; Wavelets

1. Introduction Image segmentation is a challenging task in image analysis, which consists of partitioning an image in distinct regions, generally corresponding to meaningful objects in a scene. In particular, the watershed transform is a well known image segmentation approach [1–3], which is based on the following morphological principles. If we regard a grayscale image as a topographic relief, the gray value at a given location represents the elevation at that point. If this relief is to be flooded, starting at surface global minima, the water would fill up lower elevation points first, and then the water level would increase. At locations where water coming from different minima would meet, a ‘dam’ is built. Finally, when the whole surface is flooded, each minimum becomes completely surrounded by ‘dams’ (i.e. the watersheds). These watersheds delimit the segmented regions, which are the minima catchment basins.

* Corresponding author. Tel.: C55 3316 7128; fax: C55 3316 7308. E-mail addresses: [email protected] (C.R. Jung), jacobs@ inf.ufrgs.br (J. Scharcanski).

0262-8856/$ - see front matter q 2005 Elsevier B.V. All rights reserved. doi:10.1016/j.imavis.2005.03.001

If this process is applied to a gradient image, where to each pixel corresponds the local gradient modulus, then the watersheds correspond to the gradient image crest lines. In this case, the catchment basins correspond to the segmented image objects. Unfortunately, images are inherently noisy, and contain graylevel fluctuations that generate spurious gradients. Such gradients generate spurious watersheds, which are the main cause of over-segmentation, a known limitation of the watershed segmentation approach. Therefore, the application of watershed segmentation requires a robust image gradient computation technique. Usually, thresholding methods are not sufficient to eliminate the gradients associated with noise, specially in images where the amount of noise is large and/or lowcontrast edges occur. Some methods have been proposed in the literature to simplify an image, by removing small details before applying the watershed transform. Meyer [4] introduced the levelings approach, which consists of applying morphological filters to remove small details of the image. This approach works well for images with small amounts of noise, but has limitations when the noise is intense and/or when low-contrast edges occur. Haris et al. [5] proposed an edge-preserving statistical noise reduction approach as a pre-processing stage for the watershed transform, and a hierarchical merging process

662

C.R. Jung, J. Scharcanski / Image and Vision Computing 23 (2005) 661–669

as a post-processing stage. The results obtained with their approach are satisfactory, but edge enhancement is not explored, allowing regions with weak borders to be erroneously merged. Weickert [6] proposed partial differential equations for image de-noising or edge enhancement, but did not show how to combine de-noising and edge enhancement. Kim and Kim [7,8] proposed a wavelet-based watershed image segmentation technique that reduces the over-segmentation problem and provides noise suppression, but it allows low-contrast regions to be erroneously merged. Nguyen et al. [9] proposed a combination of energy-based segmentation and watersheds, called watersnakes. This technique is useful to remove wrong limbs attached to the objects of interest, but does not address the issue of noisy image segmentation. In our approach, a relevant dyadic scale 2J is selected by the user (later, the gradient magnitudes will be computed at this scale). A wavelet-based multi-scale technique is then applied to de-noise and enhance image edges, using a wavelet decomposition of the image in JmaxZJC1 levels. Then, gradients of the de-noised and enhanced image are estimated at scale 2J using the wavelet transform, and the watershed transform is applied to the gradient magnitude image. A post-processing stage may be applied to merge some remaining small over-segmented regions, that usually contain weak borders and were sub-divided erroneously. Later, we discuss the application of our method to images containing artificial and inherent noise. Among different applications, we discuss in detail the problem of fingerprint segmentation. In this case, the foreground is understood as image parts that originated from the contact of a fingertip with the sensor. Such segmentation is crucial for automatic fingerprint classification.

2. Pre-processing To improve the quality of the watershed segmentation, we need to obtain an image with well-defined borders, avoiding situations where distinct regions are erroneously merged because of gaps in their boundaries. Also, such image should contain small noise contamination, avoiding the occurrence of over-segmentation caused by (false) noisy edges. In a previous work [10], we proposed a multi-resolution de-noising technique based on the wavelet transform. In the present work, we extend that approach to include adaptive edge enhancement, such that edges and background noise are discriminated more easily. Next, our previous work on image de-noising is briefly described, and the proposed enhancement function is introduced. 2.1. Wavelet shrinkage As in our previous work [10], we use a discrete image wavelet decomposition which is non-decimated, has only

two detail images (horizontal and vertical), and relies on the same mother wavelet proposed by Mallat and Zhong [11]. As a result, detail images W21j f ½n; m and W22j f ½n; m, as well as smoothed images S2j f ½n; m, are obtained at each scale 2j, for jZ1,.,Jmax. In the present work, a scale 2J is selected for image gradient estimation, and we define JmaxZJC1. Since the mother wavelet has approximately a derivative of Gaussian shape, the detail images W21j and W22j f are considered as local differences along the x and y directions, providing good approximations for the local image gradients at scale 2j [12]. Consequently, edge magnitudes at scale 2j are calculated based on these local image gradients, as follows [11]: qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ M2j f Z ðW21j f Þ2 C ðW22j f Þ2 : (1) For each scale, we estimate a non-negative nondecreasing shrinkage function gj(x), with 0%gj(x)%1, and the wavelet coefficients W21j f and W22j f are updated according to the following rule: NW2i j f ½n; m Z W2i j f ½n; mgj ½n; m;

i Z 1; 2;

(2)

where gj ½n; mZ gj ðM2j f ½n; mÞ is a shrinkage factor. To preserve edges during the noise removal process, the shrinkage factors should be close to 1 near edges, and close to 0 in homogeneous regions. Details on the calculation of the shrinkage function gj(x) are provided in [10]. 2.2. Edge enhancement In this work, we extend the approach described above to enhance edges in noisy images. Factors gj[n,m] assume values between 0 and 1, and are used for wavelet shrinkage in image de-noising. However, for edge enhancement purposes, we allow these shrinkage factors to be greater than 1. Therefore, when the inverse transform is applied, wavelet coefficients corresponding to edges are enhanced, and relevant image edges are enhanced as well. We introduce a monotonically increasing edge enhancement function hj:[0,1]/[0,CN), which is used for updating the shrinkage factors gj[n,m] according to: genh j ½n; m Z hj ðgj ½n; mÞ:

(3)

These functions hj should enhance shrinkage factors differently, taking into account their magnitudes and scale. Independent of scale, small shrinkage factors, which are usually associated with small coefficient magnitudes, should be de-enhanced (de-emphasized); at the same time, larger shrinkage factors, usually associated with relevant image edges, should be enhanced (emphasized). Also, maximum enhancement should occur at the selected scale 2J, where gradient magnitudes are estimated. Our choice for hj is: hj ðvÞ Z bj v2 ;

(4)

C.R. Jung, J. Scharcanski / Image and Vision Computing 23 (2005) 661–669

where bj is a scale-dependent parameter. It should be noticed that the quadratic function has the property of providing extra de-noising for small shrinkage factors, because hj(v)%v for 0%v%(1/bj), for any scale 2j. Such property helps reducing residual noise. On the other hand, hj(v)Ov for (1/bj)%v%1, for any scale 2j, enhancing coefficients with larger shrinkage factors. The parameter bj define a Gaussian window in scale domain, and is chosen to provide maximum enhancement at scale 2J, where gradient magnitudes will be estimated: bj Z bmax wðjÞ;

(5)

where wðjÞZ eðjKJÞ =s , s2 controls the window aperture in scale domain, and bmax defines the maximum enhancement allowed at scale 2J. It shall be noted that s2 controls the attenuation of bj with respect to the reference scale 2J, and larger values of s2 provide smaller bj attenuation in scale domain (i.e. less dominance of scale 2J). Also, parameter bmax controls the local contrast at boundaries. In order to enhance low-contrast image boundaries, it is recommended to use larger bmax values. However, we need to be careful to decrease bmax in images containing large amounts of noise, so that residual noise is not enhanced with the image boundaries. Therefore, our approach offers three control parameters, namely: (a) the scale 2J is selected based on the size of the objects of interest; (b) the scale domain window aperture s2 is chosen to provide enhancement of neighboring scales of 2J, so that a range of object sizes is also emphasized; and (c) the parameter bmax determines the maximum local contrast at image boundaries. Although these parameters are image dependent, our experimental results show that a good gradient estimate for a variety of images can be obtained using s2Z4 and bmaxZ7. The selection of scale 2J will be discussed in Section 5.2. The final step of the pre-processing stage is to compute updated wavelet coefficients NW2i j f ½n; m through Eq. (2), using genh j ½n; m instead of gj[n,m], and applying the inverse wavelet transform to obtain the de-noised/enhanced image. 2

2

3. Computing the watershed transform After applying the technique described above, we obtain a de-noised image with enhanced edges (with maximum

663

enhancement at scale 2J). The wavelet transform is recomputed for the enhanced image, and detail images WW21J f ½n; m, WW22J f ½n; m are used to obtain the enhanced gradient magnitudes Menh at scale 2J: qﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ M enh Z ðWW21J f Þ2 C ðWW22J f Þ2 : (6) Even after the pre-processing stage, some spurious gradients still remain in the image. To remove these undesired small magnitude gradients, a threshold T is applied to the gradient image, and coefficient values with Menh smaller than T are set to zero. The threshold T is selected as a standard value, such as TZk max(Menh), where max(Menh) is the maximum of the gradient magnitudes, and 0!k!1 is constant (we used kZ0.2 in this work). For example, let us consider the bacteria image shown in Fig. 1(a). It shall be noticed that background noise is intense, and some edges are fuzzy. Fig. 1(b) shows the denoising and edge enhancement of the bacteria image, using JmaxZ3. The enhanced magnitudes Menh of the bacteria image (after thresholding) are shown in Fig. 1(c). The watersheds of Menh are then computed, and the segmented image is obtained. Segmentation results for the bacteria image are shown in Fig. 1(d). It can be seen that all bacteria were segmented from the background, but some small spurious regions also appeared in the segmented image. Therefore, we use a post-processing stage to further reduce over-segmentation, which is described next.

4. Post-processing There are several approaches for merging watershed regions, and obtain larger and more interesting image segments. For example, Weickert [6] used the contrast difference between adjacent regions as a merging criterion. Haris et al. [5] used a hierarchical merging process as a postprocessing stage. Other authors [8,13] used markers during watershed segmentation. In this work, we studied two criteria to merge regions: minimum valid region size, and minimum edge intensity separating adjacent regions. Often, within the context of an application, realistic objects in an image exist within a range of sizes. Then, it is possible to impose a minimum valid

Fig. 1. (a) Noisy bacteria image. (b) De-noised/enhanced image. (c) Gradients of the de-noised/enhanced image. (d) Watersheds computed using gradient image.

664

C.R. Jung, J. Scharcanski / Image and Vision Computing 23 (2005) 661–669

Fig. 2. (a) Segmentation using our proposed technique with post-processing. (b) Segmentation using Gaussian filtering and Prewitt operator. (c) Segmentation using morphological filtering and gradient. (d) Segmentation using Edge Flow.

object area for segmented regions, which is denoted by TA. If a certain region has an area smaller than TA, then its borders with the neighboring regions are searched, and this region is merged with the neighboring region which has the widest border. Also, if the border between two adjacent regions is weak (i.e. has low contrast), then such regions are merged. To decide if a border is strong enough to keep two regions apart, a hysteresis approach similar to Canny’s [12] is used. Two thresholds T1!T2 are chosen, and a particular border is kept intact if all the gradients along this border are greater than T1, and at least a fraction p of the gradients are greater than T2. Otherwise, the border is removed, and the two regions are merged. The two thresholds T1 and T2 define an interval which can accommodate a range of inhomogeneous gradients. Within our framework it is also possible to change (increase) gradient magnitudes by adjusting bmax. In some situations this resource could facilitate handling inhomogeneous gradients with the two thresholds T1 and T2. It is clear that threshold TA is inherently application dependent, because the minimum object area can vary significantly for different applications (for instance, TA should be smaller for the bacteria image segmentation, and larger for isolating fingerprint images). Values T1, T2 and p depend on the strength of the edges between adjacent regions. To preserve weaker borders, these values should be smaller. On the other hand, to preserve borders with higher contrast only, these threshold values should be larger. In all experiments conducted in this work, we used T1 and T2 as 25 and 40% of the maximum gradient, respectively, and pZ0.5. As an illustration of the efficiency of our post-processing technique, the final segmentation of the bacteria image is displayed in Fig. 2(a). In this example, we used TAZ50, so that regions with an area smaller than 50 pixels were merged.

5. Experimental results In this section, we compare the proposed technique with two other pre-processing techniques known to improve the watershed segmentation. Namely, a Gaussian kernel followed by the Prewitt operator [14,15], and

the morphological noise-reduction OCCO filter followed by the morphological gradient [16,2]. We also compare our technique with the Edge Flow segmentation method [17], which is not based on watersheds. In another experiment, we analyze the influence of the scale 2J on the segmentation of a synthetic image containing several squares with different sizes, contrast and noise contamination levels. Finally, we discuss the segmentation of fingerprints with the proposed technique as a case study. 5.1. Comparison with other techniques The watersheds obtained using the Prewitt operator and the morphological filter to estimate the gradients of the bacteria image are shown in Fig. 2(b) and (c). Fig. 2(d) shows the segmentation result obtained by applying the Edge Flow method. Comparing with Fig. 2(a), it is clear that our proposed technique produces a much more accurate segmented image. In order to better illustrate our approach, we applied our technique to real low-contrast and/or noisy images. Results were analyzed in both qualitative and quantitative points of view. Qualitative analysis was realized through visual inspection. The chosen quantitative metric was a percentage error obtained by taking into account the difference between the segmented image and the ground truth. More specifically, if Bg represents a binary image of the ground truth and Bs represents a binary image of the segmented image, the error measure is given by EZ

#ððBg g Bs Þ K ðBg h Bs ÞÞ ; #Bg

(7)

where the cardinality symbol represents the number of pixels of the corresponding binary image. For example, Fig. 3(a) shows a computerized tomographic (CT) image of the human column, which is inherently noisy and presents low contrast. In this image, brighter regions are bones, which are the goal of the segmentation process. Fig. 3(b) shows the enhanced image using JmaxZ4, and the corresponding magnitudes are illustrated in Fig. 3(c). The results of the proposed technique without and with post-processing (using TAZ150) are shown in Fig. 3(d) and (e), respectively. Segmentation

C.R. Jung, J. Scharcanski / Image and Vision Computing 23 (2005) 661–669

665

Fig. 3. (a) CT image of the column. (b) Enhanced images. (c) Magnitudes. (d) Result of the proposed method without post-processing. (e) Result of the proposed method with post-processing. (f) Watersheds segmentation results using Gaussian filtering and the Prewitt operator to estimate gradient image. (h) Segmentation using Edge Flow.

Fig. 4. (a) Noisy industrial image. (b) Segmentation using our proposed technique without post-processing. (c) Segmentation using Gaussian filtering and Prewitt operator. (d) Segmentation using morphological filtering and gradient. (e) Segmentation using Edge Flow.

results using the Prewitt operator and the morphological operator to estimate image gradients are shown in Fig. 3(f) and (g). Finally, the result of Edge Flow is shown in Fig. 3(h). Visual inspection shows that the proposed technique produced the best result. A quantitative analysis (according to Eq. (7)) indicates that segmentation errors (in %) corresponding to Fig. 3(e)–(h) are, respectively, 14.57, 75.06, 75.84 and 21.55% (ground truth was obtained by manual delineation). The proposed technique produced the smallest segmentation error, which is mostly due to undersegmentation of the low-contrast bone at the right of the image. It should be noticed that accurate measurements in medical images are often obtained with the help of interactive segmentation methods (e.g. deformable contours), that require human intervention in the initialization step. Our method could provide an initial estimate to initialize such segmentation techniques. Another experiment is shown in Fig. 4. Fig. 4(a) illustrates a noisy industrial image, and Fig. 4(b)–(e) shows segmentation results using our method (without post-processing), Prewitt operator, morphological operator and Edge Flow, respectively. Our method segmented correctly all objects in

the image, and post-processing was not necessary. All other techniques showed deficiencies: using the Prewitt operator resulted in over-segmentation; using morphological gradient and Edge Flow resulted in under-segmentation. The quantitative analysis applied to Fig. 4(b)–(e) results, respectively, in the following segmentation errors of 20.47, 27.50, 55.42 and 64.87%, respectively. Again, the proposed technique produced the smallest error.1 5.2. Influence of the scale 2J selected Let us consider the synthetic image shown in Fig. 5(a). This image has a resolution of 256!512 pixels, and contains several squares with different sizes and background contrasts. Segmentation results using JZ2 and JZ3 for estimating gradient magnitudes are shown in Fig. 5(b) and (c), respectively.2 In these images no post-processing was 1

Although 20.47% may appear to be a relatively high segmentation error, it is mostly due to border pixels related to watershed lines. 2 We did not show the results for the finest resolution JZ1 because this scale is very sensitive to noise and/or image artifacts, and produces very poor segmentation results.

666

C.R. Jung, J. Scharcanski / Image and Vision Computing 23 (2005) 661–669

Fig. 5. (a) Synthetic image containing squares with varying sizes and contrast. (b) Segmentation results using JZ2. (c) Segmentation results using JZ3.

used, and the interior of each segmented region was replaced by the average graylevel of the original image for visualization purposes. It can be observed that a lower resolution 2J produces a ‘rounding’ effect at the corners of the squares. Also, small low-contrast objects may be lost, such as the small square at the bottom-right of the image. A noisy version (PSNR3Z18.92 dB) of the synthetic image is shown in Fig. 6(a), and segmentation results using JZ2 and JZ3 are illustrated in Fig. 6(b) and (c), respectively. All squares with very low contrast were missed by the segmentation procedure in both scales (in fact, they are barely visible). Selecting JZ2 allows to detect all other clearly visible squares with good border definition; choosing JZ3 results in missing one small square in the segmentation, and in the rounding effect of the other squares. An even noisier version (PSNRZ12.49 dB) of the synthetic image is shown in Fig. 7(a), and segmentation 3

The peak-to-peak signal-to-noise ratio is defined as PSNRZK20 log10(255/snoise), where snoise is the standard deviation of the noise corrupting the image.

Fig. 6. (a) Noisy synthetic image (PSNRZ18.92 dB). (b) Segmentation results using JZ2. (c) Segmentation results using JZ3.

results using JZ2 and JZ3 are illustrated in Fig. 7(b) and (c), respectively. In this example, noise corruption is very high, and our pre-processing technique cannot make a clear distinction between noise and edges. Nevertheless, structures with higher contrast are still detected. We can also observe that we can detect even low-contrast objects using JZ3, if they are sufficiently large. In general, larger J values result in more intense ‘rounding effect’ on the detected object shapes, and in missing small objects in the segmentation. However, we are able to detect larger objects under more intense noise contamination. On the other hand, smaller J values produce more accurate contours, and allow the detection of smaller structures, but causes detection errors if noise corruption is high. Table 1 presents a quantitative analysis of the segmentation results obtained for the noisy squares images, considering all segmented pixels and the ground truth. For clean images, selecting JZ2 produces a slightly better result than using JZ3, because JZ2 produces less objects are missed by the segmentation process. To illustrate such behavior in more detail, let us perform a local analysis considering two specific squares of the images. Let us

C.R. Jung, J. Scharcanski / Image and Vision Computing 23 (2005) 661–669

667

Segmentation errors for this square using JZ2 are, respectively, 9.07, 9.30 and 100%. In an analogous way, results for JZ3 are 9.12, 10.20 and 16.55%, respectively. Now, let us concentrate on the square on the second row from top to bottom, and third column from left to right, which has a larger size but contrast similar to the previous case, respectively, 4.82, 4.88 and 100%. Also, results for JZ3 are 4.85, 5.41 and 8.86%, respectively. Therefore, our general guidelines for the selection of J are the following: † † † †

For larger objects, J should be larger. For higher noise contamination, J should be larger. For smaller objects, J should be smaller. For better contour definition of segmented objects, J should be smaller.

5.3. Fingerprint segmentation

Fig. 7. (a) Noisier synthetic image (PSNRZ12.49 dB). (b) Segmentation results using JZ2. (c) Segmentation results using JZ3. Table 1 Segmentation errors (in %) for the squares images in shown in Figs. 5–7 Noisy square image (PSNR)

Clean

18.92 dB

12.49 dB

Segmentation error for JZ2 Segmentation error for JZ3

3.56% 3.86%

27.70% 37.73%

52.09% 36.69%

concentrate on a square with relatively small size and contrast, with different noise contamination levels, such as one located on the third row from top to bottom, and the third column from left to right (i.e. related to Figs. 5(a)).

An important step in an automatic fingerprint recognition system is the segmentation of fingerprints in images [18–20]. Basically, a captured fingerprint image consists of two regions: foreground and background. The first corresponds to contacts of the fingertips with the sensor, while the second is the surrounding neighborhood (which often is noisy). We tested a version of our segmentation approach in several fingerprint images of database DB3, obtained from the Fingerprint Verification Competition (FVC 2002) [21]. This database contains 80 fingerprint images obtained with a capacitive sensor (with 300!300 pixels, and 500 dpi resolution). In such images, the regions formed by the contact of a fingertip with the sensor appear as dark pixels. Thus, the region of interest (i.e. the foreground) is formed by dark valleys separated by ridges (due to the fingertip saliencies), while the background typically is brighter (and noisier). Our first step to segment these images is to increase the thickness of the dark valleys in the foreground (in order to increase the average contrast between the foreground and background). This can be achieved by eroding the original image with a 5!5 mask. Fig. 8(a) shows the image 104_1.tif from the database DB3. The result of the erosion can be seen in Fig. 8(b). It should be noticed that larger

Fig. 8. (a) Original fingerprint image. (b) Eroded image. (c) Enhanced image. (d) Gradient magnitudes.

668

C.R. Jung, J. Scharcanski / Image and Vision Computing 23 (2005) 661–669

Fig. 9. Segmentation results of the fingerprint images.

masks could be used to further increase the thickness of the dark valleys. However, this would also increase the size of the noisy black spots in the background. The next step is to de-noise and enhance the eroded image, and to compute enhanced magnitudes. Considering the average size of the objects of interest (foreground) in those images, we used a larger value for J (in fact, we used JZ5) to estimate gradient magnitudes. The enhanced image can be seen in Fig. 8(c), and the corresponding enhanced magnitudes are shown in Fig. 8(d). Fig. 9 shows our segmentation results for the images 101_1.tif, 102_1.tif,., 110_1.tif, from left to right, top to bottom. In all these images, the foreground is effectively segmented from the noisy background, and important features for fingerprint matching (such as minutiae [22]) are enclosed within the segmented region. In these experiments, we used TAZ2005 for region merging in the post-processing stage.

6. Conclusions In this paper, a new method for improving the robustness of the watershed segmentation was proposed. It is based on a specific pre-processing stage, that produces simultaneously image de-noising and edge enhancement. In our approach, the watershed transform is applied to the gradient image obtained at a selected scale 2J, allowing the user to choose the segmentation resolution. Also, an optional postprocessing procedure may be utilized to remove small (spurious) regions, and/or to merge regions presenting lowcontrast boundaries. It should be emphasized that the only user-provided parameter is the desired scale 2J (and, optionally, parameters for post-processing, which are application dependent). All other parameters used in our experiments were pre-defined (kZ0.2, s2Z4, bmaxZ7). Our experimental results indicate that the over-segmentation problem, which is typical of the watersheds technique, can be significantly attenuated. Also, false

contours due to low-contrast edges within the regions of interest are effectively reduced with our approach. The proposed technique is robust when applied to noisy and/or blurred images, performing better than other segmentation techniques proposed in the literature. The post-processing stage eliminates effectively the remaining over-segmented regions. We also showed that our algorithm could be used in practical applications, and developed a case study. In fact, we showed that fingerprint image segmentation could be performed successfully using the proposed framework. Future work will concentrate on choosing an adaptive threshold T. Also, we intend to further investigate the application of our approach to medical images (including segmentation of three-dimensional objects by applying our technique to each slice of the volume), and extend this work to color image segmentation and analysis.

Acknowledgements The authors would like to thank the anonymous reviewers, for their valuable comments. Also, the authors thank CNPq-Brazilian Research Council - for financial support.

References [1] S. Beucher, C. Lantue´joul, Use of watersheds in contour detection, in: Proceedings of IEEE International Workshop on Image Processing, Real-Time Edge and Motion Detection/Estimation, Rennes, France, 1979. [2] L. Vincent, P. Soille, Watersheds in digital spaces: an efficient algorithm based on immersion simulations, IEEE Transactions on Pattern Analysis and Machine Intelligence 13 (6) (1991) 583–598. [3] A. Bieniek, A. Moga, An efficient watershed algorithm based on connected components, Pattern Recognition 33 (6) (2000) 907–916. [4] F. Meyer, Levelings and morphological segmentation, in: Proceedings of SIBGRAPI’98, Rio de Janeiro, Brazil, 1998, pp. 28–35.

C.R. Jung, J. Scharcanski / Image and Vision Computing 23 (2005) 661–669 [5] K. Haris, S.N. Efstratiadis, N. Maglaveras, A.K. Katsaggelos, Hybrid image segmentation using watersheds and fast region merging, IEEE Transactions on Image Processing 7 (12) (1998) 1684–1699. [6] J. Weickert, Efficient image segmentation using partial differential equations and morphology, Pattern Recognition 34 (9) (2001) 1813– 1824. [7] J.B. Kim, H.J. Kim, A wavelet-based watershed image segmentation for vop generation, in: IEEE International Conference on Pattern Recognition, Que´bec City, Canada, 2002, pp. 505–508. [8] J.B. Kim, H.J. Kim, Multiresolution-based watersheds for efficient image segmentation, Pattern Recognition Letters 24 (2003) 473–488. [9] H.T. Nguyen, M. Worring, R.V.D. Boomgaard, Watersnakes: energydriven watershed segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence 25 (3) (2003) 330–342. [10] J. Scharcanski, C.R. Jung, R.T. Clarke, Adaptive image denoising using scale and space consistency, IEEE Transactions on Image Processing 11 (9) (2002) 1092–1101. [11] S.G. Mallat, S. Zhong, Characterization of signals from multiscale edges, IEEE Transactions on Pattern Analysis and Machine Intelligence 14 (7) (1992) 710–732. [12] J. Canny, A computational approach to edge detection, IEEE Transactions on Pattern Analysis and Machine Intelligence 8 (1986) 679–698. [13] F. Meyer, S. Beucher, Morphological segmentation, Journal of Visual Communication and Image Representation 1 (1990) 21–46. [14] A.K. Jain, Fundamentals of Digital Image Processing, Prentice Hall, Englewood Cliffs, NJ, 1989.

669

[15] W.K. Pratt, Digital Image Processing, Wiley, New York, 1991. [16] R.A. Peters, A new algorithm for noise reduction using mathematical morphology, IEEE Transactions on Image Processing 4 (3) (1995) 554–568. [17] W.Y. Ma, B.S. Manjunath, Edge Flow: a framework of boundary detection and image segmentation, in: IEEE Conference on Computer Vision and Pattern Recognition, 1997, pp. 744–749, available for download at http://vision.ece.ucsb.edu/segmentation/edgeflow/software/index.htm [18] A.K. Jain, L. Hong, R. Bolle, On line fingerprint verification, IEEE Transactions on Pattern Analysis and Machine Intelligence 19 (4) (1997) 302–314. [19] A.M. Bazen, S.H. Gerez, Segmentation of fingerprint images, in: ProRISC 2001 Workshop on Circuits, Systems and Signal Processing, Veldhoven, The Netherlands, 2001. [20] A.M. Bazen, S.H. Gerez, Systematic methods for the computation of the directional fields and singular points of fingertips, IEEE Transactions on Pattern Analysis and Machine Intelligence 24 (7) (2002) 905–919. [21] D. Maio, D. Maltoni, R. Cappelli, J. Wayman, A.K. Jain, Fvc2002: second fingerprint verification competition, in: International Conference on Pattern Recognition, Quebec City, Canada, 2002. [22] N. Ratha, S. Chen, A.K. Jain, Adaptive flow orientation based texture extraction in fingerprint images, Pattern Recognition 28 (1995) 1657– 1672.

Recommend Documents

robust watershed segmentation using the wavelet ... - Semantic Scholar