Efficient Edge Detection and Object Segmentation Using Gabor Filters Yiming Ji
Kai H. Chang
Chi-Cheng Hung
Depart of Computer Science and Software Engineering Auburn University 334-8446333
Depart of Computer Science and Software Engineering Auburn University 334-8446329
School of Computing and Software Engineering Southern Polytechnic State University 770-5283574
[email protected] [email protected] [email protected] ABSTRACT
F
(ω
) e
2 π iω t
(ω
∫
) =
f ( x ) e
− 2 π ix ω
dx
R
In signal analysis, f(x) describes the temporal behavior, and F(ω) describes the frequency behavior. In general, Fourier expansion is difficult to recognize features between f and F, so it is an ideal tool only for stationary (or time invariant) signals. Gabor expansion is a time-frequency analysis method, which was introduced in 1946 by Dennis Gabor [2]. This expansion introduces a time-localization Gaussian window function (g(t-x), where x is a shifting parameter used to translate the window in order to cover the whole time domain) for extracting local information of signal with the form similar to the Fourier transformation [3][4][5]:
Categories and Subject Descriptors H.2.0 [Image]: segmentation, denoising
General Terms
C ( x , ω ) =
Algorithms, Theory
∫
f (t ) g (t −
x )e
− 2 π it ω
dt
R
where g(t − x ) is the conjugate form of g(t-x), and coefficient C(x, ω) depends linearly on the original signal f(t). Some properties of the Gabor transform (energy preservation, inversion formula) are similar to those of the Fourier transformation. The Gabor filter tries to search and investigate the intermediate representations which combine the information of both time/space information f and frequency information F [6-7]. The goal is a simultaneous description of the temporal and spectral behavior of a function or signal f. Such a representation is essentially twodimensional, measuring both behavior of the frequency w and time/space t.
Keywords segmentation,
F
ω ∈ R
Gabor filter is a widely used feature extraction method, especially in image texture analysis. The selection of optimal filter parameters is usually problematic and unclear. This study analyzes the filter design essentials and proposes two different methods to segment the Gabor filtered multi-channel images. The first method integrates Gabor filters with labeling algorithm for edge detection and object segmentation. The second method uses the K-means clustering with simulated annealing for image segmentation of a stack of Gabor filtered multi-channel images. Various experiments with real images demonstrate the effectiveness of these approaches.
Gabor filter, annealing.
∑
f ( x ) =
labeling,
K-means,
simulated
1. INTRODUCTION Image segmentation continues to be an important and active research area in image analysis. In order to analyze complicated image signals, engineers and mathematicians have been searching for a simple, yet well-understood representation. Fourier expansions, Gabor expansions, and Wavelet expansions are good examples.
There are two main approaches for Gabor transform computation [7]. One focuses on the matrix factorization, which is based on the periodic and sparse structure of Gabor operator, and the factorization holds in general and does not depend on number theoretic condition on the lattice parameters of time/space and frequency. The second and efficient method is an iterative approach, which uses the Conjugate Gradient (CG) method to approximate the optimal dual window. In general CG method is more computationally efficient for large size matrix [7]. However, the classic Gabor expansion is computationally expensive [3-5], and since it combines all the space and frequency details of the original signal, it is difficult to take advantage of the gigantic amount of numbers.
Fourier expansion [1] represents signals in a series of sine waves, and the classical Fourier analysis employs two complementary representations: the function f itself and its transforms F:
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ACMSE ’04, April 2-3, 2004, Huntsville, Alabama, USA. Copyright 2004 ACM 1-58113-870-9/04/04…$5.00.
Fortunately, these problems can be partially overcome by using a multi-channel filtering technique, which uses specified filters to select the information at particular space/frequency points.
454
When an image is processed by a Gabor filter, the output is the convolution of the image i(x,y) and the Gabor function g(x,y), i.e., r ( x , y ) = g ( x , y ) * i ( x , y ) , where * denotes the two dimensional convolution. This process can be used at different frequencies and different orientations, and the result is a multichannel filter bank. Figure 1 illustrates the multi-channel filtering system.
Research has demonstrated that Gabor expansion can be implemented as a multi-channel, wavelet-like filter and this multichannel filtering process mimics the characteristics of the human visual system (HVS) [8-9]. If an image containing more than two object regions is fed to a Gabor filter bank, then local spatialfrequency differences between these objects hopefully will show variation in the filtered sub-images. This filtering approach is a natural way to manipulate variation between image objects. Gabor filters have been used in detecting texture edge [10, 11]. However, it is based on traditional Gradient method. It does not specify the choice of frequency/scale range, and as a result, it can not take advantage of the edge information for further analysis. Wavelet expansion is an excellent technique which is now commonly used in image compressions and image restorations. Comparing Gabor expansion with Wavelet analysis technique, Oscar Nesta, Rafael Navarro, Javier Portilla and Antonio Tabernero [12] concluded that Gabor expansion is the only biologically plausible filters with orientation selectivity that can be exactly expressed as a sum of only two separable filters [12, 13]. This unique property has made Gabor filter an important transformation in image processing.
Figure 1 Gabor multi-channel system
In Fig. 1, operator | · | is the magnitude operator, and gk(x,y) is the Gabor function in the kth channel, which denotes a specific frequency and orientation. With the multi-channel filtering system, an image will be processed by all the channels simultaneously. The result is a stack of filtered images which are defined at various frequencies and orientations corresponding to all the channels in the system. Hence, characteristic image information at each particular frequency and individual orientation is obtained.
While there are many image segmentation algorithms and much research has successfully used Gabor filters for texture analysis [10, 11, 14, 16], this study tries to provide additional insight into the design of optimal Gabor filters, and proposes two different segmentation algorithms. The first approach integrates images from all the channels and applies labeling algorithm for segmentation. The second approach analyzes multi-channel images using the K-means with simulated annealing algorithm. Both methods have unique applications: the first one is very efficient while the second is especially good for noisy image analysis.
2.2 Choice of Gabor Filter Parameters Although it is stated by David A. Clausi [8] that the collection of results of every 30o will provide a robust and universal feature set, this study uses four values of 0 o, 45 o, 90 o, and 135 o to save computation time. According to [12], [14], [16], and [18], the central frequency is selected according to the image dimension. That is, the following set of values is used: cycles/image-width, with N being N
The next section presents the Gabor filter design. Section 3 applies the designed filters to images and discusses how Gabor filters can capture image features by analyzing the edge information. Then, in section 4, the labeling segmentation and the K-means simulated annealing algorithm are briefly described. Experiments on images are shown in section 5, and discussions and conclusions then follow.
2 ,2
2 ,..., and
4
2
the number of the image columns. The radial frequencies are all 1 octave apart [19]. Low frequency corresponds to smooth variations and constitutes the base of an image and high frequency presents the edge information which gives the detailed information in the image. Hence, this study neglects the very low radial frequencies, e.q., 2 , and 2 2 . Using these frequencies and orientations, the Gabor filter Multi-channel system can present an image in various orientations and frequencies.
2. Gabor filter Design 2.1 Gabor function and multi-channel system There are several forms of 2-D Gabor filter, and this study uses a version similar to Daugman’s model which is used in spatial summation properties (of the receptive fields) of simple cells in the visual cortex [17]. It is defined as:
3. Edge detection specifics This section uses an image to analyze the boundary effect of the Gabor filters. Figure 2 (2.a) shows a test image of size 100×154. When the original image (2.a) is fed to the above Gabor multichannel system (no post-processing for this analysis), the integration image is (2.b). In order to illustrate the details, image (2.a) and (2.b) are filtered by the same low-pass filter, and the results are (2.c) and (2.d) respectively. (The kernel length of the low-pass filter is 5).
αβ g ( x, y ) = exp( −α 2 x ' 2 + β 2 y ' 2 ) exp( j 2π f 0 x ' ) π x ' = x cos θ + y sin θ y ' = − x sin θ + y cos θ where the arguments, x and y, specify the position of a scene, f0 is the central frequency of a sinusoidal plane wave, θ is the counter clockwise rotation of the Gaussian plane wave, and α and β are the sharpness values of the major and minor axes of the elliptic Gaussian.
Image (2.a) contains 4 texture patterns of same size and each pattern has 50X78 pixels. In order to compare the Gabor effect along the texture boundaries, pixel intensities of images (2.c) and (2.d) of Figure 2 along different rows and columns are
455
This study fully takes advantages of this property and labels distinct components based on the labeling algorithm [20, 21, 22]. Then the connectivity of each labeled region is measured accordingly. Here, a simple method is to count the number of the pixels within a region: if there are few pixels in a labeled region, it can be assumed that the pixels within the region may represent noisy, unrelated signals and thus can be neglected; otherwise, the region is considered to contain part of the edge information. In this study, a threshold of 50 is used for all test images.
highlighted. Specifically, the pixel intensities along column No. 25 and column No. 85, and along rows 25 and 75 are marked with the dotted lines.
4.2 Segmentation by simulated annealing with K-means While the first method takes advantage of the overall effect of the Gabor filtering images, this second method tries to analyze all the muti-channel sub-images directly. Here, the K-means works on a pixel vector instead of a single pixel value.
Figure 2. A test image and results. Fig. 3 shows the pixel intensities comparison result along the specified rows and columns. The circled lines in Fig. 3 represent the row/column pixel intensities of Gabor filtered image for (2.d) and the continuous lines correspond to original image (2.c).
Simulated annealing is an optimization algorithm that is based on the process of annealing metals. The integration of simulated annealing with K-means is to search optimal clustering solution by exploring most promising path similar to the way a metal cooling down to optimal state [25]. Below details the process involved in this algorithm. Step 1: Determine the number of clusters (K) and initialize center for the clusters. Step 2: Define the temperature, Tmax for highest temperature and Tmin for the cooling down value.
Figure 3. Intensities along specified rows/columns
Step 3: Distribute pixel vectors into K clusters and calculate the total variance among them.
It is obvious from Fig. 3 that the Gabor filters successfully capture and optimize the distinctions along the border of object regions (as circled by dotted ovals in Figure 3). For charts (3.a), (3.b) and (3.c), the difference of intensities for the original image is less that 0.15, while it is more than 0.4 for the Gabor processed images. The reason is that Gabor filters only choose higher frequency information (which stands for edge/boundary property) from original images, and as a result, the edge is maximized and different objects are separated. It is this property that provides level information among objects and, therefore, basis for many different segmentation algorithms. For example, some researchers blur the Gabor filtered images one more step and apply threshold to classify the texture images [13, 14, 16], and other researchers use gradient to locate the boundary [10].
Step 4: If the total variance is less than the first loop, keep the distribution; otherwise, select a random number r by tossing a dice and calculate the annealing probability: P ( c ) = exp( − c / kT ) , where P(c) is the probability of a change of c in the objective function and k is a constant analogous to Boltzmann’s constant that must be chosen for the classification. If annealing probability P(c) is greater than the random number, keep the change, otherwise discard the update. Step 5: Increase the iteration and update the temperature with: T = T max /( 1 + iteration ) , repeat steps 3, 4, and 5 until T