Fundamental Frequency Gabor Filters for Object ... - Semantic Scholar

Report 3 Downloads 65 Views
Fundamental Frequency Gabor Filters for Object Recognition Joni Kamarainen, Ville Kyrki and Heikki K¨alvi¨ainen Department of Information Technology, Lappeenranta University of Technology P.O. Box 20, FIN-53851 Lappeenranta, FINLAND fJoni.Kamarainen,Ville.Kyrki,Heikki.Kalviaineng@lut.fi Abstract

ages of electronic components are classified in rotation and translation invariant manner.

Gabor filters are a widely used feature extraction method in image analysis. In this study, a new method is presented that utilises Gabor filters for extracting fundamental frequencies of objects. The fundamental frequencies represent the shape of an object and can be used to classify objects with dissimilar spatial dimensions. Theoretical results are verified by experiments with real images of electronic components. Experiments indicate that the fundamental frequency Gabor filters are a robust tool for rotation and translation invariant object recognition.

1. Introduction In 1946 Dennis Gabor introduced the Gabor elementary functions (GEFs) which have the smallest joint uncertainty in time and frequency [4]. Thirty years later, Granlund presented a general two-dimensional image processing operator which is a 2-D counterpart of a GEF [5]. After Daugman gave the generalisation of the uncertainty theorem along with biological motivations to use Gabor filters [3], Gabor filters have become popular in digital image processing. In image processing, the most attractive and active application area for Gabor filters has been texture segmentation (e.g. [1, 7]). Based on the results on texture segmentation, Gabor filtering has been successfully applied in several applications (e.g. [6, 8]). Besides textures, Gabor filters have been used in edge detection [11], line segmentation [2], and shape recognition [9, 10]. The mentioned applications utilise Gabor filters to extract mainly the high frequency components of images, such as textures, lines, and edges. In this study, a novel theory is presented to extract shape information of objects using the low frequency components. The theory is based on the fact that the overall shape of an object determines its fundamental frequencies, which can be used to classify objects with different dimensions. The expression power of fundamental frequencies is demonstrated by an experiment where im-

2. Fundamental frequency Gabor filters 2.1. One-dimensional filters Gabor presented the one-dimensional Gabor elementary function [4] as

t

( )=

2 2 e (t t0 ) ej(2f0 t+) ;

(1)

where j is the imaginary unit. (1) represents a complex sinusoid of frequency f0 modulated by a Gaussian with sharpness . This family of functions has the smallest joint uncertainty in time and frequency domains [4]. Therefore, it seems to be a natural tool to examine the width of a pulse or a similar function. In order to examine the width, a concept of a fundamental frequency needs to be introduced. Without a loss of generality, time shift t0 and phase shift  can be set to zero to form a centred Gabor elementary function, called Gabor filter function. Let us define the response of a Gabor filter function to a function  (t) as the convolution

resp (t) =

t

( )

  (t) =

Z1

1

t  ) ( )d:

(

(2)

If the signal  (t) is amplitude limited, the family of complex exponential functions has the maximal response. Defining

cis(2f1t) = ej(2f1 t) ;

(3)

the response to this complex exponential is

respcis

p

(  )2 (f | {z }| e {z 0 phase

j 2f1 t (t) = e

amplitude

f1 )2 :

}

(4)

Now, the maximum will occur when the frequencies f0 and

f1 are equal, with an absolute value

1051-4651/02 $17.00 (c) 2002 IEEE

max

f1

p jrespcis (t)j =  ;

(5)

that is constant over time. This value can be used to normalise the response of the Gabor filter. Defining a normalised response r(t) as

r (t) =

p 

Z1

(

1

(6) 0.02

(

0

jtj  ; jtj > w 2 w

1 0

(7)

1

rp (t) = e 2

  

erf

1

t +

t

w 2

w 2

f0~



f0  f j 0 ; j

1.5 2

(8)

where erf (t) is the complex error function. The absolute value of (8) is plotted in Figure 1. It can be seen that there is a maximum at t = 0 and the frequency where it occurs can be solved by finding the zero crossings of the partial derivative of (8) with respect to f0 at t = 0. That is, w2 @ rp (0) = p e 4 2 cos(f0 w) = 0 , @f0 f0  cos(f0 w) = 0 , 1 n f0 = + ; n = f0; 1; 2; : : :g: 2w w

(9)

Now, it is evident that there is a response maximum when the width of the pulse equals half of the wavelength of the Gabor filter, and at the odd multiples of the corresponding frequency. The base frequency f0 for n = 0 in (9) is called the fundamental frequency of the pulse.

2.2. Filtering in 2-D Several forms of the Gabor filter have been given for the 2-D case [5, 3]. However, we now present a form that corresponds to the Gabor’s 1-D representation (1). A twodimensional Gabor filter (x; y ) can be defined as

x; y) =e ( x + y ) ej2f0 x ; x0 = x cos  + y sin ; y0 = x sin  + y cos ;

(

2

02

2

02

0

(10)

where f0 is the central frequency of a sinusoidal plane wave, is the anti-clockwise rotation of the Gaussian and the plane wave, and and are the sharpness values of the major and minor axes of the elliptic Gaussian. We can use normalisation similar to the 1-D case to define the normalised response r(x; y ) of a 2-D Gabor filter:



r (x; y) =

0.5

2

the normalised response of the pulse is f 2 ( 0 ) erf

0.06

0.04

t  ) ( )d;

and a unit pulse of width w as

p(t) =

0.08

(x; y )   (x; y ): 

(11)

2.5 10

5

–10

–5

0 t~

Figure 1. jrp (t)j as a function of w = 2 (maximum at f0 = 1=4).

f0

and

t

at

2-D widths can be examined in the direction of  using the result in (9) when the ratio between the wavelength and the axes of the Gaussian is fixed. This can be accomplished by defining constants = f 0 (width of the Gaussian along the major axis) and  = f 0 (width along the minor axis). In addition, these constants guarantee similar behaviour regardless of the frequency.

3. Fundamental frequencies in object recognition An object recognition scheme based on the fundamental frequencies is now presented. An assumption is made that the theories apply directly to discrete images. An obvious restriction is that objects have relatively clear shapes, otherwise the interpretation of the fundamental frequencies as spatial dimensions will fail. Also, if different objects have the same dimensions, their fundamental frequencies may be equal and they cannot be discriminated. Responses of Gabor filters on different spatial locations and frequencies form an information diagram. The information diagram, first introduced by Gabor for 1-D [4] and later by Daugman for the 2-D case [3], shows the minimal information quanta simultaneously in spatial and frequency domains. As an example, based on (9), to inspect objects whose outer dimensions are from 10 to 100 pixels, Gabor filter wavelengths of 20 to 200 must be used 1 1 : : : 200 ). To inspect the widths in n orientations, (f0 = 20 Gabor filters on orientations i = i n for i = 0 : : : n 1 can be used. An information diagram for the 4th electronic component (Fig. 4) at the centroid is shown in Fig. 2. 1 ) (horiClearly, there are local maxima at (; f0 ) = (0; 160

1051-4651/02 $17.00 (c) 2002 IEEE

1 ) (vertical) when zontal fundamental frequency) and ( 2 ; 70 the approximate dimensions of the component are respectively 80 and 35 pixels. It is now evident that the local maximum at each orientation corresponds to the width of the object along that direction.

1/ 30

1/ 50

4. Experiments The expression power of the fundamental frequency Gabor features was studied by experiments on real world images. An object recognition application used to carry out experiments is shown in Fig. 3. The application was used to create a class as in Algorithm 1 (approximate centroid pointed by user) and objects were located from new images as in Algorithm 2.

O

1/ 70

frequency

1/ 90

1/110

1/130

1/150

O 1/170

1/190

0.0

22.5

45.0

67.5

90.0

112.5

135.0

157.5

orientation

Figure 2. Information diagram I (; f0 ) for the component 4 at (x,y)=(324,263). Figure 3. User interface. The maximal responses at each orientation can be used as a feature. For objects that differ in shape, these frequencies and responses are unique and the feature vector thus represents the dimensions of the object in different orientations (Algorithm 1). Algorithm 1 Create a class from the input image f (x; y ) at (x0 ; y 0 ) 1. Compute the information diagram Imn for frequencies f0 : : : fm 1 and orientations 0 : : : n 1 at (x0 ; y 0 ). 2. For all orientations find the maximal responses r1n and the frequencies f1n they appear in (rl = max I (k; l); fl = k

arg max I (k; l)). k

3. Create a feature matrix F2n using the normalised maximal rk and the corresponding frequenresponses rnorm = r=k cies f.

To find an object in unknown location and orientation in the image the information diagram must be computed for all locations and then the distance to predefined classes can be measured by orientation invariant distance measure as presented in [10] (Algorithm 2). Algorithm 2 Find an object from the input image f (x; y ) 1. Compute a 4-D information diagram Imn (x; y ) for all (x; y ). 2. Measure the distances D(x; y; i) = dist(I (x; y ); Fi ) for each class i and location (x,y) using a rotation invariant Euclidean distance and normalised responses. 3. Find the minimal distance and return class i and location (x; y ) where D(x; y; i) = minD(x; y; i). x;y;i

For the experiment a set of images of electronic components in random locations and orientations was used. The set consisted of images of 8 different components shown in Fig. 4. There were 6 images of each component (total of 48), one image was used to create the feature matrix, and all images were used to test the classification accuracy. The data is available at project homepage (http:://www.it.lut.fi/project/gabor/gabor.html). The selection of parameters for the Gabor filters is a crucial issue that is often neglected. In the experiment, = 1:0 was used to have only one significant peak at the fundamental frequency. Because some components are quite long and thin,  = 0:1 was used to measure the widths using a narrow window. To capture the major dimensions of the components, wavelengths from 20 to 200 were used with sampling interval of 10. n = 8 orientations were used. The classification accuracy shown in Table 1 indicates that with a sufficient number of features (=8) a very accurate classification can be made. An important thing to notice is that the classification is based on a simple distance measure and uses only one example per class. Thus, the experiments seem to verify that fundamental frequencies represent the shape of the object and can be used in recognition. It should be noted that the method is quite stable in terms of parameters. However, it should be ensured that the selected frequencies cover the range of object dimensions, otherwise the fundamental frequencies cannot be found. Furthermore, the parameters can be analytically selected if application constraints, such as maximum and minimum

1051-4651/02 $17.00 (c) 2002 IEEE

Table 1. Results. Number of features 4 8

1.

Classification accuracy 96% 100%

2.

3.

an important question is, how to combine several filter responses at different spatial locations to generate efficient representations. The answer lies probably in using more complex classification methods. This study concentrated especially on the definition and reliability of fundamental frequency Gabor features. It seems that fundamental frequencies extracted by Gabor filtering are a promising feature extraction method, and thus, present a new application area for Gabor filtering.

4.

References 5.

6.

7.

8. Figure 4. Images of electronic components 18 (6  8 images in total).

sizes, can be defined.

5. Conclusion In this paper, the theory of fundamental frequency Gabor filters was presented. The theory was applied to a real world problem of recognising electronic components where a correct recognition of all objects was achieved. While the theory was presented for the pulse function, it should be noted that it applies also to smoother shapes. The Gabor filter responses have continuous and smooth behaviour in both domains, spatial and frequency. Also, the noise tolerance of Gabor features is well known, e.g. [9]. Thus, the translation and orientation invariant approach can be considered to be robust, as compared to other similar methods. In future work it seems intriguing to study, how to combine the information on fundamental frequencies and higher frequencies to recognise objects with more complicated features, e.g., texture. Also, when inspecting complex objects,

[1] A. C. Bovik, M. Clark, and W. S. Geisler. Multichannel texture analysis using localized spatial filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(1):55–73, January 1990. [2] J. Chen, Y. Sato, and S. Tamura. Orientation space filtering for multiple orientation line segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(5):417–429, May 2000. [3] J. G. Daugman. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by twodimensional visual cortical filters. Journal of the Optical Society of America A, 2(7):1160–1169, July 1985. [4] D. Gabor. Theory of communications. Journal of International Electrical Engineers, 93:427–457, 1946. [5] G. H. Granlund. In search of a general picture processing operator. Computer Graphics and Image Processing, 8:155– 173, 1978. [6] A. K. Jain and S. K. Bhattacharjee. Address block location on envelopes using Gabor filters. Pattern Recognition, 25(12):1459–1477, 1992. [7] A. K. Jain and F. Farrokhnia. Unsupervised texture segmentation using Gabor filters. Pattern Recognition, 24(12):1167–1186, 1991. [8] A. K. Jain, N. K. Ratha, and S. Lakshmanan. Object detection using Gabor filters. Pattern Recognition, 30(2):295– 309, 1997. [9] V. Kyrki, J.-K. Kamarainen, and H. K¨alvi¨ainen. Contentbased image matching using Gabor filtering. In ACIVS’2001 3rd International Conference on Advanced Concepts for Intelligent Vision Systems Theory and Applications, pages 45– 49, Baden-Baden, Germany, July 2001. [10] V. Kyrki, J.-K. Kamarainen, and H. K¨alvi¨ainen. Invariant shape recognition using global Gabor features. In 12th Scandinavian Conference on Image Analysis, pages 671–678, Bergen, Norway, June 2001. [11] R. Mehrotra, K. Namuduri, and N. Ranganathan. Gabor filter–based edge detection. Pattern Recognition, 25(12):1479–1494, 1992.

1051-4651/02 $17.00 (c) 2002 IEEE