Face Recognition Using Optimal Non-orthogonal ... - Semantic Scholar

Report 3 Downloads 53 Views
Face Recognition Using Optimal Non-orthogonal Wavelet Basis Evaluated by Information Complexity Xiaoling Wang, Hairong Qi The University of Tennessee Department of Electrical and Computer Engineering Knoxville, TN 37996 xwang1,hqi  @utk.edu

Abstract Detecting and recognizing face images automatically is a difficult task due to the variability of illumination, presentation angle, face expression and other common problems of machine vision. In this paper, we represent face images as combinations of 2-D Gabor wavelet basis which are non-orthogonal. Genetic Algorithm (GA) is used to find an optimal basis derived from a combination of frequencies and orientation angles in the 2-D Gabor wavelet transform. Instead of using the widely used within and between class scatter evaluation as the fitness function in GA, we use entropy to measure the information complexity of the wavelet transform. Compared to the well-known “eigenface” algorithm which represents face images based on an orthogonal basis, this Gabor wavelet representation with optimal basis can provide a more accurate and efficient projection scheme and therefore a better classification result.

1. Introduction In recent years, considerable progress has been made on the problems of face detection and recognition, especially in the processing of “mug shots”, i.e., head-on face pictures with controlled illumination and scale [4, 7]. But the problem of recognizing a human face from a general view remains largely unsolved, since transformations such as position, orientation, scale, and illumination cause the face appearance to vary substantially. By far, most of the practical methods use basis function representation, also called dictionary or kernel methods, to represent the original face images. Many early developed algorithms use orthogonal basis functions derived from the eigensystem method [9]. Recently, it is found that unlike signal representation, orthogonality is not a requirement for pattern recognition, and one can expect better performance

from non-orthogonal bases over orthogonal ones as they lead to an over complete and robust representational space [4]. Two sets of algorithms have been developed so far to generate non-orthogonal bases: In [4], Liu and Wechsler use whitening and rotation methods to transform the orthogonal eigenvectors into non-orthogonal bases and then use GA with within and between class scatter evaluation as its fitness function to choose an optimal basis. In [5], Lyons et al. project face images onto the non-orthogonal 2-D Gabor wavelet basis space directly with a fixed basis. In this paper, we develop a face representation method by using 2-D Gabor wavelet filters as the non-orthogonal projection basis and GA as the optimal basis selection method. Instead of using within and between class scatter evaluation as the fitness function, we use entropy to measure the information complexity as the fitness function in GA. The advantage of this algorithm is that it combines the stability and invariant characteristics of the Gabor wavelet representation regarding to face orientation and dilation with the searching power of GA. By using entropy as the fitness function in GA, we do not need to do classification in each generation which is required if using within and between class scatter matrix as the fitness function.

2. Feature Vector Extraction by 2-D Gabor Wavelet Transform The use of the 2-D Gabor wavelet representation in image analysis was pioneered by Daugman in the 1980s [2]. The parameterized family of 2-D Gabor filters are taken from actual neurophysiological measurements of the twodimensional anisotropic receptive field profiles describing single neurons in mammalian visual cortex [1]. The general functional form of the 2-D Gabor filter family over the image domain  is specified in terms of the spacial domain function as Eq. 1.

 

$&%'$)(*+# $&,-$&./ 10   23546879 : !;# (=< >? @. A

   !"#





(1)



where B02 denotes the magnitude, !  specifies the reference pixel coordinate in the image, %#, specifies the   effective width and length of the image, and : < spec  ifies the modulation, which has spatial frequency C    G : $ (D< $ FE and direction H JIFKBL9M3IF,BP NO . Suppose the total number of pixelsO in the face image is LQR%S0 , , by substituting each pixel coordinate UTFTV 9VWAWAWVV!YXZ8[6FXZ8[9 into Eq. 1, we form a 2-D Ga_ bor wavelet basis vector, \^] , for a specific reference pixel ( ` ) and a modulation (M ) as shown in Eq. 2.

\a]

_

b



c 

!UTF6Td 9VWAWAWV



 

!YXZ8[6FXZ8[9 e.gf

(2)

Let h be the lexicographic representation of pixel values in the face image, i.e., h is a Ljilk column vector. The 2-D Gabor coefficients, in essence, can be expressed as a convo_ lution between h and the wavelet basis \?] , m ]

_

*hon\

]

_

(3)

The feature vector of this image given a frequency and   orientation angle pair /C H can then be expressed as Eq. 4. p>_

qgr

m8s  _

r[dAWVWAW&dr

m-t

Z8u

_

r[. f

where v is the total number of reference pixels and denotes the vector k -norm.

possible solutions exist. By allocating more reproductive occurrences to above average individual solutions, the overall effect is to increase the population’s average fitness [10]. There are three components involved in GA: the initial population; the fitness function; and the recombination operators. For the first component, the basis set ƒ forms the initial population. Each member of the initial population (known as “chromosome”) is encoded as a binary string, where each locus is a binary code indicating the presence ( k ) or absence   ( ~ ) of a given frequency and orientation pair /C H . For example, if we specify y?k and ‚^€4 , then there are two possible frequency and orientation pairs o~2 9o and there are totally $ † [ ( $ † *Œ possible basis vectors. The $ chromosomes can be represented as k~ ~UkVkk . For the second component, fitness values are used to guide GA on how to choose offsprings for the next generation from the current parent generation. [4] uses the widely employed within and between class scatter evaluation as the fitness function in GA. However, in order to evaluate the within and between class scatter, the classification result is needed. In another word, classification has to be performed in each generation during the evolution, which brings a large computation burden onto GA. Instead, we use entropy in our algorithm as the fitness function to avoid the additional implementation of classification in GA. We define the information content for each image given a specific   frequency and orientation angle pair /C H as Eq. 5 and Eq. 6.

(4) k

I> v

rw0xr [

3. Optimal Basis Selection Using Genetic Algorithm (GA) and Information Complexity



!IŽ  

p

_

(5)

r [

I Ž Ž‘ ސ[

(6) I Ž

The entropy (’ ) is used as the fitness function to evaluate the bases,

_

In Sec. 2, the non-orthogonal basis \ ] can be constructed by different combinations of the spatial frequency and the orientation angle pair /C H . In particular, if we specify y spatial frequencies, which are denoted as C5z|{ $} for 7Q~UAWVWAW&y€k and ‚ orientation angles ranging from ~ to  degree evenly separated, then there are to tally y‚ possible pairs of C H ’s. By combining different number of these pairs, we can form a set of possible nonorthogonal wavelet bases ƒ . The total number of possible bases in the set is the summation of the binomial coefficients, „6…x†‡[ˆ(ˆ„6…a† (ˆ„6…a†Š‰ˆ(‹0A0A0U(ˆ„6…a† . The task „B… $ for GA in our algorithm is to find the optimal basis from ƒ based on the evaluation of information complexity. GA is a searching algorithm which is based on concepts of biological evolution and natural selection. It can be successfully used to solve problems where vast numbers of

r

”

‘

’“



•I Ž —– ˜2™



!I Ž

(7)

ސ[

where š is the total number of images in the training set. GA maintains the basis with the largest entropy as a successful selective result. Two kinds of chromosome processing methods can be used as the recombination operators: crossover and mutation. Mating is performed as a crossover process whereby we randomly pick a position along each pair of parent individuals as the crossover point and the two portions to the right of this point in both parents are interchanged to form two offspring strings. Mutation is used in GA for creating new combinations of central frequency and orientation pairs in 2-D Gabor wavelet representation so that the searching process can jump to another area of the fitness landscape.

By mutation, a randomly selected variable in a chromosome can be either added to or removed from it. GA is a global optimization algorithm. It approaches the global optimum in a more efficient way than simulated annealing. The optimal non-orthogonal basis generated by GA is used to project the face images to derive the feature vector for classification.

4. Experiments and Results In our experiments, we use the face images captured in a pattern recognition class taught in spring, 2001 [8]. There are totally kk persons in the image database. Each person has three images taken with different illumination, facial size, and facial expressions as shown in Fig. 1.

(a)

(b)

(c)

called the orthogonal basis derived from PCA the “eigenface”.) and the proposed non-orthogonal optimal basis approach. From this experiment, the effectiveness of using information complexity as the fitness function can also be validated. The other experiment compares the performance of choosing different number of images in the training set (either one or two images per person are chosen to form the training set). After we project the original face images by using either the optimal basis generated by GA or the orthogonal basis derived by “eigenface” method, we use the minimum distance approach to cluster the face images. The classification result of the test image is decided as the class number of the training image corresponding to the smallest distance. Fig. 3 shows the performance of the classical “eigenface” algorithm as a function of the number of eigenvectors retained when the training set is composed by using one image per person. The averaged accuracy rate for this algorithm is around ¡2~F¢ . However, if we choose two images per person to generate the training set, the performance rate by using “eigenface” algorithm is increased to about £ kd¢ . We can see that because the “eigenface” algorithm yields projection axes based on the variations from all the training samples, the classification performance depends heavily on the choice of the training set.

Figure 1. Sample face images in the database.

88

86

84

82

80 correct rate

In order to simplify the experimental procedure, a fiducial grid is positioned by manually clicking on Œ6› easily identifiable points of each face image, as shown in Fig. 2. Therefore, the L in Eq. 2 is equal to Œ6› . We also use all the Œ6› pixel positions as the reference coordinates, that is, LwœvqŒ6› . In the experiment, we set y=ž› and ‚=œŸ with the orientation angles ranging from ~ degree to k 6~ degree at a step of Œ~ degree.

78

76

74

72

70

68

5

10

15 number of eigenvectors

20

25

Figure 3. Performance of “eigenface” algorithm by using different number of eigenvectors.

Figure 2. Selection of landmarks. Two experimentsare designed: One experiment compares the performance of the classical orthogonal “eigenface” algorithm (Turk and Pentland used Principle Component Analysis (PCA) directly in face recognition in [9] and

Comparatively, we use GA to choose an optimal nonorthogonal basis from the 2-D Gabor wavelet function family by using different combinations of spatial frequency and orientation angle pairs so that it maximizes the entropy in GA processing. Some sample results of GA are shown in Table 1, where * marks the one with the largest entropy. Based on this basis, we implement the classification algorithm and the performance rate of using one image for each person in the training set is about £ kB¢ and the averaged performance rate of using two different poses for each person

in the training set can be as high as kV~~—¢ small face database, as shown in Fig. 4. Basis of frequency and orientation pair ¤/¥-¦e§¨¦&¤&© ª ¦§¨¦9¤&© ªU¦e«§d¨¦&¤&© ¬U¦­§¨¦ ¤&© ¬x¦3®V§d¨¦&¤&© ¬x¦¯&°V§¨¦&¤&© ±x¦c¯9²V§¨ ¤/¥-¦3§d¨¦c¤ © ªU¦3«V§d¨¦&¤ © ªU¦e®§d¨¦&¤ © ¬x¦3®V§d¨ ¤/¥-¦3«§¨¦9¤ © ªU¦3§¨¦&¤ © ªU¦­V§d¨¦9¤ © ±x¦c¯&°A§d¨ ¤ © ª ¦3«§¨¦&¤ © ¬x¦3«V§d¨¦&¤ © ¬ ¦®§d¨¦c¤ © ±x¦3­V§d¨

in this relatively

Entropy

90

"eigenface" 1 image in training set

¯&²6´µ¯9°6¯9§ °&¶2´ ³d·V°V® ¯9«´ §²V§¯

*

"eigenface" 2 images in training set

80

performance rate

70

60

50

40

30

20

10

0

GA search 1 image in training set

References

¯9³´ §V«¯9®

Table 1. Choice of the optimal basis by using GA and entropy evaluation.

100

optimal basis found by GA can improve the overall performance by as much as kV~—¢ .

GA search 2 images in training set

Figure 4. The overall performance rate. We observe that by using an optimal basis, the algorithm can choose the most significant components in the spatialfrequency domain of the original images (an important character of 2-D Gabor wavelet transformation) and at the same time eliminate the influence of interferences behind the images efficiently.

5. Conclusion This paper presents an algorithm for automatically classifying face images using feature vectors derived in a 2-D Gabor wavelet basis space. In order to eliminate redundancy and even some interferences of the basis, we use GA to search through the Gabor wavelet basis space to find an optimal basis generated by specifying a combination of spatial frequency and orientation angle pairs in the 2-D Gabor wavelet transformation. In the process of GA, entropy is chosen as the fitness function in order to make the corresponding children chromosomes have more information content, which has the advantage that we do not need to perform classification each time in the evolution which is required if using the within and between class scatter evaluation. Compared to the classical “eigenface” algorithm, the

[1] J. G. Daugman. Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by twodimensional visual cortical filters. Journal of Optical Society of America, 2(7):1160–1169, 1985. [2] J. G. Daugman. Complete discrete 2-d gabor transforms by neural networks for image analysis and compression. IEEE Transactions on Acoustics, Speech, and Signal Processing, 36(7):1169–1179, July 1988. [3] J. G. Daugman. An information-theoretic view of analog representation in striate cortex. in Computational Neuroscience, pages 403–424. MIT Press, 1990. [4] C. Liu and H. Wechsler. Evolutionary pursuit and its application to face recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(6):570–582, June 2000. [5] M. J. Lyons, J. Budynek, and S. Akamatsu. Automatic classification of single facial images. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(12):1357–1362, December 1999. [6] S. G. Mallat and Z. Zhang. Matching pursuits with timefrequency dictionaries. IEEE Transactions on Signal Processing, 41:3397–3415, 1993. [7] B. Moghaddam and A. Pentland. Face recognition using view-based and modular eigenspaces. Automatic Systems for the Identification and Inspection of Humans, SPIE, 2277, July 1994. [8] H. Qi. http://panda.ece.utk.edu/˜hqi/ece471-571, pattern classification, Spring, 2001. [9] M. Turk and A. Pentland. Eigenfaces for recognition. Journal of Cognitive Neuroscience, 3(1):71–86, 1991. [10] D. Whitley. A Genetic Algorithm tutorial. Computer Science Department, Colorado State University, Fort Collins, CO 80523.