Face Recognition by Curvelet Based Feature Extraction

Comment

Report 2 Downloads 173 Views

Face Recognition by Curvelet Based Feature Extraction Tanaya Mandal1, Angshul Majumdar2, and Q.M. Jonathan Wu1 1

Department of Electrical and Computer Engineering, University of Windsor, Canada 2 Pricewaterhouse Coopers, India [email protected], [email protected], [email protected] Abstract. This paper proposes a new method for face recognition based on a multiresolution analysis tool called Digital Curvelet Transform. Multiresolution ideas notably the wavelet transform have been profusely employed for addressing the problem of face recognition. However, theoretical studies indicate, digital curvelet transform to be an even better method than wavelets. In this paper, the feature extraction has been done by taking the curvelet transforms of each of the original image and its quantized 4 bit and 2 bit representations. The curvelet coefficients thus obtained act as the feature set for classification. These three sets of coefficients from the three different versions of images are then used to train three Support Vector Machines. During testing, the results of the three SVMs are fused to determine the final classification. The experiments were carried out on three well known databases, viz., the Georgia Tech Face Database, AT&T "The Database of Faces" and the Essex Grimace Face Database.

1 Introduction Face Recognition has been studied for over 20 years in computer vision. Since the beginning of the 1990s, the subject has become a major issue, mainly due to the important real-world applications of face recognition like smart surveillance, secure access, telecommunications, digital libraries and medicine. Faces are very specific objects whose most common appearance (frontal faces) roughly look alike but subtle changes make the faces different. The different face recognition techniques have been discussed in the work of Zhao et al [1]. In the recent years, the success of wavelets in other branches of computer vision, inspired face recognition researchers to apply wavelet based multiresolution techniques for face recognition [2, 3]. Over the past two decades, following wavelets, other multiresolution tools like contourlets [4] , ridgelets [5] and curvelets [6], to name a few, were developed. These tools have better directional decomposition capabilities than wavelets. These new techniques were used for image processing problems like image compression [7] and denoising [8], but not for addressing problems related to computer vision. In some recent works, Majumdar showed that a new multiresolution tool – curvelets can serve as bases for pattern recognition problems. Using curvelets, he obtained very good results for character recognition [9]. In a comparative study [10] Majumdar showed that curvelets can indeed supersede wavelets as bases for face recognition. In this paper we propose to go one step further towards a curvelet based M. Kamel and A. Campilho (Eds.): ICIAR 2007, LNCS 4633, pp. 806–817, 2007. © Springer-Verlag Berlin Heidelberg 2007

Face Recognition by Curvelet Based Feature Extraction

807

face recognition system by fusing results from multiple classifiers trained with curvelets coefficients from images having different gray scale resolutions. In Section 2 Curvelet Transform and our proposed feature extraction technique will be discussed. A brief overview of Support Vector Machine (SVM), the classifier we have used is given in Section 3. In Section 4 the three databases we have carried out our experiments on are covered. Finally, Section 5 lists the experimental results and Section 6 concludes the future prospects of this technique.

2 Curvelet Based Feature Extraction Wavelets and related classical multiresolution ideas exploit a limited dictionary made up of roughly isotropic elements occurring at all scales and locations. These dictionaries do not exhibit highly anisotropic elements and there are only a fixed number of directional elements (The usual orthogonal wavelet transforms have wavelets with primarily vertical, primarily horizontal and primarily diagonal orientations), independent of scale. Images do not always exhibit isotropic scaling and thus call for other kinds of multi-scale representation. Computer Vision researchers of the ‘80s and early 90’s [11, 12] were inspired by two biological properties of the visual cortex, that it functions in a) multi-scale b) mutli-orientational mode. The multi-scale aspect has been captured by the scale-space analysis as well as wavelet transform. However standard wavelet transforms for twodimensional functions f(x1, x2) have only very crude capabilities to resolve directional features. The limitations of the wavelet transform inspired the vision researchers to find new transforms proposed that had improved directional representation; such as the ‘Steerable Pyramids’ and ‘Cortex Transforms’. Curvelet transform by Candes and Donoho [6] is the latest multi-directional multi-scale transform. Field and Olshausen [13] did set up a computer experiment for empirically discovering the basis that best represents a database of 16 by 16 image patches. Although this experiment is limited in scale, they discovered that the best basis is a collection of needle shaped filters occurring at various scales, locations and orientations. The interested reader will find a stark similarity between curvelets, which derive from mathematical analysis, and these empirical basis elements arising from data analysis [14]. It is not possible to go into the details of digital curvelet transform within this paper. The interested reader can refer to the works of Candes and Donoho [16]. A brief procedural definition of curvelet transform is provided here for ready reference. The detailed discussion can be found in [15]. Ridgelet Transform- A basic tool for calculating ridgelet coefficients is to view ridgelet analysis as a form of wavelet analysis in the Radon domain. The Radon Transform

(

)

R : L2 ( R 2 ) → L2 [0,2π ], L2 ( R) is defined by

Rf (θ , t ) = ∫ f ( x1 , x2 )δ ( x1 cos θ + x2 sin θ − t )dx1dx2 where

δ

is the Dirac delta. The Ridgelet coefficients

given by analysis of the Radon Transform via,

(1)

ℜ f (a, b,θ ) of an object f are

808

T. Mandal, A. Majumdar, and Q.M.J. Wu

−

1 2

ℜ f (a, b,θ ) = ∫ Rf (θ , t )a ψ ((t − b) / a )dt

(2)

Hence, the ridgelet transform is precisely the application of a one-dimensional wavelet transform to the slices of the Radon transform where the angular variable θ is constant and t is varying. Discrete Ridgelet Transform- A basic strategy for calculating the continuous ridgelet transform is first to compute the Radon transform Rf (θ , t ) and second to

apply a one-dimensional wavelet transform to the slices Rf (θ , t ) . The projection formula [16] is a fundamental fact about the Radon Transform,

fˆ (ω cos θ , ω sin θ ) = ∫ Rf (θ , t )e −2πiωt dt

(3)

This says that the Radon transform can be obtained by applying the onedimensional inverse Fourier transform to the two-dimensional Fourier transform restricted to radial lines through the origin. This enables one to arrive at the approximate Radon transform for digital data based on the FFT. The steps to obtain this are as follows: 1. 2D-FFT - Compute the two-dimensional Fast Fourier Transform (FFT) of f. 2. Cartesian to polar conversion - Using an interpolation scheme, substitute the sampled values of the Fourier transform obtained on the square lattice with sampled values of ˆf on a polar lattice: that is, on a lattice where the points fall on lines through the origin. 3. 1D-IFFT - Compute the one-dimensional Inverse Fast Fourier Transform (IFFT) on each line; i.e., for each value of the angular parameter. Digital Curvelet Transform- The digital curvelet transform of f is achieved by the implementation steps. Subband Decomposition: A bank of filters P0 , ( Δ s , s

> 0) , is defined. The image ƒ

is filtered into subbands with àtrous algorithm

f → ( P0 f , Δ 1 f , Δ 2 f ,...) The different sub-bands

(4)

Δ s f contain detail about 2 −2 s wide.

Smooth Partitioning: Each subband is smoothly windowed into “squares” of an appropriate scale

Δ s f → ( wQ Δ s f ) Q∈Qs Where

wQ is a collection of smooth window localized around dyadic squares

Q = [k1 / 2 s , (k1 + 1) / 2 s ] × [k 2 / 2 s , (k 2 + 1) / 2 s ]

(5)

Face Recognition by Curvelet Based Feature Extraction

809

Renormalization: Each resulting square is renormalized to unit scale

g Q = (TQ ) −1 ( wQ Δ s f ) Q ∈ Qs ,

(6)

Fig. 1. Overview of Curvelet Transform

In the following images curvelet transform coefficients of a face at one approximation and eight detailed decompositions, from the At&T database is shown.

Fig. 2. Curvelet Transform of face images. The first one is that of approximate coefficients. The rest are detailed coefficients at eight different angles.

810

T. Mandal, A. Majumdar, and Q.M.J. Wu

Fig. 2. (continued)

Curvelets are good at representing edge discontinuities in two dimensional functions. In this work we will exploit this property of curvelets in a novel way for facial feature extraction. Human faces are three dimensional objects that are represented in two-dimensions in ordinary images. As a result when a face is photographed different parts of the face reflect the incident light differently and we find differential shades in the face image. We, human beings are able to get a rough idea of the three-dimensional structure from this differential shades. Black and white digital images are represented in 8 bits or 16 bits resulting in 256 or 65536 gray levels. Let us suppose that the images are represented by 256 gray levels (actually the image databases we used are all 8 bit images). In such an image two very near regions in can have differing pixel values. Such a gray scale image will have a lot of “edges” – and consequently the curvelet transform will capture this edge information. But if we quantize the gray levels, say to 128 or 64, nearby regions that had very little differences in pixel values and formed edges in the original 256 bit image will be merged and as a result only more bold edges in face image will be represented. Now if these gray-level quantized images are curvelet transformed, the transformed domain coefficients will contain information of these bolder curves. Images of the same person from the AT&T face database, quantized to 4 bits and 2 bits from the original 8 bit representation are shown below.

Fig. 3. The first images of the rows are the original 8 bit representations. The second images are the 4 bit images while the last ones are the 2 bit images of the original 8 bit images.

Face Recognition by Curvelet Based Feature Extraction

811

Fig. 3. (continued)

From the above images it can be clearly seen how the edge information is varied by quantizing the gray scale of the face images. In this work we vary the gray scale resolution from 256 to 16 and 4. As the images are quantized, only the bolder curves of the face image will remain. Now when we take a curvelet transform of these 8 bit, 4 bit and 2 bit images, the bolder curves in a person’s face will be captured. We will train three classifiers with the curvelet coefficients from three gray scale representations of the images. During testing, the test images will be quantized in the same manner, and the quantized test images will be classified by the three corresponding classifiers. Finally the output of these three classifiers will be fused to arrive at the final decision. The idea behind this scheme is that, even if a person’s face is failed to be recognized by the fine curves present in the original 8 bit image, it may be recognized by bolder curves at a lower bit resolution. To show, how the curvelet based scheme fairs over wavelets, the same exercise as depicted in the aforesaid paragraphs will be carried in the wavelet domain, i.e. instead of using curvelet transform on the bit quantized images, wavelet transform will be used. The results of the two schemes will be compared.

3 Support Vector Classification Support Vector Machine (SVM) [17] models are a close cousin to classical neural networks. Using a kernel function, SVM’s are an alternative training method for polynomial, radial basis function and multi-layer perceptron classifiers in which the weights of the network are found by solving a quadratic programming problem with linear constraints, rather than by solving a non-convex, unconstrained minimization problem as in standard neural network training. The two most popular approaches are the One-Against-All (OAA) Method and the One-Against-One (OAO) Method. For our purpose we used a One-Against-All (OAA) SVM because it constructs g binary classifiers as against g(g-1)/2 classifiers required for One-Against-One SVM while addressing a g – class problem.

4 Databases We tried our face recognition approach in several well known databases. A brief discussion on the databases used will follow.

812

T. Mandal, A. Majumdar, and Q.M.J. Wu

Georgia Tech Face Database [18] This database contains images of 50 people taken in two or three sessions between 06/01/99 and 11/15/99 at the Center for Signal and Image Processing at Georgia Institute of Technology. All people in the database are represented by 15 color JPEG images with cluttered background taken at resolution 640x480 pixels. The average size of the faces in these images is 150x150 pixels. The pictures show frontal and/or tilted faces with different facial expressions, lighting conditions and scale. Each image is manually labeled to determine the position of the face in the image. Of these 15 images, 9 were used as the training set and the remaining 6 were used as the testing set. The division into training and testing set was random, and was randomly done thrice.

Fig. 4. Faces from the Georgia Tech Face Database

AT&T "The Database of Faces" [19] This database contains ten different images of each of 40 distinct subjects. For some subjects, the images were taken at different times, varying the lighting, facial expressions (open / closed eyes, smiling / not smiling) and facial details (glasses / no glasses). All the images were taken against a dark homogeneous background with the subjects in an upright, frontal position (with tolerance for some side movement). For this database 6 images per person served as the training set and the remaining 4 consisted of the testing set. This random segregation into training and testing set was done thrice.

Fig. 5. Faces from the AT&T Face Database

Essex Face Database [20] A sequence of 20 images each for 18 individuals consisting of male and female subjects was taken, using a fixed camera. The images are of size 180 x 200 pixels. During the sequence the subject moves his/her head and makes grimaces which get more extreme towards the end of the sequence. There is about 0.5 seconds between successive frames in the sequence. Of the 20 images of each individual, 12 images are randomly chosen for the training set and the rest for testing. This random segregation into training and testing set was done thrice.

Face Recognition by Curvelet Based Feature Extraction

813

Fig. 6. Faces from the Essex Grimace Face Database

5 Results While doing this work, we converted the colour images from the Georgia Tech database and the Essex Grimace database to black and white. Apart from this we did not do any pre-processing step like cropping or slant correction. But [21] showed that the recognition accuracy of face images does not degrade if the size of the image is reduced before any feature extraction is done. Following this work, as a pre-processing step, we reduced all the images by four times in length and width. Images from the databases were divided into training and testing sets randomly. For each database, the segregation into testing and training sets was done randomly thrice. For each of the three pairs of training and testing set thus obtained, three sets of experiments were performed. As discussed in section 2, we converted the each of the images to two other gray scale resolutions. All the images were 8 bit images. We converted them to 4 bit and 2 bit images. With these three versions of the training images we trained three SVM classifiers – one each for the 8 bit, 4 bit and 2 bit images. The test images too were converted to these 4 bit and 2 bit versions apart from the original 8 bit one. And the separate versions of the test images thus obtained were classified with the three corresponding classifiers. The final decision was made by fusing the outputs from these three classifiers using simple majority voting. If there was no clear winner the decision image was rejected. As was mentioned at the end of section 2, a comparative study between wavelets and curvelets will be presented here. The results between the two schemes can be compared on the Georgia Tech database from tables 1- 6. Table 1. Curvelet based Results of Set 1

No. of Bits in Image 8 4 2

Accuracy of each Classifier 83.3 83.3 79.7

Accuracy after majority Voting 88.7

Rejection Rate 7

Incorrect Classification rate 4.3

Table 2. Wavelet based Results of Set 1

No. of Bits in Image 8 4 2

Accuracy of each Classifier 82.3 82.7 78

Accuracy after majority Voting 86

Rejection Rate 8.7

Incorrect Classification rate 5.3

814

T. Mandal, A. Majumdar, and Q.M.J. Wu Table 3. Curvelet based Results of Set 2

No. of Bits in Image 8 4 2

Accuracy of each Classifier 78.7 78 74

Accuracy after majority Voting 81.3

Rejection Rate 10.7

Incorrect Classification rate 8

Table 4. Warvelet based Results of Set 2

No. of Bits in Image 8 4 2

Accuracy of each Classifier 78.0 76.7 73.3

Accuracy after majority Voting 80.7

Rejection Rate 11

Incorrect Classification rate 8.3

Table 5. Curvelet based Results of Set 3

No. of Bits in Image 8 4 2

Accuracy of each Classifier 85.7 85.3 84.3

Accuracy after majority Voting 89.7

Rejection Rate 5.7

Incorrect Classification rate 4.6

Table 6. Warvelet based Results of Set 3

No. of Bits in Accuracy of each Image Classifier 8 84.3 4 84.0 2 83.7

Accuracy after majority Voting 87.3

Rejection Rate 6.3

Incorrect Classification rate 6

From the above 6 tables, it can be seen how the curvelet based scheme show better results against the wavelet based scheme. The experimental results on the AT&T database are tabulated in tables 4, 5 and 6. Table 7. Results of Set 1

No. of Bits in Image 8 4 2

Accuracy of each Classifier 96.9 95.6 93.7

Accuracy after majority Voting 98.8

Rejection Rate 1.2

Incorrect Classification rate 0

Face Recognition by Curvelet Based Feature Extraction

815

Table 8. Results of Set 2

No. of Bits in Image 8 4 2

Accuracy of each Classifier 98.8 98.1 97.5

Accuracy after majority Voting

Rejection Rate

99.4

0.6

Incorrect Classification rate 0

Table 9. Results of Set 3

No. of Bits in Image 8 4 2

Accuracy of each Classifier 96.9 96.2 95.6

Accuracy after majority Voting

Rejection Rate

100

0

Incorrect Classification rate 0

The following three tables tabulate the results of the Essex Grimace Database. Table 10. Results of Set 1

No. of Bits in Image 8 4 2

Accuracy of each Classifier 100 100 100

Accuracy after majority Voting

Rejection Rate

100

0

Incorrect Classification rate 0

Table 11. Results of Set 2

No. of Bits in Image 8 4 2

Accuracy of each Classifier 95.8 95.8 94.4

Accuracy after majority Voting

Rejection Rate

97.2

1.4

Incorrect Classification rate 1.4

Table 12. Results of Set 3

No. of Bits in Image 8 4 2

Accuracy of each Classifier 96.7 95.6 95.6

Accuracy after majority Voting 97.2

Rejection Rate 2.1

Incorrect Classification rate 0.9

6 Conclusion The technique introduced in our paper appears to be robust to the changes in facial expression as it shows good results for the Essex and the AT&T database. However,

816

T. Mandal, A. Majumdar, and Q.M.J. Wu

we are trying to improve the recognition accuracy for sidewise tilted images, like those in the Georgia Tech database. The further work is suggested in improving the recognition accuracy by cropping images and making tilt corrections and using other voting schemes as well.

Acknowledgement The work is supported in part by the Canada Research Chair program, and the Natural Science and Engineering Research Council of Canada.

References 1. Zhao, W., Chellappa, R., Rosenfeld, A., Phillips, P.J.: Face Recognition: A Literature Survey. ACM Computing Surveys, 399–458 (2003) 2. Chen, C.F, Tseng, Y.S., Chen, C.Y.: Combination of PCA and Wavelet Transforms for Face Recognition on 2.5D Images. In: Proc. Image and Vision Computing NZ, pp. 343– 347 (2003) 3. Tian, G.Y., King, S., Taylor, D., Ward, S.: Wavelet based Normalisation for Face Recognition. In: Proceedings of International Conference on Computer Graphics and Imaging (CGIM) (2003) 4. Do, M.N., Vetterli, M.: The contourlet transform: an efficient directional multiresolution image representation. IEEE Transactions Image on Processing 14(12), 2091–2106 (2005) 5. Do, M.N., Vetterli, M.: The finite ridgelet transform for image representation. IEEE Trans. Image Processing 12(1), 16–28 (2003) 6. Donoho, D.L, Duncan, M.R.: Digital curvelet transform: strategy, implementation and experiments. Tech. Rep., Department of Statistics, Stanford University (1999) 7. Belbachir, A.N., Goebel, P.M.: The Contourlet Transform for Image Compression. Physics in Signal and Image Processing, Toulouse, France (January 2005) 8. Li, A., Li, X., Wang, S., Li, H.: A Multiscale and Multidirectional Image Denoising Algorithm Based on Contourlet Transform. In: International Conference on Intelligent Information Hiding and Multimedia, pp. 635–638 (2006) 9. Majumdar, A.: Bangla Basic Character Recognition using Digital Curvelet Transform. Journal of Pattern Recognition Research (accepted for publication) 10. Majumdar, A.: Curvelets: A New Approach to Face Recognition. Journal of Machine Learning Research (submitted) 11. Simoncelli, E.P., Freeman, W.T., Adelson, E.H., Heeger, D.J.: Shiftable multi-scale transforms. IEEE Trans. Information Theory, Special Issue on Wavelets 38(2), 587–607 (1992) 12. Watson, A.B.: The cortex transform: rapid computation of simulated neural images. Computer Vision, Graphics, and Image Processing 39(3), 311–327 (1987) 13. Olshausen, A., Field, D.J.: Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature 381, 607–609 (1996) 14. Candès, E.J., Guo, F.: New Multiscale Transforms, Minimum Total Variation Synthesis: Applications to Edge-Preserving Image Reconstruction. Signal Processing 82, 1519–1543 (2002) 15. Candes, E., Demanet, L., Donoho, D., Ying, L.: Fast Discrete Curvelet Transforms, http://www.curvelet.org/papers/FDCT.pdf

Face Recognition by Curvelet Based Feature Extraction

817

16. Deans, S.R.: The Radon transform and some of its applications. John Wiley Sons, New York (1983) 17. Cortes, C., Vapnik, V.: Support-Vector Networks. Machine Learning 20(3), 273–297 (1995) 18. http://www.anefian.com/gt_db.zip 19. http://www.cl.cam.ac.uk/Research/DTG/attarchive:pub/data/att_faces.zip 20. http://cswww.essex.ac.uk/mv/allfaces/grimace.zip 21. Ruiz-Pinales, J., Acosta-Reyes, J.J., Salazar-Garibay, A., Jaime-Rivas, R.: Shift Invariant Support Vector Machines Face Recognition System. Transactions on Engineering, Computing And Technology 16, 167–171 (2006)

Recommend Documents

Phoneme recognition using ICA-based feature extraction ... - CiteSeerX