Face Recognition Using New Image Representations Zhiming Liu and Qingchuan Tao Abstract— This paper presents a novel face recognition method by using the new image representations. The commonly used gray-scale image is derived from the linear combination of R, G, and B color component images, while the new image representations are derived from the Principal Component Analysis (PCA) transform upon the hybrid configurations of different color component images. Compared to the correlated color space RGB, the correlations among the other configurations of color components (such as RCr Q, Y IQ, Y Cb Q, and so on) are reduced and hence the diversities of classification outputs among them are enhanced. Thus, the new image representations, which inherit advantages from all the individual color components, are more invariable than the grayscale image to the image variations for the face recognition task. Furthermore, we propose to encode the facial information from the new image representations by using an effective Local Binary Pattern (LBP) feature extraction method, which extracts and fuses the multi-resolution LBP features. Finally, the resulting LBP features undergo the Fisher discriminant analysis for face recognition. The most challenging Face Recognition Grand Challenge (FRGC) version 2 Experiment 4 shows the proposed method, which achieves the face verification rate of 83.41% at the false accept rate of 0.1%, performs better than some recent face recognition methods.
I. I NTRODUCTION Pattern recognition relies heavily on the particular choice of features utilized by the classifiers. Face recognition as a representative problem of pattern recognition has been attracting substantial attention from researchers in computer vision, pattern recognition, and machine learning. Most of current face recognition methods work with feature extraction on the gray-scale image, which is derived from the linear combination of three primary color components R, G, and B by either I = (R + G + B)/3 or Y = 0.299R + 0.58G + 0.114B. However, the more theoretical evidences are needed to support that such transformations are optimal for image classification. On the other hand, since the color components R, G, and B are correlated strongly to each other, can the more discriminating information be created by the simple combination of them? Recently, color information has been demonstrated sufficient effectiveness not only for face detection but also for face recognition. The different color spaces defined by transformations from the RGB color space possess different color characteristics and hence different discriminating capabilities for pattern recognition. The R component image and the V component image in the HSV color space, for example, Zhiming Liu and Qingchuan Tao are with the School of Electronics and Information Engineering, Sichuan University, Chengdu, Sichuan 610064, P.R. China. Zhiming Liu is also with the Department of Computer Science, New Jersey Institute of Technology, Newark, New Jersey 07102, USA (email:
[email protected]).
have been shown more effective for face recognition than the component images in several other color spaces [1]. The complementary characteristics of color spaces can be applied to improve face recognition performance. The results in [2] show that the Y U V color space provides the better face recognition performance than the RGB color space. The hybrid configuration of color components, for example RIQ, possesses the more discriminating abilities than the conventional color spaces, such as Y IQ, for face recognition [3]. In addition, the meaningful color transformations have been investigated in [4], [5], [6] by the statistical learning methods, such as PCA, Fisher Linear Discriminant analysis (FLD), and Independent Component Analysis (ICA), in order to derive the effective image representations for face recognition purpose. In this paper, we conduct a preliminary investigation to derive the new image representations from the hybrid configurations of color component images by the PCA transformation. Unlike the correlated color components R, G, and B, the color components in several other color spaces have the reduced correlations to each other, and hence the diversities of classification outputs among them could be enhanced. Thus, the combinations of these color components would result in the higher classification performance than the gray-scale image Y, which is derived from R, G, and B directly. Moreover, the optimum color transformation can be learned by means of principal component analysis, also called Karhunen-Loeve Transform. PCA seeks a principal subspace of lower dimensionality to preserve the variation energy best, so that the features in this subspace can represent the original data accurately. The basis vector associated with the largest eigenvalue is the principal axis of the data distribution, which encodes the maximum variance of the distribution. Thus, the projected values of color component images on this axis represent most of original data information. In particular, the new image representations, such as UY Cr Q that is derived from color components Y, Cr , and Q, are strongly complementary to other new image representations, such as UY Cb Q that is derived from color components Y, Cb , and Q. As a result, data fusion between these new image representations either in image level or in decision level can boost the classification performance significantly. For face recognition, we present a multi-resolution Local Binary Patterns (LBP) feature fusion, which extracts and fuses LBP features in three different spatial resolution, producing much more discriminating capabilities than what a single LBP operator can provide for face recognition. The feasibility of the proposed method is evaluated by the Face Recognition Grand Challenge (FRGC) version 2 Experiment 4, which contains 12,776 training images, 16,028
TABLE I C ORRELATION COEFFICIENTS BETWEEN THE DIFFERENT COLOR COMPONENTS .
R G B Y I Q Cb Cr
R 1.0 0.93 0.87 0.97 0.78 0.03 -0.77 0.72
G 0.93 1.0 0.94 0.99 0.55 -0.22 -0.67 0.46
B 0.87 0.94 1.0 0.93 0.42 -0.06 -0.46 0.37
Y 0.97 0.99 0.93 1.0 0.64 -0.11 -0.71 0.56
I 0.78 0.55 0.42 0.64 1.0 0.28 -0.84 0.97
Q 0.03 -0.22 -0.06 -0.11 0.28 1.0 0.22 0.46
Cb -0.77 -0.67 -0.46 -0.71 -0.84 0.22 1.0 -0.72
Cr 0.72 0.46 0.37 0.56 0.97 0.46 -0.72 1.0
TABLE II Q UANTITIES OF CRITERION J4
R 0.433
G 0.382
B 0.333
OF THE DIFFERENT COLOR COMPONENTS .
Criterion J4 Y I 0.394 0.512
controlled target images, and 8,014 uncontrolled query images. The Biometric Experimentation Environment (BEE) system reveals that the proposed method achieves the Face Verification Rate (FVR) of 83.41% (ROC III) at the False Accept Rate (FAR) of 0.1%, better than some methods using the conventional gray-scale image. II. M OTIVATION For color face image recognition, the RGB color space is commonly used in some methods [7], [8], [9], [10]. However, the correlations among components R, G, and B are strong, which could not contribute too much to improving performance when R, G, and B components are combined for pattern recognition, such as face recognition. Therefore, the other color spaces, such as Y IQ, HSV , and Y Cb Cr transformed from the RGB space, are adopted to perform face recognition [2], [1]. Furthermore, the hybrid configurations of different color components have shown more effective than some conventional color spaces for face recognition [11], [12], [3], [13]. First, we calculate the correlation coefficients contained between the individual components in RGB, Y IQ, and Y Cb Cr color spaces. Note that in this paper we consider only Y IQ and Y Cb Cr color spaces besides RGB due to the limitation of paper length. Table I lists the correlation coefficients, which are computed from the color components in the training database of the FRGC. The values in Table I indicate that R, G, and B components have the strong correlations to each other, while the other color components have usually the reduced correlations to each other. Such an observation implies that the better recognition performance could be provided by the other color spaces rather than RGB color space, since the decorrelation property is essential for pattern recognition system. Thus, Table I provides us the
Q 0.426
Cb 0.476
Cr 0.508
evidence to choose the configurations of color components on the basis of correlation characteristics. Next, we demonstrate the discriminant capabilities on the individual color components by analyzing the criterion of class separability. Based on the within-class scatter matrix Sw and the between-class scatter matrix Sb of the training database, we can evaluate the class separability by using the Fisher criterion: J4 = tr(Sb )/tr(Sw ). Table II gives the calculation results, which indicate that the color components G and B have the weakest power of image classification, at least for the FRGC training database. As a result, G and B components are excluded from constructing the configurations with the other color components. Therefore, we can choose some color components, by considering both the correlations among them and the discriminant capabilities, to form some color spaces, such as the representative ones RCr Q, Y IQ, Y Cr Q, and so on, which will undergo the PCA transformation to create the new image representations for face recognition.
III. N EW I MAGE R EPRESENTATION VIA PCA TRANSFORMATION
A new effective image representation can be derived from the hybrid configurations of color components via PCA transformation, of which the most significant basis vector represents the most of energy of input data. In the C1 C2 C3 color space, an image of resolution m × n consists of three color components C1 , C2 , and C3 . Without loss of generality, we assume that C1 , C2 , and C3 are column vectors: C1 , C2 , C3 ∈ RN , where N = m × n. We can form a data matrix X ∈ RN l×3 using all the training images:
X=
C11 , C21 , C12 , C22 , .. .. . . C1l , C2l ,
C31 C32 .. .
,
(1)
C3l
where l is the number of training images. The covariance matrix of ΣX may be formulated as follows: ΣX = E{[X − E(X)]t [X − E(X)]},
(2)
where E(·) is the expectation operator, t denotes the transpose operation, and ΣX ∈ R3×3 . The PCA of a random vector X factorizes the covariance matrix ΣX into the following form: ΣX = PΛPt , (3) where P = [Φ1 , Φ2 , Φ3 ] ∈ R3×3 is an orthonormal eigenvector matrix and Λ = diag{λ1 , λ2 , λ3 } ∈ R3×3 is a diagonal eigenvalue matrix with diagonal elements in decreasing order (λ1 ≥λ2 ≥λ3 ). Φ1 , Φ2 , Φ3 and λ1 , λ2 , λ3 are the eigenvectors and the eigenvalues of ΣX , respectively. Then a new image representation U ∈ RN can be derived by projecting three color component images of a C1 C2 C3 image onto Φ1 : U = [C1 , C2 , C3 ]Φ1
(4)
IV. E XPERIMENTS This section assesses the proposed method using the Face Recognition Grand Challenge (FRGC) version 2 database [14]. The most challenging FRGC experiment, the FRGC version 2 Experiment 4 [14], is applied in our method. In particular, the training set contains 12,776 images that are either controlled or uncontrolled. The target set has 16,028 controlled images and the query set has 8,014 uncontrolled images. While the faces in the controlled images have good image resolution and good illumination, the faces in the uncontrolled images have lower image resolution and larger illumination variations. The FRGC baseline algorithm, which in essence is a PCA algorithm, reveals that these uncontrolled factors pose grand challenges to the face recognition performance. The BEE system generates three ROC curves (ROC I, ROC II, and ROC III) corresponding to the images collected within semesters, within a year, and between semesters, respectively [14]. The size of the images used in our experiments is 64 × 64. A. Effectiveness of New Image Representations for Face Recognition According to Table I and Table II, we choose some ideal configurations of color components: RCr Q, RCb Q, RIQ, Y Cr Q, Y Cb Q, and Y IQ, while the other configurations, RGB, RCb Cr , RCb I, RCr I, Y Cb Cr , Y Cb I, and Y Cr I, are also included for comparison purpose. Therefore, some new image representations, such as URCr Q , URCb Q , and so
Fig. 1. The image representations. The first row: R, G, and B. The second row: Y, I, and Q. The third row: Cb and Cr . The fourth row: UY CrQ , URCrQ , UY CbQ , and URGB .
on, can be generated by using the transformation (namely, Φ1 in (4)) derived from PCA. The first set of experiments implements the Enhanced Fisher Model (EFM) [16] method and the cosine similarity measure to evaluate the new image representations. Table III shows the face verification rates (FVR) at 0.1% false accept rate (FAR), where only image representations with FVR beyond 60% are listed, and R, Y, and URGB are also included for comparison. The experimental results indicate that (i) the new image representations generated from color hybrid configurations usually achieve the better performance than the commonly employed Y and the image representation URGB derived from RGB, (ii) although the correlation between Cr and I is strong, the new image URCr I has a good performance for face recognition, and (iii) Y component image is not ideally suited for face recognition, at least for FRGC database, as the performance is compromised when compared to R component image. Fig. 1 shows some color component images and the resulting new image representations by using the transformation coefficients. The interesting findings from Fig. 1 are that (i) the new image representations, especially for UY CrQ , borrow some information from the chromatic components, such as Cr and Q. The information, which can not be available from luminance components, is helpful for resistance to image variations, (ii) while the URGB is very similar to Y, because the R, G, and B components are correlated to each other and the combination from them can not generate something new. The second set of experiments assesses face recognition
TABLE III FACE VERIFICATION RATE (ROC III) AT 0.1% FALSE ACCEPT RATE (FAR) OF SOME NEW IMAGE REPRESENTATIONS , R,
UY Cr Q 66.59% UY Cr I 61.65%
FVR (ROC III) URCr Q UY IQ 65.75% 64.95% URCb Q R 60.14% 62.12%
at 0.1% FAR URIQ URCr I 64.77% 64.63% Y URGB 56.41% 58.82%
AND
Y.
UY Cb Q 61.71%
TABLE IV C ORRELATION COEFFICIENTS BETWEEN THE DIFFERENT NEW IMAGES .
URCr Q URCb Q URIQ UY Cr Q UY Cb Q UY IQ UY Cr I URCr I
URCr Q 1.0 0.68 0.97 -0.99 −0.55 0.94 0.95 0.95
URCb Q 0.68 1.0 0.82 -0.69 -0.98 0.87 0.86 0.84
URIQ 0.97 0.82 1.0 -0.97 -0.71 0.99 0.99 0.99
UY Cr Q -0.99 -0.69 -0.97 1.0 0.56 -0.94 -0.94 -0.95
UY Cb Q −0.55 -0.98 -0.71 0.56 1.0 -0.78 -0.76 -0.74
UY IQ 0.94 0.87 0.99 -0.94 -0.78 1.0 0.98 0.98
UY Cr I 0.95 0.86 0.99 -0.94 -0.76 0.98 1.0 0.99
URCr I 0.95 0.84 0.99 -0.95 -0.74 0.98 0.99 1.0
TABLE V FACE VERIFICATION RATE (ROC III) AT 0.1% FALSE ACCEPT RATE BY FUSING THE CLASSIFICATION OUTPUTS BETWEEN THE DIFFERENT NEW IMAGES AT THE DECISION LEVEL .
URCr Q URCb Q URIQ UY Cr Q UY Cb Q UY IQ UY Cr I URCr I
URCr Q 74.64% 67.19% 66.26% 76.65% 69.46% 67.42% 69.11%
URCb Q 74.64% 70.83% 75.16% 61.77% 69.10% 67.79% 75.34%
URIQ 67.19% 70.83% 67.72% 73.55% 65.82% 64.10% 69.82%
UY Cr Q 66.26% 75.16% 67.72% 77.10% 69.68% 67.98% 69.01%
performance by fusing similarity matrices generated by any two of new image representations at the decision level. We first calculate the correlation coefficients between the new images. The results in Table IV show that there are strong decorrelations between UY Cb Q and UY Cr Q , URCr Q . If their classification outputs are fused, the better performance would be expected. Next, we employ the z-score method to normalize the similarity matrices and then fuse them by means of sum rule. The fused classification results are detailed in Table V, which indicates that the best performance, 77.10%, can be reached by fusing UY Cr Q and UY Cb Q . B. LBP-based Face Recognition Using New Image Representation In this section, we present an effective method to use LBP features for face recognition. In a 3 × 3 neighborhood of an image, the basic LBP operator assigns a binary label 0 or 1 to each surrounding pixel ip by thresholding at the gray value
UY Cb Q 76.65% 61.77% 73.55% 77.10% 71.86% 70.72% 76.81%
UY IQ 69.46% 69.10% 65.82% 69.68% 71.86% 64.27% 70.67%
UY Cr I 67.42% 67.79% 64.10% 67.98% 70.72% 64.27% 70.53%
URCr I 69.11% 75.34% 69.82% 69.01% 76.81% 70.67% 70.53% -
of the central pixel ic and replacing its value with a decimal number converted from the 8-bit binary number. Formally, the LBP operator is defined as follows: LBP =
7 X
2p s(ip − ic )
(5)
p=0
where s(ip − ic ) equals 1, if ip − ic ≥ 0; and 0, otherwise. Two extensions of the basic LBP were further developed [15]. The first extension allows LBP to deal with any size of neighborhoods by using circular neighborhoods and bilinearly interpolating the pixel values. The second extension defines the so called uniform patterns. When the binary string is considered circular, we can call LBP uniform if there are at most two bitwise transitions from 0 to 1 or vice versa. u2 After extensions, LBP can be expressed as: LBPP,R , where P and R mean P sampling points on a circle of radius R. Note that the smaller scale LBP operators extract more de-
1
0.9
0.8
Fig. 2.
Multi-resolution LBP feature fusion.
Face Verification Rate
0.7
0.6
0.5
0.4
0.3
tailed information (microstructure) and maintain the similar histogram profile (macrostructure) as the larger scale LBP operators do. We can conclude that these operators provide the complementary information to each other. Therefore, a LBP multi-resolution feature fusion is proposed as shown u2 u2 in Fig. 2. First, three LBP operators, LBP8,1 , LBP8,2 u2 and LBP8,3 , are used to extract the multi-resolution histograms. Second, three histograms undergo the EFM for dimensionality reduction, respectively. The resulting features are concatenated to form an augmented feature. Third, the EFM is applied to the augmented feature to derive the most discriminating features for face recognition. The third set of experiments evaluates face recognition performance by using the proposed multi-resolution LBP feature fusion on new image representations. To extract the LBP features, we divide a face image of size 64 × 64 into 144 (12*12) overlapping windows of 9 × 9 pixels (3 pixels overlapping). For each of the three global LBP histograms, the EFM is first used to derive the discriminant features with lower dimensionality. After feature concatenation, the EFM is applied again to analyze the augmented feature vector to choose the most discriminant features for classification. The proposed LBP method is implemented to UY Cr Q , UY Cb Q , R, and Y images, whose experimental results are shown in Table VI. These results demonstrate that the proposed method is able to utilize the LBP features effectively to improve performance of face recognition. Next, we take into account fusing classification outputs at the decision level across the different images. The corresponding results are given in Table VII, which indicates that the best FVR of 83.41% at 0.1% FAR is achieved by fusing the classification outputs of UY Cr Q and Y images. Fig. 3 shows the corresponding ROC curves for the best FVR derived by our method. In order to compare the proposed method with some recent methods using the FRGC version 2 Experiment 4, we include the references [17] and [18]. Hwang et al. [17] propose a hybrid Fourier features method that achieves the FVR of 74.33% (ROC III). Tan et al. [18] use the LBP features and kernel discriminative common vector method to achieve the FVR of 73.5% (ROC III). Our method, which applies the LBP features only, achieves the FVR of 83.41% (ROC III), thus comparing favorably against these published methods on the FRGC version 2 Experiment 4.
Proposed Method ROCIII Proposed Method ROCII Proposed Mwthod ROCI FRGC Baseline Algorithm ROCI FRGC Baseline Algorithm ROCII FRGC Baseline Algorithm ROCIII
0.2
0.1
0 −3 10
−2
−1
10
10
0
10
False Accept Rate
Fig. 3. FRGC version 2 Experiment 4 face recognition performance (ROC curves) using the proposed method. The FRGC baseline algorithm using gray-scale image is included for comparison.
V. C ONCLUSIONS In this paper, we conduct a preliminary investigation to produce the novel effective image representations for face recognition. The experiments show the satisfactory results have been achieved by using these new images and LBP features. The future work will be focused on seeking the more reliable criteria to choose the color component images, as well as the new learning methods, to generate the new image representations. R EFERENCES [1] P. Shih and C. Liu, “Comparative assessment of content-based face image retrieval in different color spaces,” International Journal of Pattern Recognition and Artificial Intelligence, vol. 19, no. 7, pp. 873– 893, 2005. [2] L. Torres, J.Y. Reutter, and L. Lorente, “The importance of color information in face recognition,” in Proc. IEEE Int. Conf. Image Processing, October 24-28 1999. [3] Z. Liu and C. Liu, “A hybrid color and frequency features method for face recognition,” IEEE Trans. on Image Processing, vol. 17, no. 10, pp. 1975–1980, 2008. [4] C. Jones III and A. Abbott, “Optimization of color conversion for face recognition,” EURASIP J. Appl. Signal Process., , no. 4, pp. 522–529, 2004. [5] J. Yang and C. Liu, “A general discriminant model for color face recognition,” in Proc. IEEE International Conference on Computer Vision (ICCV’07), 14–20 Oct 2007. [6] C. Liu, “Learning the uncorrelated, independent, and discriminating color space for face recognition,” IEEE Trans. Information Forensics and Security, vol. 3, no. 2, pp. 213–222, 2008. [7] C. Xie and V. Kumar, “Quaternion correlation filters for color face recognition,” in Proc. SPIE-IS&T, 2005, vol. 5681, pp. 486–494. [8] A. Ross and R. Govindarajan, “Feature level fusion using hand and face biometrics,” in Proc. of SPIE Conference on Biometric Technology for Human Identification II, 2005, pp. 196–204. [9] C. Jones III and A. Abbott, “Color face recognition by hypercomplex Gabor analysis,” in Proc. IEEE the 7th International Conference on Automatic Face and Gesture Recognition (FGR’06), 2006. [10] Y. Kim and S. Choi, “Color face tensor factorization and slicing for illumination-robust recognition,” in Proc. the 2th international Conference on Biometrics (ICB), Augst 2007.
TABLE VI
FACE VERIFICATION RATE (ROC III) AT 0.1% FALSE ACCEPT RATE OF UY Cr Q , UY Cb Q , R, AND Y IMAGES BY USING THE LBP METHOD .
u2 LBP8,1 u2 LBP8,2 u2 LBP8,3 EFM fusion
FVR UY Cr Q 62.87% 66.08% 57.26% 73.33%
(ROC III) at 0.1% FAR UY Cb Q R 58.25% 65.33% 60.59% 66.47% 52.18% 58.05% 68.03% 71.55%
Y 65.75% 65.45% 57.45% 71.16%
TABLE VII
FACE VERIFICATION RATE (ROC III) AT 0.1% FALSE ACCEPT RATE BY FUSING THE CLASSIFICATION OUTPUTS OF UY Cr Q , UY Cb Q , R, AND Y IMAGES BASED ON THE LBP METHOD .
UY Cr Q UY Cb Q R Y
UY Cr Q 81.80% 82.42% 83.41%
UY Cb Q 81.80% 76.97% 76.35%
[11] J. Yang and C. Liu, “Horizontal and vertical 2DPCA-based discriminant analysis for face verification on a large-scale database,” IEEE Trans. Information Forensics and Security, vol. 2, no. 4, pp. 781–792, 2007. [12] M. Sadeghi, S. Khoshrou, and J. Kittler, “SVM-based selection of colour space experts for face authentication,” in Proc. The 2nd IAPR International Conference on Biometrics (ICB’07), 2007. [13] Z. Liu and C. Liu, “Fusing frequency, spatial and color features for face recognition,” in Proc. IEEE on Biometrics: Theory, Applications and Systems (BTAS’08), 2008. [14] P.J. Phillips, P.J. Flynn, T. Scruggs, K.W. Bowyer, J. Chang, K. Hoffman, J. Marques, J. Min, and W. Worek, “Overview of the face recognition grand challenge,” in Proc. IEEE Conference on Computer Vision and Pattern Recognition, 2005. [15] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 24, no. 7, pp. 971–987, 2002. [16] C. Liu and H. Wechsler, “Robust coding schemes for indexing and retrieval from large face databases,” IEEE Trans. on Image Processing, vol. 9, no. 1, pp. 132–137, 2000. [17] W. Hwang, G. Park, J. Lee, and S.C. Kee, “Multiple face model of hybrid fourier feature for large face image set,” in Proc. 2006 IEEE Conf. Computer Vision and Pattern Recognition (CVPR), 2006. [18] X. Tan and B. Triggs, “Fusing Gabor and LBP feature sets for kernelbased face recognition,” in 2007 IEEE International Workshop on Analysis and Modaling of Faces and Gestures (AMFG’07), Oct 2007.
R 82.42% 76.97% 74.34%
Y 83.41% 76.35% 74.34% -