Face Image Retrieval Using Sparse Representation Classifier with Gabor-LBP Histogram* Hansung Lee1, Yunsu Chung1, Jeongnyeo Kim1, and Daihee Park2,** 1
Electronics and Telecommunications Research Institute, Korea {mohan,yoonsu,jnkim}@etri.re.kr 2 Dept. of Computer and Information Science, Korea University, Korea
[email protected] Abstract. Face image retrieval is an important issue in the practical applications such as mug shot searching and surveillance systems. However, it is still a challenging problem because face images are fairly similar due to the same geometrical configuration of facial features. In this paper, we present a face image retrieval method which is robust to the variations of face image condition and with high accuracy. Firstly, we choose the Gabor-LBP histogram for face image representation. Secondly, we use the sparse representation classification for the face image retrieval. Using the Gabor-LBP histogram and sparse representation classifier, we achieved effective and robust retrieval results with high accuracy. Finally, experiments are conducted on ETRI and XM2VTS database to verify a proposed method. It showed rank 1 retrieval accuracy rate of 98.9% on ETRI face set, and of 99.3% on XM2VTS face set, respectively. Keywords: face retrieval, face recognition, Gabor filter, local binary patterns, sparse representation classifier.
1 Introduction With the growing popularity of digital image, the numbers of collections of digital image databases have recently exploded. Consequently, it is more difficult to retrieve the image in manual way from large image databases. Therefore, the content-based image retrieval with query by example has been important research subject since the early 1990’s [1]. In particular, face image retrieval (FIR) is a significant research issue in many practical applications such as mug shot searching and surveillance systems. FIR is still a challenging problem because face images are fairly similar due to the same geometrical configuration of facial features [1]. This makes FIR more *
This research was financially supported by the Electronics and Telecommunications Research Institute (ETRI) through the project of “Development of CCTV Face Recognition and Identification Technology under Unconstrained Environment”; This research was partially supported by the Ministry of Education, Science Technology (MEST) and Korea Industrial Technology Foundation (KOTEF) through the Human Resource Training Project for Regional Innovation. ** Corresponding author. Y. Chung and M. Yung (Eds.): WISA 2010, LNCS 6513, pp. 273–280, 2010. © Springer-Verlag Berlin Heidelberg 2010
274
H. Lee et al.
difficult than traditional content based image retrieval. To resolve aforementioned difficulty in FIR, in general, human face recognition (HFR) techniques are employed in FIR. HFR is an automated biometric method to verify or recognize the identities of persons based on their physiological characteristics. Because of its non-aggressive and non-intrusive nature, the HFR has been the subject of extensive research over the past decades and it spans numerous fields and disciplines [2]. HFR models can be broadly divided into two categories: global approach and component based approach. Global approach methods use the whole face region as the raw input to a face recognition system. In general, these methods apply dimensionality reduction to raw input and then conduct subspace analysis. One of the most widely used methods is based on principal component analysis (PCA) and linear discriminant analysis (LDA). Most of the contemporary subspace analysis methods are either inspired or an improvements of these original works [2-4]. Global approach works well for classifying frontal face images. However, much of these methods fail in case of varying illuminations and are not robust to pose and expression changes [2-4]. An alternative method to the global approach is component based method. These methodologies extract local features or landmark such as the eyes, nose, nostrils, and corner of mouth from face image and apply statistical modeling for face recognition. Component based methods are invariant to similarity transformations, and robust to pose, illumination, and expression changes. However, it is very difficult to detect exact landmarks of face image and it is time consuming process [2-4]. According to the recent literature, unlike the mainstream approaches of face representation based on statistical learning such as subspace analysis, SVM, and Adaboost, there are many ongoing attempts to represent the face image based on non statistical learning methods. Gabor filter and local binary patterns (LBP) based methods have been successfully applied to many face recognition applications [5-8]. In particular, hybrid methods with combining Gabor filter and LBP recently attract significant attention in face recognition and retrieval. These methods are invariant to illumination and expressional variability [6-8]. On the other hand, supervised classification framework based on sparse representation, viz. sparse representation classifier (SRC), is recently proposed for face recognition [9-10]. The sparsity is an important way to encode the domain knowledge, thus generally improve the generalization capability of the model [11]. J. Wright et al. [10] showed that the classifier based on sparse representation is remarkably effective and achieves the best recognition rate on some face database. Especially, it is robust to partial occlusion and corruption of face images. In this paper, we propose a hybrid method for face retrieval, which is not only robust to the variations of imaging condition but also with high accuracy. We choose Gabor-LBP histogram for face image representation and SRC for the face image retrieval. The proposed face image retrieval system is given in Fig. 1. It consists of the following three stages: face detection, face representation, and face image retrieval. In this paper, we focus on the second and last stages. Using the Gabor-LBP histogram and sparse representation classifier, we achieved effective and robust retrieval results
FIR Using Sparse Representation Classifier with Gabor-LBP Histogram
275
with high accuracy. To evaluate the performance of our proposed method, we conduct experiments on ETRI and XM2VTS face database. Our experiments show that the retrieval accuracy at rank 1 approaches 98.9% on ETRI face set and 99.3% on XM2VTS face set, respectively. The rest of this paper is organized as follows. In Section 2, we describe the face image representation with Gabor-LBP histogram. A sparse representation classifier based face image retrieval method is presented in Section 3. In Section 4, we show experimental results. Finally, in Section 5, we conclude with a brief summary and suggest future research directions.
Fig. 1. Architecture of proposed face image retrieval system
2 Face Representation with Gabor-LBP Histogram In this Section, we describe the face representation based on Gabor filter and LBP. Gabor feature based face representations have been well known as one of the most successful methods. Recently, with the success of LBP, there are ongoing attempts to combine Gabor feature and LBP for the face description [6-8]. In this paper, we employ the face representation method that is proposed in [8]. It is robust to noise and local image transformations due to variations of lighting, occlusion and pose. Combining Gabor and LBP enhances the representation power of the spatial histogram. A face image is presented as a histogram sequence by the following steps [8]: First, an input face image is normalized and then transformed to multiple Gabor images by convolving the face image with Gabor filters. Let f ( x, y ) be the face image. Its convolution with a Gabor filter ψ o , s ( z ) is defined as follows
Gψ f ( x, y, o, s ) = f ( x, y ) *ψ o , s ( z )
(1)
276
H. Lee et al.
where o and s are orientation and scale of the Gabor filters, z = ( x, y ) , and * denotes the convolution operator. Five scales and eight orientations Gabor filters are used. Second, each Gabor image is converted to local binary patterns map using LBP operator. The LBP operator labels the image pixels by thresholding the 3 × 3neighborhood of each pixel pi (i = 0,1," , 7) with the center value pc and considering the result as a binary number [8].
⎧1, S ( pi − pc ) = ⎨ ⎩0,
pi ≥ pc pi < pc
(2)
Then, by summing the threshold values weighted by power of two, the LBP patterns at each pixel can be achieved, which characterizes the spatial structure of the local image texture. 7
LBP = ∑ S ( pi − pc ) ⋅ 2i
(3)
i =0
Third, each LBP map is divided into non-overlapping sub-regions with predefined bin size, and histograms of sub-regions are computed. Finally, the LBP histograms of all the LBP maps are concatenated to form the final histogram sequence as the face representation (or description).
3 Face Retrieval Based on Sparse Representation Classifier In this Section, we present the face image retrieval method based on sparse representation classifier. Sparse representation (SR) was firstly proposed for signal representation. In the past few years, SR has successfully applied in many practical applications such as signal compression and coding, image de-noising, and compressive sensing. Recently, supervised classification methods based on sparse representation, viz. sparse representation classifier (SRC), is proposed for face recognition. It is especially robust to partial occlusion and corruption of face images [9-11]. The problem of face image classification based on sparse representation can be formulated as follows [10-11]: Given a face image with vector pattern y ∈ R m , and a matrix A = [ x1 , x2 ," , xn ] ∈ R m× n , the objective of SR is to represent y using as few entries of A as possible. This can be formally expressed as follows:
x0 = arg min x
0
subject to y = Ax
(4)
where x ∈ R n is the coefficient vector. Unfortunately, finding the sparsest solution of (4) is NP-hard. It can be shown that if the solution x0 is sparse enough, the
FIR Using Sparse Representation Classifier with Gabor-LBP Histogram
277
solution of l0 minimization problem is equal to the solution of l1 minimization problem [10-11]. x1 = arg min x 1 subject to y = Ax
(5)
This problem can be solved in polynomial time by standard linear programming methods [10]. Given a new test data y from one of the classes in the training set, its sparse representation x1 is computed by (5). The nonzero entries in the estimate x1 will be associated with the columns of A from a single object class i , and we can easily assign the test sample y to that class. In the context of face image retrieval, we assume that training images and test images are taken under different conditions, e.g., change of illumination, hair style, facial hair, shape, facial expression and presence or absence of glasses. This may lead to nonzero entries associated with multiple object classes. Therefore, in general, face image retrieval system outputs the retrieved results in sorted order according to the score value. In this paper, we use the coefficient value in the solution x1 of l1 minimization problem as the score value. Algorithm 1 below summarizes the face image retrieval procedure. Algorithm 1. Face Image Retrieval based on Sparse Representation
1. Input: a matrix of training samples A = [ A1 , A2 ," , Ak ] ∈ R m× n for k classes, a test sample y ∈ R m . 2. Normalize the columns of A to have unit l2 norm. 3. Solve the l1 minimization problem: x1 = arg min x 1 subject to y = Ax
4. Compute the mean coefficient value of each class. mci =
1 ni
ni
∑ δ ( x ) for i = 1, 2," , k i
1
1
mci : mean coefficient value of i -th class; ni : number of elements in i -th class;
δ i ( x1 ) : characteristic function that selects the coefficients associated with the i -th class.
5. Sort the class in descending order according with mean coefficient value of classes. 6. Output: p face images which have large mean coefficient value.
278
H. Lee et al.
4 Experimental Results To evaluate the performance of our proposed method, we conducted experiments on the Electronics and Telecommunications Research Institute (ETRI) and XM2VTS face database. The ETRI face database contains images of 55 different subjects with 20 images of each subject. We used 10 images per subject as training dataset and the rest of images as test dataset. The XM2VTS database is one of standard bench mark data set which consists of 295 subjects. The database contains eight images for each subject. We used four images per subject as training dataset and the rest of images as test dataset. The proposed system has been realized by using Matlab, and SparseLab [12] is used as a sparse representation solver. To show the efficiency of proposed method, we compared our method with typical subspace analysis based method such as PCA and NMF. For this experiment, we use the k neighborhood classifier for face image retrieval. The experimental results for ETRI face database are shown in Fig. 2. It shows that the proposed method has the best retrieval accuracy rate of 98.9% at rank 1 on ETRI face set. The proposed method and NMF find exact person before rank 10.
Fig. 2. Cumulative accuracy of retrieval on ETRI faceset
Figure 3 shows the experimental results for XM2VTS face database. It achieves rank 1 retrieval accuracy rate of 99.3% on XM2VTS face set. For XM2VTS face set, the proposed method and NMF show similar performance.
FIR Using Sparse Representation Classifier with Gabor-LBP Histogram
279
Fig. 3. Cumlative accuracy of retrieval on XM2VTS faceset
5 Conclusions In this paper, we introduced a face image retrieval method which is not only robust to the variations of face image condition but also with high accuracy. Firstly, we used the Gabor-LBP histogram for face image representation. Secondly, we adopted the sparse representation classification for the face image retrieval. Using the Gabor-LBP histogram and sparse representation classifier, we achieved effective and robust retrieval results with high accuracy. To evaluate the performance of our proposed method, we conducted experiments on ETRI and XM2VTS face database. Our experiments showed that the retrieval accuracy at rank 1 approaches 98.9% on ETRI face set and 99.3% on XM2VTS face set, respectively. For the future work, we are planning to develop a prototype system based on the proposed mechanism for face image recognition and retrieval from large face image database.
References 1. Megherbi, D.B., Miao, Y.: A Distributed Technique for Recognition and Retrieval of Faces with Time-Varying Expressions. In: IEEE International Conference on Computational Intelligence for Measurement Systems and Applications, pp. 8–13. IEEE Press, Hong Kong (2009) 2. Tolba, A.S., El-Baz, A.H., El-Harby, A.A.: Face Recognition: A Literature Review. International Journal of Signal Processing 2(2), 88–103 (2006) 3. Zhao, W.: Face Recognition: A Literature Survey. ACM Computing Surveys 35(4), 339– 458 (2003)
280
H. Lee et al.
4. Vikram, T.N., Chidananda, G.K., Guru, D.S., Shalini, R.U.: Face Indexing and Retrieval by Spatial Similarity. In: International Congress on Image and Signal Processing, pp. 543– 547. IEEE Press, Hainan (2008) 5. Shen, L., Bai, L.: A Review on Gabor Wavelets for Face Recognition. Pattern Anal. Applic. 9, 273–292 (2006) 6. Xie, S., Shan, S., Chen, X., Chen, J.: Fusing Local Patterns of Gabor Magnitude and Phase for Face Recognition. IEEE Trans. on Image Processing 19(5), 1349–1361 (2010) 7. Gao, T., He, M.: A Novel Face Description by Local Multi-Channel Gabor Histogram Sequence Binary Pattern. In: International Conference on Audio, Language and Image Processing, Shanghai, China, pp. 1240–1244 (2008) 8. Zhang, W., Shan, S., Gao, W., Chen, X., Zhang, H.: Local Gabor Binary Pattern Histogram Sequence (LGBPHS): A Novel Non-Statistical Model for Face Representation and Recognition. In: International Conference on Computer Vision, Beijing, China, vol. 1, pp. 786–791 (2005) 9. Yang, A.Y., Wright, J., Ma, Y., Sastry, S.S.: Feature Selection in Face Recognition: a Sparse Representation Perspective. UC Berkeley Technical Report UCB/EECS-2007-99 (2007) 10. Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust Face Recognition via Sparse Representation. IEEE Trans. on Pattern Analysis and Machine Intelligence 31(2), 210–227 (2009) 11. Qiao, L., Chen, S., Tan, X.: Sparsity Preserving Projections with Applications to Face Recognition. Pattern Recognition 43, 331–341 (2010) 12. Stanford SparseLab, http://sparselab.stanford.edu/