Multiblock-Fusion Scheme for Face Recognition - Semantic Scholar

Report 4 Downloads 40 Views
Multiblock-Fusion Scheme for Face Recognition Shuyan Zhao, Rolf-Rainer Grigat Vision Systems, 4-08/1 Technical University Hamburg-Harburg Harburger Schloßstr 20, 21079 Hamburg, Germany [email protected], [email protected]

Abstract A multiblock-fusion scheme for face recognition is proposed in this paper. Three face recognition algorithms, i.e. probabilistic match, Linear Discriminant Analysis (LDA) and Discrete Cosine Transform (DCT) are compared under the fusion strategy. By combining global and local features, the multiblock-fusion enhances the robustness against variations of illumination, facial expressions and pose. Different partitions and combinations show specific performance for each method. The experimental results demonstrate that the fusion outperforms the single method. Some other characteristics of the three methods are also verified by the experiments.

1. Introduction Face recognition has a wide variety of applications in commercial and law enforcement [12]. Especially, due to its nonintrusive characteristic, it emerges as a vivid research direction in biometric, which gains a lot of research effort recently. Face recognition technology falls into two main categories: feature-based and holistic [4]. Feature-based approaches rely on the individual facial features, such as eyes, nose and mouth, and the geometrical relationships among them. Holistic methods take the entire face into account. Among global methods, the appearance-based modelling methods are the most popular, for example, Eigenface [11] [8], Fisherface [3], and Independent Component Analysis (ICA) [2]. These methods utilize the intensity or intensity-derived feature. In [7], Moghaddam argued the method using the input visual data or its manifold representation did not exploit knowledge of which types of variation were critical for discrimination. He thus proposed the probabilistic match for face recognition, which formulated a similarity measure of two kinds of variation, intrapersonal variation and extrapersonal variation. Recently Hafed et al. [5] presented an alternative holistic face recognition ap-

proach based on Discrete Cosine Transform (DCT). The main advantage of the method is its relationship to KL Transform, an optimal transform in the least squared-error sense, and its computational speed. Zhao et al. [12] pointed out that both global and local features are crucial for face recognition. Methods combining both global and local information are more robust against variations of illumination, pose and facial expressions. However, detection of local features are also challenging and error-prone. The subblock decomposition provides a way to represent local information. In this paper, we propose a multiblock decomposition and fusion strategy for face recognition and employ the strategy on three approaches. In the next section, three methods, probabilistic match, LDA and DCT are described briefly. Our multiblock-fusion scheme is presented in Section 3. Section 4 shows the experimental results. Conclusions are drawn in the last section.

2. Methods Three methods are compared in the multiblock-fusion framework, namely probabilistic match, Linear Discriminant Analysis, Discrete Cosine Transform. Note that this paper does not aim at combining different recognition approaches, rather than investigating the possibility to make full use of the individual methods, thus saving the investment on training different classifiers.

2.1. Probabilistic Face Recognition This method characterizes the intensity difference of two images in a probabilistic view [7]. Two kinds of variations, intrapersonal variation and extrapersonal variation, are formulated for face recognition. In this paper, only the intrapersonal difference is exploited. The intensity difference of two images of the same person, described as ∆I = xi − xj , forms the intrapersonal space ΩI , which is supposed to be Gaussian distributed, i.e. ΩI ∼ N (0, ΣI ). The recognition

0-7695-2128-2/04 $20.00 (C) 2004 IEEE

is performed by using the Maximum Likelihood similarity measure: S(∆) = p(∆|ΩI ) =

exp(− 12 ∆T Σ−1 I ∆) . (2π)1/D |ΣI |1/2

(1)

The whitening transformation can be used to make the computation more efficient. −1/2

yi = ΛI

VIT xi

(2)

where Λ and V are the matrices of the leading eigenvalues and the corresponding eigenvectors of ΣI . Then the evaluation of the similarity can be simplified to Euclidean distance yi − yj 2 .

2.2. Linear Discriminant Analysis Face recognition using Linear Discriminant Analysis has been very successful [3] [10]. Belhumeur et al. [3] compared the LDA (Fisherface) with PCA (Eigenface) and concluded that LDA method is more insensitive to illumination and facial expressions. Suppose a set of N images {x1 , x2 , ..., xN }, each with a size of n pixels. Each image belongs to one of the C classes. The definitions of between-scatter SB and withinscatter SW are : SB =

C 

Ni (µi − µ)(µi − µ)T

(3)

set. It seems more desirable when new faces are added into the database frequently. Only the DCT coefficients at the low and mid frequencies are selected as features. The classification is performed by using the nearest neighbor classifier.

3. Multiblock-Fusion Each image xi is partitioned into K local parts or subk blocks {xki }K k=1 . Features Fi are extracted in every local subblock, which are merged into a single vector Fi and taken as the features of the whole image. In order to distinguish it with the subblock, we call the divided image the superblock (Superblock 0 is the original image). Figure 1 illustrates the multiblock-fusion strategy. Three different partitions are exploited to make use of the local information at different scales. The decomposed subblocks to some extent break down the geometrical relation of the image, therefore it is necessary to combine superblocks with the holistic image. The classification of each superblock is supposed to be independent, then the final identification is the fusion of the classification of different superblocks. Kitter et al. [6] compared a variety of classifier combination rules and proved the sum rule outperforms other combination rules based on exhaustive experiments. In this  paper, the sum rule is utilized for fusion of superblocks, l Sl , where Sl denotes the similarity score of the superblock l.

i=1

SW =

Ni C  

(xj − µi )(xj − µi )T

(4)

i=1 j=1

where Ni is the number of the data in class i, µi is the classspecific mean, µ is the mean of the whole data set. LDA selects the linear transform Wlda which maximizes the ratio between the determinant of the between-scatter of projected samples and the determinant of the within-scatter of projected samples: Wlda = arg max = w

|W T SB W | |W T SW W |

(5)

Because of insufficient data (N < n), SW is singular. Thus PCA is at first used to reduce the dimension of input data to N − C. The nearest neighbor is employed for classification.

2.3. Discrete Cosine Transform DCT method [5] shares a closely related mathematical background with Eigenface. However, its merit over Eigenface is that it need less training time, moreover it is deterministic and does not require the specification of the data

Figure 1. The multiblock-fusion scheme For the probabilistic match, suppose that the intrapersonal difference of each subblock satisfies Gaussian distribution ΩkI ∼ N (0, ΣkI ), the whitened coefficients of each subblock yik are coupled into the feature vector yi . The Euclidean distances of the features of different superblocks are added, then the classification is concluded. For the LDA method, the LDA coefficients of each subblock are extracted respectively, which are concatenated to form the

0-7695-2128-2/04 $20.00 (C) 2004 IEEE

feature vector. The similarity score is the sum of the distances of features for each superblock involved in the classification. Similar steps are taken for the DCT method, the features of the superblock consist of the DCT coefficients at the low-mid frequencies of each subblock.

4. Experiments Two publicly available face databases, Yale [1] and ORL [9] are chosen for the experiments. Yale database contains 15 persons, each person has 11 images with different illuminations and facial expressions. There are 40 subjects in ORL database. The 10 different images of each person include variations of pose and facial expressions, while the illumination is almost constant. All the images are normalized to 56 × 48 with the left eye coordinate (12, 22) and right eye coordinate (34, 22). For the ORL face set the eyes in the normalized images are not strictly located at the positions given above. Figure 2 shows the examples from both of databases (the first row from Yale database and the second row from ORL database).

Figure 2. Examples from Yale and ORL databases Each half of the samples are used for training and testing. The images for learning the subspaces of the probabilistic match and the LDA method are absolutely exclusive from the test image. In the test phase, the features of all the images in the test set are compared with each other. The subspace dimensionality d of the probabilistic method is selected to be 40, 20, 20, and 12 for the corresponding subblocks in each partition. The features of the DCT method are 8 × 8, 6 × 6, 6 × 6 and 4 × 4 subsets of DCT coefficients, respectively (see Figure 1). Table 1 and Table 2 list the recognition accuracy of single superblocks. LDA outputs the best results (86.42%) by using the original image on Yale database. This method is insensitive to illumination in contrary to the other two methods. When evaluating more local information, the performance of the probabilistic match and DCT gets better, while the performance of LDA become worse. It is because with smaller subblocks, LDA can not obtain sufficient information to find the discriminant direction. When dealing with data sets with pose variations (ORL database), the performance of probabilistic match and LDA degrades, particularly LDA, since pose variations are quite nonlinear. DCT is more robust to pose changes.

Table 1. Recognition rate (%) of the single superblock on the Yale database Superblock 0 1 2 3

Probabilistic 79.01 81.48 80.25 81.48

LDA 86.42 79.01 79.01 40.74

DCT 77.78 81.48 81.48 82.72

Table 2. Recognition rate (%) of the single superblock on the ORL database Superblock 0 1 2 3

Probabilistic 74.00 71.50 70.50 75.00

LDA 63.50 67.00 60.00 27.00

DCT 79.50 81.50 78.50 81.50

Multiblock-fusion (see Table 3, Table 4, Figure 3 and Figure 4) upgrades the recognition accuracy in comparison with the single image. For the Yale database, the accuracy in the best case increases from 81.48% to 82.72% by using the probabilistic match, from 86.42% to 88.89% by using LDA, and no changes for DCT; for the ORL database, the recognition rate in the best case rises from 75.00% to 78.00% by using the probabilistic, from 67.00% to 76.50% by using LDA, no changes for DCT either. The multiblock-fusion makes LDA gain a 9.5% improvement, which overcomes its weakness when handling the nonlinear pose changes. However, the fine partition of subblocks is not helpful, the superblock 3 has very bad performance. The important observation is that DCT has more local characteristic and shows an ability to handle pose variation. As for the computational complexity, both of the probabilistic match and LDA involve the eigenanalysis in the training phase, which requires significant processing, while DCT need less computation and can be speeded up by using the fast algorithm. For the online feature extraction, the complexity depends on the dimension of feature vectors.

Table 3. Recognition rate (%) of multiblockfusion on the Yale database Superblocks 0+1 0+2 0+3 0+1+2 0+1+2+3

0-7695-2128-2/04 $20.00 (C) 2004 IEEE

Probabilistic 81.48 80.25 80.25 82.72 82.72

LDA 82.72 88.89 49.38 88.89 50.62

DCT 81.48 81.48 82.72 82.72 82.72

Table 4. Recognition rate (%) of multiblockfusion on the ORL database Superblocks 0+1 0+2 0+3 0+1+2 0+1+2+3

Probabilistic 74.50 74.50 76.50 76.50 78.00

LDA 74.00 71.50 32.00 76.50 40.50

DCT 81.50 79.50 80.00 80.50 80.50

Figure 4. Comparison of the three methods within the multiblock-fusion scheme on the ORL Database

[3]

Figure 3. Comparison of the three methods within the multiblock-fusion scheme on the Yale Database

[4]

[5]

5. Conclusion

[6]

Face classification is a challenging visual recognition task due to a lot of variations in illumination, pose and facial expressions. The proposed multiblock decomposition and fusion scheme intends to employ the potential of three face recognition approaches, i.e. probabilistic match, LDA and DCT, and compare their performance in the multiblockfusion framework. The experimental results demonstrate a notable improvement by fusing the multiblocks. It is also verified that DCT shows a more local characteristic and more robustness to pose variations. Although multiblockfusion enhances the LDA, smaller subblocks are not useful due to the less discriminant information involved.

[7]

[8]

[9]

[10]

[11]

References

[12]

[1] Yale face database. http://cvc.yale.edu/ projects/yalefaces/yalefaces.html. [2] M. S. Bartlett, H. M. Lades, and T. J. Sejnowski. Independent component representations for face recognition. In

Proc. of SPIE Symposium on Electonic Imaging, pages 528– 539, 1998. P. N. Belhumeur, J. P. Hespanha, and D. J. Kriegman. Eigenfaces vs. fisherfaces: Recognition using class specific linear projection. IEEE Trans. on Pattern Analysis and Machine Intelligence, 19(7):45–58, July 1997. R. Chellappa, C. Wilson, and S. Sirohey. Human and machine recognition of faces: A survey. IEEE., 83(5):705–740, 1995. Z. M. Hafed and M. D. Levine. Face recognition using the discrete cosine transform. International Journal of Computer Vision, 43(3):167 –188, 2001. J. Kittler, M. Hatef, R. Duin, and J. Matas. On combining classifiers. IEEE Trans. Pattern Analysis and Machine Intellegence, 20(3):226–239, 1998. B. Moghaddam. Principal manifolds and probabilistic subspaces for visual recognition. IEEE Trans. on Pattern Analysis and Machine Intelligence, 24(6):780–788, 2002. A. Pentland, B. Moghaddam, and T. Starner. View-based and modular eigenspaces for face recognition. In Proc. of IEEE Conf. on Computer Vision and Pattern Recognition (CVPR’94), pages 84–91, Seattle, WA, June 1994. F. Samaria and A. Harter. Parameterisation of a stochastic model for human face identification. In 2nd IEEE Workshop on Applications of Computer Vision, Florida, 1994. D. L. Swets and J. Weng. Using discriminant eigenfeatures for image retrieval. IEEE Trans. Pattern Analysis and Machine Intellegence, 18(8):831–836, 1996. M. Turk and A. Pentland. Eigenfaces for Recognition. Journal of Cognitive Neuroscience, 3(1):77–86, March 1991. W. Zhao, R. Chellappa, A. Rosenfeld, and P. Phillips. Face recognition: A literature survey. Technical Report CAR-TR948, University of Maryland, 2000.

0-7695-2128-2/04 $20.00 (C) 2004 IEEE