Proceedings of 2010 IEEE 17th International Conference on Image Processing
September 26-29, 2010, Hong Kong
AUTOMATIC AND ROBUST 3D FACE REGISTRATION USING MULTIRESOLUTION SPHERICAL DEPTH MAP Peijiang Liu1 , Yunhong Wang1 , Zhaoxiang Zhang1 and Yiding Wang2 1
School of Computer Science and Engineering, Beihang University, China School of Information Engineering, North China University of Technology, China
[email protected]; {yhwang, zxzhang}@buaa.edu.cn;
[email protected] 2
ABSTRACT Face registration is a necessary preprocessing step for 3D face recognition. An entirely automatic method for 3D face registration is proposed in this paper with high accuracy and good robustness to pose and facial expression variations. Our method consists of the following three stages. Firstly, the face shape is represented by Fitting Sphere Representation (FSR). Secondly, generate the Spherical Depth Map (SDM) of face shape which is normalized roughly to a similar pose. Finally, accurately localize the nose tip using multiresolution SDM by a coarse-to-fine method. Then, in conjunction with the face orientation figured out by linear fitting and the center of fitting sphere, face can be registered completely. Extensive experiments are conducted on five popular 3D face databases. The registration accuracy is near 100 percent. Experimental results demonstrate the high robustness of the proposed methods to pose and expression variations. Index Terms— Face Registration, Face Representation, Depth Map, Template Matching
Due to the complication and diversity of 3D face images, to find an excellent descriptor of face shape will greatly facilitate the processing of face registration. P. J. Besl and R. C. Jain [4] surveyed the techniques for obtaining, processing, and characterizing range image (or depth map) which is a useful shape descriptor as the input of the following processing instead of ordinary 3D images. Range images are usually acquired with optical range sensors in computer vision, which suffer from the variations in scale and pose [5]. To avoid the above problems, we address the generation of a novel depth map, Spherical Depth Map, merely from 3D point coordinates. SDM may be applied to any 3D data formats with advantages of noise immunity, effectiveness and invariance to scale and rotation. In this paper, we introduce a novel method for accurate localization of nose tip using multiresolution SDM by a coarse to fine method. We firstly detect the convexities on low-resolution SDM as the candidates of nose tip and then filter out the nose tip from candidates by template matching on high-resolution SDM. The nose tip, face orientation and the center of fitting sphere can be used to register the 3D face completely.
1. INTRODUCTION Face registration is to detect faces in images and build a dense correspondence between faces by accurately localizing the feature points on face. The performance of face registration seriously influences the following face processing steps such as feature extraction and recognition. Until now, automatic and robust face registration is still a challenge to existing techniques. In [1], the authors reviewed the algorithms of 3D face registration and recognition based solely on 3D shape information. D. Colbry et al. [2] used shape index to detect the key anchor points in 3D face scans and achieved 99% success rate in frontal images and 86% success rate in scans with large variations in pose and expression. In [3], the authors found the reference point and nose line for face registration by computing the symmetry plane of nose region under prior anatomical knowledge. However, the proposed method based on SDM is insensitive to noises, and more robust to pose and expression variation.
978-1-4244-7994-8/10/$26.00 ©2010 IEEE
2. SPHERICAL DEPTH MAP We achieve the SDM of 3D face by the following steps as shown in Fig. 1. Firstly, a sphere is fitted to the input 3D face without preprocessing according to the prior knowledge that the shape of human head looks like a ball. After coordinate transforms and scale normalizations, we obtain the FSR of the 3D face. The detail of generating the FSR of 3D face can be found in our early paper [6]. Secondly, due to the distortions of spherical parametrization [7], the face pose is roughly normalized to a position with least distortions according to the centroid of face shape and the face orientation figured out by linear fitting. Finally, the Spherical Depth Map of 3D faces is obtained by resampling the point depth on the surface of fitting sphere. The resolution of SDM is determined by the resampling rate on fitting sphere.
4389
ICIP 2010
(2) The sampling value is the mean of depth of all points belonging to the partition. The key advantage of mean is the immunity against the noise such as the discrete point or tiny hole. If there is no point in the sampling position, the value is zero. Although there are an arbitrary number of points in face shapes, the SDM with a fixed number of points is achieved after sampling on fitting sphere. 3. FACE REGISTRATION Fig. 1. The flowchart of SDM generation 2.1. Rough Normalization of Face Pose The sphere cannot be projected onto the plane without distortions [7]. In FSR, the human face area on the fitted sphere is less than π/2 × π/2. Therefore orthographic projection is employed to parameterize the fitting sphere and minimize the distortions of face by rotating it to the center of projection. The rough normalization of face pose is performed according to two factors: the centroid of face shape and the face orientation proved by linear fitting. The head shape can be fitted by a cylinder and the axis of cylinder is the face orientation. However, cylindrical fitting is complicated and time-consuming. On the other hand, only the orientation is desired in our methods. In this case, we employe the linear fitting instead of the cylinder fitting to find the orientation of face shape because the straight line can be viewed as a cylinder with zero radiuses. We conduct linear fitting on 3D points using Singular Value Decomposition:
An obvious and useful property used in the nose locating is that the nose always sticks out from the rest of the face. In this paper, a novel concept named convexity is defined to denote this property based on SDM of 3D face. Firstly, all convexities are detected on low-resolution SDM as the candidates of nose tip. Then, filter out the nose tip from candidates by template matching method on high-resolution SDM. 3.1. Convexity Detection Fig. 2 shows a 3 × 3 subrange of SDM. pi (i ∈ [0, 8]) is one sampling point in the subrange. p0 is considered to be a convexity only when p0 > pi , ∀i ∈ [0, 8]. There are three convexities detected in the face in Fig. 2(a).
Centroid = mean(H) A = H − Centroid U SV = svd(A)
(1)
where H denotes the points of face shape. The matrix S contains the singular values. The right singular vector in V corresponding to the max value in S is the direction cosine of the fitted line. Considering the convenience and intuition, we choose the point (π, π/2) on sphere as the center of orthographic projection. Then, we normalize the face pose by the following two steps based on the FSR of face. Firstly, the centroid of face shape is situated in negative x axis by rotating round z and y axis. Then, rotate the face round x axis to a pose that the face orientation is perpendicular to the z axis as shown in Fig. 1. 2.2. Resamping on Fitting Sphere After rough normalization of face pose, we obtain the SDM of 3D face by the following two steps: (1) Segment the whole fitting sphere by a certain increment such as π/100. Every partition is a sampling point.
Fig. 2. Convexity detection and relocation. The use of theta and phi to denote, respectively, inclination (or elevation) and azimuth in spherical coordinates. The sampling increment of SDM for convexity detection is π/25 in this paper which is based on the following two reasons. On the one hand, the number of detected convexities will increase with the increasing sampling rate. To reduce the complexity, the lowest resolution of SDM is adopted under the condition that the nose tip could be detected. On the other hand, restricted by the biometric character of human being, the area of human nose on fitting sphere is similar (about π/10 × π/10). To improve the precision, nose localization should be conducted in high-resolution SDM. Therefore, before the nose localization, we relocate the detected convexities in high-resolution SDM by the following two steps:
4390
(1) Fix the subrange in high-resolution SDM corresponding to the detected convexity. (2) The highest sampling point in the subrange is the new convexity in high-resolution SDM. Fig. 2(b) shows the relocated convexity in high-resolution SDM. The convexities is the candidates of nose tip. 3.2. Template Matching The nose tip are accurately localized by matching the candidates to a prepared template which comes from an generalized face [8]. Fig. 3 shows the template face and its SDM in π/100 sampling increment. Based on the nose tip which is labeled manually, we choose a square along the nose orientation as the nose template (a square matrix). After rough normalization of face pose, the face orientation is perpendicular to the z axis. In other words, template matching is performed only in two directions. The opposite direction template is the rotation of the square matrix by 180◦ .
(a)
manually as ”ground truth”. The face orientation can be figured out by the nose tip and the midpoint of two inner corners of eyes. Labeling all faces in the five databases is burdensome and unnecessary. So only part of 3D faces were selected randomly from the databases for our experiments. Two criterions of success in face registration are strictly required. Firstly, the nose tip which requires the nose tip derived from experiments is in the range of one sampling point. Secondly, the intersection angle between the face orientation derived from linear fitting and the labeled orientation is less than 5 degrees. Only when the two criterions are satisfied simultaneously, the face registration is successful. Two accuracy are also calculated respectively to analyze the success in each aspect. Experimental results are listed in Table 1 where A1 denotes the accuracy of nose-tip localization and A2 is the accuracy of finding the face orientation. Ten examples of successful localization of nose tip are illustrated in Fig. 4.
(b)
Fig. 3. Template face and its SDM. The use of theta and phi to denote, respectively, inclination (or elevation) and azimuth in spherical coordinates. Based on every convexity, two squares with the same size along the horizontal direction are selected as the matching regions. Finally, we figure out the similarity between the template and the matching regions by the following equation: T dθ Rdθ Dif f = mean( ) − mean(T ) mean(R) T dϕ Rdϕ ) (2) +mean( − mean(T ) mean(R) where T and R denote the nose template and the matching region respectively. dθ and dϕ denote the differences calculated along the dimensions of θ and ϕ. The convexity with the least Dif f is the nose tip. The nose tip, in conjunction with the face orientation and the center of fitting sphere, can be used to register the 3D face completely. 4. EXPERIMENTAL RESULTS AND ANALYSIS Our experiments are conducted on five popular 3D face databases: 3DRMA [9], BJUT-3D, FRAV3D [10], FRGC2.0 [11] and GavabDB [12]. To calculate the accuracy of face registration, we labeled the nose tip and the inner corner of eyes
4391
Fig. 4. Examples of nose localization based on five 3D face databases. The first column is the neutral faces coming from five face databases, and the second is the faces with expressions. In the corresponding SDM of every face, the blue point between a couple of squares denotes a detected convexity. The green square is the localized nose region. We can draw the following conclusions from Table 1 and Fig. 4. Experiments performed on five different face databases prove the extensive applicability of our methods. The proposed method is immune to the noise because the input of our methods is raw point cloud without preprocessing such as spikes removal and hole filling. The proposed method is robust to pose and expression variation. By analyzing the failure examples in our experiments, the cause of errors in FRGC2.0 is that the shoulder area is included in input as the confusion of nose area. The reason in
Database 3DRMA BJUT-3D FRAV3D FRGC2.0 GavabDB
Quantity 50 50 30 × 16 300 30 × 9
Table 1. Experimental results on five 3D face databases Face area Noise Pose Expressions face, neck splashes, hole uniform neutral face, neck, ear spike uniform neutral face, neck, ear spike, hole various neutral, smile, open mouth face, neck, shoulder spike, hole uniform neutral, smile, surprise, angry face, neck, hair spike, hole various neutral, smile, laugh, gesture
A2 100% 94% 100% 100% 100%
7. REFERENCES
GavabDB is that there are half faces in the database.
Ours [2] [3] [13]
A1 100% 100% 100% 99.3% 97.8%
[1] B. Gokberk, M. O. Irfanoglu, and L. Akarun, “3D shape-based face representation and feature extraction for face recognition,” Image and Vision Computing, vol. 24, pp. 857–869, 2006.
Table 2. The comparison with related works Input Preproc. Accuracy Point coordinates No 99% Shape index Yes 99%, 86% Point cloud Yes Not mentioned Point cloud and Yes 96.5% color texture image
[2] D. Colbry, G. Stockman, and A. K. Jain, “Detection of anchor points for 3D face verification,” in IEEE CVPR, 2005. [3] X. Tang, J. Chen, and Y. Moon, “Towards more accurate 3D face registration under the guidance of prior anatomical knowledge on human faces,” in IEEE FGR, 2008.
We compare our methods for face registration with some state-of-the-art works [2] [3] [13]. As shown in Table 2, the comparison demonstrates the wider applicability, better robustness and higher accuracy of our proposed method. 5. CONCLUSIONS A novel descriptor of 3D face shape has been proposed in this paper by which automatic face registration was conducted with high accuracy and good robustness to noise and pose variation. We conducted our experiments on five 3D face databases which varied in scale, content and restrictive assumptions. The accuracy of face registration is near 100 percent. We compared our methods for face registration with some state-of-the-art works. The comparison demonstrated the wider applicability, the better robustness and the higher accuracy of the proposed method.
[4] P. J. Besl and R. Jain, “Three-dimensional object recognition,” ACM Computing Surveys, vol. 17, no. 1, pp. 75–145, March 1985. [5] J. C. Lee and E. Milios, “Matching range images of human faces,” in IEEE ICCV, December 1990, pp. 722–726. [6] P. Liu and Y. Wang, “3D face pose normalization using equipartition fitting sphere representation of shape,” in Fifth International Conference on Image and Graphics, September 2009. [7] M. S. Floater and K. Hormann, “Surface parameterization: a tutorial and survey,” in Advances in Multiresolution for Geometric Modelling. Springer, Verlag Berlin Heidelberg, 2005, pp. 157–186. [8] V. Blanz and T. Vetter, “A morphable model for the synthesis of 3D faces,” in Siggraph 1999, Computer Graphics Proceedings, 1999, pp. 187–194. [9] “Signal and Image Centre. 3D RMA: 3D database,” ˜ rma.html. http://www.sic.rma.ac.be/beumier/DB/3d [10] C. Conde, A. Serrano, and E. Cabello, “Multimodal 2D, 2.5D 3D Face Verification,” in IEEE ICIP, 2006, pp. 2061–2064. [11] P. J. Phillips, P. J. Flynn, T. Scruggs, K. W. Bowyer, J. Chang, K. Hoffman, J. Marques, J. Min, and W.Worek, “Overview of the face recognition grand challenge,” in IEEE CVPR, 2005.
6. ACKNOWLEDGEMENTS This work is funded by the National Natural Science Foundation of China (No. 60873158), the National Basic Research Program of China (No. 2010CB327902), the Fundamental Research Funds for the Central Universities, and the Opening Funding of the State Key Laboratory of Virtual Reality Technology and Systems. Portions of the research in this paper use the BJUT-3D Face Database collected under the joint sponsor of National Natural Science Foundation of China, Beijing Natural Science Foundation Program, Beijing Science and Educational Committee Program.
4392
[12] “GavabDB: a 3D Face Database,” http://gavab.escet.urjc.es/recursos en.html. [13] T. C. Faltemier, K. W. Bowyer, and P. J. Flynn, “Rotated profile signatures for robust 3D feature detection,” in IEEE FGR, 2008.