Isometric Deformation Modelling for Object ... - Semantic Scholar

Report 1 Downloads 132 Views
Isometric Deformation Modelling for Object Recognition Dirk Smeets , Thomas Fabry , Jeroen Hermans, Dirk Vandermeulen, and Paul Suetens K.U. Leuven, Faculty of Engineering, Department of Electrical Engineering, Center for Processing Speech and Images, Medical Imaging Research Center, Universitair Ziekenhuis Gasthuisberg, Herestraat 49 bus 7003, B-3000 Leuven, Belgium Abstract. We present two methods for isometrically deformable object recognition. The methods are built upon the use of geodesic distance matrices (GDM) as an object representation. The first method compares these matrices by using histogram comparisons. The second method is a modal approach. The largest singular values or eigenvalues appear to be an excellent shape descriptor, based on the comparison with other methods also using the isometric deformation model and a general baseline algorithm. The methods are validated using the TOSCA database of non-rigid objects and a rank 1 recognition rate of 100% is reported for the modal representation method using the 50 largest eigenvalues. This is clearly higher than other methods using an isometric deformation model.

1

Introduction

During the last decades, many developments in 3D modelling and 3D capturing techniques augmented the interest in the use of 3D objects for a number of applications. Examples of these are CAD/CAM, architecture, computer games, archaeology, medical applications and biometrics. Because of this growing use of 3D objects, we see the emergence of 3D databases, which leads to a new research question: 3D object retrieval. One witness of this are the yearly SHREC contests [1]. For the last 3 years already, the 3D SHape REtrieval Contest has the objective to evaluate the effectiveness of 3D-shape retrieval algorithms. Our contribution considers 3D object recognition coping with non-rigid deformations in particular. A few examples of these kinds of deformations are the expression variations of a human face, the movement of different subparts of a fabrication robot or simply the movement of a walking human. Based on the assumption that geodesic distances1 remain approximately constant during natural non-rigid deformations, we propose a technique for nonrigid object recognition based on the geodesic distance matrix (GDM), a matrix summarizing all point-to-point geodesic distances on the object mesh.   1

Corresponding author: [email protected] Joint first author. The geodesic distance between two points is the length of the shortest path on the object surface between two points on the object.

X. Jiang and N. Petkov (Eds.): CAIP 2009, LNCS 5702, pp. 757–765, 2009. c Springer-Verlag Berlin Heidelberg 2009 

758

2

D. Smeets et al.

Related Work

Some 3D object recognition methods dealing with non-rigid objects and making use of the geodesic distance matrix are already to be found in literature. The one that received the most attention is probably the method of Elad and Kimmel [2]. Here, the GDM is computed using the fast marching on triangulated domains (FMTD) method. Then, the GDM is processed using the multidimensional scaling (MDS) approach, converting the non-rigid objects into rigid invariant signature surfaces. These can be compared using simpler algorithms for rigid matching. We will use an implementation of this method for comparison. Another 3D object recognition method that shows some similarity to one method we propose here is the Geodesic Object Representation of Hamza and Krim [3]. Here, the shape descriptor is a global geodesic shape function. This shape function is defined in each point on the surface and measures the normalized integral of squared geodesic distances to other points on the surface. These global geodesic shape functions are then used to construct geodesic shape distributions. These are kernel density estimates (KDE) made of the (discretisized) global geodesic shape functions of a particular object. For the actual recognition, these KDEs are compared using the Jensen-Shannon divergence. The similarity of this method to our modal representation lies in the use of the geodesic distance matrix, which, in the method of Hamza and Krim, is used for the computation of the geodesic shape functions. Finally, a similar method to our modal representation approach is the method shown in Jain and Zhang’s work [4]. This method measures the inter-object distance by taking the χ2 -distance between the 20 largest eigenvalues of a weighted GDM. We will show that the weighting of the GDM has an adverse effect on the accuracy of the method.

3

Isometric Deformation Modelling

In mathematics, an isometry is a distance-preserving isomorphism between metric spaces. The basis of the isometric deformation model is therefore the invariance of distances measured along the surface, called geodesic distances. Therefore, an appropriate object representation to exploit the advantages of an isometric model is the geodesic distance matrix (GDM). We call G a GDM for a particular object if G = [gij ], with gij the geodesic distance between points i and j on the object surface. This matrix is a symmetric matrix and defined up to a random permutation of the points on the represented object surface. Figure 1 shows a 3D object and the associated GDM. For the calculation of the GDM, a fast marching algorithm for triangulated meshes is used [5]. The algorithm computes the distance of the shortest (discrete) path between each pair of surface points. The complexity of this computation is O(n2 ), with n the dimension of the GDM. Beside the geodesic distance matrix (G1 = [gij ]), also other affinity matrices, closely related to the GDM are 2 examined. For example the squared GDM (G2 = [gij ]), the Gaussian weighted

Isometric Deformation Modelling for Object Recognition

759

200 180 160 140 120 100 80 60 40 20 0

Fig. 1. 3D mesh of an object (a) and its geodesic distance matrix representation (b)

GDM (G3 = [exp(−gij 2 /(2σ 2 )]) and the increasing weighting function GDM (G4 = [1 + σ1 gij ]−1 ) [6]. 3.1

Multidimensional Scaling

Multidimensional scaling (MDS) is a technique that allows visualisation of the proximity between points with respect to some kind of dissimilarity (distance) measure matrix. For Euclidean distance matrix representations of a 3D object, three dimensional MDS provides the configuration of the original object. In [7], MDS is applied on the GDM in order to obtain a configuration of points where pointwise Euclidean distances approximately equal the original pointwise geodesic distances. Figure 2 shows the resulting 2D and 3D configurations, called canonical forms, calculated using classical MDS. Because the geodesic distances remain constant under isometric transformations, the GDM of an object is invariant with respect to isometric transformations up to an arbitrary - simultaneous - permutation of rows and columns. However, the canonical forms have the same shape. Therefore, objects can be

(a)

(b)

Fig. 2. 2D (a) and 3D (b) canonical form of the same object as shown in Fig. 1

760

D. Smeets et al.

compared by rigidly aligning the canonical forms and comparing the registration error. 3.2

Histogram Comparison

We propose another way to compare deformable objects: by comparing histograms of the values contained in the geodesic distance matrices. The resulting representation is invariant for matrix permutations. Experiments were conducted with two kinds of histograms. The first are histograms calculated from all values in the upper triangle of the GDM. The second are histograms of mean geodesic distances per point. Examples of those histograms for the object in Fig. 1 are shown in Fig. 3. Other histogram variants are possible but are not considered here. 5

2.5

x 10

200 180

2

160 140

1.5

120 100

1

80 60

0.5

40 20

0

0

10

20

30

40

50

60

70

80

90

0

100

0

10

20

30

40

50

60

70

80

Fig. 3. Histograms of all (a) and point-wise averaged (PWA) (b) geodesic distances of the same object as shown in Fig.1

The histograms S j (j = 1, . . . , n), with n the number of objects, can be thought of as m-dimensional vectors, with m the number of bins. They can be compared with a plethora of dissimilarity measures. We have tested 8 different ones. Histograms can be compared using the Jensen-Shannon divergence [8]: n n   JSD(S 1 , S 2 , . . . S n ) = H( πj S j ) − πj H(S j ), j=1

(1)

j=1

the histogram vector S j and H(S j ) the Shannon entropy, with πj the weight for m given by H(S) = − i=1 Si logb Si . In this work only pair-wise comparisons are considered. Both histograms receive equal weighting (π1 = π2 = 1/2). The other dissimilarity measures need less explication and are listed in Tab. 1. 3.3

Modal Representation

A third approach for object comparison using the isometric model is based on a modal representation. Here, the information in the geodesic distance matrix is separated into a matrix that contains intrinsic shape information and a matrix with

Isometric Deformation Modelling for Object Recognition

761

Table 1. Dissimilarity measures Dissimilarity measure

Formula

Jensen-Shannon Divergence

D1 = H( 21 S k + 12 S l ) − ( 12 H(S k ) + 12 H(S l ))  2|Sik −Sil | D2 = m i=1 S k +S l

Mean normalized Manhattan distance Mean normalized maximum norm Mean normalized absolute difference of square root vectors

Euclidean distance Normalized Euclidean distance

i

2|S k −S l |

D3 = maxi S ki+S li i i √ √  2| Sik − Sil | √ √ l D4 = m i=1 k Si +

k

Correlation

Mahalanobis distance

i

Si

l

·S D5 = 1 − SSk S l  m k D6 = (S − Sil )2  i=1 i m k l 2 2 D7 = i=1 (Si − Si ) /σi m k l T −1 (S k − S l ) D8 = i=1 (S − S ) cov(S)

information about corresponding points. This is done with an eigenvalue decomposition (EVD) or a singular value decomposition (SVD) of the GDM. Both decompositions give similar results because the GDM is a symmetric matrix. The eigenvalues and singular values can be used as intrinsic shape descriptors, while the eigenvectors and singular vectors give information about correspondences. For numerical reasons, only the largest eigenvalues or singular values are computed. Because we do not know anything about the order of the points, G and all possible simultaneous permutations of rows and columns of G determine the configuration of the object. Let P be a random permutation matrix, such that G = P GP T is a GDM with rows and columns permuted, and G = U ΣV T a singular value decomposition, then G = P GP T = P U Σ(P V )T .

(2)

Because P U and P V remain unitary matrices and Σ is still a diagonal matrix with non-negative real numbers on the diagonal, the right hand side of Eq. 2 is a valid singular value decomposition of G . A common convention is to sort the singular values in non-increasing order. In this case, the diagonal matrix Σ is uniquely determined by G . Therefore, Σ = Σ  , with Σ  the singular value matrix of G . From this, we can see that Σ contains the intrinsic information about geometry, while U and V contain the information about correspondences between points. This justifies our approach of object recognition using S = {σ1 , σ2 , . . . , σk }, with σ1 , σ2 , . . . , σk the first k singular values of the GDM, as a shape descriptor. As such, the computational complexity singular value calculation is limited to O(k.n2), with n the dimension of the GDM. For comparing these singular value vectors, we can use the same dissimilarity measures as we described in Sect. 3.2 (see Tab. 1).

762

4

D. Smeets et al.

Experimental Validation

To examine the deviation to the isometric deformation assumption in a realistic situation, we looked at the change in geodesic distance between four finger tips in three situations with different configuration of a hand. This results in a mean coefficient of variation (CV) of 5.3% for the geodesic distances, while the CV for Eucledian distances is equal to 27.6%. For the validation of the three proposed object recognition approaches, we use the TOSCA database [9]. This database consists of various 3D non-rigid shapes in a variety of poses and is intended for non-rigid shape similarity and correspondence experiments. We use 133 objects, i.e. 9 cats, 11 dogs, 3 wolves, 17 horses, 21 gorillas, 1 shark, 24 female figures, and two different male figures, containing 15 and 20 poses. Each object contains approximately 3000 vertices. We compare the three GDM-based methods with a baseline algorithm: the standard iterative closest point (ICP) algorithm [10]. This is a well-known and extensively used rigid object registration method that minimizes the sum of squared Euclidean distances between closest points. After rigid registration the objects can be compared using the value of the employed registration objective function. After roughly tuning the parameters, we used 100 bins for the histogram comparison with all values and 80 bins for the pointwise averaged (PWA) value histogram comparison. This number was determined experimentally. The different approaches are validated using standard recognition experiments, i.e. the verification and the identification scenario. The performance of those scenarios is measured with the receiving operating characteristic (ROC) curve and the cumulative matching curve (CMC), respectively. The former is a curve plotting the false rejection rate (FRR) against the false acceptance rate (FAR), while the latter gives the recognition rate for several ranks. These curves can be found in Fig. 4. Here, we plotted the best combination of GDM weighting, dissimilarity measure and, for the modal representation approach, the optimal number of eigenvalues (see below). The equal error (EER) and rank 1 recognition rate (R1 RR) are characteristic points on the ROC and CMC respectively. These are tabulated in Tab. 2. Figure 5 plots the R1 RR against the number of eigenvalues (logarithmic scale) used in the shape descripor. A plateau of maximum recognition is observerd for shape descriptors using a number of eigenvalues between 35 and 430. In Tab. 3, the different dissimilarity measures are compared, showing that he best results are obtained with the mean normalized absolute difference of square root vectors of the 50 largest eigenvalues. Table 2. Results of different isometric deformation model methods on TOSCA database experiment MDS histogram of PWA values histogram of all values modal representation ICP

R1 RR 39.34% 63.11% 72.13% 100.0% 35.29%

EER 29.49% 16.93% 14.90% 2.43% 40.07%

Isometric Deformation Modelling for Object Recognition

1

90

0.9

80

0.8

70

0.7

60

0.6 FRR

recogntion rate [%]

100

50

0.5

40

0.4

30

0.3

20

0.2 0.1

10 0

763

2

4

6

8

10

12

0

14

0

0.1

0.2

0.3

rank

(a)

0.4

0.5 FAR

0.6

0.7

0.8

0.9

1

(b)

Fig. 4. Results of standard validation experiments on the TOSCA-database with CMC (a) and ROC (b). Object recognition with a baseline algorithm (thin solid line) is compared to object recognition using MDS (dash-dot line), histogram comparison of PWA (dotted line) and all values (dashed line) and modal representation (thick solid line). 100 90

rank 1 recognition rate [%]

80 70 60 50 40 30 20 10 0 0 10

1

10

2

3

10 no. eigenvalues

10

Fig. 5. The R1 RR is plotted against the number of eigenvalues (in log scale) used in the shape descriptor Table 3. Comparison of different dissimilarity measures as defined in Tab. 1 Diss measure D1 D2 D3 PWA value Histogram comparison 45.08% 54.92% 45.08% R1 RR EER 18.68% 15.83% 25.31% All value Histogram comparison 67.21% 69.67% 47.54% R1 RR EER 14.95% 15.26% 21.01% Modal representation 84.43% 100.0% 85.25% R1 RR EER 10.11% 2.43% 10.09%

D4

D5

D6

D7

D8

54.92% 46.72% 56.56% 63.11% 20.49% 15.69% 34.68% 23.13% 16.93% 42.07% 69.67% 58.20% 72.13% 66.39% 20.49% 15.26% 19.63% 14.90% 16.94% 48.37% 100.0% 54.92% 76.23% 97.54% 33.79% 2.44% 20.33% 10.74% 7.74% 34.18%

To show the influence of different weightings of the GDM, we also tabulate the rank one recognition rate and the equal error rate for the different weighting functions as proposed in Sect. 3. This can be found in Tab. 4. The abbreviations used are the ones introduced in Sect. 3. We can clearly see that every weighting reduces the accuracy of both methods, sometimes quite drastically.

764

D. Smeets et al.

Table 4. Comparison of different weighting function of the GDM as defined in Sect. 3 G1 G2 G3 All Value Histogram comparison R1 RR 72.13% 70.49% 70.49% EER 14.90% 14.01% 14.14% Modal representation R1 RR 100.0% 97.54% 71.31% EER 2.43% 3.47% 17.79%

G4 69.67% 15.42% 90.98% 12.25%

All results clearly show that the modal representation of the geodesic distance matrices provides the highest performance. We also note that all methods using geodesic distance matrices perform better than the baseline algorithm.

5

Conclusions

In this article, different methods using geodesic distance matrices are compared. Amongst all the representations and methods, the modal approach outperforms the other methods, a geodesic histogram based representation, the MDS-approach and the baseline ICP algorithm. For the TOSCA database a rank 1 recognition rate of 100% is obtained. As future work, we propose to further exploit the modal decomposition method in order to obtain correspondences between different objects. This can be done using the eigenvectors or singular vectors based on the method of Brady and Shapiro [11].

References 1. AIM@SHAPE: SHREC - 3D shape retrieval contest, http://www.aimatshape.net/event/SHREC 2. Elad, A., Kimmel, R.: On bending invariant signatures for surfaces. IEEE Transactions on Pattern Analysis and Machine Intelligence 25(10), 1285–1295 (2003) 3. Hamza, A.B., Krim, H.: Geodesic object representation and recognition. In: Nystr¨ om, I., Sanniti di Baja, G., Svensson, S. (eds.) DGCI 2003. LNCS, vol. 2886, pp. 378–387. Springer, Heidelberg (2003) 4. Jain, V., Zhang, H.: A spectral approach to shape-based retrieval of articulated 3D models. Computer-Aided Design 39(5), 398–407 (2007) 5. Peyr´e, G., Cohen, L.D.: Heuristically driven front propagation for fast geodesic extraction. Intl. Journal for Computational Vision and Biomechanics 1(1), 55–67 6. Carcassoni, M., Hancock, E.R.: Spectral correspondence for point pattern matching. Pattern Recognition 36, 193–204 (2003) 7. Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Expression-invariant 3D face recognition. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688, pp. 62–69. Springer, Heidelberg (2003) 8. Lin, J.: Divergence measures based on the shannon entropy. IEEE Transactions on Information Theory 37(1), 145–151 (1991)

Isometric Deformation Modelling for Object Recognition

765

9. Bronstein, A., Bronstein, M., Kimmel, R.: Numerical Geometry of Non-Rigid Shapes. Springer, Heidelberg (2008) 10. Besl, P.J., Mckay, H.D.: A method for registration of 3-d shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence 14(2), 239–256 (1992) 11. Shapiro, L.S., Brady, J.M.: Feature-based correspondence: an eigenvector approach. Image Vision Comput. 10(5), 283–288 (1992)