Reducing the time complexity of Minkowski sum based similarity ...

Report 2 Downloads 42 Views
Reducing the time complexity of Minkowski sum based similarity calculations by using geometric inequalities Henk Bekker1, Axel Brink1 1

Institute for Mathematics and Computing Science, University of Groningen, P.O.B. 800 9700 AV Groningen, The Netherlands. [email protected], [email protected]

Abstract. The similarity of two convex polyhedra A and B may be calculated by evaluating the volume or mixed volume of their Minkowski sum over a specific set of relative orientations. The relative orientations are characterized by the fact that faces and edges of A and B are parallel as much as possible. For one of these relative orientations the similarity measure is optimal. In this article we propose and test a method to reduce the number of relative orientations to be considered by using geometric inequalities in the slope diagrams of A and B. In this way the time complexity of O(n6 ) is reduced to O(n4.5 ). This is derived, and verified experimentally.

1

Introduction: Minkowski-sum based similarity measures

Because shape comparison is of fundamental importance in many fields of computer vision, in the past many families of methods to calculate the similarity of two shapes have been proposed. Well-known families are based on the Hausdorff metric, on contour descriptors and on moments of the object, see [1] for an overview. Recently, a new family of methods has been introduced, based on the Brunn-Minkowki inequality and its descendants. The central operation of this method is the minimization of a volume or mixed volume functional over a set of relative orientations [2]. It is defined for convex objects, and can be used to calculate many types of similarity measures. Moreover, it is invariant under translation and rotation, and when desired, under scaling and reflection. The methods may be used in any-dimensional space, but we will concentrate on the 3D case. Experiments with these methods have been performed on 2D polygons and 3D polyhedra [3,4], and show that for polygons the the time consumption is low. However, already for 3D polyhedra of moderate complexity in terms of the number of faces, edges and vertices the time consumption is prohibitive. In this article we present a method to reduce the time complexity of these calculations by reducing the number of relative orientations to be considered. The structure of this article is as follows. In this section we introduce the Minkowski sum, the notion of mixed volume, the Brunn-Minkowski inequalities, and derive some example similarity measures. In section two we introduce

the slope diagram representation of convex polyhedra, define the set of critical orientations to be considered, present the current algorithm to calculate a similarity measure, and discuss its time complexity. In section three we introduce and test the new and more efficient algorithm, and we derive its theoretical time complexity.

Fig. 1. Two polyhedra A and B and their Minkowski sum C. C is drawn on half the scale of A and B.

Let us consider two convex polyhedra A and B in 3D. The Minkowski sum C of two polyhedra A and B is another polyhedron, generally with more faces, edges and vertices than A and B, see figure 1. It is defined as C ≡ A ⊕ B ≡ {a + b | a ∈ A, b ∈ B}.

(1)

This definition does not give much geometrical insight how C is formed from A and B. To get some feeling for that, we separately look at two properties of C, namely its shape and its position. The shape of C may be defined by a sweep process as follows. Choose some point p in A, and sweep space with translates of A such that p is in B. C consists of all points that are swept by translates of A. The same shape C results when A and B are interchanged. The position of C is roughly speaking the vectorial sum of the positions of A and B. More precise, the rightmost coordinate of C is the sum of the rightmost coordinates of A and B, and analogously the leftmost, uppermost and lowermost coordinates of C. In this article only the shape of C plays a role, not its position. Obviously, the shape and volume of C depend on the relative orientation of A and B. The volume of C may be written as V (C) = V (A ⊕ B) = V (A) + 3V (A, A, B) + 3V (A, B, B) + V (B).

(2)

Here, V (A) and V (B) are the volumes of A and B, and V (A, A, B) and V (A, B, B) are the mixed volumes, introduced by Minkowski [6]. Geometrically it is not obvious how the volume of A and B and the mixed volumes add up to the volume of C. However, it can be shown that V (A, A, B) is proportional to the area of A and the linear dimension of B, and V (A, B, B) is proportional to the linear dimension of A and the area of B.

As an example we derive two typical similarity measure expressions, based on the following two theorems [3,6]: Theorem1: For two arbitrary convex polyhedra A and B in R3 , V (A, A, B)3 ≥ V (A)2 V (B)

(3)

with equality if and only if A = B. Theorem2: For two arbitrary convex polyhedra A and B in R3 , 1

1

V (A ⊕ B) ≥ 8V (A) 2 V (B) 2

(4)

with equality if and only if A = B. From these theorems the similarity measures σ1 and σ2 respectively may be derived in a straightforward way, σ1 (A, B) ≡ max R∈R

V (A)2/3 V (B)1/3 V (R(A), R(A), B) 1

σ2 (A, B) ≡ max R∈R

(5)

1

8V (A) 2 V (B) 2 . V (R(A) ⊕ B)

(6)

Here R denotes the set of all spatial rotations, and R(A) denotes a rotation of A by R. Because the volumes in these equations are always positive, σ1 and σ2 are always positive and ≤ 1, with equality if and only if A = B. Besides the inequalities in theorem1 and theorem2 many other inequalities exist, some based on the volume of the Minkowski sum, some on the mixed volume, some on the area of the Minkowski sum or the mixed area. From every of these inequalities a similarity measure may be derived. In this article we concentrate on computing σ1 because the technique presented in this article to speed up this computation may be applied to other Minkowski sum based similarity calculations as well.

2

Calculating the similarity measure straightforward

To find the maximum in (5), in principle an infinite number of orientations of A have to be checked. That would make this similarity measure useless for practical purposes. Fortunately, as is shown in [3], to find the maximum value only a finite number of relative orientations of A and B have to be checked. Roughly speaking these orientations are characterized by the fact that edges of B are as much as possible parallel to faces of A. To formulate this more precise we use the slope diagram representation (SDR) of polyhedra. We denote face i of polyhedron A by Fi (A), edge j by Ej (A), and vertex k by Vk (A). The SDR of a polyhedron A, denoted by SDR(A), is a subdivision on the unit sphere. A vertex of A is represented in SDR(A) by the interior of a spherical polygon, an edge by a spherical arc of a great circle, and a face by a vertex of a spherical polygon, see figure 2. To be more precise: – Face representation. Fi (A) is represented on the sphere by a point SDR(Fi (A)), located at the intersection of the outward unit normal vector ui on Fi (A) with the unit sphere.

– Edge representation. An edge Ej (A) is represented by the arc of the great circle connecting the two points corresponding to the two adjacent faces of Ej (A). – Vertex representation. A vertex Vk (A) is represented by the interior of the polygon bounded by the arcs corresponding to the edges of A meeting at Vk (A). Some remarks. From this description it can be seen that the graph representing SDR(A) is the dual of the graph representing A. SDR(A) is not a complete description of A, it only contains angle information about A. Obviously, when A is rotated by a rotation R, the slope diagram representation rotates in the same way, i.e., SDR(R(A)) = R(SDR(A)). In the following, when speaking about distance in an SDR we mean spherical distance, i.e. the length of an arc on the unit sphere. Because the angle between two adjacent faces of a polyhedron is always < π, the length of the arcs in a SDR is always < π.

Fig. 2. (a): A polyhedron A. (b): The slope diagram representation of A. The orientations of A and SDR(A) are the same, so with some patience it should be possible to see how they are related.

The slope diagram representation is useful to represent situations where faces and edges of A are parallel to faces and edges of B. It is easily verified that the faces Fi (A) and Fj (B) are parallel when in the overlay of SDR(A) and SDR(B) the point SDR(Fi (A)) coincides with the point SDR(Fj (B)). Also, an edge Ei (B) is parallel to Fj (A) when the point SDR(Fj (A)) lies on the arc SDR(Ei (B)). The description given earlier, stating that (5) obtains its maximum value when edges of B are as much as possible parallel to faces of A can now be made more precise in terms of their slope diagrams: Theorem3: When σ1 is maximal then three points of SDR(R(A)) lie on three arcs of SDR(B). This theorem is derived in [3]. Unfortunately, this theorem does not tell for which three points in SDR(R(A)) and which three arcs in SDR(B) σ1 is maximal, thus to find the maximum, all rotations R have to be considered for which three points of SDR(R(A)) lie on three arcs of SDR(B). So, for three given points p1 , p2 , p3 in SDR(A) and three arcs a1 , a2 , a3 in SDR(B), an algorithm is needed that calculates a spatial rotation R for which holds that R(p1 ) lies on

a1 , R(p2 ) lies on a2 and R(p3 ) lies on a3 . We developed such an algorithm [5], and implemented it in the function tvt(). It takes as argument three points and three arcs and calculates a rotation R. It is called as tvt(p1, p2, p3, a1, a2, a3, R). The function tvt() first calculates a rotation R with the property that R(p1 ) lies on c1 , R(p2 ) lies on c2 and R(p3 ) lies on c3 , where c1 , c2 , c3 is the great circle carrying the arc a1 , a2 , a3 respectively. When R(p1 ) lies on a1 , R(p2 ) lies on a2 and R(p3 ) lies on a3 , tvt() returns ”true”, else it returns ”false”. The time complexity of tvt() is constant. Notice that the rotation returned by the call tvt(p1, p2, p3, a1, a2, a3, R), is the same as the rotation returned by the calls tvt(p1, p3, p2, a1, a3, a2, R), tvt(p2, p1, p3, a2, a1, a3, R), tvt(p3, p1, p2, a3, a1, a2, R), tvt(p3, p2, p1, a3, a2, a1, R) and tvt(p2, p3, p1, a2, a3, a1, R). That is because the the order of the statements ”R(p1 ) lies on a1 , R(p2 ) lies on a2 , R(p3 ) lies on a3 ” is irrelevant. In the implementation this observation may be used to gain a factor of six. Now calculating σ1 (A, B) consists of running through all triples of points in SDR(A) and all triples of arcs in SDR(B), to calculate for every combination the rotation R, and to evaluate σ1 for every valid R. The maximum value is the similarity measure σ1 (A, B). Assuming that SDR(A) and SDR(B) have been calculated, this results in the following algorithm outline, called algorithm1. for all points p1 // of SDR(A) for all points p2 > p1 for all points p3 > p2 for all arcs a1 // of SDR(B) for all arcs a2 for all arcs a3 if (tvt(p1, p2, p3, a1, a2, a3, R)){ sigma1=Vol(A)^{2/3} Vol(B)^{1/3}/Vol(R(A),R(A),B) if(sigma1>sigma1_max){sigma1_max=sigma1} } return sigma1_max; In the implementation it is assumed that the arcs and points are stored in a linearly ordered data structure. In this data structure, the variable p1 runs through all points, the variable p2 runs through all points greater than p1, and the variable p3 runs through all points greater than p2. In this way irrelevant permutation evaluations are avoided. The time complexity of algorithm1 is easily derived. We assume that A and B are approximately of the same complexity, i.e. have approximately the same number of vertices, edges and faces. We denote the number of faces of A and B as f , the number of edges of A and B as e. So, the number of points in SDR(A) equals f , and the number of arcs in SDR(B) equals e. Because e is proportional to f , the inner loop is evaluated O(f 6 ) times. For polyhedra of small and medium complexity the time consumption of tvt() by far exceeds the timeconsumtion of calculating the mixed volume, so the time complexity of the complete algorithm is O(f 6 ).

3

Using geometric inequalities to skip orientations

As explained before, the function tvt() calculates a rotation R with the property that R(p1 ) lies on a1 , R(p2 ) lies on a2 and R(p3 ) lies on a3 . However, without calling tvt(), it is possible to detect cases where no such R exists. As an example, let us look at two points p1 and p2 with a spherical distance d(p1 , p2 ), and at two arcs a1 and a2 , where dmin(a1 , a2 ) and dmax(a1 , a2 ) are the minimal and maximal distance between the arcs. Here, dmin(a1 , a2 ) is defined as the minimum distance of the points q1 and q2 where q1 lies on a1 and q2 lies on a2 , i.e., dmin(a1 , a2 ) ≡ {min(d(q1 , q2 ))|q1 on a1 , q2 on a2 }. Dmax(a1 ,a2 ) is defined analogously. Obviously, only when dmin(a1 , a2 ) ≤ d(p1 , p2 ) ≤ dmax(a1 , a2 ), p1 can lie on a1 while at the same time p2 lies on a2 , see figure 3. This observation may be used to skip calls of tvt(). Of course, the same principle may be used for the other two pairs of points and arcs, i.e, tvt() should only be called when dmin(a1 , a2 ) ≤ d(p1 , p2 ) ≤ dmax(a1 , a2 ) and dmin(a2 , a3 ) ≤ d(p2 , p3 ) ≤ dmax(a2 , a3 ) and

(7) (8)

dmin(a3 , a1 ) ≤ d(p3 , p1 ) ≤ dmax(a3 , a1 ).

(9)

a2

a3 p3 p2 p1 a1

Fig. 3. (a): SDR(A) with three marked points p1 , p2 , p3 . (b): SDR(B) with three marked arcs a1 , a2 , a3 . SDR(A) may be rotated so that in the overlay R(p2 ) lies on a2 and R(p3 ) lies on a3 , but clearly then R(p1 ) can not lie on a1 .

In the implementation we calculate the distance between all pairs of points of SDR(A) in a preprocessing phase, and store these distances in a table indexed by two points. In the same way we store the minimal and maximal distance between all arcs in SDR(B) in tables indexed by two arcs. Now we can give algorithm2. fill_distance_tables() for all points p1 // of SDR(A) for all points p2 > p1

for all points p3 > p2 for all arcs a1 // of SDR(B) for all arcs a2 for all arcs a3 if (dmin(a1, a2)