An Alternative Approach to Computing Shape ... - Semantic Scholar

Report 0 Downloads 118 Views
Int J Comput Vis (2009) 81: 138–154 DOI 10.1007/s11263-008-0149-1

An Alternative Approach to Computing Shape Orientation with an Application to Compound Shapes Joviša Žuni´c · Paul L. Rosin

Received: 9 January 2008 / Accepted: 9 June 2008 / Published online: 28 August 2008 © Springer Science+Business Media, LLC 2008

Abstract We consider the method that computes the shape orientation as the direction α that maximises the integral of the length of projections, taken to the power of 2N, of all the straight line segments whose end points belong to the shape, to a line that has the slope α. We show that for N = 1 such a definition of shape orientation is consistent with the shape orientation defined by the axis of the least second moment of inertia. For N > 1 this is not the case, and consequently our new method can produce different results. As an additional benefit our approach leads to a new method for computation of the orientation of compound objects. Keywords Shape · Orientation · Image processing · Early vision

1 Introduction Determining the orientation of a shape is often performed in computer vision to enable subsequent analysis to be carried out in the shape’s local frame of reference (thereby

J. Žuni´c () Department of Computer Science, University of Exeter, Exeter EX4 4QF, UK e-mail: [email protected] J. Žuni´c Mathematical Institute, Serbian Academy of Arts and Sciences, Belgrade, Serbia P.L. Rosin School of Computer Science, Cardiff University, Cardiff CF24 3AA, Wales, UK e-mail: [email protected]

simplifying that analysis). Some typical examples are: reorienting images so that descriptors from subsequent texture analysis are made rotation invariant (Jafari-Khouzani and Soltanian-Zadeh 2005); simplification of model matching for automatic recognition of military tank targets in infrared images (Chandra 1998); pose estimation of parts for 3D object retrieval (Kim and Kweon 2008); detection of the orientation of nano-objects from scanning electron microscope (SEM) images to enable nanomanipulation for rapid prototyping in the nano-world (Fahlbusch et al. 2005); estimating the orientation of aggregates in fine-grained soil in order to determine soil properties such as deformation during compaction (Shi et al. 1998). While for many shapes their orientations are obvious and can be computed easily, the orientation of other shapes may be ambiguous, subtle, or ill defined. Problems related to shape orientability are considered in Žuni´c et al. (2006). The difficulty of the task can be seen from the multiplicity of mechanisms used in human perception in which orientation can be determined by axes of symmetry and elongation (Boutsen and Marendaz 2001), gravitational cues (Marendaz et al. 1993), as well as cues from local contour, texture, and context (Palmer 1999). Moreover, shape perception has been shown to be dependent on orientation (Rock et al. 1994), emphasising the importance of orientation estimation. The most common computational method for determining a shape’s orientation is based on the axis of the least second moment of inertia (Jain et al. 1995; Klette and Rosenfeld 2004). Although straightforward and efficient to compute it breaks down in some circumstances—for example, problems arise when working with symmetric shapes (Tsai and Chou 1991; Žuni´c et al. 2006). This has encouraged the development of other competing methods (Cortadellas et al. 2004; Ha and Moura 2005; Jain et al. 1995; Kim and Kim

Int J Comput Vis (2009) 81: 138–154

1999; Klette and Rosenfeld 2004; Lin 1993, 1996; Shen and Ip 1996; Tsai and Chou 1991). Suitability of those methods strongly depends on the particular situation in which they are applied. It is not possible to say which of them is the best one or to establish a strict ranking among them as they each have their relative strengths and weaknesses (e.g. relating to robustness to noise, classes of shape that can be oriented, number of parameters, computational efficiency). A method dominant for one of applications could fail for another. In this paper we introduce a new approach to the shape orientation problem. We consider a line that maximises the integral of the squared lengths of the projections of line segments whose end points belong to the shape onto this line. Then we define the orientation of the shape by the slope of such a line. It turns out that such a method for computing shape orientation is consistent with the standard method based on the line of the least second moment of inertia. More precisely, the standard method defines the shape orientation by the line that minimises the integral of squared distances of the points in the shape to the line. However, it is well known (Tsai and Chou 1991; Žuni´c et al. 2006) that the standard method should be modified in order to be applicable to a wider class of shapes. A modification is introduced in Tsai and Chou (1991) where the squared distances are replaced with distances taken to a higher power. If a similar modification (for the same reasons) is applied to our method then the integral of projections taken to the power higher than 2 should be considered instead of the integral of the squared length of projections. It will be shown that the modified standard method and the modification of the method presented here do not coincide and, as such, the method presented here is essentially new. The new approach has the additional benefit that it easily leads to a method for the computation of the orientation of compound shapes; there is little work in the scientific literature concerning this topic. The paper is organised as follows. A short overview of the standard method for computation of shape orientation is given in Sect. 2. Section 3 introduces an alternative approach to the shape orientation problem. The new approach considers the projections of edges whose end points belong to the considered shape rather than the distance of shape points to a line. A proof that the new method for computation of shape orientation is equivalent to the standard method is given as well. In Sect. 4 we extend the method to compound shapes. A formula for the computation of orientation of compound shapes is derived and several notes describing how the new method works in practice are given. Two modifications of the method are also given. Those modifications enable the control of the impact of the component sizes on the overall shape orientation. Experiments and detailed

139

discussion are provided as well. Section 5 deals with situations where the method does not work and proposes a modification of the method in order to overcome such situations. Such a modified method could be applied to a wider class of shapes but needs optimization of a more complex function. Two properties of such a modified method are proven and several experiments are given. Concluding comments are in Sect. 6.

2 The Standard Method The most standard method for computation of shape orientation is based on the axis of the least second moment of inertia. The axis of the least second moment of inertia (Jain et al. 1995; Klette and Rosenfeld 2004) is the line which minimises the integral of the squares of distances of the points (belonging to the shape) to the line. The integral is  r 2 (x, y, α, ρ)dxdy (1) I (α, S, ρ) = S

where r(x, y, α, ρ) is the perpendicular distance from the point (x, y) to the line given in the form X · sin α − Y · cos α = ρ.

(2)

It is well known that the axis of the least moment of inertia passes through the shape centroid. The centroid is comm (S) m0,1 (S) puted as ( m1,0 , ) where m0,0 (S) is the zeroth order 0,0 (S) m0,0 (S) moment of S and m1,0 (S) and m0,1 (S) are the first order moments of S. In general, the moment mp,q (S) and the centralised moment mp,q (S) are defined as  x p y q dxdy, mp,q (S) = S

  mp,q (S) = S

m1,0 (S) x− m0,0 (S)

p 

m0,1 (S) y− m0,0 (S)

q

(3) dxdy.

Both, mp,q (S) and mp,q (S) have the order p + q. Thus, if the shape S  is the translation of S by the vector m (S) m0,1 (S) −( m1,0 , ) (such that the centroid of S  coincides 0,0 (S) m0,0 (S) with the origin) then it is possible to set ρ = 0 and consider the minimisation of I (α, S  , ρ = 0) instead of the minimization of I (α, S, ρ). Since the squared distance r 2 (x, y, α, ρ = 0) of a point (x, y) to the line X · sin α − Y · cos α = 0 is (x · sin α − y · cos α)2

(4)

and to shorten notation let F (α, S) = I (α, S  , ρ = 0) then the function that should be minimised, in order to compute the orientation of S, is     m1,0 (S) x− · sin α F (α, S) = m0,0 (S) S

140

Int J Comput Vis (2009) 81: 138–154





2

m0,1 (S) · cos α dxdy m0,0 (S)    m1,0 (S) 2 2 dxdy = sin α · x− m0,0 (S) S    m0,1 (S) 2 y− dxdy + cos2 α · m0,0 (S) S    m1,0 (S) − sin(2α) · x− m0,0 (S) S   m0,1 (S) dxdy × y− m0,0 (S) − y−

= sin2 α · m2,0 (S) + cos2 α · m0,2 (S) − sin(2α) · m1,1 (S).

(5)

The angle α for which the function F (α, S) (i.e. the integrals I (α, S  , ρ = 0) and I (α, S, ρ)) reaches its minimum defines the orientation of the shape S. We give a formal definition. Definition 1 The orientation of a given shape S is given by the angle α for which the function F (α, S) reaches the minimum. Shape orientation, as given by Definition 1, is easy to compute. It is enough to set the first derivative dF (α, S)/dα equal to zero, i.e. dF (α, S)/dα = m2,0 (S) · sin(2α) − m0,2 (S) · sin(2α) − 2m1,1 (S) · cos(2α) = 0

(6)

which yields the angle α which defines the orientation of S satisfies the following equation: 2 · m1,1 (S) sin(2α) = . cos(2α) m2,0 (S) − m0,2 (S)

(7)

3 An Alternative Approach In this section we start with a different motivation for the definition of shape orientation. It will turn out that this approach in its basic variant (squared lengths of edge projections are considered) leads to the same shape orientation as if computed based on the axis of the least moment of inertia, but a use of a higher exponent of the lengths of edge projections leads to an essentially new method. As an additional benefit, this new approach leads to a method for the computation of orientation of compound objects. Let S be a given shape, and consider the line segments [AB] whose end points A and B are points from S. Let − → → a denote the unit vector in the direction α, i.e., − a = [AB] be the projection of the (cos α, sin α). Also, let pr− → a

Fig. 1 Projections of all the line segments whose endpoints lie in S are considered, irrespective of whether the line segment intersects the boundary of S (e.g. the line segment [CD]) or not (e.g. [AB])

line segment [AB] onto a line that makes an angle α with the x-axis, while |pr− → a [AB]| denotes the length of such a projection (notations are illustrated by Fig. 1). Then, we define the orientation of the shape S by the direction that max 2 imises the integral → A=(x,y)∈S |pr− a [AB]| dxdydudv of B=(u,v)∈S

the squared lengths of projections of such edges onto a line having this direction. We give a formal definition. Definition 2 The orientation of a given shape S is defined by the angle α where the function    2 G(α, S) = |pr− (8) → a [AB]| dxdydudv A=(x,y)∈S B=(u,v)∈S

reaches its maximum. Informally speaking, Definition 2 defines the orientation of a given shape S by the slope of a line that maximises the total sum of squared lengths of projections of all straight line segments whose end points belong to S (see Fig. 1). Even though Definition 1 and Definition 2 come from different motivations it turns out that they are equivalent. Indeed, Theorem 1 (the proof is given in Appendix) shows that for a fixed shape S the quantity G(α, S) + 2 · m0,0 (S) · F (α, S)

(9)

is the same for all α ∈ [0, 2π). Furthermore, this implies that the maximum of G(α, S) and minimum of F (α, S) are reached at the same α values. In other words, the orientations computed by Definition 1 and Definition 2 are consistent even though the functions F (α, S) and G(α, S) are different. Theorem 1 For a given shape S G(α, S) + 2 · m0,0 (S) · F (α, S)

Int J Comput Vis (2009) 81: 138–154

= 2 · m0,0 (S) · (m2,0 (S) + m0,2 (S))

141

(10)

holds for each α. To close this section we note that the functions G(α, S) and F (α, S) are connected by the following relation as well:   π G(α, S) = 2 · m0,0 (S) · F α + , S . (11) 2 (It is easy to verify the above equation.) A consequence of (11) is that the new approach does not lead to a new shape elongation measure. Indeed, if we follow the standard approach where the elongation is defined as the ratio between the maximum and minimum of F (α, S) then, analogously, the elongation based on a use of function G(α, S) can be defined as: max{α ∈ [0, 2π) | G(α, S)} . min{α ∈ [0, 2π) | G(α, S)}

(12)

By using (11) we find that such a computed elongation is equal to the standard elongation measure max{α ∈ [0, 2π) | F (α, S)} . min{α ∈ [0, 2π) | F (α, S)}

Theorem 2 The angle α where the function Gcomp (α, S) reaches its maximum satisfies the following equation  2· m sin(2α) i=1 m1,1 (Si ) · m0,0 (Si ) = m . cos(2α) (m 2,0 (Si ) − m0,2 (Si )) · m0,0 (Si ) i=1

(15)

The following five notes are given to point out some properties of the Gcomp (α, S) measure. Note 1 The standard method (i.e. based on F (α, S) and therefore the equivalent one based on G(α, S), as well) for computing orientation breaks down when m1,1 (S) = m2,0 (S) − m0,2 (S) = 0 because under these conditions the functions F (α, S) and G(α, S) became constant functions (see (5) and (10)). Consequently no direction can be selected as a shape orientation. Analogously, when the following conditions hold m 

m0,0 (Si ) · m1,1 (Si )

i=1

(13)

=

m 

m0,0 (Si ) · (m2,0 (Si ) − m0,2 (Si )) = 0

(16)

i=1

4 Orientation of Compound Objects Image analysis often deals with groups of shapes or with shapes that are composed of several parts. The desired properties of the computed orientation of such compound shapes can vary. Sometimes it is reasonable that the computed orientation is derived from the orientations of components of compound shape, whereas in other cases it is preferable that the orientation is a global property of the whole compound object. In this section we introduce a new definition for computing the orientation of such compound shapes and give some examples of computed orientations. The new definition is actually a generalization of Definition 2. Definition 3 Let S be a compound object which consists of m disjoint shapes S1 , S2 , . . . , Sm . Then the orientation of S is defined by the angle that minimises the function Gcomp (α, S) defined by Gcomp (α, S) m   = i=1

A=(x,y)∈Si B=(u,v)∈Si

2 |pr− → a [AB]| dxdydudv.

 then (see (35)) Gcomp (α, S) = m i=1 2m0,0 (Si ) · m2,0 (Si ) does not depend on α and, consequently, the orientation of the compound shape S = S1 ∪ · · · ∪ Sm cannot be computed by Gcomp (α, S). For a compound shape S = S1 ∪ · · · ∪ Sm , if G(α, Si ) is constant for all Si (1 ≤ i ≤ m) then Gcomp (α, S) is also constant. However, it is possible to find shapes whose components or the overall shape is unorientable using the method based on G(α, S) (i.e. F (α, S)) but not the other method based on Gcomp (α, S). Examples1 of all these cases are shown in Table 1. Note 2 Any component Si of a compound shape S = S1 ∪ · · · ∪ Sm that is considered unorientable by G(α, S) (i.e. G(α, Si ) = const.) will not contribute to (15), and are therefore ignored in the computation of Gcomp (α, S). That is because G(α, Si ) = const. implies m1,1 (Si ) = 0 and m2,0 (Si ) = m0,2 (Si ).

(14)

The above definition enables an easy computation of the defined orientation. That is the result of the following theorem whose proof is given in Appendix.

1 All the examples in Table 1 are simple shapes that are unorientable due to their local or global symmetry. Only the third bottom shape required precise dimensions in order to make G(α, Si√ ) constant. The horizontal bars are unit thickness, two units apart, and 7 units wide.

142

Int J Comput Vis (2009) 81: 138–154

Table √ 1 Examples of multiple components shapes that are orientable ( ) or unorientable (×) according to G(α, S) and Gcomp (α, S). Also shown is whether the individual components are orientable by G(α, Si ). Situations presented in the second and fourth row are impossible G(α, S) per component

G(α, S)

Gcomp (α, S)

× ×

× ×

× √

√ √

× √

×

×

× × √ √

×









Example

impossible

impossible

√ × √

Note 3 If all components of S have identical orientation according to G(α, S) then this same orientation is also computed by Gcomp (α, S). Note 4 From (15) it can be seen that components of S contribute a weight proportional to m0,0 (Si )3 . The cubic weighting given to components in (15) seems excessive since it tends to cause the larger components to strongly dominate the computed orientation. For instance, let a compound object S consist of a shape S1 and a shape S2 which is the dilation of a shape S2 by a factor r, i.e. S2 = r · S2 = {(r · x, r · y) | (x, y) ∈ S2 } and, consequently, the size of S2 increases when r increases too. Then, m0,0 (S2 ) = r2 · m0,0 (S2 ), m1,1 (S2 ) = r4 · m1,1 (S2 ), m2,0 (S2 ) = r4 · m2,0 (S2 ), m0,2 (S2 ) = r4 · m0,2 (S2 ). Entering the above estimates into (15) we obtain that the orientation α of the compound shape S should be computed from

2 · m1,1 (S1 ) · m0,0 (S1 ) + 2 · m1,1 (S2 ) · m0,0 (S2 ) sin(2α) = cos(2α) (m2,0 (S1 ) − m0,2 (S1 )) · m0,0 (S1 ) + (m2,0 (S2 ) − m0,2 (S2 )) · m0,0 (S2 ) =

2 · m1,1 (S1 ) · m0,0 (S1 ) + 2 · r6 · m1,1 (S2 ) · m0,0 (S2 ) (m2,0 (S1 ) − m0,2 (S1 )) · m0,0 (S1 ) + r6 · (m2,0 (S2 ) − m0,2 (S2 )) · m0,0 (S2 )

(17)

and obviously the influence of S2 to the computed orientation of S could be very big if the dilation factor r is much bigger than 1. This suggests a modification of (15) to enforce instead a linear weighting by area, namely:  2· m sin(2α) i=1 m1,1 (Si )/m0,0 (Si ) = m . cos(2α) i=1 (m2,0 (Si ) − m0,2 (Si ))/m0,0 (Si )

(18)

If the orientation α of S = S1 ∪ S2 = S1 ∪ r · S2 is computed by (18) then it satisfies 2 · m1,1 (S1 )/m0,0 (S1 ) + 2 · r2 · m1,1 (S2 )/m0,0 (S2 ) sin(2α) = cos(2α) (m2,0 (S1 ) − m0,2 (S1 ))/m0,0 (S1 ) + r2 · (m2,0 (S2 ) − m0,2 (S2 ))/m0,0 (S2 )

and obviously, the impact of an increase of r to both nominator and denominator is smaller than if (15) is applied directly. It is not difficult to imagine the situation where the size change of components of compound objects should have no effect on the computed orientation. For instance, objects (i.e. components of a compound object) may be the same size in nature, but appear different sizes in the image due to varying distances from the camera. If we would like to avoid any impact of the size of the components on the computed orien-

(19)

tation of a compound object then we can use the following formula:  2 2· m sin(2α) i=1 m1,1 (Si )/(m0,0 (Si )) . (20) = m 2 cos(2α) i=1 (m2,0 (Si ) − m0,2 (Si ))/(m0,0 (Si )) In the view of the previous simple example, the computed orientation of the compound object S = S1 ∪ S2 = S1 ∪ r · S2 is the same for each r > 0. Indeed, if the last formula is applied then the computed orientation α satisfies:

Int J Comput Vis (2009) 81: 138–154

143

Fig. 2 Orientations of trademarks computed by (15), (18), (7) and the circular mean are listed. They are also displayed as arrows of diminishing size relative to the order: (15), (18), (7), circular mean

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)

2 · m1,1 (S1 )/(m0,0 (S1 ))2 + 2 · m1,1 (S2 )/(m0,0 (S2 ))2 sin(2α) = . cos(2α) (m2,0 (S1 ) − m0,2 (S1 ))/(m0,0 (S1 ))2 + (m2,0 (S2 ) − m0,2 (S2 ))/(m0,0 (S2 ))2

Notes 1–3 relating to (15) also hold for (18), but for the latter the following now applies. Note 5 The effect of n identical copies of a component Si is equivalent to a single instance of Si with a weighting n. Following on from Note 2, any component that is almost unorientable by G(α, Si ) will have little effect on the computation of Gcomp (α, S). Thus we can say that Gcomp (α, S) is not the same as computing a simple circular mean2 of the orientations produced by G(α, Si ) (i.e. F (α, Si )) since Gcomp (α, S) weights the contributions of components according to both their area and their orientability. 4.1 Compound Object Orientation Examples Here we illustrate by several examples how the new method (given by (15)) and its variations described by (18) and (20) works in practice. 2 The

circular mean μ of a set of n orientation samples θi is defined as (Mardia ⎧  and Jupp 1999): if S > 0 and C > 0, ⎪ ⎨μ μ=

⎪ ⎩

μ + π

if C < 0,

(21)

First, the new methods for computing orientation are demonstrated on some trademarks in Fig. 2.3 The images were scanned, and so there are minor irregularities. For instance, in Fig. 2a the three components are not identical, and have areas {3807, 3788, 3765} and the orientations computed by G(α, Si ) are {62.0, 62.1, 61.7}. Since Gcomp (α, S) gives the orientation 61.9◦ , this demonstrates that Note 3 also holds approximately for noisy data. The orientation of a perfect circle is not possible to compute. Due to the digitization error, the circles in Fig. 2c are imperfect ones and their computed orientations {−12.5◦ , 2.5◦ , 59.0◦ , −18.5◦ , −0.0◦ , −8.9◦ }—are fairly random and they do not come from the nature of the original shape. Such an “unorientability” by G(α, Si ) can be concluded from the values of m1,1 (Si )/m0,0 (Si ) and (m2,0 (Si ) − m0,2 (Si ))/m0,0 (Si ) which are all close to zero: {0.28, 1.19}, {−0.03, 0.61}, {−0.59, −0.63}, {0.36, 0.94}, {0.00, 0.09}, {0.24, 1.48} whereas the remaining two curvilinear shapes have well defined orientations (both are at −44.4◦ ) as shown by their larger values of m1,1 /m0,0 and (m2,0 − m0,2 )/m0,0 : {534.85, 22.20}, {31.61, 20.60}. When computing (15) and (18) the circular components have negligible effect, and so the resulting orientation is identical to the orientation of curvilinear components.

μ

+ 2π if S < 0 and C > 0   where μ = tan−1 CS , C = n1 ni=1 cos θi , S = n1 ni=1 sin θi . Since ◦ there is a 180 ambiguity in orientation a double angle representation is used—this simply requires all orientation angles θi to be doubled.

3 The

images (intensity and binary) used for processing here and elsewhere in this paper are available at http://users.cs.cf.ac.uk/Paul.Rosin.

144

Int J Comput Vis (2009) 81: 138–154

Fig. 3 Natural images containing multiple components with orientations overlaid. Orientations were computed by (18)—shorter dark arrow and (7)—long light arrow

The trademark in Fig. 2g has reflective symmetry, and so the orientation given by Gcomp (α, S) is along the symmetric axis as opposed to the standard method for estimating orientation. The individual orientations of the components are {52.0◦ , −52.4◦ }. When one component is reduced in size (Fig. 2h) the larger component dominates (15) (i.e. the computed orientation (52.22◦ ) of the compound object is very close to the computed orientation (52.0◦ ) of the larger component) whereas this effect is reduced by (18). Remark It is important to notice that both the compound shapes in Fig. 2g and Fig. 2h have the computed (nearly vertical) orientations −87.9◦ and −89.3◦ if (20) is applied. That is because the formula (20) ignores the size of components of the considered compound shape. Note 5 is demonstrated in Fig. 2i; four quarter area components combine to have identical effect to a single full size component, and the orientation given by (18) is close to −90◦ again. The larger component still dominates (15). Figure 2j gives an example of a shape that is unorientable by (15) and (18)—see Note 1. The individual component orientations of {65.5◦ , 185.6◦ , 305.3◦} are well defined; (|m2,0 (Si ) − m0,2 (Si )| + |m1,1 (Si )|)/m0,0 (Si )  are around the value 400. However, | 3i=1 (m2,0 (Si ) −  m0,2 (Si ))/m0,0 (Si )| + | 3i=1 m1,1 (Si )/m0,0 (Si )| is less than 6. This explains the inconsistency between the given orientation estimates 24.0◦ and −4.8◦ from (15) and (18) despite the component areas being almost equal. The given orientations 24.0◦ and −4.8◦ are caused by the digitization error rather then by the nature of the object itself. That is also in accordance with the statement of Lemma 1 since the shape presented in Fig. 2j is 3-fold rotationally symmetric. For comparison, the circular mean was calculated, with each component weighted by its area. Naturally when the orientation is consistent among the components then the circular mean, (15), (18) and (20) will all be approximately the same, e.g. Fig. 2a. Elsewhere there will be substantial differences, for instance in Fig. 2c in which the circular mean

combines the (unreliable) orientations of the circles with the two curvilinear components, resulting in a poor estimate. The new method is further demonstrated on several natural scenes in Fig. 3. Despite the variations in shape, size and orientation of individual components (e.g. in Fig. 3a the birds have areas {1503, 2311, 2055, 1904, 1942} and orientations {4.8◦ , 14.4◦ , 7.7◦ , 4.2◦ , 8.0◦ }) the overall orientations computed by (18) are appropriate. Namely, the computed orientation of each single bird is orthogonal to its flying direction which is consistent with the computed orientation of flock of birds—i.e. again the computed orientation is orthogonal to the flying direction. Further difficulties are apparent in Fig. 3b in which most of the components are at best only moderately oriented, while many have no distinct orientation, leading to considerable variability in individual orientation estimates. Nevertheless, the overall orientation (formula (18) is used) is correctly determined. In Fig. 3c a shoal is presented. The orientation of each fish in the shoal is coincident with the direction of shoal motion. The same orientation is obtained if the formula (18) is applied to the shoal as a compound object. Similar results hold for the silhouettes of men presented in Fig. 3d. It can be seen that the standard method fails when applied to any of the situations presented in Fig. 3 since it depends on the global arrangement of the components. This is demonstrated on the two images in Fig. 4; the central pair of arrows are the orientations shown previously in Fig. 3. In addition, the each image was partitioned into two halves (denoted by the vertical line), and the orientations were calculated separately for each half set of components. Whereas the new method produces a consistent orientation in all three cases, the standard method’s orientation varies considerably. It was stated earlier that a typical application of orientation estimation is to rotate an object or image into a canonical orientation. We show here two examples of this. The first is in the context of texture classification, in which rotational invariance is generally required. One solution found in the literature (Martinez-Alajarin et al. 2005) is to compute the directional texture features over a range of orientations

Int J Comput Vis (2009) 81: 138–154

145

Fig. 4 Orientations computed separately for the left and right halves of the images, and also for the complete images. Equations: (18)—shorter dark arrow, (7)—long light arrow

Fig. 5 Pre-processing of texture images: (a) original image, (b) original image thresholded, (c) image after background correction, (d) corrected image thresholded, (e) corrected and blurred image thresholded, (f) final rotated image

and then average them. However, this is wasteful since it loses important information regarding the anisotropy of the texture, and therefore suffers a loss in discriminative power. A better approach is to avoid the need for averaging over orientation by pre-processing the texture images to standardise their orientation (Chandra 1998; Jafari-Khouzani and Soltanian-Zadeh 2005). We demonstrate the application of our method for orientation normalisation on images from the texture database provided by Lazebnik et al. (2005). The idea is to simply threshold each texture image and treat the resulting set of binary regions as a compound shape. First, some pre-processing is performed to correct for global illumination effects: an inverted Gaussian filter centred on the DC term is applied in the frequency domain to attenuate the magnitude of very low frequencies. More precisely, the filter is 1 − N (0, 3), where N is scaled to [0, 1]. Next, the image is thresholded using Tsai’s moment preserving algorithm (Tsai 1985). As shown in Fig. 5 the regions exhibit the directionality present in the texture although there is substantial noise and variability. This is reduced somewhat by performing Gaussian smoothing (σ = 4) on the image prior to thresholding. The orientation estimated from this image (Fig. 5e) is used to rotate the original image into its canonical orientation (Fig. 5f). The algorithm is not sensitive to the selected threshold: a graph of computed orientation versus threshold level in Fig. 6 shows that, without even performing the smoothing step, the orientation hardly varies over a wide range of intensity thresholds. Further examples are shown in Figs. 7 and 8 in which the first ten images from each of the tree bark and brick classes are normalised. Although the bricks are very regular objects their images still show considerable variability due to the different patterns within the bricks. Although there is no ground truth available for the canonical orientation it can

Fig. 6 Orientations computed at different threshold values for the image in Fig. 5c

be seen that the recovered orientations appear mostly accurate. A different application of the method is to find components with atypical orientation. For instance, Wei et al. (2006) give examples of using data mining methods to find “shape discords” in various domains, including zoology, anthropology, and medicine. In our case, the simplest approach would be to directly compare a component’s orientation against the overall orientation of the full set of components and flag those components whose orientation differs substantially. However, the problem with such a scheme is that shapes with low orientability, and therefore unreliable orientation estimates, may be incorrectly identified as outliers. We suggest a better approach in which each component is tested by excluding it, and then computing the orientation of the remaining components. The difference in the orientation with and without the component then indicates the level of consistency of that component with respect to the other components. The advantage is that shapes with low orientability have little effect on the estimated orientation of the compound shape, as discussed previously. Therefore, only shapes whose orientation differs from the compound shape and also whose orientation is well defined (i.e. the shape is orientable) will be detected as outliers. This process

146

Int J Comput Vis (2009) 81: 138–154

Fig. 7 The first 10 images from the tree bark set before and after orientation normalisation plus the binary regions

Fig. 8 The first 10 images from the brick set before and after orientation normalisation plus the binary regions

Fig. 9 Deleting outlier components: (a) the effect of removing a fish on the shoal’s orientation; (b) the first 7 components which are selected to be removed are drawn in gray

can be repeated, so that the most inconsistent component is detected and eliminated, and the reference orientation recomputed at each iteration. We return to the shoal of fish in Fig. 3 for the first example. The graph of the difference in orientation caused by removing the least consistent component is shown for up to half the number of components (Fig. 9a). Since there is a wide variation in the size of the fish we have discounted component area to avoid closer/larger fish dominating the process, and used (20) to compute orientation. Deleting the first seven components has a significant effect on the shoal’s orientation; thereafter the magnitudes of the differences stabilise, indicating that the first seven components are outliers while the orientation of the remaining components is consistent with the shoal’s orientation. These outliers are outlined in Fig. 9b and the reasons for their removal can be determined on inspection. One component has been truncated by the image border at the bottom, and therefore has an erro-

neous shape and orientation. The large double fish above it is a consequence of faulty segmentation; the two fish have been merged into a single component, again with an overall orientation inconsistent with the shoal’s. The remaining five outliers can generally be seen to be orientated differently from the shoal, although the discrepancy is subtle in some cases. A second example of outlier deletion is shown in Fig. 10 in which it is used to refine image segmentation. Since the input image (Fig. 10a) is of low resolution, and the birds are often close or overlapping, the extraction of the bird components by thresholding using Tsai’s algorithm is often faulty. Ideally these faults should be at least identified, and if possible rectified by re-segmenting the image in those regions of interest. Since many of those incorrect components can be identified as orientation outliers this provides a means of performing the first step. The size of the birds is more consistent than in the previous fish example and so (18) can be

Int J Comput Vis (2009) 81: 138–154

147

Fig. 10 Deleting outlier components: (a) original image; (b) the effect of removing a bird on the flock’s orientation; (c) the first 38 components which are selected to be removed are drawn in black

Fig. 11 Two gait sequences. (a), (b) For each sequence the extracted silhouettes are displayed and underneath is an intensity coding of each silhouette to show its degree of being an outlier (dark means high likelihood). (c), (d) A magnified view of the most outlying silhouette from each sequence and its neighbours

used to compute orientation. From the graph in Fig. 10b a cutoff is selected at a change of orientation of 0.3, and the resulting outliers are displayed in gray in Fig. 10c. It can be seen that the majority of the detected outliers correspond to merged birds or other malformed components. A third and somewhat different example of outlier detection is given in Fig. 11. Here the application is gait recognition and the binary data is taken from the NLPR Gait Database (Wang et al. 2003). The binary masks were generated and provided by Wang et al. (2003) using background subtraction but, as noted by the authors, many segmentation errors remain. These mainly consist of small holes and spurious background blobs, and as such the silhouettes are often fragmented into multiple components. Although most of these can be readily corrected using standard mathematical morphological operations there remain larger errors that would need to be identified and processed separately. To apply our method here, the set of blobs in each image frame is considered as a single component. Whereas previously we considered multiple components distributed spatially within the image, in this application the multiple components are distributed temporally across the sequence. Since each frame in the sequence is considered as important as any other the weighting for each component is set to

be independent of size, and therefore (20) is used to compute orientation. As with the previous fish and bird examples, the expectation is that if the components are fairly consistently oriented, then faulty segmentation is likely to result in atypical component orientations. Two examples are given in Fig. 11. The difference in orientation caused by removing the least consistent component (i.e. image frame) is computed for up to half the number of components, and the frames are replotted with darkness proportional to their difference values. The remaining half of the frames are considered as inliers and their differences are ignored. It can be seen that in Fig. 11a there has been some kind of error in the original processing chain that produced the binary images, and the person’s leading leg has been displaced. In Fig. 11b the quality of change detection at the beginning of the sequence is poor. Figure 11c and Fig. 11d shows these errors more clearly. In both cases these segmentation errors have been identified as orientation outliers (indicated by the darker frames below). Of course, this approach is not guaranteed to find all instances of outliers, since faulty segmentation does not necessarily change a component’s orientation. In Fig. 11b there is a second instance of poor change detection two thirds of the way through the sequence, which has not been clearly identified as containing outliers.

148

Int J Comput Vis (2009) 81: 138–154

5 Orientation by a Higher Exponent In this paper we have used an alternative approach for defining shape orientation. It turned out that the new method that defines the shape orientation by the line that maximises the integral of the squared lengths of projections of line segments whose end points belong to the shape is coincident with the standard method based on the least axis of the second moment of inertia. The function G(α, S) (introduced by Definition 2) that should be maximised in order to compute the orientation of S can be written (see the proof of Theorem 1 given in Appendix) in the form G(α, S) = A · cos2 α + B · sin2 α + C · 2 · cos α sin α

(22)

where (by a straightforward use of (32) and (33) and Definition 2)   (x 2 − 2xu + u2 )dxdydudv A= S×S

= 2m2,0 (S)m0,0 (S) − 2m1,0 (S)2 ,   (y 2 − 2yv + v 2 )dxdydudv B=

(23)

S×S

= 2m0,2 (S)m0,0 (S) − 2m0,1 (S)2 ,  (xy − xv − yu + uv)dxdydudv C=

(25)

Being of such a simple form as (22) is, it is expected that there are shapes where the method cannot be used for the computing the shape orientation. It is enough to set the first derivative of G(α, S) to be equal to zero dG(α, S) = (B − A) sin(2α) + C cos(2α) = 0 dα

can be found in Tsai and Chou (1991), Žuni´c et al. (2006). In this paper we will use an analogous approach and instead of using G(α, S) we suggest a use of more complex functions G2N (α, S) defined as  2N G2N (α, S) = |pr− → a [AB]| dxdydudv A=(x,y)∈S B=(u,v)∈S

(24)

S×S

= 2m1,1 (S)m0,0 (S) − 2m1,0 (S)m0,1 (S).

and to see that for all shapes with A = B and C = 0 (where A, B, and C are given by (23)–(25)) the function G(α, S) is a constant function and, consequently, does not tell us what the direction should be used to define the shape orientation. Many-fold rotationally symmetric shapes are the examples most studied in the literature that cannot be oriented by the standard method. To overcome such problems, (Tsai and Chou 1991) suggests a use of higher exponent than 2 in (1). In such a way a more complex function is obtained and the class of shapes that could be oriented by the method is expanded. More details about the shape orientation computed by use of the functions F2N (α, S) defined as     m1,0 (S) F2N (α, S) = x− · sin α m0,0 (S) S  2N  m0,1 (S) · cos α dxdy (27) − y− m0,0 (S)

(26)

(28)

and give the following definition which is a modification of Definition 2. Definition 4 Let S be a shape, then the orientation of S is defined by the angle α where the function  2N |pr− G2N (α, S) = → a [AB]| dxdydudv A=(x,y)∈S B=(u,v)∈S

(29)

reaches its maximum.

Shape

G2

F2

G4

F4

G6

F6

G8

F8

G10

F10

(a) (b) (c) (d)

−56.6 −85.0 48.0 87.8

−56.6 −85.0 48.0 87.8

−57.8 −87.6 46.9 86.4

−60.8 −79.3 51.6 90.0

−59.1 89.7 45.8 83.2

−65.8 −79.3 52.6 −89.2

−59.8 87.8 44.6 78.3

−67.4 −74.4 52.9 −88.9

−60.0 86.3 43.5 74.9

−67.8 46.9 53.1 −88.7

Fig. 12 The orientation computed by G(α, S) = G2 (α, S) and F (α, S) = F2 (α, S) must coincide. Orientations computed by G2N (α, S) and F2N (α, S) may differ if N > 1. The orientations are

also displayed as sets of arrows superimposed on the left (G2N (α, S)) and right (F2N (α, S)) of the centre of each shape, with increasing N denoted by decreasing arrow length

Int J Comput Vis (2009) 81: 138–154

149

Fig. 13 Examples of deforming shapes: galloping horse by Eadweard Muybridge. The standard method is given by the long gray arrow, and the new method by the shorter black arrow

This paper has shown that the orientations computed based on G(α, S) and F (α, S) coincide. However, the shapes presented in Fig. 12 have different computed orientations if G2N (α, S) and F2N (α, S) are applied for several values of N > 1. This shows that the method introduced in this paper and the standard method together with its modification studied in Tsai and Chou (1991), Žuni´c et al. (2006) can be understood as essentially different. Of course, the final judgement which of computed orientations G2N (α, S) and F2N (α, S) are better cannot be given. The preference to either G2N (α, S) or F2N (α, S) should be made depending on particular application. The equivalence for N = 1 can be understood as a coincidence rather than an expected consequence of the nature of the approaches. The introduction of a higher exponent in (8) (as discussed above) leads to a larger class of shapes that can be oriented by the modified method. But no matter how big a value of N is selected there will always be shapes that cannot be oriented using G2N (α, S). The next lemma shows that M-fold rotationally symmetric shapes with M > 2N cannot be oriented by a use of G2N (α, S). Lemma 1 Let S be an M-fold rotationally symmetric shape. If M > 2N then G2N (α, S) is a constant function. Lemma 1 can be proved on a similar way as the proof of an analogous statement of the function F2N (α, S) presented in Žuni´c et al. (2006). A proof is given here, as well (see Appendix), in order to make the paper self contained. A benefit of using a higher exponent in (29) is that shapes from a wider class can be orientated. On the other hand, a disadvantage is that a closed formula (similar to (3) with N = 1, for example) for the computation of orientation cannot be derived. This disadvantage is not too great because

Definition 4 enables a reasonably efficient numerical computation of the maximum of G2N (α, S). But overcoming the problems associated with computing the orientation of many-fold symmetric shapes or some regular shapes whose orientation cannot be computed by optimising G(α, S) (i.e. F (α, S)) is not the only benefit from the use of a higher exponent in (29). In view of the examples presented on Fig. 13 we can see that using different exponents in (29) can lead to more realistic computed orientations. Indeed, in this particular case, it seems that the best results, in accordance with our perception, are obtained if G8 (α, S) is used (the computed orientations are nearly horizontal). Less appropriate orientations are obtained if F8 (α, S) is used. Further, we prove a desirable property of the shape orientation computed by using G2N (α, S). We show that the first derivatives dG2N (α, S)/dα and dG2N (α + π/2, S)/dα vanish if α equals the angle between the x-axis and one of the symmetry axes of a given reflective symmetric shape S. That shows that a symmetry axis of S has a ‘good chance’ to be coincident with the computed shape orientation. Of course such a coincidence cannot be ultimately required. Just consider a triangle ABC such that |AC| = |BC|. If |AC| (i.e. |BC|) is much bigger than |AB| then it is expected that the computed orientation is coincident with the symmetry axis of the triangle ABC. On the other hand, if |AC| (i.e. |BC|) is close to |AB|/2 then the expected orientation is 0 degrees—i.e. the assigned orientation is orthogonal to the symmetry axis. In the limit case where |AC| (i.e. |BC|) tends to |AB|/2 the triangle degenerates into a straight line segment and consequently the computed orientation is consistent with such a line (the line passing through A and B).

150

Int J Comput Vis (2009) 81: 138–154

Fig. 14 Examples of deforming shapes: Help! by The Beatles. The standard method is given by the long gray arrow, and the new method by the shorter black arrow

We give the following lemma whose proof is in Appendix. Lemma 2 Let S be a reflective symmetric shape whose symmetry axis has a slope α0 . Then the first derivative of G2N (α, S) vanishes for α = α0 and α = α0 + π/2, i.e. dG2N (α0 , S) = 0 and dα (30) dG2N (α0 + π/2, S) = 0. dα The statement of the previous lemma is illustrated by the silhouettes of the first and second men presented in Fig. 14. All orientations computed by G2N (α, P ) for N = 1, 2, 3, 4 are very close to 90◦ and they coincide with the “symmetry” axes of the silhouettes. Note that in the case of second silhouette both orientations (measured by G2N (α, P ) and F2N (α, P )) are vertical, as expected. In the case of the first silhouette the orientations measured by G2N (α, P ) remain vertical (as preferred), while the orientations computed by F4 (α, P ), F6 (α, P ) and F8 (α, P ) are less appropriate. The new measure also performs better on the third silhouette presented in Fig. 14. It keeps (for N = 1, 2, 3, 4) the computed orientations nearly vertical even if some shape parts change position, while the orientations computed by F2N (α, P ) vary.

All computed orientations (by using F2N (α, P ), G2N (α, P ), N = 1, 2, 3, 4) of the fourth silhouette are coincident. This indicates that it has a very distinct orientation, and consequently the similarity of orientations computed by different means is no surprise. To compare the robustness of the orientation estimates, the shapes in Figs. 13 and 14 were deformed; examples are shown in Fig. 15. This was achieved by adding salt and pepper noise with impulse probability of 0.2 to the binary images, retaining the largest component, and filling any holes. The circular variance (Mardia and Jupp 1999) (which is in the range [0, 1]) of 50 noisy instances of each shape was computed, see Table 2. In almost all cases G8 (α, P ) has a lower variance than F8 (α, P ), indicating that it is more robust. To close this section, let us mention that a higher exponent in G2N (α, S) obviously increases the computational complexity because the number of summands in G2N (α, S) increases. But it is worth mentioning that G2N (α, S) still can be expressed in terms of the geometric moments mp,q (S) even though this is not explicitly visible from Definition 4 (i.e. from (29)) where quadruple integrals appear. Indeed,

for expressing G2N (α, S) as in (37) and by writing 2N k (2N )! we derive: (2N −k)!k!

G2N (α,   S)  = =

A=(x,y)∈S B=(u,v)∈S

((x − u) cos α + (y − v) sin α)2N dxdydudv

 2N   2N k=0  2N  

k

A=(x,y)∈S B=(u,v)∈S

 ((x − u)k · (y − v)2N −k )dxdydudv cosk α sin2N −k α

2N −k    k     2N − k  k k−i i 2N −k−j j x (−u) y = (−v) dxdydudv cosk α sin2N −k α A=(x,y)∈S i j B=(u,v)∈S k=0 i=0 j =0         2N   2N 2N − k k−i i 2N −k−j j i+j k x uy = (−1) v dxdydudv cosk α sin2N −k α A=(x,y)∈S k i j 0≤i≤k B=(u,v)∈S 2N k





k=0

=

2N   k=0

  2N k

0≤i≤k 0≤j ≤2N−k

0≤j ≤2N−k

i+j

(−1)

    k 2N − k mk−i,2N −k−j (S) · mi,j (S) cosk α sin2N −k α, i j

Int J Comput Vis (2009) 81: 138–154

151

Fig. 15 The first shape from Fig. 14 with different instances of added noise Table 2 Circular variances of orientation estimates for each of the four shapes in Figs. 13 and 14 (one of each figure per row) calculated over 50 noisy instances of each Muybridge’s horse

Beatles

G8 (α, P )

F8 (α, P )

G8 (α, P )

F8 (α, P )

0.003

0.004

0.000

0.055

0.006

0.007

0.004

0.003

0.003

0.011

0.000

0.002

0.000

0.035

0.000

0.000

showing that only two-dimensional geometric moments are needed for the computation of G2N (α, S). Of course, standard techniques, see for example (Jiang and Bunke 1991), for a fast computation of two dimensional geometric moments can be used here as well.

6 Conclusion An alternative method for computing the orientation α of a shape S has been introduced. The approach considers the projections of the edges, whose end points belong to the shape, onto a line. In contrast, the standard approach uses the distances of the shape points to a line to define a shape’s orientation. The effective computation of shape orientation is based on maximising the term G2N (α, S). We have proven that when N = 1 the method yields orientations identical to those found by the standard approach defined by the axis of the least second moment of inertia. However, for N > 1 the results differ, and thus the proposed approach provides a fundamentally new method for the computation of shape orientation. Situations when G2N (α, S) does not tell us what the orientation of S should be considered as well. A desirable property related to the reflective symmetric shapes is also proven. The computed orientations are given for several examples and they are reasonable. On several presented examples the new method performs better than the standard method. But yet, due to the shape diversity and due to the diversity of possible applications, it is not possible to say that orientations computed by one of the methods are better then

the orientations computed by the other one. A final judgement can be given only based on knowing the particular situation where the methods are applied. This is one of the reasons for many methods that have already developed for the computation of shape orientation (see references)— each of them fits well with particular kinds of problems and shapes. Furthermore, use of one method does not exclude the use of another one. For example, if both methods give the same computed orientation, it could be an indicator that the considered shape has a very distinct orientation. An important benefit of our new approach is that it leads to a new method for computing the orientation of compound shapes which consist of several components and the majority of the paper is related to this method. The orientation of compound shapes based on maximizing the term Gcomp (α, S) and a closed formula for computing the maximum of Gcomp (α, S) is given. It was shown that Gcomp (α, S) has several attractive properties. Several easily provable statements are formulated as five notes (see Sect. 4) and they give an impression of how Gcomp (α, S) acts in some particular situations. In particular the method takes into account the size of each component and its degree of orientability. Thus, both small components that arise due to noise, and near many-fold symmetric objects whose individual orientations are unreliable, will have little effect on the overall orientation computed over the compound object. A modification of the method, that will give more weight to small components, and a modification that ignores the components’ sizes, are described as well. Such a modification could be useful when the size of the object in an image does not correspond properly to its size (or importance) in reality (due to the object’s position with respect to the camera, for example). The standard method for computing orientation provides an estimate based on the global characteristics of the shape and there is no straightforward modification of the standard method which would be applicable to the compound shapes. The new method based on Gcomp (α, S) proposed in this paper provides an estimate of orientation based on more local information, namely the shape’s connected components. Several experiments are provided to illustrate the suitability of the new measure. A proper discussion of experiments is also given. An interesting topic of research to follow in the future would be to investigate how the two estimates could be combined, in a similar way to how context plays a role in human perception. For instance, whereas an equilateral triangle in isolation is directionally ambiguous, in Fig. 16 the two sets of triangles have distinct and well defined orientations (Palmer 1980).

152

Int J Comput Vis (2009) 81: 138–154

Fig. 16 The set of equilateral triangles on the left appear to point to the right, whereas the set on the right appear to point upwards

− 2m0,0 (S)m1,1 (S) sin(2α)

Appendix Proof of Theorem 1 Since the shape S is fixed all moments that appear in the proof are constant and they do not change when α varies. Also, we will use the following trivial equalities: m2,0 (S) = m2,0 (S) −

(m1,0 (S))2 , m0,0 (S)

= 2 · m0,0 (S)(m2,0 (S) + m0,2 (S)) − 2m0,0 (S)F (α, S). (34) 

This establishes the proof.

Proof of Theorem 2 Similarly as in the proof of Theorem 1 we have

(m0,1 (S))2 , m0,0 (S) m1,0 (S)m0,1 (S) , m1,1 (S) = m1,1 (S) − m0,0 (S) (follow easily from (3)). m0,2 (S) = m0,2 (S) −

(31) Gcomp (α, S) = cos2 α ·

m 

2m0,0 (Si )m2,0 (Si )

i=1

+ sin2 α ·

m 

2m0,0 (Si )m0,2 (Si )

i=1 2 2 |pr− → a [AB]| = ((x − u) · cos α + (y − v) · sin α) where A = (x, y), B = (u, v).  x p y q ur v t dxdydudv = mp,q (S) · mr,t (S).

+ sin(2α) · (32) (33)

S×S

2m0,0 (Si )m1,1 (Si ).

(35)

i=1

Setting the first derivative dGcomp (α, S)/dα equal to zero we obtain

Now, we derive G(α, S)  =

m 

− sin(2α) ·

m 

2m0,0 (Si )(m2,0 (Si ) − m0,2 (Si ))

i=1

((x − u) · cos α

+ 2 · cos(2α) ·

S×S

+ (y − v) · sin α)2 dxdydudv    (x 2 − 2xu + u2 )dxdydudv = (cos α)2 ·

m 

2m0,0 (Si )m1,1 (Si ) = 0,

(36)

i=1

what is equivalent to (15) from the statement of Theorem 2. 

S×S

 + (sin α)2 ·

(y 2 − 2yv + v 2 )dxdydudv 

S×S

+ sin(2α) ·

(xy − xv S×S

Proof of Lemma 1 Let S be an M-fold rotationally symmetric shape. First, we will derive that there are not more than 4N values of α for which dG2N (α, S)/dα vanishes. Indeed, in accordance with (28) and (32) the function G(α, S) can be expressed as

− yu + uv)dxdydudv



= (2m2,0 (S)m0,0 (S) − 2(m1,0 (S)) ) cos α 2

2

G2N (α, S) =

+ (2m0,2 (S)m0,0 (S) − 2(m0,1 (S))2 ) sin2 α

((x − u) cos α

+ (y − v) sin α)2N dxdydudv

+ (2m1,1 (S)m0,0 (S) − 2m1,0 (S)m0,1 (S)) · sin(2α)

=

= 2m0,0 (S)m2,0 (S)(1 − sin α) 2

+ 2m0,0 (S)m0,2 (S)(1 − cos2 α)

A=(x,y)∈S B=(u,v)∈S

2N  k=0

where

Ak cosk α sin2N −k α,

(37)

Int J Comput Vis (2009) 81: 138–154

Ak =

(2N )! k!(2N − k)!

153



A=(x,y)∈S B=(u,v)∈S

((x − u)k

× (y − v)2N −k )dxdydudv for k = 0, 1, . . . , 2N . Furthermore, dG2N (α, S)  Ak −k cosk−1 α sin2N −k+1 α = dα 2N

k=1

+ (2N − k) cosk+1 α sin2N −k−1 α

(38)

we distinguish two situations (denoted by (a) and (b) below) depending on the values of dG2N (α = 0, S)/dα and dG2N (α = π, S)/dα: (a) If α = 0 and α = π (i.e. sin α = 0) are not solutions of dG2N (α, S)/dα = 0, then we have from (38) dG2N (α, S) =0 dα sin2N α ·

2N 

⇐⇒



Ak −k cotk−1 α + (2N − k) cotk+1 α = 0.

k=1

(39) Since the quantity 2N 



Ak −k cotk−1 α + (2N − k) cotk+1 α

(40)

k=1

is a 2N -degree polynomial on cot α it cannot have more than 2N real zeros cot α1 = z1 , cot α2 = z2 , . . . , cot αk = zk

i.e., A2N −1 = 0.

Proof of Lemma 2 Without loss of generality we can assume α0 = 0, i.e. the x-axis is one of the symmetry axes of S. Also we express S as the union S = S + ∪ S − where S + and S − be the parts of S that lie above and below x-axis respectively. To prove dG2N (α0 , S)/dα = 0 it is enough to enter α = 0 into dG2N (α, S) dα  = 2N

A=(x,y)∈S B=(u,v)∈S

(42)

and apply trivial identities  (x − u)2N −1 · (y − v)dxdydudv (x,y)∈S + (u,v)∈S +

=−

(41)

((x − u) cos α + (y − v) sin α)2N −1

× ((u − x) sin α + (y − v) cos α)dxdydudv



(k ≤ 2N ).

Because cot α = cot(α + π) we can deduce that the equation dG2N (α,S) = 0 has no more than 4N solutions, assuming that dα G2N (α, S) is not a constant function. (b) If α = 0 and α = π (i.e., sin α = 0) are solutions of dG2N (α, S)/dα = 0, then (see (38)) A2N −1 (2N − (2N − 1)) = 0,

Thus, in both cases (a) and (b) the number of zeros of dG2N (α, S)/dα is not bigger than 4N . On the other hand, if S is a fixed M-fold rotationally symmetric shape, then G2N (α, S) must have (because of the symmetry) at least M local minima and M local maxima (one minimum and one maximum at each interval of the form [α, α + 2π/M), or it must be a constant function. That means that dG2N (α, S)/dα must have (at least) 2M zeros α1 , α2 , . . . , α2M . Since the presumption 2N < M does not allow 2M zeros of dG2N (α, S)/dα if G2N (α, S) is not a constant function, we just derived a contradiction. So, G2N (α, S) must be constant for all 2N less than M. 

(x,y)∈S − (u,v)∈S −

and  (x,y)∈S + (u,v)∈S −

(x − u)2N −1 · (y − v)dxdydudv



=−

(x − u)2N −1 · (y − v)dxdydudv (43)

(x,y)∈S − (u,v)∈S +

(x − u)2N −1 · (y − v)dxdydudv. (44)

But, in such a situation the quantity (40) is an (2N − 1)degree polynomial in cot α since the coefficient of (cot α)2N vanishes (because of (41)). Consequently, the polynomial (40) cannot have more than 2N − 1 real zeros:

The equality dG2N (α = π/2, S)/dα = 0 can be proven in an analogous way. 

cot α1 = z1 , cot α2 = z2 , . . . , cot αk = zk

References

(k ≤ 2N − 1),

i.e. there are no more than 2(2N − 1) = 4N − 2 values of α for which the quantity (40) (and consequently dG2N (α, S)/dα) vanishes. So, again, including α = 0 and α = π, the total number of zeros of dG2N (α, S)/dα = 0 is not bigger than 4N.

Boutsen, L., & Marendaz, C. (2001). Detection of shape orientation depends on salient axes of symmetry and elongation: Evidence from visual search. Perception & Psychophysics, 63(3), 404–422. Chandra, D. V. S. (1998). Target orientation estimation using Fourier energy spectrum. IEEE Transactions Aerospace and Electronic Systems, 34(3), 1009–1012.

154 Cortadellas, J., Amat, J., & De la Torre, F. (2004). Robust normalization of silhouettes for recognition applications. Pattern Recognition Letters, 25(5), 591–601. Fahlbusch, S., et al. (2005). Nanomanipulation in a scanning electron microscope. Journal of Materials Processing Technology, 167(2– 3), 371–382. Ha, V. H. S., & Moura, J. M. F. (2005). Affine-permutation invariance of 2-d shapes. IEEE Transactions on Image Processing, 14(11), 1687–1700. Jafari-Khouzani, K., & Soltanian-Zadeh, H. (2005). Radon transform orientation estimation for rotation invariant texture analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6), 1004–1008. Jain, R., Kasturi, R., & Schunck, B. G. (1995). Machine vision. New York: McGraw-Hill. Jiang, X.Y., & Bunke, H. (1991). Simple and fast computation of moments. Pattern Recognition, 24(8), 801–806. Kim, S., & Kweon, I. S. (2008). Scalable representation for 3D object recognition using feature sharing and view clustering. Pattern Recognition, 41(2), 754–773. Kim, W. Y., & Kim, Y. S. (1999). Robust rotation angle estimator. IEEE Transactions on Pattern Analysis and Machine Intelligence, 21(8), 768–773. Klette, R., & Rosenfeld, A. (2004). Digital geometry. San Mateo: Morgan Kaufmann. Lazebnik, S., Schmid, C., & Ponce, J. (2005). A sparse texture representation using local affine regions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(8), 1265–1278. Lin, J.-C. (1993). Universal principal axes: An easy-to-construct tool useful in defining shape orientations for almost every kind of shape. Pattern Recognition, 26(4), 485–493. Lin, J.-C. (1996). The family of universal axes. Pattern Recognition, 29(3), 477–485. Mardia, K. V., & Jupp, P. E. (1999). Directional statistics. New York: Wiley. Marendaz, C., Stivalet, P., Barraclough, L., & Walkowiac, P. (1993). Effect of gravitational cues on visual search for orientation. Jour-

Int J Comput Vis (2009) 81: 138–154 nal of Experimental Psychology: Human Perception and Performance, 19, 1266–1277. Martinez-Alajarin, J., Luis-Delgado, J. D., & Tomás-Balibrea, L.-M. (2005). Automatic system for quality-based classification of marble textures. IEEE Transactions on Systems, Man and Cybernetics, Part C, 35(4), 488–497. Palmer, S. E. (1980). What makes triangles point: Local and global effects in configurations of ambiguous triangles. Cognitive Psychology, 12, 285–305. Palmer, S. E. (1999). Vision science-photons to phenomenology. Cambridge: MIT Press. Rock, I., Schreiber, C., & Ro, T. (1994). The dependence of twodimensional shape perception on orientation. Perception, 23, 1409–1426. Shen, D., & Ip, H. H. S. (1996). Optimal axes for defining the orientations of shapes. Electronic Letters, 32(20), 1873–1874. Shi, B., Murakami, Y., & Wu, Z. (1998). Orientation of aggregates of fine-grained soil: quantification and application. Engineering Geology, 50, 5970. Tsai, W. H. (1985). Moment-preserving thresholding. Computer Vision, Graphics and Image Processing, 29, 377–393. Tsai, W. H., & Chou, S. L. (1991). Detection of generalized principal axes in rotationally symetric shapes. Pattern Recognition, 24(1), 95–104. Wang, L., Tan, T. N., Hu, W. M., & Ning, H. Z. (2003). Automatic gait recognition based on statistical shape analysis. IEEE Transactions on Image Processing, 12(9), 1120–1131. Wei, L., Keogh, E., & Xi, X. (2006). SAXually explicit images: Finding unusual shapes. In Int. conf. on data mining, pp. 711–720. Žuni´c, J., Kopanja, L., & Fieldsend, J. E. (2006). Notes on shape orientation where the standard method does not work. Pattern Recognition, 39(5), 856–865. Žuni´c, J., Rosin, P. L., & Kopanja, L. (2006). On the orientability of shapes. IEEE Transactions on Image Processing, 15(11), 3478– 3487.