Variational Shape Matching For Shape ... - Semantic Scholar

Report 2 Downloads 183 Views
Variational Shape Matching For Shape Classification and Retrieval Kamal Nasreddine, Abdesslam Benzinou Ecole Nationale d’Ing´enieurs de Brest, laboratoire RESO - 29238 BREST (FRANCE)

Ronan Fablet Telecom Bretagne, LabSTICC - 29238 BREST (FRANCE)

Abstract In this paper we define a distance between shapes based on geodesics in shape space. The proposed distance, robust to outliers, uses shape matching to compare shapes locally. Multiscale analysis is introduced in order to avoid problems of local and global variabilities. The resulting similarity measure is invariant to translation, rotation and scaling independently of constraints or landmarks, but constraints can be added to the approach formulation when needed. An evaluation of the proposed approach is reported for shape classification and shape retrieval on a complex benchmark shape database. It demonstrates in both cases that previous work is outperformed. Key words: Shape classification, shape retrieval, contour matching, shape geodesics, multiscale analysis, robustness.

Email addresses: {nasreddine,benzinou}@enib.fr (Kamal Nasreddine, Abdesslam Benzinou), [email protected] (Ronan Fablet)

Preprint to be submitted to Pattern Recognition Letters

November 13, 2009

1

1. Introduction and related work

2

This work is concerned with the definition of a robust distance between

3

shapes based on shape geodesics. The proposed distance is shown to serve

4

for shape classification and shape retrieval. Recently, computer vision has

5

extensively studied object recognition and known significant progress, but

6

current techniques do not provide entirely significant solutions [Daliri and

7

Torre, 2008; Veltkamp and Hagedoorn, 2001].

8

Regarding shape analysis and classification, similarity measures may be

9

defined from information extracted from the whole area of the object (region-

10

based techniques) [Kim and Kim, 2000], or from some features which describe

11

only the object boundary (boundary-based techniques) [Costa and Cesar,

12

2001]. The latter category may also comprise skeleton description [Lin and

13

Kung, 1997; Sebastian and Kimia, 2005]. Skeleton description of shapes has a

14

lower sensitivity to articulation compared with boundary and region descrip-

15

tions, but it is with the cost of higher degree of computational complexity

16

due to tree or graph matching [Sebastian and Kimia, 2005; Sebastian et al.,

17

2003]. On the other hand, boundary-based object description is considered

18

more important than region-description because an object’s shape is mainly

19

disriminated by the boundary. In most cases, the central part of object

20

contributes little to shape recognition.

21

The boundary-based approach described in this paper is established on a

22

comparison between matched contours. Contour matching has been already

23

widely applied for object recognition based on shape boundary [Diplaros

24

and Milios, 2002]. In general, contour matching methods are devided in

25

two major classes: those based on rigid transformations, and those based 2

26

on non-rigid deformations [Veltkamp and Hagedoorn, 2001]. Methods of the

27

first type look for optimal parameters which align feature points assuming

28

that the transformation is composed of translation, rotation and scaling only.

29

They may lack accuracy. Methods based on elastic deformations rely on the

30

minimization of some appropriate matching criterion. They may present the

31

drawback of asymmetric treatment of the two curves and in many cases lack of

32

rotation and scaling invariance [Veltkamp and Hagedoorn, 2001]. Moreover,

33

existing techniques typically take advantage of constraints specific to the

34

applications or uses shape landmarks. These points are generally defined

35

as minimal or maximal shape curvature [Del Bimbo and Pala, 1999; Super,

36

2006], as zero curvature [Mokhtarian and Bober, 2003], at a distance from

37

specific points [Zhang et al., 2003], on convex or concave segments [Diplaros

38

and Milios, 2002], or any other criteria suitable to involved shapes.

39

Shape analysis from geodesics in shape space has emerged as a powerful

40

tool to develop geometrically invariant shape comparison methods [Younes,

41

2000]. Using shape geodesics, we can state the contour matching as a varia-

42

tional non rigid formulation ensuring a symmetric treatment of curves. The

43

resulting similarity measure is invariant to translation, rotation and scaling

44

independently of constraints or landmarks, but constraints can be added to

45

the approach formulation when needed. This paper is an extension of the

46

work presented in [Younes, 2000] to the task of shape classification and the

47

task of shape retrieval.

48

The following is a summary list of the contributions of our work:

49

− Geodesics in shape space have been introduced to develop efficient

50

shape warping methods [Younes, 2000]. We exploit the corresponding 3

51

similarity measure to define a new distance for shape classification and

52

shape retrieval. This distance issued from a shape matching procedure

53

based on shape geodesics takes advantage of local shape features while

54

ensuring invariance to geometric transformations (e.g. translation, ro-

55

tation and scaling). In addition, a hierarchical approach is considered

56

based on the resolution of the shape sampling to deal with local and

57

global variabilities.

58

− As optimization method, beside dynamic programming which is gener-

59

ally used to solve variationnal problems, we propose here to use a new

60

minimization technique based on an incremental iterative scheme.

61

− To ensure more robustness against outliers, we introduce a robust cri-

62

terion as a modification of the similarity measure issued from shape

63

geodesics. Evaluation results show that this modification ensures a

64

faster convergence of the iterative scheme and avoids a convergence to

65

a local minimum.

66

− We establish the superiority of the proposed method over state-of-art

67

methods already used for shape classification and shape retrieval. The

68

test is carried out on a complex benchmark shape database, the part

69

B of the MPEG-7 Core Experiment CE-Shape-1 data set [Jeannin and

70

Bober, 1999]. This database is the largest and the most widely tested

71

among available test shape databases.

72

The subsequent is organized as follows. In Section 2 is detailed the pro-

73

posed framework for shape matching based on shape space, from where a

4

74

robust similarity measure between two shapes is taken. For numerical im-

75

plementation, Section 3 describes a new optimization technique, based on an

76

incremental iterative scheme, beside dynamic pogramming which is classicaly

77

used for this purpose. We discuss in Section 4 the benefit of the proposed

78

similarity measure on shape matching performances. Sections 5 and 6 are

79

devoted to present the derived multiscale distance proposed for shape classi-

80

fication and shape retrieval. Finally, in Section 7 we evaluate the proposed

81

distance for shape classification and shape retrieval experiments on the part

82

B of the MPEG-7 shape database and we compare results to other state-of-

83

art schemes.

84

2. Proposed contour matching

85

In this paper a boundary-based approach is considered. The comparison

86

between shapes is based on a similarity measure using shape geodesics. The

87

proposed similarity measure will be exploited to define a new distance be-

88

tween shapes for classification and retrieval purposes. A multiscale analysis

89

will be performed to take into account both local and global differences in

90

the shapes.

91

2.1. Shape geodesics

92

There are various ways to solve for shape matching problem, and many

93

similarity measures have been proposed in the case of planar shapes [Veltkamp,

94

2001]. Recently, shape geodesics have emerged as a powerful tool [Younes,

95

2000], they are widely used in analyses concerned with studying variations

96

and changes in the shape of organisms, for instance morphometrics and image

97

warpings. 5

98

{Figure 1 goes here}

99

{Figure 2 goes here}

100

Geodesics in the shape space are defined as paths between two shapes

101

(Figure 1) with respect to some metric. This metric is chosen to be in-

102

variant for a given set of transformations (e.g. translation, rotation, scaling,

103

. . . ). Mostly, shapes are considered as points on an infinite-dimensional Rie-

104

mannian manifold and distances between shapes as minimal length geodesic

105

paths. Retrieving the geodesic path between any two closed shapes resorts

106

to a matching issue (Figure 2) with respect to the considered metric. Let us

107

˜ locally characterized by the angle between the consider two shapes Γ and Γ

108

tangent to the curve and the horizontal axis (θ and θ˜ respectively). Following

109

[Younes, 2000], the matching issue is stated as the minimization of a shape

110

similarity measure given by :

θ(s) − θ(φ(s)) ˜ p ˜ SM (θ, θ(φ)) = 2 arccos φs (s) cos ds 2 s∈[0,1] Z

(1)

111

where s refers to the normalized curvilinear abscissa defined on [0, 1], φ is

112

˜ to the curvia mapping function that maps the curvilinear abscissa on Γ

113

linear abscissa on Γ and φs =

dφ . ds

The similarity measure considered here

116

˜ includes a measure of the difference between the two orientations θ and θ,   ˜ cos θ(s)−θ(φ(s)) , and a term that penalizes the torsion and stretching along 2 p the curve, ( φs (s)).

117

Curve parametrization via the angle function θ(s) defined on the normal-

118

ized arc-length s allows to a representation which complies with the expected

119

invariance properties (translation and scaling). A translation of the curve has

114

115

6

120

no effect on θ, and an homothetie with factor λ has no effect on the normal-

121

ized parameter s. Thus curves modulo translation and homothetie will be

122

represented by the same angle function θ(s). A rotation of angle c transforms

123

the function θ(s) into the function θ(s) + c modulo 2π. To add rotation in-

124

˜ variance, Youness proposes to minimize SM (θ, θ(φ)) (Equation 1) over all

125

choices for the origins of the curve parametrizations.

126

2.2. Robust variational formulation

127

˜ ˜ respectively encoded in θ(s) and θ(s), Given two shapes Γ and Γ the

128

matching problem comes to the registration of two 1D signals. The regis-

129

tration consists in retrieving the transformation that best matches points

130

of similar characteristics (Figure 2). Formally, it resorts to determining the

131

˜ transformation function φ(s) such that θ(s) = θ(φ(s)). Here, we propose to

132

state this issue as the minimization of an energy E(φ) involving a data-driven

133

term, ED , that evaluates the similarity between the reference and aligned sig-

134

nals and a regularization term, ER . The term of regularization is considered

135

in order to obtain a smooth transformation function.

E(φ) = (1 − α)ED (φ) + αER (φ) Z ER (φ(s)) = |φs (s)|2 ds

(2) (3)

s∈[0,1] 136

where α is a variable that controls the regularity. From time causality, the

137

minimization of E(φ) has to be carried out under the constraint φs > 0.

138

139

140

The similarity measure we propose to use is derived from the similarity measure (given in (1)) using shape geodesics (proposed by [Younes, 2000]).



˜ To improve its robustness to outliers, we introduce a robust norm θ(s) − θ(φ(s))

ρ

7

141

˜ instead of the simple difference (θ(s) − θ(φ(s))). The principle is supported

142

by the use of a function that adjusts a weight ω in order to penalize the data

143

points with high variation compared to other points. Several forms of the

144

robust estimator ρ were proposed [Black and Rangarajan, 1996]. We will use

145

the Leclerc estimator given by:

krkρ = 1 − exp(−r2 /(2σ 2 )) 146

(4)

with σ is the standard deviation of data errors r. Using the above data-driven term in the functional E(φ) and after adding the robust estimator, the shape registration issue resorts to minimizing: p R kr(s)k E(φ) = (1 − α)arccos s∈[0,1] φs (s) cos 2 ρ ds R +α s∈[0,1] |φs (s)|2 ds

147

˜ where r(s) = θ(s) − θ(φ(s)).

148

3. Numerical implementation

149

150

(5)

To solve for minimization of E(φ), two methods are considered: dynamic programming and iterative scheme.

151

The dynamic programming algorithm is applied as follows. Given a step

152

˜ sj )j=1..M , the algorithm of discretisation and the discretized θ(si )i=1..N and θ(˜

153

considers in the plane [s1 , sN ] × [˜ s1 , s˜M ] the grid G which contains the points

154

p = (x, y) such that either x = si and y ∈ [˜ s1 , s˜M ], or y = s˜j and x ∈ [s1 , sN ].

155

We fetch a continuous and increasing matching function that is linear on each

156

portion that does not cut the grid. The value of the energy E(φ) is calculated

157

at each point of the grid depending on the values at previous points, and the 8

158

minimum is chosen. This procedure is iterated over all choices for the origins

159

of the curves. This algorithm is more detailed in [Trouv´e and Younes, 2000].

160

Here we propose to use an incremental iterative minimization, which is

161

shown to be computationally more efficient than the dynamic technique in

162

the case of registration without landmarks (see section 4 for comparison). At

163

iteration k, given φk we solve for an incremental update: φk+1 = φk + δφk such that δφk = argmin E(φk + δφ). The initialization of the algorithm is

164

δφ

165

given by the identity function taken in turn for all choices for the origins of

166

the curves. For each of these initializations, the algorithm iterates two steps:

167

1. the computation of the robust weights ωik issued from the robust es-

168

timator ρ. For instance, the weight issued from the Leclerc estimator

169

is ωik =

170

standard deviation of data errors r,

2 2 exp( −rσ2(si ) ) σ2

˜ k (si )) and σ is the where r(si ) = θ(si ) − θ(φ

172

2. the estimation of δφk = {δφk (si )} as successive solutions of the linP k earized minimization δφk = argmin i Ei . The key approximation of

173

˜ k+1 ) = θ(φ ˜ k + δφk ) ≈ θ(φ ˜ k ) + θ˜s (φk ) · δφk . For this linearization is: θ(φ

174

α = 0, the equation we obtain does not have a unique solution. The

175

resulting δφk (si ) for α 6= 0 is given by:

171

δφ

9

N (si ) D(si ) p φk (si+1 ) − φk (si−1 ) S(si ) =  k  ωi r(si ) ˜ k ˜ k (si−1 ))] (1 − α) g(si ) = sin [θ(φ (si )) − θ(φ 2   k ωi r(si ) N (si ) = −S(si )g(si )cos 2 k k +2α[2φ (si ) − φ (si−1 ) − φk (si+1 )

δφk (si ) =

(6)

−δφk (si−1 ) − δφk−1 (si+1 )] 1 D(si ) = S(si )g 2 (si ) − 4α 2

176

4. Shape matching performances

177

To study the impact of adding the robust criterion and the regularization

178

term, we will test here the matching process on synthetic contours (one con-

179

tour is obtained by applying a known transformation to the other one). Some

180

examples of these synthetic shapes are given in Figure 3 with a representation

181

of the used transformation function φ.

182

{Figure 3 goes here}

183

 2  ˜ In Figure 4 we have reported the mean square error M SE = E θ − θ(φ)

184

obtained for different values of α ∈ [0, 1]. This result is issued from the dy-

185

namic programming algorithm. For high values of α, the term of regularity is

186

favored over the similarity measure and the alignment is attained with high

187

188

values of M SE. For small values of α, the robust algorithm ensures solutions  with smaller errors corresponding to M SEφ = E |φapplied − φestimated |2 ≈

10

189

0.001. The gain1 due to the robust solution is represented in Figure 4(b);

190

this gain is optimum for α = 0 and reaches 90%. The aligned shapes given

191

in Figures 4(c) and 4(d) show the superiority of the robust solution. The

192

consistency has been verified by testing many transformation functions with

193

different shapes.

194

{Figure 4 goes here}

195

Using the incremental iterative scheme, the minimization leads to the

196

same optimum as the dynamic programming except for α = 0 (Figure 5).

197

For the iterative scheme the term of regularity is necessary, α must should

198

have a nonzero value to lead to a unique solution. Experimentally, a value

199

of α in the range [0.1, 0.2] is optimal.

200

{Figure 5 goes here}

201

{Figure 6 goes here}

202

In Figure 6, we have reported another tested synthetic shape obtained

203

by applying an occlusion on the shape given in Figure 3(c). The results of

204

its matching with the reference shape given in Figure 3(a) are reported in

205

Figures 7 and 8. We see that the robust algorithm is more robust against

206

the occlusion, it is still able to align the curves and to find the applied

207

transformation with minor errors. The transformation found by the non

208

robust algorithm (Figure 7(b)) is so far than the real one (Figure 3(b)).

209

{Figure 7 goes here}

210

{Figure 8 goes here}

211

The effect of the robust solution is more visible when we analyze the 1

defined as:

M SEN onRobust −M SERobust M SEN onRobust

× 100

11

212

evolution of the algorithm through the initializations in turn for all choices for

213

the origins of the curves. For initializations at points which are far from the

214

correct solution, we have noticed that the mean square error M SE decreases

215

through iterations to attain the optimum with the robust algorithm, while it

216

stays high in the case of non robust algorithm. Without the robust estimator,

217

the minimization converges to a local minimum. Hence, the robust algorithm

218

can be carried out with only one initialization for one choice for the origins

219

of the curves. For the synthetic shapes given in Figure 3, we have reported in

220

table 1 the optima M SEs and the gain due to the robust solution for some

221

initializations at points which are far from the correct solution from different

222

angles (30◦ , 45◦ , 90◦ and 135◦ ). One can see that the robust algorithm always

223

converges to the global minimum in contrast to the non robust one.

224

Finally, the iterative method is also computationally more efficient in

225

the case of shapes without landmarks. The dynamic programming needs a

226

relatively longer time. For example, for the taken synthetic contours, this

227

time reaches 9.7 times that required by the robust iterative scheme.

228

229

{Table 1 goes here} 5. Distance-based shape classification

230

Here, we exploit shape geodesics for shape classification and propose to

231

compare shapes on the basis of a metric that takes into consideration shape

232

matching. The similarity measure used in Eq 5 is taken as the cost of de-

233

formation of the aligned shape. On the basis of a general algebraic and

234

variational framework, [Younes, 2000] has proved that the constructed cost

235

function meets all the conditions necessary for a true distance between planar 12

236

curves. Formally, the distance between two shapes S1 and S2 is defined as: d(S1 , S2 ) = ED (S1 , S2 (φ∗ )) where φ∗ = argmin E(S1 , S2 , φ)

(7)

φ 237

In this work, an hierarchical characterization will be issued from the com-

238

bination of shape matching at different sampling resolutions. Here, the scale

239

is considered related to the resolution of shape sampling, as considered in

240

[Attalla and Siy, 2005].

241

In order to avoid problems of local and global variabilities, the distance

242

used for shape comparison is a combination of distances measured at dif-

243

ferent scales. The final distance between shapes S1 and S2 used for shape

244

classification is defined as follows: N 1 X d(S1 , S2 ) = dk (S1 , S2 ) N k=1

(8)

245

where dk is the distance defined in Equation 7 between the same shapes at

246

the k th scale and N the number of considered scales.

247

Assuming we are provided with a set of categorized shapes, (Sl , Cl ), where

248

Sl is the shape of the lth sample in the database and Cl its class, the classi-

249

fication of a new shape S is issued from a nearest neighbor criterion.

250

6. Distance-based shape retrieval

251

In addition to shape classification performance, we also address shape

252

retrieval [Del Bimbo and Pala, 1999]. A retrieval problem consists in deter-

253

mining what are the shapes in the considered database that are the most

254

similar to a query shape. The classification accuracy of a shape descriptor 13

255

does not necessarily give a relevant guess of the retrieval efficiency [Kunttu

256

et al., 2006]. As for classification, the distance used for shape retrieval is the

257

distance defined in Equation 8.

258

7. Comparison to other schemes

259

To compare the proposed approach to the state-of-the-art shape recogni-

260

tion approaches, we evaluate it for shape classification and retrieval exper-

261

iments on the part B of the MPEG-7 shape database [Jeannin and Bober,

262

1999]. This database is composed of a large number of different types of

263

shapes: 70 classes of shapes with 20 examples of each class, for a total of

264

1400 shapes. The classes of shapes include natural and artificial objects. The

265

shape recognition on this database is not simple because elements present

266

outliers so that some samples are visually dissimilar from other members of

267

their own class (Figure 9). Furthermore, there are shapes that are highly

268

similar to examples of other classes (Figure 10).

269

{Figure 9 goes here}

270

{Figure 10 goes here}

271

We do not discuss edge detection here; it is an obvious step in image anal-

272

ysis. The dataset of shape outlines are issued from an automated extraction

273

of the outlines using the Matlab image processing toolbox2 .

274

With a view to being invariant to flip transformation, the optimal match-

275

ing between two shapes results from Equation 5 where matching costs are

276

computed between the first shape and the second flipped or not. 2

Website: http://www.mathworks.com/products/image/

14

277

Shape representation is given by points equally sampled along the bound-

278

ary. Shape sampling at different scales with 32, 48, 64 and 192 points is

279

considered.

280

Classification rates are issued from the leaving one out method where

281

each shape in turn is left out of the training set and used as a query image.

282

Retrieval accuracy is measured by the so-called Bull’s eye test [Jeannin and

283

Bober, 1999]: for every image in the database, the top 40 most similar shapes

284

are retrieved. At most 20 of the 40 retrieved shapes are correct hits. The

285

retrieval accuracy is measured as the ratio of the number of correct hits of

286

all images to the highest possible number of hits which is 20 × 1400.

287

As mentioned in Section 4, the best shape matching in term of mean

288

square error is obtained for α = 0.1. The results of shape classification

289

carried out on this database don’t change significantly (±0.01%) by taking α

290

in the range [0.05, 0.2]. Note that the value of α intervenes in the process of

291

convergence of the shape matching and not in the expression of the distance

292

of Equation 8. In Figure 11 we report the variation of the correct shape

293

classification rate with respect to α.

294

{Figure 11 goes here}

295

{Table 2 goes here}

296

The proposed approach based on shape geodesics has been compared to

297

state-of-the-art schemes for the benchmark dataset as reported in Table 2.

298

The proposed approach outperforms reported schemes with a correct classifi-

299

cation rate of 98.86% corresponding to a gain in term of correct classification

300

rate between 0.3% and 17%. Regarding the bull’s eye, a score of 89.05% is

301

reached. This is greater by 1.35% than the best result reported previously.

15

302

The highest scores of previous works are those of methods based on shape

303

matching and/or with hierarchical analysis; this fact justifies the choice of

304

the bases of the proposed approach.

305

In order to analyse the results presented in Table 2, we will describe the

306

methods listed above with specifications about the similarities and differences

307

with the proposed method.

308

Zernike moments [Kim and Kim, 2000] are the most potent moments for

309

shape description among the region-based descriptors. They are orthogonal

310

moments which represent the shape information optimally. However, the

311

computation of zernike polynomials remains difficult and complex. The shape

312

representation is global in this approach, in the sense that each moment holds

313

information about all shape points and the shape comparison is not spatially

314

local.

315

The curvature scale space (CSS) is a boundary representation introduced

316

in [Mokhtarian and Mackworth, 1986]. It is invariant under the affine trans-

317

forms. This method is based on finding points of inflection on the curve at

318

various levels of detail. The CSS-representation uses multiple resolutions re-

319

sulting from an iterated smoothing of the boundary. Compared to other tools,

320

the CSS has a relatively low shape recognition accuracy and efficiency [Zhang

321

and Lu, 2003]. Although this shape representation is local and in a multiscale

322

analysis, a key difference between this method and the one described in this

323

paper is that comparison between shapes in the CSS representation is done

324

by considering points of zero curvature only, not all points.

325

Many techniques based on Fourier descriptors have been proposed for

326

shape recognition. The method proposed in [Arbter et al., 1990] transforms

16

327

a parametrized boundary description into the Fourier domain to get a set

328

of coefficients. These coefficients are normalized to eliminate dependencies

329

on the affine transformations and the starting point. Multiscale Fourier-

330

based approach has been proposed in [Kunttu et al., 2006] to improve the

331

shape classification rate. Besides, elliptic Fourier descriptors [Nixon and

332

Aguado, 2007] are of the robust boundary-based shape descriptors. Despite

333

the fact that some Fourier methods are multiscale and invariant under affine

334

transformations, they remain global as corresponding descriptors are derived

335

from a calculation including all points of the boundary or the entire object

336

in the case of 2D Fourier descriptors .

337

Visual parts are used for shape matching in [Latecki and Lakamper, 2000].

338

This approach is boundary-based and uses a local representation of shapes.

339

Comparison between shapes follows here a shape matching, but the matching

340

in this approach is a correspondence of convex/concave arcs of the studied

341

boundaries.

342

Shape context [Belongie et al., 2002] is developped as a local descriptor

343

for finding correspondences between point sets. A shape is represented by

344

a discrete set of points sampled from the contour of the object. Given a

345

set of points, the shape context captures the relative distribution (distance

346

and orientation) of points in the plane relative to each point in the shape.

347

Shape contexts have been used as attributes for a weighted bipartie match-

348

ing problem. In order to improve the classification of articulated shapes,

349

shape contexts have been modified [Ling and Jacobs, 2007] by considering

350

the geodesic distance of contour instead of the Euclidean distance. This

351

object-based approach requires the definition of landmarks in the objects for

17

352

the correspondence.

353

The inner-distance is defined as the length of the shortest distance be-

354

tween shape landmarks. It has been used to characterize shapes in [Ling and

355

Jacobs, 2007].

356

In [Daliri and Torre, 2008], a recent technique represents shapes using a

357

string of symbols and the shape recognition is done by operations on this

358

string of symbols. It is a local boundary-based approach.

359

The shape tree approach [Felzenszwalb and Schwartz, 2007] is based on a

360

hierarchical representation of the sampled points of the curves. A shape-tree

361

is constructed for each curve and the curves are matched by looking for a

362

mapping from points in a curve to points in the other one such that the

363

shape-tree of the curve is deformed as little as possible.

364

The fixed correspondance approach [Super, 2006] and the Racer algorithm

365

[Super, 2003] are boundary-based methods with local description of curves. A

366

boundary matching is carried out in these approaches using key points: points

367

of local maximum or minimum curvature. These methods analyse shapes in

368

a one scale. Chance probability functions [Super, 2006] are used for learning

369

the classification process in order to improve recognition performances.

370

Wavelet transforms have been widely used in image analysis as multi-

371

scale tools [Chuang and Kuo, 1996]. For shape representation, wavelets are

372

boundary-based local descriptors. However, they are not suitable for describ-

373

ing shapes because the corresponding descriptors are not rotation invariant

374

[Yang et al., 1998].

375

The proposed approach in this paper is invariant to geometric transfor-

376

mations (translation, rotation and scaling) and exploits local shape features.

18

377

In particular, high curvature points play a key role. This local setting makes

378

also simpler the use of landmarks when needed. Landmarks are simply con-

379

sidered as points for which φ(s) is known. These landmarks can be detected

380

automatically or set by experts depending on the application.

381

Another important property of the proposed metric, compared to others

382

proposed for shape matching, is that it is symmetric, in the sense that if we

383

register one shape on the other one, we will have the same matching if we

384

have done the symmetric registration; in fact, in both cases we look for the

385

path of minimal cost of deformation aligning the two shapes which ensures

386

a symmetric treatment of curves.

387

In Figure 12 we have reported images of some objects from different

388

classes. These shapes are highly similar, curvature differs in a small number

389

of data points only. Experimentally we notice that the use of the robust

390

criterion leads to consider these data points as outliers. For example, if we

391

focus on the nearest 20 neighbors of the samples of the class spoon, more than

392

50% are elements of the classes: watch, pencil, key and bottle. However, if

393

we use the similarity measure without the robust weights, 95% of the nearest

394

20 neighbors are of the same class, spoon. Using robust weights, the average

395

retreival accuracy is penalized due to the low accuracies obtained for these 6

396

classes, but it remains higher than without the use of the robust weights.

397

{Figure 12 goes here}

398

Future work will explore the combination of the proposed approach to

399

kernel-based statistical-learning. Recently, in [Yang et al., 2008] authors

400

propose to combine classical metrics to learning through graph transduction.

401

It has been shown that this approach yields significant improvements on

19

402

retrieval accuracies. For example, the retrieval rate using the IDSC [Ling

403

and Jacobs, 2007] is improved by 5.6% when combined to the learning graph

404

transduction. This research direction will be investigated in future work.

405

Acknowledgement

406

The authors would like to thank Jean Le Bihan for fruitful discussions.

407

References

408

Arbter, K., Snyder, W., Burkhardt, H., Hirzinger, G., 1990. Application

409

of affine-invariant fourier descriptors to recognition of 3-d objects. IEEE

410

Transactions on Pattern Analysis and Machine Intelligence 12 (7), 640–

411

647.

412

Attalla, E., Siy, P., 2005. Robust shape similarity retrieval based on con-

413

tour segmentation polygonal multiresolution and elastic matching. Pattern

414

Recognition 38 (12), 2229 – 2241.

415

Belongie, S., Malik, J., Puzicha, J., 2002. Shape matching and object recog-

416

nition using shape contexts. IEEE Transactions on Pattern Analysis and

417

Machine Intelligence 24 (4), 509–522.

418

Black, M., Rangarajan, A., 1996. On the unification of line processes, outlier

419

rejection and robust statistics with applications in early vision. Computer

420

Vision 19 (5), 57–92.

421

Chuang, G. C., Kuo, C., 1996. Wavelet descriptor of planar curves: theory

422

and applications. IEEE Transactions on Image Processing 5 (1), 56–70. 20

423

424

425

426

427

428

Costa, L. F., Cesar, R. M., 2001. Shape analysis and classification, theory and practice. CRC Press, Boca Raton, Florida. Daliri, M. R., Torre, V., 2008. Robust symbolic representation for shape recognition and retrieval. Pattern Recognition 41 (5), 1799–1815. Del Bimbo, A., Pala, P., 1999. Shape indexing by multiscale representation. Image and Vision Computing 17 (3), 245–261.

429

Diplaros, A., Milios, E., 2002. Matching and retrieval of distorted and oc-

430

cluded shapes using dynamic programming. IEEE Transactions on Pattern

431

Analysis and Machine Intelligence 24 (11), 1501–1516.

432

Direkoglu, C., Nixon, M., 2008. Shape classification using multiscale fourier-

433

based description in 2-d shape. In: ICSP’08: Proceedings of the 9th Inter-

434

national Conference on Signal Processing. Vol. 1. pp. 820–823.

435

Felzenszwalb, P. F., Schwartz, J. D., 2007. Hierarchical matching of de-

436

formable shapes. In: CVPR’07: Proceedings of the IEEE Conference on

437

Computer Vision and Pattern Recognition. pp. 1–8.

438

Jeannin, S., Bober, M., 1999. Description of Core Experiments for MPEG-7

439

Motion/Shape. MPEG7, ISO/IEC JTC1/SC29/WG11 N2690, document

440

N2690, Seoul.

441

442

443

Kim, W. Y., Kim, Y. S., 2000. A region-based shape descriptor using zernike moments. Signal Processing: Image Communication 16, 95–102. Kunttu, I., Lepist¨o, L., Rauhamaa, J., Visa, A., 2006. Multiscale fourier

21

444

descriptors for defect image retrieval. Pattern Recognitions Letters 27 (2),

445

123–132.

446

447

Latecki, L. J., 2002. Application of planar shape comparison to object retrieval in image databases. Pattern Recognition 35 (1), 15–29.

448

Latecki, L. J., Lakamper, R., 2000. Shape similarity measure based on cor-

449

respondence of visual parts. IEEE Transactions on Pattern Analysis and

450

Machine Intelligence 22 (10), 1185–1190.

451

Lin, I. J., Kung, S. Y., 1997. Coding and comparison of dag’s as a novel neu-

452

ral structure with applications to on-line handwriting recognition. IEEE

453

Transactions on Signal Processing 45 (11), 2701–2708.

454

Ling, H., Jacobs, D. W., 2007. Shape classification using the inner-distance.

455

IEEE Transactions on Pattern Analysis and Machine Intelligence 29 (2),

456

286–299.

457

McNeill, G., Vijayakumar, S., 2006. Hierarchical procrustes matching for

458

shape retrieval. In: CVPR’06: Proceedings of the IEEE Conference on

459

Computer Vision and Pattern Recognition. Vol. 1. pp. 885–894.

460

Mokhtarian, F., Abbasi, S., Kittler, J., 1996. Efficient and robust retrieval

461

by shape content through curvature scale space. In: Proceedings of In-

462

ternational Workshop on Image DataBases and Multimedia Search. pp.

463

35–42.

464

Mokhtarian, F., Bober, M., 2003. Curvature scale space representation: the-

22

465

ory, applications, and MPEG-7 standardization. Kluwer Academic Pub-

466

lishers, Norwell, MA, USA.

467

Mokhtarian, F., Mackworth, A. K., 1986. Scale-based description and recog-

468

nition of planar curves and two-dimensional shapes. IEEE Transactions on

469

Pattern Analysis and Machine Intelligence 8 (1), 34–43.

470

471

472

473

Nixon, M. S., Aguado, A., 2007. Feature extraction and image processing. Academic Press. Sebastian, T. B., Kimia, B. B., 2005. Curves vs. skeletons in object recognition. Signal Processing 85 (2), 247–263.

474

Sebastian, T. B., Klein, P. N., Kimia, B. B., 2003. On aligning curves. IEEE

475

Transactions on Pattern Analysis and Machine Intelligence 25 (1), 116–

476

125.

477

Super, B., 2003. Improving object recognition accuracy and speed through

478

nonuniform sampling. In: SPIE’03: Proceedings of the Society of Photo-

479

Optical Instrumentation Engineers Conference. Vol. 5267. pp. 228–239.

480

Super, B. J., 2006. Retrieval from shape databases using chance probabil-

481

ity functions and fixed correspondence. Pattern Recognition and Artificial

482

Intelligence 20 (8), 1117–1138.

483

Trouv´e, A., Younes, L., 2000. Diffeomorphic matching problems in one di-

484

mension: Designing and minimizing matching functionals. In: ECCV ’00:

485

Proceedings of the 6th European Conference on Computer Vision-Part I.

486

Springer-Verlag, London, UK, pp. 573–587. 23

487

Veltkamp, R. C., 2001. Shape matching: similarity measures and algorithms.

488

In: SMI 2001: International Conference on Shape Modeling and Applica-

489

tions. pp. 188–197.

490

491

Veltkamp, R. C., Hagedoorn, M., 2001. State of the art in shape matching, 87–119.

492

Yang, H. S., Lee, S. U., Lee, K. M., 1998. Recognition of 2d object con-

493

tours using starting-point-independent wavelet coefficient matching. Visual

494

Communication and Image Representation 9 (2), 171–181.

495

Yang, X., Bai, X., Latecki, L. J., Tu, Z., 2008. Improving shape retrieval by

496

learning graph transduction. In: ECCV’08: Proceedings of the European

497

Conference on Computer Vision. Vol. 4. pp. 788–801.

498

499

Younes, L., 2000. Optimal matching between shapes via elastic deformations. Image and Vision Computing 17 (5), 381–389.

500

Zhang, D., Lu, G. A., 2003. Comparative study of curvature scale space and

501

fourier descriptors for shape-based image retrieval. Visual Communication

502

and Image Representation 14 (1), 41–60.

503

504

Zhang, J., Zhang, X., Krim, H., Walter, G., 2003. Object representation and recognition in shape spaces. Pattern Recognition 36 (5), 1143–1154.

24

(a)

(b) Interpolated curves: geodesic path from 1(a)

(c) Final

Starting

to 1(c) in the shape space.

curve

curve

Figure 1: Deformation path from fig. 1(a) to fig. 1(c).

(a) The mapping function for two

(b) The visualisation of the

˜ depicted shape outlines Γ and Γ,

mapping function φ as a 2D

in 2(b), as a monotonic function

outline matching

which matchs a curvilinear abscissa ˜ to a curvilinbetween 0 and 1 on Γ ear abscissa on Γ Figure 2: Example of contour matching.

25

(a) Reference

(b) Applied transformation

(c) Curve to

curve

be aligned

Figure 3: Test on synthetic shapes. We have applied a known transformation (3(b)) on the shape of 3(a) to get the shape 3(c).

Table 1: Optima M SEs obtained by the robust and the non robust algorithms with the gain due to tho robust solution for initializations of φ at points which are far from the correct solution from different angles. This experiment is carried out on synthetic shapes given in Figure 3. Angle

Gain=

M SEN onRobust −M SERobust M SEN onRobust

M SEN onRobust

M SERobust

35◦

0.293

0.087

70.30%

45◦

8.66

0.089

98.97%

90◦

0.296

0.085

71.28%

135◦

1.78

0.086

95.17%

26

× 100

(a) MSE rad2 versus α values

(b) Gain due to the robust algorithm



(c)

(d)

Aligned

curve the

with

Aligned

curve

robust

the

with non

algorithm for

robust

al-

α = 0.1

gorithm

for

α = 0.1

Figure 4: Results of shape matching on synthetic contours depicted in Figure 3 using the dynamic programming for different values of α ∈ [0, 1].

27

Figure 5: Results of shape matching on synthetic contours depicted in Figure 3 using the iterative scheme for different values of α ∈]0, 1]. The iterative algorithm leads to the same optimum as the dynamic programming (Figure 4(a)).

Figure 6: Test on synthetic shapes. Occluded shape obtained from the shape 3(c).

28

(a) Transformation found with the robust

(b) Transformation found with the non ro-

algorithm for α = 0.1

bust algorithm for α = 0.1

(c) MSE versus α values

(d) Gain due to the robust algorithm

Figure 7: Results of shape matching using the iterative scheme for different values of α ∈]0, 1]. We register here the occluded shape of Figure 6 with respect to the reference 3(a).

29

(a) curve the

(b)

Aligned

Aligned

with

curve with the

robust

non robust al-

algorithm

gorithm

Figure 8: Results of shape matching. Aligned shapes by the robust and non robust algorithms; the reference shape is given in Figure 3(a) and the shape to be aligned in Figure 6.

(a) Dogs

(b) Apples

(c) Beetles

(d) Elephants

(e) Flies

(f) Hats

(g) Horses

(h) Spoons

Figure 9: Examples of shapes that are visually dissimilar from other samples of their own class.

30

(a) Apple/ oc-

(b) Sea snake/

topus

lizzard

(c) Deer/ horse

(d) Hat/ device3

Figure 10: Examples of pair of shapes issued from different classes but highly similar.

Figure 11: The correct classification rate (in %) on the MPEG-7 shape database versus the values of α (α is the coefficient that controls the regularity of the solution).

31

(a)

(b)

(c) Pen-

(d) Lm-

Watch

Spoon

cil

fish

(e) Key

(f) Bottle

Figure 12: Examples of shapes from different classes with high similar curvature.

32

Table 2: Recognition accuracy measured as nearest neighbor classification rate and retrieval accuracy measured by the bull’s eye test on the MPEG-7 shape database. Method

Retrieval accuracy

Classification rate

Proposed scheme

89.05%

98.86%

String of symbols [Daliri and Torre, 2008]

85.92%

98.57%

Zernike moments

70.22%

90%

Multiscale FD 2D [Direkoglu and Nixon, 2008]

NA

95.5%

Elliptic FD [Direkoglu and Nixon, 2008; Nixon and Aguado, 2007]

NA

82%

Shape tree [Felzenszwalb and Schwartz, 2007]

87.7%

NA

Inner-distance shape context (IDSC) [Ling and Jacobs, 2007]

85.40%

NA

84%

97.4%

83.04%

97.2%

[Direkoglu and Nixon, 2008; Kim and Kim, 2000]

Fixed correspondence + aggregated-pose chance probability functions [Super, 2006] Fixed correspondence + Chance probability functions [Super, 2006] Fixed correspondence [Super, 2006]

80.78%

97%

Hierarchical procruste matching [McNeill and Vijayakumar, 2006]

86.35%

95.71%

Multilayer eigenvectors [Super, 2006]

70.33%

NA

Normalized squared distance [Super, 2003]

79.36%

96.9%

Racer [Super, 2003]

79.09%

96.8%

Optimized CSS [Mokhtarian and Bober, 2003]

81.12%

NA

Curve edit distance [Sebastian et al., 2003]

78.17%

NA

Shape context [Belongie et al., 2002]

76.51%

NA

Parts correspondence [Latecki, 2002; Latecki and Lakamper, 2000]

76.45%

NA

Visual parts [Latecki and Lakamper, 2000]

76.45%

NA

60%

NA

Curvature Scale Space [Mokhtarian et al., 1996]

75.44%

NA

Wavelet [Chuang and Kuo, 1996]

67.76%

NA

Skeleton DAG [Lin and Kung, 1997]

33