From Contours to 3D Object Detection and Pose Estimation

Report 4 Downloads 136 Views
From Contours to 3D Object Detection and Pose Estimation Nadia Payet and Sinisa Todorovic

Wednesday, November 30, 11

1

Problem Statement

Given a single image: 1. Detect an object of interest 2. Delineate its boundaries 3. Estimate its continuous 3D pose Wednesday, November 30, 11

2

Prior Work Generative models e.g., aspect graphs Koendrik & Doorn 79 Kushal et al. 04 Saverese & Fei-Fei 07-09 Arie & Basri 09 Hu & Zhu 10

Discriminative models e.g., structured prediction Hoiem et al. 07 Su et al. ICCV 09 Ozuysal et al. 09 Liebelt & Schmid 08-10 Gu & Ren 10

Main characteristics of recent work: • Local image features • Sophisticated models • 3D pose = Interpolation of viewpoint classes Wednesday, November 30, 11

3

To Bridge the Semantic Gap... Recent work, typically semantic level

model

gap

local features pixels Wednesday, November 30, 11

4

To Bridge the Semantic Gap... Recent work, typically

Our approach

semantic level

semantic level model

model

local features pixels Wednesday, November 30, 11

mid-level gap features Prior work: contours Lowe & Binford 85 Cyr & Kimia 04

pixels 5

To Bridge the Semantic Gap... Recent work, typically

Our approach

semantic level

semantic level model

model

local features pixels Wednesday, November 30, 11

mid-level gap features contours pixels 6

To Bridge the Semantic Gap... Recent work, typically

Our approach

semantic level

semantic level model

model

gap

BoBs

Prior work: Zhu et al. 08 Zhang et al. 11

contours local features pixels Wednesday, November 30, 11

pixels 7

Bags of Boundaries = BoBs

If an object occurs, it must be in the spotlight of many BoBs jointly supporting the occurrence hypothesis Wednesday, November 30, 11

8

Bags of Boundaries = BoBs

latent indicator of boundaries shape context histogram of boundaries

s=

# bins # contours



# contours

Zhu et al. 08, Zhang et al. 11 Wednesday, November 30, 11

9

Bags of Boundaries vs. Bags-of-Words BoBs

Histogram of hidden features that must be inferred

Wednesday, November 30, 11

BoWs

Histogram of observable features

10

Approach input contour extraction Zhu et al. ICCV07

Wednesday, November 30, 11

11

Approach input contour extraction grid of BoBs

Wednesday, November 30, 11

12

Approach input contour extraction object model

Wednesday, November 30, 11

grid of BoBs

13

Approach input contour extraction object model

grid of BoBs estimate of 3D pose

Wednesday, November 30, 11

14

Approach input selected boundaries object model

grid warping estimate of 3D pose

Wednesday, November 30, 11

15

Approach input

output

object model

Wednesday, November 30, 11

16

Object Model = Shape Templates

2D probabilistic maps of shape for a set of viewpoints

Wednesday, November 30, 11

17

Learning view 1

view 2

view 3 ... view n

...

image 1 image m

Table top dataset Sun et al. 10 Wednesday, November 30, 11

18

Example Shape Templates

AUTOCAD dataset Liebelt & Schmid 08-10 Wednesday, November 30, 11

19

Representation of the Shape Template

Regular grid of shape-context descriptors + Affine projection matrix T

Wednesday, November 30, 11

20

Inference = Matching of BoBs

Wednesday, November 30, 11

21

Inference = Matching of BoBs

template 1 Wednesday, November 30, 11

template 2

...

template n 22

Inference = Matching of BoBs

under an arbitrary affine projection Wednesday, November 30, 11

23

Example Problem: Object Recognition

Given a set of edges in the image detect and localize all object instances and estimate their 3D pose Payet & Todorovic ICCV11 Wednesday, November 30, 11

24

Matching Formulation

T

tr C (X)F + ||T QF

min

X,F,T

min

X,F,T

T T +⇥||(T QF

T

T T (T QF

tr C (X)F +P )||T QF +⇥||(T QF

Wednesday, November 30, 11

T

P)

(T QF

T

P ||

T

P ||P )W || T

P )W ||

25

Matching Formulation

T

tr C (X)F + ||T QF

min

X,F,T

min

X,F,T

T T +⇥||(T QF

T

T T (T QF

tr C (X)F +P )||T QF +⇥||(T QF

T

s.t. X Wednesday, November 30, 11

P)

(T QF

N

[0, 1] ; T

T

P ||

T

P ||P )W || T

P )W ||

T; 26

Matching Formulation

T

tr C (X)F + ||T QF

min

X,F,T

min

X,F,T

T T +⇥||(T QF

T

T T (T QF

tr C (X)F +P )||T QF +⇥||(T QF

F s.t. X Wednesday, November 30, 11

T

P)

(T QF

T

P ||

T

P ||P )W || T

P )W ||

NT

0;1]F ;1TN =T 1; M ; F 1M  1N [0, 27

Matching Formulation

T

tr C (X)F + ||T QF

min

X,F,T

min

X,F,T

T T +⇥||(T QF

T

T T (T QF

tr C (X)F +P )||T QF +⇥||(T QF

F s.t. X Wednesday, November 30, 11

T

P)

(T QF

T

P ||

T

P ||P )W || T

P )W ||

NT

0;1]F ;1TN =T 1; M ; F 1M  1N [0, 28

Results: Object Detection

PASCAL VOC 2006 car dataset Wednesday, November 30, 11

Car show dataset

29

Results: Viewpoint Classification

3D#Object#dataset:#Cars## Wednesday, November 30, 11

30

Results: 3D Pose Estimation

Correct detection, localization, and pose estimation Wednesday, November 30, 11

31

Results: 3D Pose Estimation

Correct detection, localization, and pose estimation Wednesday, November 30, 11

32

Conclusion



Recent work:

• Pre-selected local features • Sophisticated object models and algorithms



Our approach:

• Mid-level features allow for: • Abstracting low-level features • Synergistic bottom-up/top-down interaction

• Simple models and algorithms Wednesday, November 30, 11

33