Depth-Encoded Hough Voting for Joint Object Detection and Shape Recovery Min Sun Advisor: Prof. Silvio Savarese
Vision Lab 1
Tea Party
2
Tea Party: Object Detection
Cups Tea pot 3
Tea Party: Shape is the key Object 2D Location Object 3D Shape
4
Robotics approach 1.
2.
Images from Willow Garage, Inc
1. Identify the Object in 3D using active sensors 2. Grasp the object using motion planning 5
Our goals 1.
2.
Venetian fort
1. Jointly Detect an object 2. Infer its 3D shape Sun et. al. ECCV’10 at Crete, Greece 6
Outline • • • • •
Related Work Our Method Experiments Applications Conclusion & Future Work
7
Related Work: Part-based models Constellation Model
Deformable Part-based model
Multi-View Model
Key-view 1
Fergus et. al. CVPR’03
Key-view 2
Key-view 3 Sun et. al. ICCV’09
Felzenszwalb et. al. PAMI’08
Implicit Shape Model
Pictorial Structure
Leibe ECCV’04 workshop
Felzenszwalb & Huttenlocher IJCV’05
8
Multi-View Model Multi-View Model
Key-view 2
•Representation: -Dense representation - Multi-view generative part-based model
Key-view 1
Key-view 3
Sun et. al. ICCV’09
•Learning: -Weakly-supervised learning -Incremental. Sun et. al. ICCV’09 9
Multi-View Model
Sun et. al. ICCV’09 10
Related Work: Part-based models Constellation Model
Deformable Part-based model
Multi-View Model
Key-view 1
Felzenszwalb & Huttenlocher IJCV’05
Implicit Shape Model
Leibe ECCV’04 workshop
Key-view 2
Key-view 3 Sun et. al. ICCV’09
Felzenszwalb et. al. PAMI’08
Pictorial Structure
Felzenszwalb & Huttenlocher IJCV’05
11
Related Work: Hough Voting Scheme
• Voting is a general technique where we let the parts vote for all hypotheses that are compatible with it. • Popular for detecting parameterized shapes –Hough’59, Duda&Hart’72, Ballard’81,…
12
Slide Modified from S. Maji
Hough transform P.V.C. Hough, Machine Analysis of Bubble Chamber Pictures, Proc. Int. Conf. High Energy Accelerators and Instrumentation, 1959
Given a set of points, find the curve or line that explains the data points best
y
m
x
n Hough space
y=mx+n
13
Hough transform y
m
n
x y
m 3
x
5
3
2
2
3 7
11 10
4
3
2 3
1
4
5
2
2
0
1
3
3
1
3
n
14
Generalized Hough transform • What if want to detect arbitrary shapes defined by boundary points and a reference point?
[Dana H. Ballard, Generalizing the Hough Transform to Detect Arbitrary Shapes, 1980] Credit slide: C. Grauman
15
Example Circle model
rx
ry
0
1
0
45
0.7
0.7
90
0
1
135
-0.7
0.7
0.7
-0.7
…
270
Query
P1 = 0
R = [rx,ry] = [1,0] C1 = P1 + R
P2 = 45 R = [rx,ry] = [.7,.7] C2 = P2 + R Pk = -180 R = [rx,ry] = [-1, 0] Ck = Pk + R
… 16
Related Work: Implicit shape models
parts with displacement vectors
training image
• Instead of indexing displacements by manually defined parts, index by “visual codeword” B. Leibe, A. Leonardis, and B. Schiele, Combined Object Categorization and Segmentation with an Implicit Shape Model, ECCV Workshop on Statistical Learning in Computer Vision 2004 17 Credit slide: S. Lazebnik
Implicit shape models: Training 1. Build codebook of patches around extracted interest points using clustering
Credit slide: S. Lazebnik
18
Implicit shape models: Training 1. Build codebook of patches around extracted interest points using clustering 2. For each codebook entry, store all positions relative to object center [center is given]
19
Implicit shape models 1. Given test image, extract patches, match to codebook entry 2. Cast votes for possible positions of object center 3. Search for maxima in Hough voting space
test image 20
Outline • • • •
Related Work Our Method Experiments Applications
• Conclusion & Future Work
21
Depth Encoded Hough Voting Codebook match
Object hypothesis
Image patch
Detection Score
Position Posterior
Voting Confidence
Codeword From Object Probability
Scale Prior Given Depth22
Depth Encoded Hough Voting •Design the scale to depth relation where is a 1-to-1 mapping between s & d
23
Object (x) and Patch (l,s) Example
: Object location x : Patch with center location l and scale s 24
Inferring Depth from Scale
s
d
-1
d =m(s,l) 25
Inferred Depth Issue: Quantization Error
Object Box
Patch Center
26
Inferred Depth Issue: Phantom Objects
Object Box
Patch Center S= h/w -h: object 2D height -w:object height to patch scale ratio.
27
Issue of Depth Decoding
Object Box
Patch Center
28
Given Depth Helps Detection • Using depth to scale mapping s =m(d,l)
Without Depth
With Depth
29
Outline • • • • •
Related Work Our Method Experiments Applications Conclusion & Future Work
30
Experiments • Table-top Dataset (New dataset proposed by Sun et. al. ECCV’10) – 200 table-top objects with dense depths – 3~5 object instances, 3 object categories (mice, mugs, staplers)
…
31
Results: Table-top Object
Implicit shape model [leibe et. al. 2004] = baseline method
32
Results: Table-top Object
33
Experiments • ETHZ Shape mug (proposed by Ferrari et. al.)
• Pascal voc 2007 cars
34
Results
•ETHZ Shape Mugs
•PASCAL VOC’07 Cars
35
Outline • • • • •
Related Work Our Method Experiments Applications Conclusion & Future Work
36
Applications: 6DOF & Pop-Up CAD Model Registration
37
Results
38
Application: Scene Understanding Object Detector
Layout Estimator
Bao, Sun, and Savarese, CVPR’10
39
Assumptions about objects and scenes (1)
(2)
1) objects and their supporting surfaces 2) objects and observer
40
Precision
Results
Recall •
13% improvement over our original detector (41%) 41
Conclusion • Joint object detection and shape recovery • Improve detection performance given: -depth in training depth -depth both in training and testing • Applications: -6DOF pose estimation -Object pop-up -Scene understanding
42
Future work • Use more 3D information, like curvature, surface normal to improve detection • Build a system to allow user to easily generate visually pleasing Object Pop-up
43
Vision Lab
Thank You
Acknowledgements Work partially supported by: NSF (Grant CNS 0931474) and Gigascale Systems Research Center.
44